by C.A. Bertulani - e-mail: bertulani@physics.arizona.edu, URL: http://www.physics.arizona.edu/~bertulani
These notes are intended to give an introduction to the C++ programming
language. They have been adapted and extended from several sources (see credits
at the end of these notes). The material can be easily covered in a week for
the reader with some experience in programming. At the end of that the reader will
be able to write and run his own programs.
C++ was developed by Bjarne Stroustrup of AT&T Bell Laboratories in the early 1980's, and is based on the C language. The "++" is a syntactic construct used in C (to increment a variable), and C++ is intended as an incremental improvement of C. Most of C is a subset of C++, so that most C programs can be compiled (i.e. converted into a series of low-level instructions that the computer can execute directly) using a C++ compiler.
C is in many ways hard to categorise. Compared to assembly language it is high-level, but it nevertheless includes many low-level facilities to directly manipulate the computer's memory. It is therefore an excellent language for writing efficient "systems" programs. But for other types of programs, C code can be hard to understand, and C programs can therefore be particularly prone to certain types of error. The extra object-oriented facilities in C++ are partly included to overcome these shortcomings.
The American National Standards Institution (ANSI) provides "official" and generally accepted standard definitions of many programming languages, including C and C++. Such standards are important. A program written only in ANSI C++ is guaranteed to run on any computer whose supporting software conforms to the ANSI standard. In other words, the standard guarantees that ANSI C++ programs are portable. In practice most versions of C++ include ANSI C++ as a core language, but also include extra machine-dependent features to allow smooth interaction with different computers' operating systems. These machine dependent features should be used sparingly. Moreover, when parts of a C++ program use non-ANSI components of the language, these should be clearly marked, and as far a possible separated from the rest of the program, so as to make modification of the program for different machines and operating systems as easy as possible.
We need several pieces of software:
· An editor with which to write and modify the
C++ program components or source code,
· A compiler with which to convert the source
code into machine instructions which can be executed by the computer directly,
· A linking program with which to link the
compiled program components with each other and with a selection of routines
from existing libraries of computer code, in order to form the complete
machine-executable object program,
· A debugger to help diagnose problems, either
in compiling programs in the first place, or if the object program runs but
gives unintended results.
Here is an example of a complete C++ program:
// The C++ compiler ignores comments which start with
// double slashes like this, up to the end of the line.
/* Comments can also be written starting with a slash
followed by a star, and ending with a star followed by
a slash. As you can see, comments written in this way
can span more than one line. */
/* This program prompts the user for the current year, the user's
current age, and another year. It then calculates the age
that the user was or will be in the second year entered. */
#include <iostream.h>
int main()
{
int year_now, age_now, another_year, another_age;
cout << "Enter the current year then press RETURN.\n";
cin >> year_now;
cout << "Enter your current age in years.\n";
cin >> age_now;
cout << "Enter the year for which you wish to know your age.\n";
cin >> another_year;
another_age = another_year - (year_now - age_now);
if (another_age >= 0) {
cout << "Your age in " << another_year << ": ";
cout << another_age << "\n";
}
else {
cout << "You weren't even born in ";
cout << another_year << "!\n";
}
return 0;
}
This program illustrates several general features of all C++ programs. It
begins (after the comment lines) with the statement
#include
This statement is called an include directive. It tells the compiler and the linker that the program will need to be linked to a library of routines that handle input from the keyboard and output to the screen. The header file "iostream.h" contains basic information about this library.
Because the program is short, it is easily packaged up into a single list of
program statements and commands. After the include directive, the basic
structure of the program is:
int main()
{
First statement;
...
...
Last statement;
return 0;
}
All C++ programs have this basic "top-level" structure. Notice that each statement in the body of the program ends with a semicolon. In a well-designed large program, many of these statements will include references or calls to sub-programs, listed after the main program or in a separate file. These sub-programs have roughly the same outline structure as the program here, but there is always exactly one such structure called main. Again, you will learn more about sub-programs later in the course.
When at the end of the main program, the line
return 0;
means "return the value 0 to the computer's operating system to signal that the program has completed successfully". More generally, return statements signal that the particular sub-program has finished, and return a value, along with the flow of control, to the program level above. More about this later.
Our example program uses four variables:
year_now, age_now, another_year and another_age
Program variables are not like variables in mathematics. They are more like
symbolic names for "pockets of computer memory" which can be used to
store different values at different times during the program execution. These
variables are first introduced in our program in the variable declaration
int year_now, age_now, another_year, another_age;
which signals to the compiler that it should set aside enough memory to store four variables of type "int" (integer) during the rest of the program execution. Hence variables should always be declared before being used in a program. Indeed, it is considered good style and practice to declare all the variables to be used in a program or sub-program at the beginning. Variables can be one of several different types in C++, and we will discuss variables and types at some length later.
After we have compiled the program above, we can run it. The result will be something like
Enter current year then press RETURN.
1996
Enter your current age in years.
36
Enter the year for which you wish to know your age.
2001
Your age in 2001: 41
The first, third, fifth and seventh lines above are produced on the screen by the program. In general, the program statement
cout << Expression1 << Expression2 << ... << ExpressionN;
will produce the screen output
Expression1Expression2...ExpressionN
The series of statements
cout << Expression1;
cout << Expression2;
...
...
cout << ExpressionN;
will produce an identical output. If spaces or new lines are needed between the output expressions, these have to be included explicitly, with a " " or a "\n" respectively.
The numbers in bold in
the example screen output above have been typed in by the user. In this
particular program run, the program statement
cin >> year_now;
has resulted in the variable year_now being assigned the value 1996 at the point when the user pressed RETURN after typing in "1996". Programs can also include assignment statements, a simple example of which is the statement
another_age = another_year - (year_now - age_now);
Hence the symbol = means "is assigned the value of". ("Equals" is represented in C++ as ==.)
The last few lines of our example program (other than "return 0") are:
if (another_age >= 0) {
cout << "Your age in " << another_year << ": ";
cout << another_age << "\n";
}
else {
cout << "You weren't even born in ";
cout << another_year << "!\n";
}
The "if ... else ..." branching mechanism is a familiar construct in many procedural programming languages. In C++, it is simply called an if statement, and the general syntax is
if (condition) {
Statement1;
...
...
StatementN;
}
else {
StatementN+1;
...
...
StatementN+M;
}
The "else" part of an "if statement" may be omitted, and furthermore, if there is just one Statement after the "if (condition)", it may be simply written as
if (condition)
Statement;
It is quite common to find "if statements" strung together in
programs, as follows:
...
...
if (total_test_score < 50)
cout << "You are a failure. You must study much harder.\n";
else if (total_test_score < 65)
cout << "You have just scraped through the test.\n";
else if (total_test_score < 80)
cout << "You have done quite well.\n";
else if (total_test_score < 95)
cout << "Your score is excellent. Well done.\n";
else {
cout << "You cheated!\n";
total_test_score = 0;
}
...
...
This program fragment has quite a complicated logical structure, but we can confirm that it is legal in C++ by referring to the syntax diagram for "if statements". In such diagrams, the terms enclosed in ovals or circles refer to program components that literally appear in programs. Terms enclosed in boxes refer to program components that require further definition, perhaps with another syntax diagram. A collection of such diagrams can serve as a formal definition of a programming language's syntax (although they do not help distinguish between good and bad programming style!).
Below is the syntax diagram for an "if statement". It is best
understood in conjunction with the syntax diagram for a "statement" .
In particular, notice that the diagram doesn't explicitly include the ";" or "{}" delimiters, since these are built
into the definition (syntax diagram) of "statement".
1.7.1 Syntax diagram for an If Statement
The C++ compiler accepts the program fragment in our example by counting all
of the bold text in
...
...
if (total_test_score < 50)
cout << "You are a failure. You must study much harder.\n";
else if (total_test_score < 65)
cout << "You have just scraped through the test.\n";
else if (total_test_score < 80)
cout << "You have done quite well.\n";
else if (total_test_score < 95)
cout << "Your score is excellent. Well done.\n";
else {
cout << "You cheated!\n";
total_test_score = 0;
}
...
...
as the single statement which must follow the first else.
As far as the C++ compiler is concerned, the following program is exactly the same as the program in section 1.5:
#include <iostream.h> int main() { int year_now, age_now, another_year, another_age; cout <<
"Enter the current year then press RETURN.\n"; cin >> year_now;
cout << "Enter your current age in years.\n"; cin >> age_now; cout <<
"Enter the year for which you wish to know your age.\n"; cin >>
another_year; another_age = another_year - (year_now - age_now); if
(another_age >= 0) { cout << "Your age in " << another_year << ": ";
cout << another_age << "\n"; } else { cout <<
"You weren't even born in "; cout << another_year << "!\n"; } return
0; }
However, the lack of program comments, spaces, new lines and indentation makes this program unacceptable. There is much more to developing a good programming style than learning to lay out programs properly, but it is a good start! Be consistent with your program layout, and make sure the indentation and spacing reflects the logical structure of your program. It is also a good idea to pick meaningful names for variables; "year_now", "age_now", "another_year " and "another__age " are better names than "y_n", "a_n", "a_y" and "a_a", and much better than "w", "x", "y" and "z". Remember that your programs might need modification by other programmers at a later date.
As we have seen, C++ programs can be written using many English words. It is useful to think of words found in a program as being one of three types:
|
asm auto break case catch char class const |
|
continue default delete do double else enum extern |
|
float for friend goto if inline int long |
|
new operator private protected public register return short |
|
signed sizeof static struct switch template this throw |
|
try typedef union unsigned virtual void volatile while
|
An identifier cannot be any sequence of symbols. A valid identifier must start with a letter of the alphabet or an underscore ("_") and must consist only of letters, digits, and underscores.
C++ requires that all variables used in a program be given a data type. We have already seen the data type int. Variables of this type are used to represent integers (whole numbers). Declaring a variable to be of type int signals to the compiler that it must associate enough memory with the variable's identifier to store an integer value or integer values as the program executes. But there is a (system dependent) limit on the largest and smallest integers that can be stored. Hence C++ also supports the data types short int and long int which represent, respectively, a smaller and a larger range of integer values than int. Adding the prefix unsigned to any of these types means that you wish to represent non-negative integers only. For example, the declaration
unsigned short int year_now, age_now, another_year, another_age;
reserves memory for representing four relatively small non-negative integers.
Some rules have to be observed when writing integer values in programs:
Variables of type "float" are used to store real numbers. Plus and minus signs for data of type "float" are treated exactly as with integers, and trailing zeros to the right of the decimal point are ignored. Hence "+523.5", "523.5" and "523.500" all represent the same value. The computer also excepts real numbers in floating-point form (or "scientific notation"). Hence 523.5 could be written as "5.235e+02" (i.e. 5.235 x 10 x 10), and -0.0034 as "-3.4e-03". In addition to "float", C++ supports the types "double" and "long double", which give increasingly precise representation of real numbers, but at the cost of more computer memory.
Sometimes it is important to guarantee that a value is stored as a real number, even if it is in fact a whole number. A common example is where an arithmetic expression involves division. When applied to two values of type int, the division operator "/" signifies integer division, so that (for example) 7/2 evaluates to 3. In this case, if we want an answer of 3.5, we can simply add a decimal point and zero to one or both numbers - "7.0/2", "7/2.0" and "7.0/2.0" all give the desired result. However, if both the numerator and the divisor are variables, this trick is not possible. Instead, we have to use a type cast. For example, we can convert "7" to a value of type double using the expression "double(7)". Hence in the expression
answer = double(numerator) / denominator
the "/" will always be interpreted as real-number division, even when both "numerator" and "denominator" have integer values. Other type names can also be used for type casting. For example, "int(14.35)" has an integer value of 14.
Variables of type "char" are used to store character data. In standard C++, data of type "char" can only be a single character (which could be a blank space). These characters come from an available character set which can differ from computer to computer. However, it always includes upper and lower case letters of the alphabet, the digits 0, ... , 9, and some special symbols such as #, £, !, +, -, etc. Perhaps the most common collection of characters is the ASCII character set.
Character constants of type "char" must be enclosed in single quotation marks when used in a program, otherwise they will be misinterpreted and may cause a compilation error or unexpected program behaviour. For example, "'A'" is a character constant, but "A" will be interpreted as a program variable. Similarly, "'9'" is a character, but "9" is an integer.
There is, however, an important (and perhaps somewhat confusing) technical point concerning data of type "char". Characters are represented as integers inside the computer. Hence the data type "char" is simply a subset of the data type "int". We can even do arithmetic with characters. For example, the following expression is evaluated as true on any computer using the ASCII character set:
'9' - 48 == 9
However, declaring a variable to be of type "char" rather than type "int" makes an important difference as
regards the type of input the program expects, and the format of the output it
produces. For example, the program
#include <iostream.h>
int main()
{
int number;
char character;
cout << "Type in a character:\n";
cin >> character;
number = character;
cout << "The character '" << character;
cout << "' is represented as the number ";
cout << number << " in the computer.\n";
return 0;
}
produces output such as
Type in a character:
9
The character '9' is represented as the number 57 in the computer.
We could modify the above program to print out the whole ASCII table of
characters using a "for loop". The "for loop" is an example
of a repetition statement - we will discuss these in more detail later.
The general syntax is:
for (initialisation; repetition_condition ; update) {
Statement1;
...
...
StatementN;
}
C++ executes such statements as follows: (1) it executes the initialisation statement. (2) it checks to see if repetition_condition is true. If it isn't, it finishes with the "for loop" completely. But if it is, it executes each of the statements Statement1 ... StatementN in turn, and then executes the expression update. After this, it goes back to the beginning of step (2) again.
Hence to print out the ASCII table, the program above can be modified to:
#include <iostream.h>
int main()
{
int number;
char character;
for (number = 32 ; number <= 126 ; number = number + 1) {
character = number;
cout << "The character '" << character;
cout << "' is represented as the number ";
cout << number << " in the computer.\n";
}
return 0;
}
which produces the output:
The character ' ' is represented as the number 32 in the computer.
The character '!' is represented as the number 33 in the computer.
...
...
The character '}' is represented as the number 125 in the computer.
The character '~' is represented as the number 126 in the computer.
Our example programs have made extensive use of the type "string" in their output. As we have seen, in C++ a string constant must be enclosed in double quotation marks. Hence we have seen output statements such as
cout << "' is represented as the number ";
in programs. In fact, "string" is not a fundamental data type such as "int", "float" or "char". Instead, strings are represented as arrays of characters, so we will return to subject of strings later, when we discuss arrays in general.
Later in the course we will study the topic of data types in
much more detail. We will see how the programmer may define his or her own data
types. This facility provides a powerful programming tool when complex
structures of data need to be represented and manipulated by a C++ program.
When program output contains values of type "float", "double" or "long double", we may wish to restrict the precision with which these values are displayed on the screen, or specify whether the value should be displayed in fixed or floating point form. The following example program uses the library identifier "sqrt" to refer to the square root function, a standard definition of which is given in the header file "math.h".
#include <iostream.h>
#include <math.h>
int main()
{
float number;
cout << "Type in a real number.\n";
cin >> number;
cout.setf(ios::fixed); // LINE 10
cout.precision(2);
cout << "The square root of " << number << " is approximately ";
cout << sqrt(number) << ".\n";
return 0;
}
This produces the output
Type in a real number.
200
The square root of 200.00 is approximately 14.14.
whereas replacing line 10 with "cout.setf(ios::scientific)" produces the output:
Type in a real number.
200
The square root of 2.00e+02 is approximately 1.41e+01.
We can also include tabbing in the output using a statement such as "cout.width(20)". This specifies that the next item output will have a width of at least 20 characters (with blank space appropriately added if necessary). This is useful in generating tables:
#include <iostream.h>
#include <math.h>
int main()
{
int number;
cout.width(20);
cout << "Number" << "Square Root\n\n";
cout.setf(ios::fixed);
cout.precision(2);