C++ tidbits that don't fit in lecture

First, a simple change from C to C++: the data type bool. As you might have guessed, it's for boolean values. Like Java, it uses true and false as keywords (note the difference from using TRUE and FALSE macros in C). While it describes a binary value, it is almost always implemented as a byte, which can cause surprises when creating an array of bools; the reason for this is due to hardware constraints, and is covered in 370 or 378.

As mentioned in class, the default access modifier for a class in C++ is private. It just so happens that structs are still around in C++ too, but remember that in C everything is public. In order to keep things backwards compatible, the default modifier for structs in C++ is public, but structs have still changed from C. They can now have methods, and you can define members of a struct to have private or protected access; however, doing either of these is not backwards compatible with C (note that you can simulate having member methods in C by using function pointers). In C++, the only difference between structs and classes is the default member access (public vs. private).

There's a complicated addendum to the access modifiers. As you know, any private members cannot be accessed by anything outside of the class (including subclasses). However, there is another keyword, friend, which allows functions or classes to access these private data members. As the ACM tutorial puts it, "friends can touch each others' privates." A friend function is usually not a member of the class but can still access private or protected data; because of this, it is declared both inside and outside of the class:

class myString {
    char* data;
  public:
    friend void readData(myString&); // give friend status to the function
};
void readData(myString&); // declare the function

Function overloading is basically the same as in Java. You just declare a function that has a unique parameter signature. The tricky part is to remember about implicit casts and default parameters. For example, consider the following code:

int foo();
int foo(string s);

int bar();
int bar(int x = 0); // error: ambiguous

int baz(int x);
int baz(int x, int y = 0); // error: ambiguous

The declarations of foo have parameter lists that are mutually exclusive; that is, which one to use when actually making the call is apparent based on the parameters given. However, the declarations of bar and baz are ambiguous. Calling bar with no parameters could either call the first declaration or could call the second declaration and use the default value. Similarly, calling baz with only one parameter causes the same situation.

This problem can also occur with classes that can be implicitly cast to other types:

class myString {
    char* data;
  public:
    myString();
    myString(char*); 
};
void foo(myString str);
void foo(std::string str);

Again in this code segment, calling foo with a char* is ambiguous, since it could create either a myString or a std::string using each of their constructors.

Another useful feature of C++ is operator overloading. As Dan's mentioned, the std::string class allows you to use + as a concatenation. This is allowed in C++ because operators (such as +, -, ==, /, etc...) are all basically functions. As such, you can define and overload them as you please. There are a couple of interesting operators that you can overload. Of possible interest is that new and delete are also operators and as such can be overloaded if you know what you're doing. The syntax for defining an operator is like a function (e.g. returnType name(parameters)) except that the name is "operator X" where X is the operator being overloaded (this is the only time a function name has a space it in; the space is optional). A couple examples:

class myString {
    char* data;
  public:
    myString (char*);
    bool operator == (myString&);
    bool operator != (myString&);
    bool operator > (myString&);
    myString operator + (myString&);
    myString& operator += (myString&); // returning a reference is discussed towards the bottom
};

With this declaration, you can compare a myString object to other myString objects or to C-strings. Note, however, that this will not compare a C-string to a myString. For that you need to declare a global operator to do this. Using the myString declaration above, if the operator == compares the contents of the data, a global operator will need to be a friend function (see above). Also, the global operator will have two parameters: the C-string, and the myString it's being compared to. Updated code:

class myString {
    char* data;
  public:
    myString (char*);
    bool operator == (myString&);
    bool operator == (char*);
    friend bool operator == (char*, myString&); // give friend status
};
bool operator== (char*, myString&); // declare operator

One very odd operator is the conversion operator. It has no return type, and takes no parameters. It is of the general form operator newType ();. Here's an example of the conversion operator:

class myString {
    char* data;
  public:
    myString (char*);
    operator const char* (); // return const so that the data doesn't change; see below
    { return data; } // note that it is returning the pointer to private data
    operator std::string (); // convert to an object of the std::string type
    { return std::string(data); } // this will copy the private data in the constructor for std::string
};

As you can see, you can either return data members from inside your class (using const is recommended; again, see below for how const works), or you can return a new object of a different type. Defining conversion operators will allow the program to implicitly make casts from your type to another (the opposite can be done with constructors, going from another type to your type). This can also cause ambiguity for overloaded functions, as mentioned above. Using conversion operators is frowned upon by some C++ programmers because of the complication it adds to the code.

C has a feature we didn't discuss in lecture, but comes up often in C++ as well, const. This is similar to the final keyword in Java, but it also has some different uses. The first and most basic use of const is to create constant values (in C we often used macros for this):

const int BUFFER_SIZE = 128;
char buffer [BUFFER_SIZE];
const std::string message = "Hello World";

As with pointers, passing parameters by reference is a lot more efficient than passing a copy of the object. As such, you'll often see things passed as const references, such as in void foo(const SomeType& param);. The const in this serves two purposes: first, it assures that the object passed as a parameter will not be changed; second, it allows the function to take a const object as a parameter, because it's already guaranteed not to change. If the const was left out, the function would not be able to take a constant object as a parameter, since it would not be guaranteed that the object would not be changed.

When used with pointers, however, const becomes a lot more confusing. There are actually two uses of const in this case:

const int* p1;
int* const p2;

The first use of const will create a pointer to a const int. The code *p1 = 2; will not work because you are not allowed to change the value pointed to by p1. However, you can still point p1 at another int. The second use of const different; it creates a constant pointer to an int. You can now use the code *p2 = 2; but you cannot run the code p2 = new int; as the pointer location cannot be changed.

There's also a third use of const, for use with OOP. When you have a constant object, e.g. const myString str = "hello";, it cannot be modified by any of its member methods. In order to assure that this is upheld by the code, the const keyword is used at the end of a method declaration and definition to show that the object calling the method will not be modified:

class myString {
    char* data;
  public:
    myString(char*);
    myString substr(int, int) const; // returns a new myString, this one isn't changed
    bool operator == (const myString&) const; 
    bool operator == (const char*) const;
};

This new modifier can lead to code duplication if the function has different semantics for const or non-const objects. Take, for example, the indexing operator for a string:

class myString {
    char* data;
  public:
    myString(char*);
    int length() const // will be the same for all objects
    { return strlen(data); }
    char operator [] (int index) const // returns a copy of the character from a const string
    { return data[index]; }
    char& operator [] (int index) // returns a character reference from a string, which can change the string
    { return data[index]; }
};

As you can see, the code for each one is exactly the same, but the semantics of the method are quite different.

As a convention, objects are quite often passed either as pointers or references. This prevents the need to copy objects, which can be very inefficient for large data structures. Since you'll often want to know that the reference passed doesn't get changed, using const will make the check at compile time. Another note: in C, when you wanted to change a pointer, you would have to pass a pointer to it (a pointer to the pointer). Because this is easily confusing, especially when dereferencing a pointer yields another pointer, in C++ you can pass a pointer by reference. This will allow you to change the pointer without the extra layer of referencing. Here's a sample program:

#include <iostream>
void foo(int*& p) {
  delete p;
  p = new int(42);
}
int main() {
  int* p = new int(3);
  std::cout << *p << std::endl; // outputs 3
  foo(p);
  std::cout << *p << std::endl; // outputs 42
  delete p;
  return 0;
}

Another trick often used in C++ is to return a reference to an object. This is quite often used with OOP to return an object's value after a method, and then use it again. Perhaps the easiest example of this is the output operator. The operator method returns a reference to the ostream object, which then allows you to call the operator again, as in the code std::cout << "first use of operator, " << "second use of operator" << std::endl;. Here is how we would use this with the myString class:

class myString {
    char* data;
  public:
    myString(char*);

    friend ostream& operator << (ostream&, const myString&); // give friend status
}
ostream& operator << (ostream& out, const myString& str) { // function definition
  out << str.data; // calls an overload to output a char*
  return out; // return the ostream now that it has changed; it can then be used again
}

Note that this is very different from returning just an ostream object. Doing that would require copying the ostream (copying is inefficient and therefore bad), and it would also not change the original object. With any type of object where state has to be maintained (especially I/O streams), the changes have to be kept inside the original object. This could be done with pointers, but it would be more complicated.

There are a lot more interesting features in C++, but going into detail about them would be way beyond the scope of this class (even this page is). If you're interested, you can look up the various cast operators (static_cast, dynamic_cast, const_cast, and reinterpret_cast), templates (vaguely similar to Java generics, but also allows for a powerful style of coding known as meta-programming), and multiple inheritance (which allows classes to have more than one base class but causes a lot of big messes). These topics can be very useful to use but can also be difficult to understand. They also bring up some very weird quirks, so if you ever use them be sure to understand how they work. Also, the C++ standard library has a lot of useful classes and functions, such as std::string, std::vector (like an ArrayList in Java), iterators, an assortment of things in the <algorithm> header, and more. It's not as full featured (or, as C++ programmers would say: bloated) as the Java or .NET libaries, but there are also a ton of 3rd party libraries available to fit specific needs (e.g. Boost libraries, available from boost.org, is quite popular for several different things, or OpenGL libraries for graphics).