C++ Week 21

Templates, preprocessor commands, the STL

Template classes

We have already seen how we can implement a stack of integers. It would be nice if we could reuse that code to allow us to declare a stack of strings or doubles, in the same way as we can declare vectors of different datatypes. In fact the vector is defined with a template and we will now look at how to implement this ourselves, taking the Stack class and modifying it.

The steps are as follows:

Put the line template <typename T>in front of the class header and in front of all the member function definitions. There is nothing special about the name T – any identifier will do; it's just a placeholder for the data type we will eventually put in our stack, e.g integer or string. We are, in effect, parameterizing the type, and this identifier is a bit like a parameter.
Add a <T> to the class identifier, i.e. change Stack to Stack<T>. Compilers seem to vary in how fussy they are about which ones you attach it to. Some are happy to have every Stack changed to Stack<T>; others get upset if you attach it to the Stack in class Stack or to the names of the constructor(s) or the destructor. The essential ones are the ones before the scope operator – Stack<T>:: – and any that do not follow a Stack<T>::, such as when the Stack is a return type (which therefore precedes the function name with its Stack<T>::), or when it is the parameter of a non-member function (a non-member function name is obviously not preceded by a Stack<T>::).
Replace all the occurrences of the data type you want to change with T, in our example you would change the instances of int to T. You should be careful to change only those occurrences of int that refer to the contents of the stack; you would not want to change, for example, an int that was the counter of a for loop.

Typical changes are shown below. Click here for a full code listing.

template<typename T>
class Stack    //or Stack<T> on some compilers
{ public:
  Stack();     // or Stack<T>();
  ...

template<typename T>
Stack<T>& Stack<T>::operator=(const Stack<T>& sx)  // of the three occurrences of <T> in this line
{	if (this != &sx)                          // the first two are essential, the third optional
	{	free(top);
		copy(top, sx.top);
	}
	return *this;
}

In main() we can use syntax familiar from creating a vector, which is also a type of container class:

Stack<int> sti;
Stack<string> ststr;
sti.push(234);
ststr.push("Cinderella");

Reusing code and creating components

Components are standard fragments of code that have been tested and are largely bug-free and stable. Code like this is very useful as it saves us re-inventing the wheel and cuts down on testing efforts. There are two ways in which you can deliver components.

Source code
object code and its associated headers

The first solution seems useful, but is fraught with danger. Open source software can be tweaked and broken. There are also commercial disadvantages. Someone could easily pinch your code, make a few improvements and sell it on at a large mark up. Moreover, tweaked code may end up being incompatible with other components and it is hard to maintain. For this reason components are often delivered as a source code header (.h ) file and an object file that contains all the binaries from the function definitions of your class. At final build time you can compile your other code into object files and link all the object code together to produce an executable. Under Unix CCN the command to produce an object file is, for example:

CCN -c date.C

Using Borland the command is:

bcc32 -c -odate date.cpp

Header files

A header file normally only contains the class definition. This is the source code file that you deliver and shows all the interfaces used by your class. Without this code it would be hard for anyone else to use your object code. Say we have two files Date.h and Month.h, which are the headers for the Date and Month classes respectively. To use them you will place a #include "Date.h" at the beginning of your code. Quite possibly Date.h will include a #include "Month.h" (There is nothing to prevent one header file including another.)

The source code for Date.cpp will begin with the usual includes augmented by #include "Date.h", followed by the definitions of the member functions of the Month and Date classes, e.g.

#include <iostream>

using namespace std;

#include "Date.h"

const Month Date::months = {{"",0},{"Jan",31}, {"Feb",28},....,{"Dec",31}};

Date::Date()
{  day = 1;  month = 1;  }

void  Date::add_day()
{ ....
}

....

bool Date::operator<(const Date& dx) const
{	return month < dx.month ||
		(month == dx.month && day < dx.day);
}

We can compile this into the Date.o object file (the compiler does not insist on having a main if it is compiling into object code - it is happy to compile a set of functions).

Using header files

When we come to write code that will make use of the object code, we just include the header file so that the compiler can see what the class interfaces are. It doesn't need any more than that for the time being. For example:

//file Usedate.C
#include <iostream>
using namespace std;
#include "Date.h"

int main( )
{ Date D;         // To use the Date class, we only need to know its interface
  ....
  D.add_day();
  ...
}

We compile our Usedate.C into object code and pass this file of object code, along with the file of object code generated from Date.C, to the linker. The linker links it all up to produce an executable.

Conditional compilation

Things can get complicated when we use several headers that themselves contain a #include "Date.h". Suppose we had #include "Calendar.h" and #include "Report.h" at the start of our program, and that both Calendar.h and Report.h contained #include "Date.h" The preprocessor will pull in Date.h twice. When the resulting file is handed on to the compiler, it will not accept it since it will not accept multiple definitions of the same class, even if the definitions are identical, so we are in trouble. To work around this problem, we use conditional compilation.

The preprocessor accepts commands of the form #define, #ifdef and #endif. We put the command #ifndef DATE at the top of Date.h to test whether the flag DATE has been defined. If it hasn't (i.e. this is the first time that Date.h has been included in the file being processed), then we #define DATE and add the rest of the class definition. Finally we use #endif to close the conditional statement.

#ifndef DATE
  #define DATE
  #include "Month.h"
  class Date
  { public:
      Date();
      ...
  };
#endif

The next time the preprocessor encounters a #include "Date.h", it will detect that DATE has already been set and skip the code up to the subsequent #endif, so Date.h will only ever get included once.

#defines are also useful when you want to compile your code for different environments. You can set the preprocessor flags from the command line, or by adding a line to the top of your .C file, so that the preprocessor will only bring in code that has been flagged for a particular platform, such as Windows or Unix. For example:

#define SUN
#ifdef SUN
//code for SOLARIS in here

#endif

#ifdef WINDOWS
//code for Windows in here
#endif

You edit the first line to be #define SUN if you want SUN code to be included, #define WINDOWS if you want Windows code to be included.

Similarly this can be a useful way to set up a debug version of your program that prints out messages. You bracket each of your debug sections between #ifdef DEBUG and #endif. If you want the debug version to be compiled, you insert the line #define DEBUG at the start; if not, you take it out.

Conditional compilation can also be helpful when want to exclude part of the program, in the course of debugging. You can also achieve the same effect by putting part of the program between /* and */, i.e. pretending that part of the program is a comment and therefore to be ignored by the compiler. But this will not work if the part to be excluded itself contains a (genuine) comment because comments cannot be nested. Better is to insert a #if 0 before the part to be excluded and a #endif at the end.

The Standard Template Library

The STL contains a range of container classes. The vector class is one of them. Another is the list class.

To use the list container class you should #include the <list> library. Lists are like vectors in that they are template classes and can hold collections of types, such as objects or integers. They are declared as follows:

list<int> ilist;  // create a list of integers

The list class supports the member functions push_back() and push_front(), which work as you might expect. e.g.

ilist.push_back(1);  //  1

ilist.push_back(2);  //  1 2

ilist.push_front(3); //  3 1 2

ilist.push_front(4); //  4 3 1 2

Other container classes include the deque ("deque" is short for "double-ended queue", though the STL deque is not really a deque in the Data-Structures sense but a double-ended vector). Also the set, which you can think of as a binary search tree, the multiset (like a set but allowing duplicates), the map, which is a binary search tree of key-value pairs, and a multimap (a map allowing duplicates). Specialised versions of these are also available - the stack, the queue and the priority queue. This last is like a queue but the next item to get removed is not necessarily the one that has been there longest (as in a regular queue) but the one with the highest value on some attribute. For instance, if age were the attribute, then the person at the "front" of the queue would be the oldest person in the queue.

Iterators

Each of the container classes has its own type of iterator. An iterator is like a pointer, only it's a C++ object and therefore has its own functionality. For example, as with a pointer, you can apply the dereferencing operator to an iterator to get at the thing it is pointing at, but unlike a pointer, you can also use the auto-increment and auto-decrement operators meaning "Go on to the next item," or "Go back to the previous item." (You can use ++ and -- with a pointer but only if it is pointing at an element of an array. You cannot use them to move along a linked list or a binary tree. With iterators, you can.)

Iterators allow you to move along lists, using the begin() and end() member functions as reference points. Note that, as with vectors, end() is actually the post-ultimate position, i.e. whereas begin() is the position of the first item in the structure, end() is a position after the last one. You declare an iterator variable and use it as an index to be moved along the list, much as you might use a pointer.

list<int>::iterator intiter;  // declare an iterator; each container class has its own iterator type
intiter = ilist.begin();

To move an iterator along the list, use the ++ and -- functions. The following procedure prints the contents of a list:

list<int> ilist;
...

list<int>::iterator intiter;
for (intiter = ilist.begin(); intiter != ilist.end(); intiter++)
  cout << *intiter  << " ";

insert() takes an iterator and an item to insert, e.g.

ilist.insert(intiter, 5);      //  inserts a 5 before the item intiter is pointing at
ilist.insert(ilist.end(), 6);  // equivalent to a push_back()

erase() takes an iterator and removes the item from the list, e.g.

ilist.erase(intiter);      // remove selected item
intiter = ilist.end();
intiter--;
ilist.erase(intiter);   // last three lines equivalent to pop_back()
// You can't do arithmetic with iterators, eg you can't say ilist.end() - 1.

Generic algorithms

There is a set of algorithms provided in the standard library that operate on container classes - you could use the same algorithm on a vector or a list, for example (or a set or a map), hence the name "generic". They are in the <algorithm> library.

Consider a vector of integers v that contains the following values

If we #include the <algorithm> library, we can use the generic algorithm sort() as follows:

sort(v.begin(), v.end());

which sorts the vector v in ascending order from start to finish. begin() and end() are iterator values corresponding to the position of the first element of the vector and a position immediately after the last element. The results of this procedure are thus:

Similarly we can use the algorithm reverse() as follows to reverse the order of elements in v:

reverse(v.begin(), v.end());

Another of the generic algorithms is find. find() takes three arguments, the start of the range, the end of the range, and the item being sought, and returns an iterator (sort of pointer) to the item, e.g.

intiter = find(ilist.begin(), ilist.end(), 1);  // find the number 1 and return its position.

If the item sought is not in the list, find returns ilist.end(), which is not actually in the list.

There are a number of these generic algorithms - see the appendix to Lippman and Lajoie. I have not said much about them in this course since the course is an introduction to programming in general, using C++, rather than a course specifically about C++. If you were going to be a serious C++ programmer, you would need to get acquainted with the generic algorithms - there are about 70 of them - and you would make extensive use of them, but I haven't expected you to do that on this course, and indeed I wouldn't want you to. Working out how to reverse a vector using a loop teaches you something about programming in general; learning how to use the reverse algorithm just teaches you something about C++.

Notes on R. Mitton's lectures by S.P. Connolly, edited by R. Mitton, 2000