[13] Operator overloading
(Part of C++ FAQ Lite, Copyright © 1991-99, Marshall Cline, cline@parashift.com)

FAQs in section [13]:

[13.1] What's the deal with `operator` overloading?

It allows you to provide an intuitive interface to users of your class.

Operator overloading allows C/C++ operators to have user-defined meanings on user-defined types (classes). Overloaded operators are syntactic sugar for function calls:

class Fred { public: // ... }; #if 0 // Without operator overloading: Fred add(Fred, Fred); Fred mul(Fred, Fred); Fred f(Fred a, Fred b, Fred c) { return add(add(mul(a,b), mul(b,c)), mul(c,a)); // Yuk... } #else // With operator overloading: Fred operator+ (Fred, Fred); Fred operator* (Fred, Fred); Fred f(Fred a, Fred b, Fred c) { return a*b + b*c + c*a; } #endif

[ Top | Bottom | Previous section | Next section ]

[13.2] What are the benefits of operator overloading?

By overloading standard operators on a class, you can exploit the intuition of the users of that class. This lets users program in the language of the problem domain rather than in the language of the machine.

The ultimate goal is to reduce both the learning curve and the defect rate.

[ Top | Bottom | Previous section | Next section ]

[13.3] What are some examples of operator overloading?

Here are a few of the many examples of operator overloading:

myString + yourString might concatenate two string objects
myDate++ might increment a Date object
a * b might multiply two Number objects
a[i] might access an element of an Array object
x = *p might dereference a "smart pointer" that actually "points" to a disk record — it could actually seek to the location on disk where p "points" and return the appropriate record into x

[ Top | Bottom | Previous section | Next section ]

[13.4] But `operator` overloading makes my class look ugly; isn't it supposed to make my code clearer?

[Recently added return type to main() (on 10/99). Click here to go to the next FAQ in the "chain" of recent changes.]

Operator overloading makes life easier for the users of a class, not for the developer of the class!

Consider the following example.

class Array { public: int& operator[] (unsigned i); // Some people don't like this syntax // ... }; inline int& Array::operator[] (unsigned i) // Some people don't like this syntax { // ... }

Some people don't like the keyword operator or the somewhat funny syntax that goes with it in the body of the class itself. But the operator overloading syntax isn't supposed to make life easier for the developer of a class. It's supposed to make life easier for the users of the class:

int main() { Array a; a[3] = 4; // User code should be obvious and easy to understand... }

Remember: in a reuse-oriented world, there will usually be many people who use your class, but there is only one person who builds it (yourself); therefore you should do things that favor the many rather than the few.

[ Top | Bottom | Previous section | Next section ]

[13.5] What operators can/cannot be overloaded?

[Recently added return type to main() (on 10/99). Click here to go to the next FAQ in the "chain" of recent changes.]

Most can be overloaded. The only C operators that can't be are . and ?: (and sizeof, which is technically an operator). C++ adds a few of its own operators, most of which can be overloaded except :: and .*.

Here's an example of the subscript operator (it returns a reference). First without operator overloading:

class Array { public: #if 0 int& elem(unsigned i) { if (i > 99) error(); return data[i]; } #else int& operator[] (unsigned i) { if (i > 99) error(); return data[i]; } #endif private: int data[100]; }; int main() { Array a; #if 0 a.elem(10) = 42; a.elem(12) += a.elem(13); #else a[10] = 42; a[12] += a[13]; #endif }

[ Top | Bottom | Previous section | Next section ]

[13.6] Can I overload `operator==` so it lets me compare two `char[]` using a string comparison?

[Recently replaced "class type" with "user-defined type" in first paragraph thanks to Daryle Walker (on 10/99). Click here to go to the next FAQ in the "chain" of recent changes.]

No: at least one operand of any overloaded operator must be of some user-defined type (most of the time that means a class).

But even if C++ allowed you to do this, which it doesn't, you wouldn't want to do it anyway since you really should be using a string-like class rather than an array of char in the first place since arrays are evil.

[ Top | Bottom | Previous section | Next section ]

[13.7] Can I create a `operator**` for "to-the-power-of" operations?

Nope.

The names of, precedence of, associativity of, and arity of operators is fixed by the language. There is no operator** in C++, so you cannot create one for a class type.

If you're in doubt, consider that x ** y is the same as x * (*y) (in other words, the compiler assumes y is a pointer). Besides, operator overloading is just syntactic sugar for function calls. Although this particular syntactic sugar can be very sweet, it doesn't add anything fundamental. I suggest you overload pow(base,exponent) (a double precision version is in <math.h>).

By the way, operator^ can work for to-the-power-of, except it has the wrong precedence and associativity.

[ Top | Bottom | Previous section | Next section ]

[13.8] How do I create a subscript `operator` for a `Matrix` class?

[Recently added return type to main(); added parameters to the instantiation of m in main() thanks to Boris Pulatov (on 10/99). Click here to go to the next FAQ in the "chain" of recent changes.]

Use operator() rather than operator[].

When you have multiple subscripts, the cleanest way to do it is with operator() rather than with operator[]. The reason is that operator[] always takes exactly one parameter, but operator() can take any number of parameters (in the case of a rectangular matrix, two paramters are needed).

For example:

class Matrix { public: Matrix(unsigned rows, unsigned cols); double& operator() (unsigned row, unsigned col); double operator() (unsigned row, unsigned col) const; // ... ~Matrix(); // Destructor Matrix(const Matrix& m); // Copy constructor Matrix& operator= (const Matrix& m); // Assignment operator // ... private: unsigned rows_, cols_; double* data_; }; inline Matrix::Matrix(unsigned rows, unsigned cols) : rows_ (rows), cols_ (cols), data_ (new double[rows * cols]) { if (rows == 0 || cols == 0) throw BadIndex("Matrix constructor has 0 size"); } inline Matrix::~Matrix() { delete[] data_; } inline double& Matrix::operator() (unsigned row, unsigned col) { if (row >= rows_ || col >= cols_) throw BadIndex("Matrix subscript out of bounds"); return data_[cols_*row + col]; } inline double Matrix::operator() (unsigned row, unsigned col) const { if (row >= rows_ || col >= cols_) throw BadIndex("const Matrix subscript out of bounds"); return data_[cols_*row + col]; }

Then you can access an element of Matrix m using m(i,j) rather than m[i][j]:

int main() { Matrix m(10,10); m(5,8) = 106.15; cout << m(5,8); // ... }

[ Top | Bottom | Previous section | Next section ]

[13.9] Why shouldn't my `Matrix` class's interface look like an array-of-array?

[Recently created (on 10/99). Click here to go to the next FAQ in the "chain" of recent changes.]

Here's what this FAQ is really all about: Some people build a Matrix class that has an operator[] that returns a reference to an Array object, and that Array object has an operator[] that returns an element of the Matrix (e.g., a reference to a double). Thus they access elements of the matrix using syntax like m[i][j] rather than syntax like m(i,j).

The array-of-array solution obviously works, but it is less flexible than the operator() approach. Specifically, there are easy performance tuning tricks that can be done with the operator() approach that are more difficult in the [][] approach, and therefore the [][] approach is more likely to lead to bad performance, at least in some cases.

For example, the easiest way to implement the [][] approach is to use a physical layout of the matrix as a dense matrix that is stored in row-major form (or is it column-major; I can't ever remember). In contrast, the operator() approach totally hides the physical layout of the matrix, and that can lead to better performance in some cases.

Put it this way: the operator() approach is never worse than, and sometimes better than, the [][] approach.

The operator() approach is never worse because it is easy to implement the dense, row-major physical layout using the operator() approach, so when that configuration happens to be the optimal layout from a performance standpoint, the operator() approach is just as easy as the [][] approach (perhaps the operator() approach is a tiny bit easier, but I won't quibble over minor nits).
The operator() approach is sometimes better because whenever the optimal layout for a given application happens to be something other than dense, row-major, the implementation is often significantly easier using the operator() approach compared to the [][] approach.

As an example of when a physical layout makes a significant difference, a recent project happened to access the matrix elements in columns (that is, the algorithm accesses all the elements in one column, then the elements in another, etc.), and if the physical layout is row-major, the accesses can "stride the cache". For example, if the rows happen to be almost as big as the processor's cache size, the machine can end up with a "cache miss" for almost every element access. In this particular project, we got a 20% improvement in performance by changing the mapping from the logical layout (row,column) to the physical layout (column,row).

Of course there are many examples of this sort of thing from numerical methods, and sparse matrices are a whole other dimension on this issue. Since it is, in general, easier to implement a sparse matrix or swap row/column ordering using the operator() approach, the operator() approach loses nothing and may gain something -- it has no down-side and a potential up-side.

Use the operator() approach.

[ Top | Bottom | Previous section | Next section ]

[13.10] Should I design my classes from the outside (interfaces first) or from the inside (data first)?

[Recently added an admonition to not "roll your own" container classes (on 10/99). Click here to go to the next FAQ in the "chain" of recent changes.]

From the outside!

A good interface provides a simplified view that is expressed in the vocabulary of a user. In the case of OO software, the interface is normally to a class or a tight group of classes.

First think about what the object logically represents, not how you intend to physically build it. For example, suppose you have a Stack class that will be built by containing a LinkedList:

class Stack { public: // ... private: LinkedList list_; };

Should the Stack have a get() method that returns the LinkedList? Or a set() method that takes a LinkedList? Or a constructor that takes a LinkedList? Obviously the answer is No, since you should design your interfaces from the outside-in. I.e., users of Stack objects don't care about LinkedLists; they care about pushing and popping.

Now for another example that is a bit more subtle. Suppose class LinkedList is built using a linked list of Node objects, where each Node object has a pointer to the next Node:

class Node { /*...*/ }; class LinkedList { public: // ... private: Node* first_; };

Should the LinkedList class have a get() method that will let users access the first Node? Should the Node object have a get() method that will let users follow that Node to the next Node in the chain? In other words, what should a LinkedList look like from the outside? Is a LinkedList really a chain of Node objects? Or is that just an implementation detail? And if it is just an implementation detail, how will the LinkedList let users access each of the elements in the LinkedList one at a time?

One man's answer: A LinkedList is not a chain of Nodes. That may be how it is built, but that is not what it is. What it is is a sequence of elements. Therefore the LinkedList abstraction should provide a "LinkedListIterator" class as well, and that "LinkedListIterator" might have an operator++ to go to the next element, and it might have a get()/set() pair to access its value stored in the Node (the value in the Node element is solely the responsibility of the LinkedList user, which is why there is a get()/set() pair that allows the user to freely manipulate that value).

Starting from the user's perspective, we might want our LinkedList class to support operations that look similar to accessing an array using pointer arithmetic:

void userCode(LinkedList& a) { for (LinkedListIterator p = a.begin(); p != a.end(); ++p) cout << *p << '\n'; }

To implement this interface, LinkedList will need a begin() method and an end() method. These return a "LinkedListIterator" object. The "LinkedListIterator" will need a method to go forward, ++p; a method to access the current element, *p; and a comparison operator, p != a.end().

The code follows. The key insight is that the LinkedList class does not have any methods that lets users access the Nodes. Nodes are an implementation technique that is completely buried. The LinkedList class could have its internals replaced with a doubly linked list, or even an array, and the only difference would be some performance differences with the prepend(elem) and append(elem) methods.

#include <assert.h> // Poor man's exception handling typedef int bool; // Someday we won't have to do this class LinkedListIterator; class LinkedList; class Node { // No public members; this is a "private class" friend LinkedListIterator; // A friend class friend LinkedList; Node* next_; int elem_; }; class LinkedListIterator { public: bool operator== (LinkedListIterator i) const; bool operator!= (LinkedListIterator i) const; void operator++ (); // Go to the next element int& operator* (); // Access the current element private: LinkedListIterator(Node* p); Node* p_; }; class LinkedList { public: void append(int elem); // Adds elem after the end void prepend(int elem); // Adds elem before the beginning // ... LinkedListIterator begin(); LinkedListIterator end(); // ... private: Node* first_; };

Here are the methods that are obviously inlinable (probably in the same header file):

inline bool LinkedListIterator::operator== (LinkedListIterator i) const { return p_ == i.p_; } inline bool LinkedListIterator::operator!= (LinkedListIterator i) const { return p_ != i.p_; } inline void LinkedListIterator::operator++() { assert(p_ != NULL); // or if (p_==NULL) throw ... p_ = p_->next_; } inline int& LinkedListIterator::operator*() { assert(p_ != NULL); // or if (p_==NULL) throw ... return p_->elem_; } inline LinkedListIterator::LinkedListIterator(Node* p) : p_(p) { } inline LinkedListIterator LinkedList::begin() { return first_; } inline LinkedListIterator LinkedList::end() { return NULL; }

Conclusion: The linked list had two different kinds of data. The values of the elements stored in the linked list are the responsibility of the user of the linked list (and only the user; the linked list itself makes no attempt to prohibit users from changing the third element to 5), and the linked list's infrastructure data (next pointers, etc.), whose values are the responsibility of the linked list (and only the linked list; e.g., the linked list does not let users change (or even look at!) the various next pointers).

Thus the only get()/set() methods were to get and set the elements of the linked list, but not the infrastructure of the linked list. Since the linked list hides the infrastructure pointers/etc., it is able to make very strong promises regarding that infrastructure (e.g., if it was a doubly linked list, it might guarantee that every forward pointer was matched by a backwards pointer from the next Node).

So, we see here an example of where the values of some of a class's data is the responsibility of users (in which case the class needs to have get()/set() methods for that data) but the data that the class wants to control does not necessarily have get()/set() methods.

Note: the purpose of this example is not to show you how to write a linked-list class. In fact you should not "roll your own" linked-list class since you should use one of the "container classes" provided with your compiler. Ideally you'll use one of the standard container classes such as the list<T> template.

[ Top | Bottom | Previous section | Next section ]

[13] Operator overloading (Part of C++ FAQ Lite, Copyright © 1991-99, Marshall Cline, cline@parashift.com)

FAQs in section [13]:

[13.1] What's the deal with operator overloading?

[13.2] What are the benefits of operator overloading?

[13.3] What are some examples of operator overloading?

[13.4] But operator overloading makes my class look ugly; isn't it supposed to make my code clearer?

[13.5] What operators can/cannot be overloaded?

[13.6] Can I overload operator== so it lets me compare two char[] using a string comparison?

[13.7] Can I create a operator** for "to-the-power-of" operations?

[13.8] How do I create a subscript operator for a Matrix class?

[13.9] Why shouldn't my Matrix class's interface look like an array-of-array?

[13.10] Should I design my classes from the outside (interfaces first) or from the inside (data first)?

[13] Operator overloading
(Part of C++ FAQ Lite, Copyright © 1991-99, Marshall Cline, cline@parashift.com)

[13.1] What's the deal with `operator` overloading?

[13.4] But `operator` overloading makes my class look ugly; isn't it supposed to make my code clearer?

[13.6] Can I overload `operator==` so it lets me compare two `char[]` using a string comparison?

[13.7] Can I create a `operator**` for "to-the-power-of" operations?

[13.8] How do I create a subscript `operator` for a `Matrix` class?

[13.9] Why shouldn't my `Matrix` class's interface look like an array-of-array?