welcome to the second part of the multivariate polynomials in c++ using templates series. this part will explain how to store multivariate polynomials via templates, how to implement storage and operations.

### multivariate polynomial rings in mathematics.

as you might recall from your algebra courses, denotes the (ring, in case itself is a ring) of multivariate polynomials in the indeterminates (also called variables) with coefficients in . note that i numerate the indeterminates beginning with zero, which is rather intuitive for a c++ programmer, though often not for a mathematician. there are different ways to treat such polynomials; for example, one can work with multiindices ; for such an , let be an element of and define ; then a polynomial can be represented as . clearly, one does not want to work with infinite objects, whence one restricts to a subset of , for example of type . this is one way of representing multivariate polynomials, and fine for several applications, but not the most easy one to implement in c++ as a template depending on .

another way to treat multivariate polynomials is to iteratively write (one could also use , but i will stick to the other as it is more natural for the implementation i'm describing). hence, a polynomial is interpreted as an <i>univariate</i> polynomial with coefficients in the ring . this allows a recursive definition which is well-suited for templates.

### implementing the data structure using templates.

now let us leave the realm of mathematics for a while and talk about c++. the ring mentioned above will be replaced by an abstract type `T`

, which will be a template parameter. note that in the discussion below, we will ignore allocators, and just talk about them at the very end of this article. our main template type will be `template<int n, class T> class poly;`

, which essentially contains a `std::vector<poly<n - 1, T> >`

to store coefficients. (the storage is in fact more delicate, we will discuss this below.)

clearly, this is not enough yet. the compiler has no clue that we are talking about multivariate polynomials, and that `n`

should always be non-negative, and it also does not knows that `poly<0, T>`

is essentially an element of `T`

. the way to solve this is partial template specialization: we create a specialized version of `poly<n, T>`

for the case `n == 0`

. the declaration is as follows:

1template<int n, class T> 2class poly; // forward declaration 3 4template<class T> 5class poly<0, T> // specialize for n = 0 6{ 7private: 8 T d_value; 9public: 10 ... 11} 12 13template<int n, class T> 14class poly // the general implementation 15{ 16private: 17 std::vector<poly<n - 1, T> > d_value; 18public: 19 ... 20}

the specialization for `n == 0`

has a rather simple implementation. besides a constructor accepting an object of type `T`

, with default value `T()`

, it has a casting operator to `T`

, provides an `isZero()`

predicate, assignment and arithmetic operations, output to `std::ostream`

s, and using `std::swap()`

to swap two `poly<0, T>`

objects, it has a simple “evaluate” function implemented as `operator()`

which is essentially the same as the cast operator `operator T()`

.

the general implemenation is more interesting. there is a method `normalize()`

which ensures that `d_value.back()`

is the leading coefficient and `d_value.size() - 1`

the degree of the polynomial; note that we use -1 for the degree of the zero polynomial. these conditions are used for example in the implementation of `isZero()`

, `leading()`

and `degree()`

. for most operations, the polynomial itself makes sure it is normalized, but when accessing a non-`const`

polynomial with `operator[]`

, it might change its degree. after that, one better calls `normalize()`

. it would also be possible to do this automatically using more template tricks, but i wanted to leave some responsibility with the user of the template. :-)

the arithmetic operations are pretty straightforward to implement; one has to resize `d_value`

and later call `normalize()`

to make sure nothing bad happens. for multiplication, one creates a temporary which contains the product (the maximal degree of the product is known, but `normalize()`

should be run afterwards since there is no garuantee that the ring containing the objects of type `T`

is zero-divisor free). as an example, adding two polynomials is accomplished by the following operator:

1 inline poly operator + (const poly & q) const 2 { 3 poly r(*this); 4 if (r.d_value.size() < q.d_value.size()) 5 r.d_value.resize(q.d_value.size()); 6 for (unsigned i = 0; i < q.d_value.size(); ++i) 7 r[i] += q[i]; 8 r.normalize(); 9 return r; 10 }

first, a copy of the first argument (`*this`

) is made. then, its size is adjusted to make sure there's enough room. then, simply all coefficients of the second argument are added, and finally the result is `normalize()`

d and returned. as another example, here's polynomial multiplication:

1 inline poly operator * (const poly & p) const 2 { 3 int d = p.degree() + degree(); 4 if (d < -1) 5 d = -1; 6 poly r(true, d + 1); 7 if (!isZero() && !p.isZero()) 8 for (unsigned i = 0; i < r.d_value.size(); ++i) 9 for (unsigned j = 0; j < d_value.size(); ++j) 10 if (i < j + p.d_value.size()) 11 r[i] += d_value[j] * p[i - j]; 12 r.normalize(); 13 return r; 14 }

note that as we are inside the template, we can write `poly`

instead of `poly<n, T>`

. moreover, here, use is made of a special constructor `poly(bool, unsigned size)`

which initializes the polynomial with `size`

coefficients. the `bool`

is just to ensure that the compiler won't confuse this constructor with any of the others. this constructor is made `private`

.

printing the polynomial to a string is slightly more involved, as the correct names should be used for the indeterminates. for this, i use a method `void print(std::ostream & s, int N = n) const`

which keeps track of the maximal number of indeterminates. the `operator<<`

is then implemented as follows:

1 friend inline std::ostream & operator << (std::ostream & s, const poly & p) 2 { 3 p.print(s); 4 return s; 5 }

the implementation of `print(...)`

works recursively, and `N`

keeps track of `n`

of the top-level instance. the indeterminate at the current level has index `N - n`

.

finally, let me announce evaluation of polynomials. this will be done with `operator()`

. the implementation inside `poly<n, T>`

is very easy:

1 template<class S> 2 inline poly_evaluator<n, T, S> operator() (const S & x) const 3 { 4 return poly_evaluator<n, T, S>(*this, x); 5 }

the interesting part, namely the implementation of the `poly_evaluator<n, T, S>`

template, will be discussed in part four of this series.

### making the storage more intelligent.

in the above code, we use a simple `std::vector<poly<n - 1, T> >`

to store the coefficients, which are (except for `n=1`

) itself polynomials. for example, doing a simple resize of `d_value`

in `poly<11, T>`

will result in maybe many objects of type `poly<10, T>`

to be created, copied and destroyed, and this in turn causes maybe many more objects of type `poly<9, T>`

to be created, copied and destroyed, and so on. this is not exactly optimal. one solution is to replace `std::vector<poly<n, T> >`

by `std::vector<poly<n, T>*>`

, i.e. store pointers to polynomials instead of polynomials itself. this will make resizes very efficient, since only pointers have to moved around. unfortunately, it has a drawback: in the case `n == 1`

, this is often not very optimal, as if `T==int`

, we are storing pointers to `int`

s, which are usually at least the same size as the `int`

s themselves. this leads to inefficient code. one solution is to introduce another partial specialization for `n=1`

, but this results in a lot of duplicated code. a more clever solution is to create another template, like `template<class T, bool usePointers> class poly_ivector;`

, which stores pointers to `T`

and makes sure objects are properly created and destroyed in case `usePointers == true`

, and which is simply `std::vector<T>`

in disguise in case `usePointers == false`

. it provides a simple interface, offering the number of stored elements (`unsigned size() const`

), to resize the array (`void resize(unsigned)`

), accessing elements via `operator[]`

, and accessing the last element via `back()`

. finally, it offers a `void swap(poly_ivector<T, usePointers> &)`

method to swap the contents of two `poly_ivector<T, usePointers>`

instances. if you are interested in the implementation of `poly_ivector<T, bool>`

, please look at the source code.

### allocators.

in the real implementation, i've added basic support for allocators. these allow to provide own allocation functions (`malloc`

/`free`

) for memory used to store the coefficients. note that my template is not working perfectly with allocators having states, but for state-less allocators, it works fine. if you need to use it with allocators having states, better check if the code does exactly what you want.

i used the polynomial class with a self-written allocator which provides thread local memory management. this was necessary for the aforementioned toy project, for which i wanted to do massive parallel computation (128 threads). it turned out that most threads were waiting for others, as there was a lot of allocation/deallocation going on and the global memory manager, protected by a mutex, made everyone wait. after throwing in a not very efficient, but working memory manager with thread-local storage for the management data, the program ran perfectly parallel.

## comments.