skip to main content.

posts for 2011. (page 12.)

welcome to the second part of the multivariate polynomials in c++ using templates series. this part will explain how to store multivariate polynomials via templates, how to implement storage and operations.

multivariate polynomial rings in mathematics.

as you might recall from your algebra courses, R[x_0, \dots, x_{n-1}] denotes the (ring, in case R itself is a ring) of multivariate polynomials in the n indeterminates (also called variables) X_0, \dots, X_{n-1} with coefficients in R. note that i numerate the indeterminates beginning with zero, which is rather intuitive for a c++ programmer, though often not for a mathematician. there are different ways to treat such polynomials; for example, one can work with multiindices i = (i_0, \dots, i_{n-1}) \in \N^n; for such an i, let a_i be an element of R and define X^i := X_0^{i_0} \cdots X_{n-1}^{e_{n-1}}; then a polynomial f \in R[X_0, \dots, X_{n-1}] can be represented as f = \sum_{i \in \N^n} a_i X^i. clearly, one does not want to work with infinite objects, whence one restricts to a subset of \N^n, for example of type [0, (k_0, \dots, k_{n-1})] \cap \N^n = \{ i \in \N^n \mid 0 \le i_j \le k_j \text{ for } j = 0, \dots, n-1 \}. this is one way of representing multivariate polynomials, and fine for several applications, but not the most easy one to implement in c++ as a template depending on n.

another way to treat multivariate polynomials is to iteratively write R[X_0, \dots, X_{n-1}] = (R[X_1, \dots, X_{n-1}])[X_0] (one could also use R[X_0, \dots, X_{n-1}] = (R[X_0, \dots, X_{n-2}])[X_{n-1}], but i will stick to the other as it is more natural for the implementation i'm describing). hence, a polynomial f \in R[X_0, \dots, X_{n-1}] is interpreted as an <i>univariate</i> polynomial with coefficients in the ring R[X_1, \dots, X_{n-1}]. this allows a recursive definition which is well-suited for templates.

implementing the data structure using templates.

now let us leave the realm of mathematics for a while and talk about c++. the ring R mentioned above will be replaced by an abstract type T, which will be a template parameter. note that in the discussion below, we will ignore allocators, and just talk about them at the very end of this article. our main template type will be template<int n, class T> class poly;, which essentially contains a std::vector<poly<n - 1, T> > to store coefficients. (the storage is in fact more delicate, we will discuss this below.)

clearly, this is not enough yet. the compiler has no clue that we are talking about multivariate polynomials, and that n should always be non-negative, and it also does not knows that poly<0, T> is essentially an element of T. the way to solve this is partial template specialization: we create a specialized version of poly<n, T> for the case n == 0. the declaration is as follows:

 1template<int n, class T>
 2class poly; // forward declaration
 3
 4template<class T>
 5class poly<0, T> // specialize for n = 0
 6{
 7private:
 8    T d_value;
 9public:
10    ...
11}
12
13template<int n, class T>
14class poly // the general implementation
15{
16private:
17    std::vector<poly<n - 1, T> > d_value;
18public:
19    ...
20}

the specialization for n == 0 has a rather simple implementation. besides a constructor accepting an object of type T, with default value T(), it has a casting operator to T, provides an isZero() predicate, assignment and arithmetic operations, output to std::ostreams, and using std::swap() to swap two poly<0, T> objects, it has a simple “evaluate” function implemented as operator() which is essentially the same as the cast operator operator T().

the general implemenation is more interesting. there is a method normalize() which ensures that d_value.back() is the leading coefficient and d_value.size() - 1 the degree of the polynomial; note that we use -1 for the degree of the zero polynomial. these conditions are used for example in the implementation of isZero(), leading() and degree(). for most operations, the polynomial itself makes sure it is normalized, but when accessing a non-const polynomial with operator[], it might change its degree. after that, one better calls normalize(). it would also be possible to do this automatically using more template tricks, but i wanted to leave some responsibility with the user of the template. :-)

the arithmetic operations are pretty straightforward to implement; one has to resize d_value and later call normalize() to make sure nothing bad happens. for multiplication, one creates a temporary which contains the product (the maximal degree of the product is known, but normalize() should be run afterwards since there is no garuantee that the ring containing the objects of type T is zero-divisor free). as an example, adding two polynomials is accomplished by the following operator:

 1    inline poly operator + (const poly & q) const
 2    {
 3        poly r(*this);
 4        if (r.d_value.size() < q.d_value.size())
 5            r.d_value.resize(q.d_value.size());
 6        for (unsigned i = 0; i < q.d_value.size(); ++i)
 7            r[i] += q[i];
 8        r.normalize();
 9        return r;
10    }

first, a copy of the first argument (*this) is made. then, its size is adjusted to make sure there's enough room. then, simply all coefficients of the second argument are added, and finally the result is normalize()d and returned. as another example, here's polynomial multiplication:

 1    inline poly operator * (const poly & p) const
 2    {
 3        int d = p.degree() + degree();
 4        if (d < -1)
 5            d = -1;
 6        poly r(true, d + 1);
 7        if (!isZero() && !p.isZero())
 8            for (unsigned i = 0; i < r.d_value.size(); ++i)
 9                for (unsigned j = 0; j < d_value.size(); ++j)
10                    if (i < j + p.d_value.size())
11                        r[i] += d_value[j] * p[i - j];
12        r.normalize();
13        return r;
14    }

note that as we are inside the template, we can write poly instead of poly<n, T>. moreover, here, use is made of a special constructor poly(bool, unsigned size) which initializes the polynomial with size coefficients. the bool is just to ensure that the compiler won't confuse this constructor with any of the others. this constructor is made private.

printing the polynomial to a string is slightly more involved, as the correct names should be used for the indeterminates. for this, i use a method void print(std::ostream & s, int N = n) const which keeps track of the maximal number of indeterminates. the operator<< is then implemented as follows:

1    friend inline std::ostream & operator << (std::ostream & s, const poly & p)
2    {
3        p.print(s);
4        return s;
5    }

the implementation of print(...) works recursively, and N keeps track of n of the top-level instance. the indeterminate at the current level has index N - n.

finally, let me announce evaluation of polynomials. this will be done with operator(). the implementation inside poly<n, T> is very easy:

1    template<class S>
2    inline poly_evaluator<n, T, S> operator() (const S & x) const
3    {
4        return poly_evaluator<n, T, S>(*this, x);
5    }

the interesting part, namely the implementation of the poly_evaluator<n, T, S> template, will be discussed in part four of this series.

making the storage more intelligent.

in the above code, we use a simple std::vector<poly<n - 1, T> > to store the coefficients, which are (except for n=1) itself polynomials. for example, doing a simple resize of d_value in poly<11, T> will result in maybe many objects of type poly<10, T> to be created, copied and destroyed, and this in turn causes maybe many more objects of type poly<9, T> to be created, copied and destroyed, and so on. this is not exactly optimal. one solution is to replace std::vector<poly<n, T> > by std::vector<poly<n, T>*>, i.e. store pointers to polynomials instead of polynomials itself. this will make resizes very efficient, since only pointers have to moved around. unfortunately, it has a drawback: in the case n == 1, this is often not very optimal, as if T==int, we are storing pointers to ints, which are usually at least the same size as the ints themselves. this leads to inefficient code. one solution is to introduce another partial specialization for n=1, but this results in a lot of duplicated code. a more clever solution is to create another template, like template<class T, bool usePointers> class poly_ivector;, which stores pointers to T and makes sure objects are properly created and destroyed in case usePointers == true, and which is simply std::vector<T> in disguise in case usePointers == false. it provides a simple interface, offering the number of stored elements (unsigned size() const), to resize the array (void resize(unsigned)), accessing elements via operator[], and accessing the last element via back(). finally, it offers a void swap(poly_ivector<T, usePointers> &) method to swap the contents of two poly_ivector<T, usePointers> instances. if you are interested in the implementation of poly_ivector<T, bool>, please look at the source code.

allocators.

in the real implementation, i've added basic support for allocators. these allow to provide own allocation functions (malloc/free) for memory used to store the coefficients. note that my template is not working perfectly with allocators having states, but for state-less allocators, it works fine. if you need to use it with allocators having states, better check if the code does exactly what you want.

i used the polynomial class with a self-written allocator which provides thread local memory management. this was necessary for the aforementioned toy project, for which i wanted to do massive parallel computation (128 threads). it turned out that most threads were waiting for others, as there was a lot of allocation/deallocation going on and the global memory manager, protected by a mutex, made everyone wait. after throwing in a not very efficient, but working memory manager with thread-local storage for the management data, the program ran perfectly parallel.

in this (short) series, i want to explain how to implement support for multivariate polynomials in c++ using templates. for a toy project i worked on mostly last weekend, i needed to work with polynomials in different ways:

  • create a polynomial in three variables with double coefficients;
  • computing the partial derivatives of this polynomial;
  • evaluate the above polynomials in many points (with double coordinates);
  • plug in different univariate polynomials (of form \alpha_i T + \beta_i) into the first polynomial to obtain a univariate polynomial whose zeros correspond to the intersection of the surface defined by the multivariate polynomial together with a parameterized line;
  • test whether a univariate polynomial has zeros in (0, \infty) using sturm chains.

for a given, simple enough polynomial, everything can be done “by hand”, but sooner or later, this is annoying and too restricting. and it gets really annoying for the second part, where i wanted to obtain the univariate polynomial. therefore, i thought about how to create a multivariate polynomial template which provides means to implement all of the above in an easy manner. in the end, i had a multivariate polynomial template with a nice interface, for which – as a small example – the following test program compiles:

 1#include <iostream>
 2#include "polynomial.hpp"
 3
 4int main()
 5{
 6    poly<2, int> f = Monomial<int>(1, 2) + 3 * Monomial<int>(4, 5);
 7    std::cout << f << "\n";
 8    int e = f(4)(2);
 9    std::cout << e << "\n";
10    std::cout << derivative<1>(f) << "\n";
11    poly<1, int> g = f(2), h = f(3);
12    std::cout << g << " and " << h << "\n";
13    std::cout << GCD<double>(g, h) << "\n"; 
14    // cast polynomials to poly<1, double> first; otherwise,
15    // result will be incorrect since int is not a field
16    std::cout << GCD(g, h) << "\n";
17    // to prove our point, check for yourself that the
18    // result is X_0^5 instead of X_0^2
19    poly<1, double> l = f(Monomial<double>(1) - 1)(Monomial<double>(1) - 2);
20    std::cout << l << "\n";
21    std::cout << derivative<0>(l) << "\n";
22    std::cout << GCD(l, derivative<0>(l)) << "\n";
23}

compiling it via g++ -O3 -Wall -o poly-test poly-test.cpp, it yielded the following output when run:

1(1 X_1^2 X_0 + 3 X_1^5 X_0^4)
224592
3(2 X_1 X_0 + 15 X_1^4 X_0^4)
4(2 X_0^2 + 48 X_0^5) and (3 X_0^2 + 243 X_0^5)
51 X_0^2
61 X_0^5
7(-100 + 632 X_0 + -1781 X_0^2 + 2905 X_0^3 + -3006 X_0^4 + 2043 X_0^5 + -912 X_0^6 + 258 X_0^7 + -42 X_0^8 + 3 X_0^9)
8(632 + -3562 X_0 + 8715 X_0^2 + -12024 X_0^3 + 10215 X_0^4 + -5472 X_0^5 + 1806 X_0^6 + -336 X_0^7 + 27 X_0^8)
91

let me explain this example. first, a polynomial f = X_0 X_1^2 + 3 X_0^4 X_1^5 in two indeterminates with int coefficients is created as the linear combination of two monomials (X_0 X_1^2 and X_0^4 X_1^5), which is the first line of the output. then the polynomial is evaluated in (4, 2), and the result is printed as line 2. after that, the partial derivative of the polynomial f with respect to X_1 is computed and written as line 3. after that, the polynomial is evaluated in X_0 in 2 and 3, yielding two new polynomials g(X_0) = f(2, X_0) and h(X_0) = f(3, X_0) which are printed in line 4. in line 5, the monic greatest common divisor of g and h is computed, for which the polynomials are treated as polynomials with double coefficients. that this is necessary is demonstrated by the sixth line, in which the output is printed when ints are used to compute the greatest common divisor: clearly, the result is wrong. then, in line 7 of the output, we evaluate f at (T - 1, T - 2), yielding a univariate polynomial \ell(X_0) = f(X_0 - 1, X_0 - 2) with double coefficients. line 8 of the output shows its derivative \ell', and line 9 the monic greatest common divisor of \ell and \ell'.

note that the program is easy to read for people used to the notion of multivariate polynomials and c++.

in the second part of the series, i want to describe how to represent the polynomials. in the third part, i want to explain how partial derivatives are computed. the fourth part will describe how to do efficient evaluation, and the fifth part will explain the long division and euclidean algorithm implementations.

finally, the source code of the library polynomial.hpp, the test program poly-test.cpp and a simple makefile (using g++) can be downloaded as a zip file here (10 kb).

last wednesday, i was with a friend at another concert in the z-7. according to orphaned land‘s singer kobi farhi, this concert was part of the first (european?) oriental metal tour ever. the tour features four bands: the french hardcore/punk/metal band artweg, the tunesian progressive metal band myrath, the algerian/french melodic death metal band arkan as well as the israelian progressive death metal band orphaned land.
the opening band, artweg, wasn’t my personal favorite, mostly because i’m not that much into hardcore. nonetheless, they delivered a very good stage show. the second band was myrath from tunesia (according to encyclopaedia metallum, the first tunesian (metal?) band to be signed on a label). they are playing very progressive metal, and influences from bands like symphony x and dream theater are very audible, but also their oriental influences. this band is really fantastic, and made me buy their two albums which were available on the merch stand ;-) after myrath, arkan entered the stage. during the first song, there were only screamed male vocals, which i didn’t liked too much, but as soon as their female singer entered the stage and the vocal quality increased a lot :-) from that point on, i really enjoyed their music. i ended up buying one of their albums as well… finally, orphaned land entered the stage. i’ve been waiting for some years to finally see them live, and i have to say that it was totally worth it. (no need to buy albums here, since i already have them all ;-) )
one thing i felt really bad about was that the concert was very poorly attended. there were maybe little over 50 attendees, in a hall which fits many hundreds. this was really a shame, as the concert was – at least for me – probably the best one this year. i really hope the bands are more lucky at their other gigs!