implementing multivariate polynomials in c++ using templates [part 5/5].

welcome to the fifth (also last and easiest) part of the multivariate polynomials in c++ using templates series. this part will explain how i implemented long division and the euclidean algorithm.

long division.

long division in is defined as follows: given polynomials , find polynomials such that and . provided that the leading term of is a unit in (or at least divides most things), and exist and are unique (in fact, i'm implicitly assuming that is commutative and unitary). this division makes an euclidean ring in case is a field (i.e. all non-zero elements are invertible). our T is in general not a field, but we can still try to do long division, and let the user worry if this actually makes sense. for example, if T == int, one can divide two elements (assuming the second is not zero), but this is not the division a mathematician would expect. hence, if is not monic in case of T == int, our implmentation will output <i>something</i> which does not necessarily make sense.

anyway. my implementation is rather straightforward, which is not very surprising:

 1template<class T>
2void eucliddiv(const poly<1, T> & f, const poly<1, T> & g, poly<1, T> & q, poly<1, T> & r)
3{
4    if (f.degree() < g.degree())
5    {
6        // Ignore the trivial case
7        q = poly<1, T>();
8        r = f;
9        return;
10    }
11    q = poly<1, T>(true, f.degree() - g.degree() + 1);
12    r = f;
13    for (int i = q.degree(); i >= 0; --i)
14    {
15        q[i] = r[i + g.degree()] / g.leading();
16        for (int j = g.degree(); j >= 0; --j)
17            r[i + j] -= q[i] * g[j];
18    }
19    r.normalize();
20    // Note that the degree of q is already correct.
21}


here, we assume that f.leading() / g.leading() is non-zero; otherwise, q.normalize() would have to be called, but in that case the whole result will probably be broken anyway. we create the polynomial q using the special constructor mentioned here.

euclidean algorithm.

the implementation of the euclidean algorithm is not depending on the implementation anymore (long division made only use of the special constructor). first, we take care of special cases (one of the two arguments is zero), and then proceed with the main loop. we normalize the remainder in every iteration of the loop, and also the result, to (hopefully) increase numerical stability a bit. hence, for this to work, we really need that T is a field, or otherwise it will only work in very special cases.

 1template<class T>
2poly<1, T> GCD(const poly<1, T> & f, const poly<1, T> & g)
3{
4    if (f.isZero())
5    {
6        if (g.isZero())
7            return f; // both are zero
8        else
9            return g / g.leading(); // make g monic
10    }
12    while (!d2.isZero())
13    {
14        eucliddiv(d1, d2, q, r);
15        d1.swap(d2);
16        d2 = r;
17        d2 /= r.leading(); // make r monic
18    }
19    return d1;
20}


i don't think there's much to write about this, which isn't already written somewhere else. this just shows how to implement a generic polynomial based algorithm.

implementing multivariate polynomials in c++ using templates [part 4/5].

welcome to the fourth part of the multivariate polynomials in c++ using templates series. in this part, i want to explain how to implement efficient polynomial evaluation using a collection of interacting templates. the output will be quite optimized code (assuming the compiler has a decent enough optimizer).

evaluating multivariate polynomials is more complicated. i started with a naive implementation of operator() in poly<n, T>:

 1    template<class S>
2    poly<n - 1, S> operator() (const S & x) const
3    {
4        poly<n - 1, S> res;
5        S xx = (S)1;
6        for (unsigned i = 0; i < d_value.size(); ++i)
7        {
8            res += d_value[i] * xx;
9            xx *= x;
10        }
11        return res;
12    }


if f is of type poly<2, int>, then f(4) will be of type poly<1, int>, whence f(4)(2) will call poly<1, int>::operator()<int>(const int &) to yield a polynomial of type poly<0, int>, which automatically casts to an int. unfortunately, in this process, first f(4) will be created, which requires arithmetic of polynomials of type poly<1, int>, and then the resulting polynomial will be evaluated again. in particular, if n is larger, this is far from being optimal. hence, this solution is fine if evaluations are seldomly done, but if they are done more often, it is too slow.

to stay with our example of f of type poly<2, int>. one could evaluate , written as f(4)(2), directly as follows:

 1int result = 0, xx1 = 1;
2for (int i = 0; i <= f.degree(); ++i)
3{
4    int result2 = 0, xx2 = 1;
5    for (int j = 0; j <= f[i].degree(); ++j)
6    {
7        result2 += f[i][j] * xx2;
8        xx2 *= 2;
9    }
10    result += result2 * xx1;
11    xx1 *= 4;
12}


clearly, this is tiresome (and prone to typos) to write every time. moreover, this is also not optimal, as our operator[] will do more than just returning an element. in particular, if f is not const in this context, the compiler will insert code at every operation f[i] and f[i][j] to check whether the index is out of range (to resize d_value in that case). writing more directly

 1int result = 0, xx1 = 1;
2for (int i = 0; i < f.d_value.size(); ++i)
3{
4    int result2 = 0, xx2 = 1;
5    for (int j = 0; j < f.d_value[i].d_value.size(); ++j)
6    {
7        result2 += f.d_value[i].d_value[j] * xx2;
8        xx2 *= 2;
9    }
10    result += result2 * xx1;
11    xx1 *= 4;
12}


would result in faster code, if it would compile – d_value is private. for this reason, i came up with two templates:

1template<int n, class T, class S = T>
2class poly_evaluator;
3template<int n, class T, class HL, class S>
4class poly_evaluator_impl;


the template poly_evaluator<n, T, S> is instanciated in poly<n, T>::operator()<S>(const S &), which is defined as follows:

1    template<class S>
2    inline poly_evaluator<n, T, S> operator() (const S & x) const
3    {
4        return poly_evaluator<n, T, S>(*this, x);
5    }


the idea is that if operator() of poly_evaluator<n, T, S>(*this, x) is called, it will spawn an object of type poly_evaluator_impl<n-1, T, poly_evaluator<n, T, S>, S>. if operator() of this new object is called, it will create an object of type poly_evaluator_impl<n-2, T, poly_evaluator_impl<n-1, T, poly_evaluator<n, T, S>, S>, S>, and so on. the purpose of carrying the type of the caller around is that the outer loop of the evaluation function is the loop for the inner-most object (of type poly_evaluator<n, T, S>). if operator() would be right-associative instead of left-associative (as it is), this trouble would not be necessary.

the idea is that both templates poly_evaluator<n, T, S> and poly_evaluator<n, T, HL, S> provide an internal function template<class SS, class Fun> evaluate(SS & res, Fun evalfun), which in case of poly_evaluator<n, T, HL, S> calls the corresponding evaluate function of its owner (with evalfun replaced), and in the case of poly_evaluator<n, T, S> implements the outermost loop, which loops over the coefficients of and uses evalfun to evaluate the coefficients. hence, the evalful handed upwards must be extended in every step to include the next variable.

assume that we have a polynomial f of type poly<3, int>, and we want to evaluate f(10)(20)(30). this is an object of type poly_evaluator_impl<1, int, poly_evaluator_impl<2, int, poly_evaluator<3, int, int>, int>, int>, whose cast operator operator int() will start evaluation by defining a local class poly_evaluator_impl<1, int, poly_evaluator_impl<2, int, poly_evaluator<3, int, int>, int>, int>::eval_fun, defined as follows:

 1    class eval_fun
2    {
3        const poly_evaluator_impl<1, int, poly_evaluator_impl<2, int, poly_evaluator<3, int, int>, int>, int> & d_owner;
4
5    public:
6        inline eval_fun(const poly_evaluator_impl<1, int, poly_evaluator_impl<2, int, poly_evaluator<3, int, int>, int>, int> & owner)
7            : d_owner(owner)
8        {
9        }
10
11        inline int operator() (const poly<1, int> & p) const
12        {
13            int res = 0;
14            int xx = 1;
15            for (int i = 0; i < (int)p.d_value.size(); ++i)
16            {
17                res += p.d_value[i] * xx;
18                xx = xx * d_owner.d_evalpoint;
19            }
20            return res;
21        }
22    };


an object of it is created by operator int() as follows:

1    inline operator int() const
2    {
3        int res = 0;
4        d_owner.evaluate(res, eval_fun(*this));
5        return res;
6    }


the object eval_fun(*this) can now be used to evaluate polynomials of type poly<1, int> at the specified evaluation point d_owner.d_evalpoint, which equals 30 in this case.

the operator int() calls poly_evaluator_impl<2, int, poly_evaluator<3, int, int>, int>::evaluate<int, Fun>(int &, Fun), which in turn is implemented as follows:

1    template<class Fun>
2    inline void evaluate(int & res, const Fun & evalfun) const
3    {
4        d_owner.evaluate(res, eval_fun<int, Fun>(*this, evalfun));
5    }


the newly created object of type eval_fun<int, Fun> will now evaluate polynomials of type poly<2, int> by using the provided functor evalfun, which is an object of the class poly_evaluator_impl<1, int, poly_evaluator_impl<2, int, poly_evaluator<3, int, int>, int>, int>::eval_fun defined above. the class eval_fun in poly_evaluator_impl<2, int, poly_evaluator<3, int, int>, int> is defined as follows:

 1    template<class Fun>
2    class eval_fun
3    {
4        const poly_evaluator_impl<2, int, poly_evaluator<3, int, int>, int> & d_owner;
5        const Fun & d_evalfun;
6
7    public:
8        inline eval_fun(const poly_evaluator_impl<2, int, poly_evaluator<3, int, int>, int> & owner, const Fun & evalfun)
9            : d_owner(owner), d_evalfun(evalfun)
10        {
11        }
12
13        inline int operator() (const poly<2, int> & p) const
14        {
15            int res = 0;
16            int xx = 1;
17            for (int i = 0; i < (int)p.d_value.size(); ++i)
18            {
19                res += d_evalfun(p.d_value[i]) * xx;
20                xx = xx * d_owner.d_evalpoint;
21            }
22            return res;
23        }
24    };


note that d_owner.d_evalpoint is 20 in this case. finally, we are at the top level, namely in the function poly_evaluator<3, int, int>::evaluate<int, Fun>(), defined as follows:

 1    template<class Fun>
2    inline void evaluate(int & res, const Fun & evalfun) const
3    {
4        int xx = 1;
5        for (int i = 0; i < (int)d_poly.d_value.size(); ++i)
6        {
7            res += evalfun(d_poly.d_value[i]) * xx;
8            xx = xx * d_evalpoint;
9        }
10    }


(here, d_evalpoint is 10.) this is the outer loop. the middle loop is inserted when calling evalfun(d_poly.d_value[i]), and into that the innermost loop is inserted when it calles d_evalfun(p.d_value[i]). this all essentially boils down to the following combination:

 1    {
2        int res = 0;
3        int xx = 1;
4        for (int i = 0; i < (int)d_poly.d_value.size(); ++i)
5        {
6            int tmp;
7            const poly<2, int> & p = d_poly.d_value[i];
8            {
9                int res2 = 0;
10                int xx2 = 1;
11                for (int i2 = 0; i2 < (int)p.d_value.size(); ++i2)
12                {
13                    int tmp2;
14                    const poly<2, int> & p2 = p.d_value[i2];
15                    {
16                        int res3 = 0;
17                        int xx3 = 1;
18                        for (int i3 = 0; i3 < (int)p2.d_value.size(); ++i3)
19                        {
20                            res3 += p2.d_value[i3] * xx3;
21                            xx3 = xx3 * d_owner.d_evalpoint;
22                        }
23                        tmp2 = res3;
24                    }
25                    res2 += tmp2 * xx2;
26                    xx2 = xx2 * d_owner.d_evalpoint;
27                }
28                tmp = res2;
29            }
30            res += tmp * xx;
31            xx = xx * d_evalpoint;
32        }
33        return res;
34    }


of course, this is just an intermediate step (to highlight the similiarities to all the code snippets from above); the compiler will optimize this to something more compact.

this “example” already presented most of the concept. the implementation itself is more complicated, since different types can be used for evaluation (at different stages, even), and since the implementation has support for allocators.

an important point is that both templates poly_evaluator<n, T, S> and poly_evaluator<n, T, HL, S> are specialized for n == 1, as in that case, there is no lower level, and hence these specializations provide no evaluate template and try to pass evaluation further down, but right away pass it higher up to their owner, or in case of poly_evaluator<1, T, S>, it will just do the evaluation.

this concludes the fourth part. the last part will concentrate something less complicated: implementing long division and the euclidean algorithm.

implementing multivariate polynomials in c++ using templates [part 3/5].

welcome to the third part of the multivariate polynomials in c++ using templates series. this part will explain how to compute partial derivatives using partial template specialization tricks.

the idea is very simple. given , written as with , we have

and

this allows us to implement partial derivatives iteratively. we assume that the number of the indeterminate by which one should differentiate is already known during compile time. the implementation will be done by a templated function

1template<int N, int n, class T>
2inline poly<n, T> derivative(const poly<n, T> & p);


the purpose of the template parameter N is to say that should be computed. given a polynomial p, the function can be called simply by derivative<N>(p). to implement derivative<N>(p), we would like to again use partial template specialization to distinguish between the cases N == 0 and N > 0. unfortunately, partial template specialization is not available for template functions. hence, we need to introduce a helper template class:

1template<int N, int n, class T>
2class derivative_computer;


it should contain one static function, static inline void computeDerivative(const poly<n, T> & src, poly<n, T> & dest); which allows to compute the partial derivative of the polynomial in src and store it into dest. then derviative<N>(p) can simply be implemented by

1template<int N, int n, class T>
2inline poly<n, T, std::allocator<T> > derivative(const poly<n, T, std::allocator<T> > & p)
3{
4    poly<n, T> res;
5    derivative_computer<N, n, T>::computeDerivative(p, res);
6    return res;
7}


now let us focus on the class template derivative_computer<N, n, T>. the general implementation, for N > 0, uses the second formula from above and iteratively calls derivative_computer<N-1, n-1, T>::computeDerivative():

 1template<int N, int n, class T>
2class derivative_computer
3{
4public:
5    static inline void computeDerivative(const poly<n, T> & src, poly<n, T> & dest)
6    {
7        dest.d_value.resize(src.degree() + 1);
8        for (int i = 0; i <= src.degree(); ++i)
9            derivative_computer<N - 1, n - 1, T>::computeDerivative(src[i], dest[i]);
10        dest.normalize();
11    }
12};


now we are left with the case that N = 0. the implementation here is also simple:

 1template<int n, class T>
2class derivative_computer<0, n, T>
3{
4public:
5    static inline void computeDerivative(const poly<n, T> & src, poly<n, T> & dest)
6    {
7        dest.d_value.resize(src.degree() >= 0 ? src.degree() : 0);
8        for (int i = 1; i <= src.degree(); ++i)
9            dest[i - 1] = src[i] * (T)i;
10        dest.normalize();
11    }
12};


here, we cast i to an object of type T before multiplication. from this point on, no further iteration is needed.

we are left with the case that some (malicious or just confused) user calls derivative<N>(p) for a polynomial p of type poly<n, T> with n <= N. in that case, the compiler tries to instanciate derivative_computer<k, 0, T>::computeDerivative() and tries to treat poly<0, T> as a polynomial. this will result in strange error messages, and it would be nice if there is something more readable. for that reason, i added another two specializations:

 1template<class T>
2class derivative_computer<0, 0, T>
3{
4public:
5    class ERROR_N_must_be_strictly_less_than_n_in_derivative_template;
6
7    static inline void computeDerivative(const poly<0, T> &, poly<0, T> &)
8    {
9        ERROR_N_must_be_strictly_less_than_n_in_derivative_template();
10    }
11};
12
13template<int N, class T>
14class derivative_computer<N, 0, T>
15{
16public:
17    class ERROR_N_must_be_strictly_less_than_n_in_derivative_template;
18
19    static inline void computeDerivative(const poly<0, T> &, poly<0, T> &)
20    {
21        ERROR_N_must_be_strictly_less_than_n_in_derivative_template();
22    }
23};


these two definitions compile fine, but as soon as one of them is instanciated by the compiler, it will try to find out something about the class ERROR_N_must_be_strictly_less_than_n_in_derivative_template and not succeed, and spit out some error message hopefully containing the string ERROR_N_must_be_strictly_less_than_n_in_derivative_template. for example, if i replace the line computing the partial derivative of the intersection equation in poly-test.cpp by std::cout << derivative<2>(l) << "\n";, g++ will complain as follows:

1In file included from poly-test.cpp:2:0:
2polynomial.hpp: In static member function ‘static void derivative_computer<N, 0, T>::computeDerivative(const poly<0, T>&, poly<0, T>&) [with int N = 1, T = double]’:
3polynomial.hpp:1200:13:   instantiated from ‘static void derivative_computer<N, n, T>::computeDerivative(const poly<n, T>&, poly<n, T>&) [with int N = 2, int n = 1, T = double]’
4polynomial.hpp:1247:5:   instantiated from ‘poly<n, T> derivative(const poly<n, T>&) [with int N = 2, int n = 1, T = double]’
5poly-test.cpp:19:33:   instantiated from here
6polynomial.hpp:1227:9: error: invalid use of incomplete type ‘struct derivative_computer<1, 0, double>::ERROR_N_must_be_strictly_less_than_n_in_derivative_template’
7polynomial.hpp:1223:11: error: declaration of ‘struct derivative_computer<1, 0, double>::ERROR_N_must_be_strictly_less_than_n_in_derivative_template’
8make: *** [poly-test.o] Error 1


this is not exactly readable, but the uppercase ERROR quickly lets you look at the important part of that message. (note that i removed the allocators from the error messages, so that it is in sync with the above writing. if you try it out yourself with the source code, you'll see a slightly more complicated error message.)

this completes part three of the series. in the next part, i will discuss how to evaluate polynomials efficiently using templates.

implementing multivariate polynomials in c++ using templates [part 2/5].

welcome to the second part of the multivariate polynomials in c++ using templates series. this part will explain how to store multivariate polynomials via templates, how to implement storage and operations.

multivariate polynomial rings in mathematics.

as you might recall from your algebra courses, denotes the (ring, in case itself is a ring) of multivariate polynomials in the indeterminates (also called variables) with coefficients in . note that i numerate the indeterminates beginning with zero, which is rather intuitive for a c++ programmer, though often not for a mathematician. there are different ways to treat such polynomials; for example, one can work with multiindices ; for such an , let be an element of and define ; then a polynomial can be represented as . clearly, one does not want to work with infinite objects, whence one restricts to a subset of , for example of type . this is one way of representing multivariate polynomials, and fine for several applications, but not the most easy one to implement in c++ as a template depending on .

another way to treat multivariate polynomials is to iteratively write (one could also use , but i will stick to the other as it is more natural for the implementation i'm describing). hence, a polynomial is interpreted as an <i>univariate</i> polynomial with coefficients in the ring . this allows a recursive definition which is well-suited for templates.

implementing the data structure using templates.

now let us leave the realm of mathematics for a while and talk about c++. the ring mentioned above will be replaced by an abstract type T, which will be a template parameter. note that in the discussion below, we will ignore allocators, and just talk about them at the very end of this article. our main template type will be template<int n, class T> class poly;, which essentially contains a std::vector<poly<n - 1, T> > to store coefficients. (the storage is in fact more delicate, we will discuss this below.)

clearly, this is not enough yet. the compiler has no clue that we are talking about multivariate polynomials, and that n should always be non-negative, and it also does not knows that poly<0, T> is essentially an element of T. the way to solve this is partial template specialization: we create a specialized version of poly<n, T> for the case n == 0. the declaration is as follows:

 1template<int n, class T>
2class poly; // forward declaration
3
4template<class T>
5class poly<0, T> // specialize for n = 0
6{
7private:
8    T d_value;
9public:
10    ...
11}
12
13template<int n, class T>
14class poly // the general implementation
15{
16private:
17    std::vector<poly<n - 1, T> > d_value;
18public:
19    ...
20}


the specialization for n == 0 has a rather simple implementation. besides a constructor accepting an object of type T, with default value T(), it has a casting operator to T, provides an isZero() predicate, assignment and arithmetic operations, output to std::ostreams, and using std::swap() to swap two poly<0, T> objects, it has a simple “evaluate” function implemented as operator() which is essentially the same as the cast operator operator T().

the general implemenation is more interesting. there is a method normalize() which ensures that d_value.back() is the leading coefficient and d_value.size() - 1 the degree of the polynomial; note that we use -1 for the degree of the zero polynomial. these conditions are used for example in the implementation of isZero(), leading() and degree(). for most operations, the polynomial itself makes sure it is normalized, but when accessing a non-const polynomial with operator[], it might change its degree. after that, one better calls normalize(). it would also be possible to do this automatically using more template tricks, but i wanted to leave some responsibility with the user of the template. :-)

the arithmetic operations are pretty straightforward to implement; one has to resize d_value and later call normalize() to make sure nothing bad happens. for multiplication, one creates a temporary which contains the product (the maximal degree of the product is known, but normalize() should be run afterwards since there is no garuantee that the ring containing the objects of type T is zero-divisor free). as an example, adding two polynomials is accomplished by the following operator:

 1    inline poly operator + (const poly & q) const
2    {
3        poly r(*this);
4        if (r.d_value.size() < q.d_value.size())
5            r.d_value.resize(q.d_value.size());
6        for (unsigned i = 0; i < q.d_value.size(); ++i)
7            r[i] += q[i];
8        r.normalize();
9        return r;
10    }


first, a copy of the first argument (*this) is made. then, its size is adjusted to make sure there's enough room. then, simply all coefficients of the second argument are added, and finally the result is normalize()d and returned. as another example, here's polynomial multiplication:

 1    inline poly operator * (const poly & p) const
2    {
3        int d = p.degree() + degree();
4        if (d < -1)
5            d = -1;
6        poly r(true, d + 1);
7        if (!isZero() && !p.isZero())
8            for (unsigned i = 0; i < r.d_value.size(); ++i)
9                for (unsigned j = 0; j < d_value.size(); ++j)
10                    if (i < j + p.d_value.size())
11                        r[i] += d_value[j] * p[i - j];
12        r.normalize();
13        return r;
14    }


note that as we are inside the template, we can write poly instead of poly<n, T>. moreover, here, use is made of a special constructor poly(bool, unsigned size) which initializes the polynomial with size coefficients. the bool is just to ensure that the compiler won't confuse this constructor with any of the others. this constructor is made private.

printing the polynomial to a string is slightly more involved, as the correct names should be used for the indeterminates. for this, i use a method void print(std::ostream & s, int N = n) const which keeps track of the maximal number of indeterminates. the operator<< is then implemented as follows:

1    friend inline std::ostream & operator << (std::ostream & s, const poly & p)
2    {
3        p.print(s);
4        return s;
5    }


the implementation of print(...) works recursively, and N keeps track of n of the top-level instance. the indeterminate at the current level has index N - n.

finally, let me announce evaluation of polynomials. this will be done with operator(). the implementation inside poly<n, T> is very easy:

1    template<class S>
2    inline poly_evaluator<n, T, S> operator() (const S & x) const
3    {
4        return poly_evaluator<n, T, S>(*this, x);
5    }


the interesting part, namely the implementation of the poly_evaluator<n, T, S> template, will be discussed in part four of this series.

making the storage more intelligent.

in the above code, we use a simple std::vector<poly<n - 1, T> > to store the coefficients, which are (except for n=1) itself polynomials. for example, doing a simple resize of d_value in poly<11, T> will result in maybe many objects of type poly<10, T> to be created, copied and destroyed, and this in turn causes maybe many more objects of type poly<9, T> to be created, copied and destroyed, and so on. this is not exactly optimal. one solution is to replace std::vector<poly<n, T> > by std::vector<poly<n, T>*>, i.e. store pointers to polynomials instead of polynomials itself. this will make resizes very efficient, since only pointers have to moved around. unfortunately, it has a drawback: in the case n == 1, this is often not very optimal, as if T==int, we are storing pointers to ints, which are usually at least the same size as the ints themselves. this leads to inefficient code. one solution is to introduce another partial specialization for n=1, but this results in a lot of duplicated code. a more clever solution is to create another template, like template<class T, bool usePointers> class poly_ivector;, which stores pointers to T and makes sure objects are properly created and destroyed in case usePointers == true, and which is simply std::vector<T> in disguise in case usePointers == false. it provides a simple interface, offering the number of stored elements (unsigned size() const), to resize the array (void resize(unsigned)), accessing elements via operator[], and accessing the last element via back(). finally, it offers a void swap(poly_ivector<T, usePointers> &) method to swap the contents of two poly_ivector<T, usePointers> instances. if you are interested in the implementation of poly_ivector<T, bool>, please look at the source code.

allocators.

in the real implementation, i've added basic support for allocators. these allow to provide own allocation functions (malloc/free) for memory used to store the coefficients. note that my template is not working perfectly with allocators having states, but for state-less allocators, it works fine. if you need to use it with allocators having states, better check if the code does exactly what you want.

i used the polynomial class with a self-written allocator which provides thread local memory management. this was necessary for the aforementioned toy project, for which i wanted to do massive parallel computation (128 threads). it turned out that most threads were waiting for others, as there was a lot of allocation/deallocation going on and the global memory manager, protected by a mutex, made everyone wait. after throwing in a not very efficient, but working memory manager with thread-local storage for the management data, the program ran perfectly parallel.

implementing multivariate polynomials in c++ using templates [part 1/5].

in this (short) series, i want to explain how to implement support for multivariate polynomials in c++ using templates. for a toy project i worked on mostly last weekend, i needed to work with polynomials in different ways:

• create a polynomial in three variables with double coefficients;
• computing the partial derivatives of this polynomial;
• evaluate the above polynomials in many points (with double coordinates);
• plug in different univariate polynomials (of form ) into the first polynomial to obtain a univariate polynomial whose zeros correspond to the intersection of the surface defined by the multivariate polynomial together with a parameterized line;
• test whether a univariate polynomial has zeros in using sturm chains.

for a given, simple enough polynomial, everything can be done “by hand”, but sooner or later, this is annoying and too restricting. and it gets really annoying for the second part, where i wanted to obtain the univariate polynomial. therefore, i thought about how to create a multivariate polynomial template which provides means to implement all of the above in an easy manner. in the end, i had a multivariate polynomial template with a nice interface, for which – as a small example – the following test program compiles:

 1#include <iostream>
2#include "polynomial.hpp"
3
4int main()
5{
6    poly<2, int> f = Monomial<int>(1, 2) + 3 * Monomial<int>(4, 5);
7    std::cout << f << "\n";
8    int e = f(4)(2);
9    std::cout << e << "\n";
10    std::cout << derivative<1>(f) << "\n";
11    poly<1, int> g = f(2), h = f(3);
12    std::cout << g << " and " << h << "\n";
13    std::cout << GCD<double>(g, h) << "\n";
14    // cast polynomials to poly<1, double> first; otherwise,
15    // result will be incorrect since int is not a field
16    std::cout << GCD(g, h) << "\n";
17    // to prove our point, check for yourself that the
18    // result is X_0^5 instead of X_0^2
19    poly<1, double> l = f(Monomial<double>(1) - 1)(Monomial<double>(1) - 2);
20    std::cout << l << "\n";
21    std::cout << derivative<0>(l) << "\n";
22    std::cout << GCD(l, derivative<0>(l)) << "\n";
23}


compiling it via g++ -O3 -Wall -o poly-test poly-test.cpp, it yielded the following output when run:

1(1 X_1^2 X_0 + 3 X_1^5 X_0^4)
224592
3(2 X_1 X_0 + 15 X_1^4 X_0^4)
4(2 X_0^2 + 48 X_0^5) and (3 X_0^2 + 243 X_0^5)
51 X_0^2
61 X_0^5
7(-100 + 632 X_0 + -1781 X_0^2 + 2905 X_0^3 + -3006 X_0^4 + 2043 X_0^5 + -912 X_0^6 + 258 X_0^7 + -42 X_0^8 + 3 X_0^9)
8(632 + -3562 X_0 + 8715 X_0^2 + -12024 X_0^3 + 10215 X_0^4 + -5472 X_0^5 + 1806 X_0^6 + -336 X_0^7 + 27 X_0^8)
91


let me explain this example. first, a polynomial in two indeterminates with int coefficients is created as the linear combination of two monomials ( and ), which is the first line of the output. then the polynomial is evaluated in , and the result is printed as line 2. after that, the partial derivative of the polynomial with respect to is computed and written as line 3. after that, the polynomial is evaluated in in 2 and 3, yielding two new polynomials and which are printed in line 4. in line 5, the monic greatest common divisor of and is computed, for which the polynomials are treated as polynomials with double coefficients. that this is necessary is demonstrated by the sixth line, in which the output is printed when ints are used to compute the greatest common divisor: clearly, the result is wrong. then, in line 7 of the output, we evaluate at , yielding a univariate polynomial with double coefficients. line 8 of the output shows its derivative , and line 9 the monic greatest common divisor of and .

note that the program is easy to read for people used to the notion of multivariate polynomials and c++.

in the second part of the series, i want to describe how to represent the polynomials. in the third part, i want to explain how partial derivatives are computed. the fourth part will describe how to do efficient evaluation, and the fifth part will explain the long division and euclidean algorithm implementations.

finally, the source code of the library polynomial.hpp, the test program poly-test.cpp and a simple makefile (using g++) can be downloaded as a zip file here (10 kb).