Using the polynomial module

In this notebook we propose a simple tutorial on how to use the Lazylinop polynomial module.

First of all, you need to install Lazylinop if not done already, see the installation guide. If you’re not yet introduced to the basic use of Lazylinop, we invite you to jump into the library by following this quick start guide.

0. Needed imports

The notebook relies of course on Lazylinop and additionally on NumPy and SciPy. So let’s make the imports.

[1]:
import lazylinop as lz
from lazylinop.polynomial import Polynomial, Chebyshev
import numpy as np
import scipy as sp

1. Introduction

The polynomial module provides a set of classes and functions dedicated to the manipulation of polynomials able to act on linear operators.

Mathematically, given a polynomial \(P(x) = c_0 + c_1x+ \cdots + c_n x^n\) and any square matrix \(M\) one can define \(P(M) = c_0 Id + c_1 M + \cdots c_n M^n\). Convenient polynomial classes are implemented in NumPy, yet they cannot be readily applied to NumPy arrays or SciPy LinearOperators. The polynomial module fixes this by implementing the needed classes, which allow to define and manipulate \(K = P\left(L\right)\), where \(L\) is any square LazyLinOp, and the resulting \(K\) is again a LazyLinOp.

Relationship with NumPy polynomials

The polynomial module actually relies a lot on NumPy representation of polynomials but offers the support of LazyLinOp as indeterminates. As in NumPy, it allows to perform standard operations on polynomials (e.g. p1 + p2) but also to lazily evaluate a polynomial on an indeterminate (naturally written P(L) in the code). Pretty much as simple as what we are used to write basically in maths. Note however that contrary to NumPy’s the Lazylinop polynomials are defined according the matrix power, that is, L**n = L @ L @ ... @ L (with L a LazyLinOp) while NumPy’s are defined on array elementwise power A**n = A * A * ... * A (with A a NumPy array).
Last remark about NumPy/Lazylinop comparison: it is not advised to use Lazylinop polynomials with scalar indeterminates defined as LazyLinOp objects of dimensions \(1 \times 1\). Indeed, it is possible to do so but to the best of our knowledge NumPy is faster for that use case.

Lazy polynomials use cases & Quick API tour

Of course many use cases exist. An example is the approximation of an operator exponential \(e^{L}\approx c_0Id + c_1L + c_2L^2 + \cdots + c_nL^n\) where the \(c_i\) are the coefficients of the polynomial. The main interest of this approximation is to spare compute time. In section 5. we say more about this kind of approximation. More generally, the polynomial module follows the lazy and sparse computing way of Lazylinop. You can compute the result of \(P\left(L\right)\) applied to a batch \(X\) of column-vectors without ever explicitly computing the matrix associated to \(P(L)\). This saves memory, and delays all computations to when \(P\left(L\right) @ X\) is actually evaluated.

A bunch of functions and objects are hence available to compute lazily polynomials, including the most important listed here with their API documentations:

For general polynomials:

For the evaluation of polynomials in monomial form:

  • Polynomial(coef)(L) @ x or xpoly(coef, kind='monomial')(L) @ x

For the evaluation of polynomials in monomial form from roots:

  • Polynomial([roots, leading_coef], fromRoots=True)(L) @ x or xpoly([roots, leading_coef], kind='roots')(L) @ x

The notebook does not show use case of form='roots'. Therefore, please refer to the documentation for more details.

For the evaluation of polynomials in Chebyshev form:

  • Chebyshev(coef)(L) @ x or xpoly(coef, kind='chebyshev')(L) @ x

For the exponentiation:

Those APIs are demonstrated in the next of this notebook. In particular, the polynomial available operations are shown more smoothly than in the API doc.

Many examples will be given in this notebook. From the simplest to more advanced ones.

2. N-th power of an operator

The most simple syntax to create a polynomial is p = Polynomial([c_0,..., c_n]) which creates a polynomial instance such that \(p(L) = c_0 Id + c_1 L + \cdots + c_n L^n\) for any LazyLinOp \(L\). Hence, the polynomial instances p1 and p2 below are such that \(p_1(L) = L\) and \(p_2(L) = L^2\). Let us check that \((p_1^2)(L)\), \(p_2(L)\) and \(L@L\) all implement the same LazyLinOp: applying them to any vector \(x\) yields the same result. For simplicity we illustrate this for a simple LazyLinOp L associated to a square matrix full of ones, and to a random vector x.

[2]:
p1 = Polynomial([0.0, 1.0])
print("p1 is", p1)
p2 = Polynomial([0.0, 0.0, 1.0])
print("p2 is", p2)
print("p1 and p2 are instances of:", p2.__class__)

N = 5000
L = lz.ones((N, N)) # L is a LazyLinOp built by lazylinop.ones
M = np.ones((N, N)) # NumPy array of ones
x = np.random.randn(N)

print("(A) lazylinop time for (p1 ** 2)(L) @ x:")
%timeit -r 100 -n 10 (p1 ** 2)(L) @ x
print("(B) lazylinop time for p2(L) @ x:")
%timeit -r 100 -n 10 p2(L) @ x
print("(C) numpy time for M @ (M @ x):")
%timeit -r 100 -n 10 M @ (M @ x)
p1 is 0.0 + 1.0·x
p2 is 0.0 + 0.0·x + 1.0·x²
p1 and p2 are instances of: <class 'lazylinop.polynomial.Polynomial'>
(A) lazylinop time for (p1 ** 2)(L) @ x:
4.85 ms ± 501 µs per loop (mean ± std. dev. of 100 runs, 10 loops each)
(B) lazylinop time for p2(L) @ x:
4.77 ms ± 138 µs per loop (mean ± std. dev. of 100 runs, 10 loops each)
(C) numpy time for M @ (M @ x):
10.4 ms ± 934 µs per loop (mean ± std. dev. of 100 runs, 10 loops each)

The result of the following quick benchmark shows that in this specific example the polynomial of a LazyLinOp ones is faster than NumPy’s. The reason why this is faster is actually not because of the instance of Poly class but rather the Lazylinop ones operator. However the two instances p1 and p2 of Polynomial class allow us to use this operator as a variable, which is needed to perform the achieved speedup.

3. Basics operations

We have already seen that given \(p_1\) it is possible to compute the n-th power \(p_1^n\) via the usual ** syntax. Other simple operations are the addition, subtraction, or composition of polynomials. Please have a look to the following lines of code that build a polynomial p5, apply it to a linear operator L and apply the resulting operator p5(L) to a vector x.

[3]:
p1 = Polynomial([1.0, -2.0]) # First polynomial is 1 - 2 * x
p2 = Polynomial([0.0, 0.0, 1.0]) # Second polynomial is x^2
p3 = Polynomial([-1.0, 1.0]) # Third polynomial is x - 1
p4 = Polynomial([0.0, 0.0, 0.0, 0.0, 2.0]) # Fourth polynomial is 2 * x^4
print("p1 instance is", p1)
print("p2 instance is", p2)
print("p3 instance is", p3)
print("p4 instance is", p4)

p5 = 2 * (p1 + p2) ** 2 - p4(p3)
print("p5 instance is", p5)

N = 1000
L = lz.ones((N, N))
x = np.random.randn(N)

print("p5(L) @ x is all zero:", np.allclose(0.0, p5(L) @ x), "(as expected).")
p1 instance is 1.0 - 2.0·x
p2 instance is 0.0 + 0.0·x + 1.0·x²
p3 instance is -1.0 + 1.0·x
p4 instance is 0.0 + 0.0·x + 0.0·x² + 0.0·x³ + 2.0·x⁴
p5 instance is 0.0
p5(L) @ x is all zero: True (as expected).

Indeed p5(L) == 0 for any L because p4(p3)(L) == 2 * (L - 1)**4 and 2 * ((p1 + p2)**2)(L) == 2 * (L**2 - 2 * L - 1)**2 == 2 * (L - 1)**4.

Now let’s experiment the composition p = p1(t1) of two polynomials p1 and t1. The evaluation p(L) @ x can be written either as (p1(t1))(L) @ x or p1(t1(L)) @ x. It might be possible that the two evaluations are not equivalent in term of computation time even if the results are the same as you can see in this example:

[4]:
p1 = Polynomial(np.random.randn(10)) # First polynomial
t1 = Chebyshev(np.random.randn(100)) # Second polynomial in Chebyshev form

N = 10
L = lz.kron(
    (1.0 / N ** 2) * np.random.randn(N, N),
    (1.0 / N ** 2) * np.random.randn(N, N)
)
x = np.random.randn(L.shape[1])

print("duration of (t1(p1))(L) @ x:")
%timeit -r 40 -n 10 (t1(p1))(L) @ x
print("duration of t1(p1(L)) @ x:")
%timeit -r 40 -n 10 t1(p1(L)) @ x

print("Are the two results the same ?" +
      " {0:s}".format(
          str(np.allclose(
            (t1(p1))(L) @ x,
            t1(p1(L)) @ x))))
duration of (t1(p1))(L) @ x:
27.1 ms ± 915 µs per loop (mean ± std. dev. of 40 runs, 10 loops each)
duration of t1(p1(L)) @ x:
21.8 ms ± 573 µs per loop (mean ± std. dev. of 40 runs, 10 loops each)
Are the two results the same ? True

Even though the calculations give the same results in both cases, the computation times are not the same. The evaluation (t1(p1))(L) @ x is slower compared to the evaluation of t1(p1(L)) @ x. That is something to know when you are using Lazylinop polynomials.

4. Changing a polynomial to Chebyshev basis

As you may have noticed in our last example, one can create polynomials with coefficients expressed in different polynomial basis. We have already seen the syntax p = Polynomial([c0,..., c_n]) in the standard representation as a sum of monomials, but one can also use chebyshev, hermite (physicists polynomials), hermite_e (probabilists polynomials), laguerre, legendre and roots. For example, q = Chebyshev([c0,...,c_n]) will create a polynomial such that \(q(L) = c_0 I + c_1 T_1(L) + \cdots + c_n T_n(L)\) where \(T_i\) is the \(i\)-th Chebyshev polynomial.

Following our previous section we now show how p1 and p2, which are defined according to a monomial basis, can be converted to a representation in the Chebyshev basis using the NumPy syntax convert(kind=Chebyshev). We then apply as before the resulting polynomial of L to x.

[5]:
print(isinstance(p1 + p2, Polynomial))
print(isinstance((p1 + p2).convert(kind=Chebyshev), Chebyshev))
print(isinstance(p1 + p2, Polynomial))

L = lz.ones((x.shape[0], x.shape[0]))
ypoly = (p1 + p2)(L) @ x
ycheb = ((p1 + p2).convert(kind=Chebyshev))(L) @ x
np.linalg.norm(ypoly - ycheb) / np.linalg.norm(ypoly)
True
True
True
[5]:
1.2341363840556378e-16

We maintain the same accuracy using Chebyshev polynomials, all is fine!

5. Approaching the exponential of a Kronecker product using Polynomial

In the above examples, we mainly manipulated p(L) with low-degree polynomials p and a simple LazyLinOp L corresponding to a squared matrix filled with ones. If you found these examples too trivial, you may be interested in the next example: we now use polynomials to efficiently approximate the exponential \(e^{\tau K}\) of a (lazy) Kronecker product \(K=A\otimes B\) using the polynomial module. We will test it and then compute an approximation of exponential action of K on x (that is \(e^{\tau K} x\)). First we construct the NumPy arrays A,B, the NumPy array K corresponding to their Kronecker product, and the LazyLinOp lK obtained via a lazy Kronecker product.

[8]:
A = np.random.randn(2, 2)
B = np.random.randn(4, 4)
K = np.kron(A, B)
lK = lz.kron(A, B)

Then we apply SciPy’s expm to compute the matrix exponential of K and apply it to a random vector. This will serve as our reference. Of note, it matters to first normalize our operator \(K\) to avoid large eigenvalues associated to numerical issues in the computation of the exponential.

[9]:
norm = np.linalg.norm(K)
lK = lK / norm
K = K / norm

x = np.random.rand(lK.shape[1])
y = sp.linalg.expm(K) @ x

Now, we construct an instance p1 of the Polynomial class corresponding the Taylor approximation \(1+x+\frac{x^2}{2}\) of degree 2 of the exponential function \(e^x\). This allows to approximate the action of the exponential of lK on \(x\) using the simple instruction p1(lK) @ x.

[10]:
p1 = Polynomial([1.0, 1.0, 0.5])
approx = p1(lK) @ x
np.linalg.norm(y - approx) / np.linalg.norm(y)
[10]:
0.0017207308680879968

Well, the approximation is not so good, but we can increase the precision by increasing the degree of the Taylor approximation. To do so, we create a new polynomial gathering the third, fourth and fifth degree terms of the Taylor approximation \(\frac{x^3}{3!}+\frac{x^4}{4!}+\frac{x^5}{5!}+\frac{x^6}{6!}\).

[11]:
p2 = Polynomial([0.0, 0.0, 0.0, 1.0 / 6.0, 1.0 / 24.0, 1.0 / 120.0, 1.0 / 720.0])
approx = (p1 + p2)(lK) @ x
np.linalg.norm(y - approx) / np.linalg.norm(y)
[11]:
2.8228073277124983e-08

This is not perfect but alreay quite better.

6. Benchmarking the exponential of a Kronecker product

We know thanks to the previous section that the polynomial approximation works, now we want to know if it works faster. So, as before we create our Kronecker product \(K=A\otimes B\) of matrices \(A\) and \(B\) as a LazyLinOp lK and as a NumPy array K.

Again, we normalize our operator \(K\) in order to avoid a diverging approximation. The norm should be chosen carefully to reduce the cost. You might also divide by the largest eigenvalue which comes at a lower cost if you use SciPy scipy.sparse.linalg.eigsh(A, k=1).

[12]:
M, N = 16, 16
A = np.random.randn(M, N)
B = np.random.randn(M, N)
K = np.kron(A, B)

norm = np.linalg.norm(K)

lK = 1 / norm * lz.kron(A, B)
K = 1 / norm * np.kron(A, B)

Then we create a polynomial p9 in monomial form with the first ninth coefficients of the Taylor series of the exponential \(e^x\approx\sum_{i=0}^8\frac{x^i}{i!}\).

We check that our result and the one given by SciPy are close.

[13]:
x = np.random.randn(lK.shape[1])
p9 = Polynomial(1.0 / sp.special.factorial(np.arange(9), exact=True))
y = p9(lK) @ x
z1 = sp.linalg.expm(K) @ x
z2 = sp.sparse.linalg.expm_multiply(K, x)
print(np.linalg.norm(y - z1) / np.linalg.norm(z1))
print(np.linalg.norm(y - z2) / np.linalg.norm(z2))
3.9053842693899996e-16
1.7580767299374295e-16

Perfect! Now let’s measure the computation time. We confront our results to scipy.linalg.expm and scipy.sparse.linalg.expm_multiply functions.

[14]:
%timeit -r 10 -n 2 p9(lK) @ x
The slowest run took 4.43 times longer than the fastest. This could mean that an intermediate result is being cached.
979 µs ± 732 µs per loop (mean ± std. dev. of 10 runs, 2 loops each)
[15]:
%timeit -r 10 -n 2 sp.linalg.expm(K) @ x
The slowest run took 5.61 times longer than the fastest. This could mean that an intermediate result is being cached.
24.3 ms ± 22 ms per loop (mean ± std. dev. of 10 runs, 2 loops each)
[16]:
%timeit -r 10 -n 2 sp.sparse.linalg.expm_multiply(K, x)
The slowest run took 9.63 times longer than the fastest. This could mean that an intermediate result is being cached.
1.98 ms ± 1.81 ms per loop (mean ± std. dev. of 10 runs, 2 loops each)

On this very example, it’s clear that Lazylinop approximation gives better performance than SciPy!

It’s noteworthy that of course larger are the matrices better is the performance of the polynomial approximation relatively to SciPy.

You can follow the same idea to compute a Pascal matrix that involves a nilpotent matrix exponential.

Lazylinop provides many similar polynomial approximation functions with arbitrary order of approximation:

Feel free to test and explore and write yours.

Thanks for reading this notebook! Find others on the Lazylinop website.