Using the FAµST Factorization Wrappers

Previous notebooks already addressed a part of the pyfaust's functionalities however pyfaust.fact which is a central module was not covered. It is mainly dedicated to the algorithms that (approximately) factorize a dense matrix into a Faust object. This is the subject covered (partly) in this notebook.

You will see how to launch the algorithms easily, how to feed them with your own complex set of parameters, how to run them for only few steps when possible, how to use a variant or another, how to enable debugging information, etc.

NOTE: the notebook is made to be executed sequentially, otherwise, skipping some cells, you would end up on an import error.

Table of Contents

1. The Hierarchical PALM4MSA Algorithm
1.1 Generating a Hadamard Matrix
1.2 Factorizing a Hadamard Matrix
1.2.1 Defining the Constraints
1.2.2 Setting the Rest of the Parameters and Running the Algorithm
1.2.3 Using Projectors instead of ConstraintList-s

2. The GFT Algorithm
2.0 The pygsp Toolbox
2.1 The Truncated Jacobi Algorithm

1. The Hierarchical PALM4MSA Algorithm

1.1 Generating a Hadamard Matrix

Before to tackle Hadamard matrices factorization, let's introduce one pyfaust's function which is actually directly related: wht.
This method allows you to generate sparse Hadamard transforms.

All the factors are the same, let's print the first one.

You might want to verify if $H$ is a proper Hadamard matrix.

The ugly code above basically verifies that all column or row vectors are mutually orthogonal and made only of -1, 0 and 1 coefficients, that is exactly the Hadamard matrix definition. The response is yes, so we can go ahead.

1.2 Factorizing a Hadamard Matrix

Let's begin by factorizing $H$ the easy way with the automatic parametrization provided by pyfaust.

Interestingly, the FH2's factors are not the same as FH's but the resulting numpy array is exactly the same. Just to be sure let's verify the relative error against H.

Good! The argument 'hadamard' of faust_fact is a shorthand to a set of parameters fitting the factorization of Hadamard matrices.

Speaking about shorthand, faust_fact is an alias of pyfaust.fact.hierarchical. This function implements the hiearchical factorization based on PALM4MSA. For details about the theory behind these algorithms you can read Luc Le Magoarou's thesis manuscript.

We won't go into further details nevertheless it's important to see how to define the parameters manually in order to proceed to your own custom factorizations. For that purpose we need to describe approximately the algorithm but please keep in mind this is just an insight (for instance, the norms below are voluntarily not defined).

Basically, the hierarchical algorithm works like this:

1.2.1 Defining the Constraints

The explanation above is eluding something essential: the sparsity constraints. Indeed, the purpose of the algorithm is not only to decompose a matrix but moreover to enhance its sparsity, as hinted by the FAµST acronym: Flexible Approximate Multi-layer Sparse Transform.

So you'll need to feed the algorithm with sparsity constraints. In fact, you'll define one pair of constraints per iteration, the first is for $S_{i}$ (the main factor) and the second for $R_i$ (the residual).

The pyfaust API is here to help you define the constraints in one shot but you can if you want define constraints one by one as we'll see later.

Let's unveil the factorization constraints to decompose the Hadamard matrix $H \in \mathbb R^{n \times n}$.
Let's go back to the code and define our constraints as lists (provided by the module pyfaust.factparams):

The 'splincol' constraint used here is for defining the maximum number of nonzeros elements to respect for any row or column of the considered matrix (here $S_i$ or $R_i$). Looking at S_constraints initialization, we see in the first line 'splincol', 2, n, n; the value 2 means we target 2 nonzeros per-column and per-row, the next arguments define the size of the matrix to constrain (its number of rows and columns).

So in the example of S_constraints all the targeted matrices have 2n nonzeros.

More generally in pyfaust, the constraints are defined in terms of norms, in our case of Hadamard factorization we used 0-norm (which is not actually a norm) as follows:

Enough of explanations! I let you decrypt why R_constraints correspond likewise to (C2).

But wait a minute, what would happen if we had a set of one hundred constraints? Don't worry! The pyfaust API allows to alternatively set the constraints one by one:

S_constraints and R_constraints are still the exact same set of constraints as before.

ConstraintInt is a family of integer constraints. More globally, there is a hierarchy of classes whose parent class is ConstraintGeneric where you'll find other kind of constraints ; ConstraintMat, ConstraintReal.

The table here can give you an idea of each constraint definition. Most constraints are associated to a proximal operator which projects the matrix into the set of matrices defined by the constraint or at least it tries to (because it's not necessarily possible).

In the latest version of FAµST it's possible to project matrices from the wrappers themselves, let's try one ConstraintInt:

We asked for a sparsity of 18 nonzeros and that's what we got after calling project.

The function project can help you debug a factorization or even understand exactly how a prox works for a specific constraint and matrix.

Another API doc link completes the definition of constraints : ConstraintName

If you find it complicated to parameterize your own set of constraints, well, you are not the only one! You may be happy to know that it is one of the priorities of the development team to provide a simplified API for the factorization algorithms in an upcoming release.
A simplification is already available with the pyfaust.proj package for which you might read the following specific notebook: Using The FAµST Projectors API. An example of the same configuration of constraints but the projector way is introduced in 1.2.3.

1.2.2 Setting the Rest of the Parameters and Running the Algorithm

OK, let's continue defining the algorithm parameters, the constraints are the most part of it. You have optional parameters too, but one key parameter is the stopping criterion of the PALM4MSA algorithm. You have to define two stopping criteria, one for the local optimization and another for the global optimization.

The type of stopping criterion used here is the number of iterations of PALM4MSA, that is the number of times all the current factors ($R_i$ included) are updated either for the local or the global optimization. So, with the initialization of loc_stop and glob_stop above, if you ask J factors (through the number of constraints you set) you'd count a total of $(30+30)*(J-1)$ iterations of PALM4MSA.

Ok, I think we're done with the algorithm parameters, we can pack them into one object and give it to faust_fact.

You might wonder what is the boolean is_update_way_R2L. Its role is to define if the PALM4MSA algorithm will update the $S_i$ and last $R_i$ factors from the left to right (if False) or toward the opposite direction (if True). It does change the results of factorization!

I must mention that there are several other optional arguments you'd want to play with for configuring, just dive into the documentation.

You also might call the PALM4MSA algorithm directly (not the hierarchical algorithm which makes use of it), the function is here. You can theoretically reproduce the hierarchical algorithm step by step calling palm4msa() by yourself.

Well, there would be a lot to say and show about these two pyfaust algorithms and their parameters but at least with this example, I hope, you got an insight of how it works.

1.2.3 Using Projectors instead of ConstraintList-s

Now that we know how to factorize a matrix, let's show in this short section how to use a more handy pyfaust API component than ConstraintList to define a set of projectors (or proximity operators) instead of constraints. Indeed when a Constraint* object is defined behind the scene a projector is used to compute the matrix image (with respect to a certain structure, e.g. the sparsity of the matrix). So why don't we directly define projectors? That is the thing, there is no reason not to do that, moreover it's simpler! So let's see how to define the Hadamard matrix factorization set of projectors and run again the algorithm.

As you see the factorization gives exactly the same Faust as the one before (with ConstraintList instead of pyfaust.proj list). It could seem to be just a syntactic detail here to use the pyfaust.proj package instead of ConstraintList but it is more than that, projectors like splincol are functors and can be used easily to project a matrix.
Again, I can only advise you to read the dedicated notebook about projectors if you want to go further into details about projectors and discover the collection of the projectors available in the pyfaust API.

2. The GFT Algorithm

Since version 2.4, FAµST is able to compute Fast Graph Fourier Transforms. Indeed, FAµST includes the Truncated Jabcobi algorithm for this goal. It allows to diagonalize symmetric positive definite matrices.

It is implemented in C++ and a Python wrapper makes it available from pyfaust.

Here again the theory isn't what matters but feel free to take a look at the following papers:

2.0 The pygsp Toolbox

Since we're going to work with graph Laplacians it would be a pity not to plot things. The pygsp toolbox provides what you need to easily obtain Laplacians of different graph families. Please install the package with pip (or maybe conda).

Let's plot the Minnesota graph (on the right) and its associated Laplacian matrix (on the left).

The goal of FGFT algorithms is to obtain an orthonormal Faust that nearly diagonalizes a symmetric matrix. In terms of graph signal processing, the considered symmetric matrix is typically a graph Laplacian $L$, and the Faust is expected to approximate the graph Fourier basis, i.e., the basis of eigenvectors of $L$. Given $L$, the algorithm outputs an orthonormal Faust $\hat U$ and a diagonal factor $\hat D$ such that: $L \approx \hat U \hat D \hat U^T$.

2.1 The Truncated Jacobi Algorithm

Let's run the truncated Jacobi algorithm (for eigen decomposition, hence the name eigtj) on the Minnesota Laplacian example. You'll see it's really straightforward!

Write some code to check the first factor of $\hat U $ is orthogonal and then that $\hat U$ itself is orthogonal (with a tolerance near to the machine espilon).

Ignoring the fact that $\hat U$ is approximate and maybe some slight numerical error, yes it is orthogonal.

Good! The Laplacian approximation defined by $\hat U$ and $\hat D$ seems not so bad according to the relative error. Do you want a smaller error ? It's time to read the documentation, the eigtj()'s keyword argument nGivens can help you in that goal.

Is that all? No, we can evaluate the memory saving $\hat U$ brings as a Faust relatively to the matrix U obtained by numpy.linalg.eig().

The memory space to store $\hat U$ is smaller since its number of nonzeros elements is less than that of $U$.

The execution time for Faust-vector multiplication is really better too (compared to the matrix-vector multiplication) ! Applying the Graph Fourier Transform $\hat U$ on x is faster.


The third notebook is ending here, I hope you'll be interested to dig into pyfaust API and maybe even give some feedback later. The API has made a lot of progress lastly but it remains a work-in-progress and some bugs might appear here or there. Besides, the documentation and the API can always receive some enhancement so don't hesitate, any suggestion is welcome!


Note: this notebook has been executed using the following version of pyfaust.