Butterfly module#
Preamble#
This module gathers utilities to manipulate and create a particular family of LazyLinOp
associated to matrices with the so-called butterfly structure, i.e., which are expressed as a
product of a few compatible Kronecker-Sparse factors.
Classical examples of such operators are the Hadamard transform and the Discrete-Fourier-Transform.
Construction of a butterfly LazyLinOp from a given array.
The main function of the module is ksd() (Kronecker-sparse decomposition), which computes a near best approximation of a given array or of a LazyLinOp by an operator with a prescribed butterfly structure [1].
Example: Consider an \(N\times N\) Hadamard matrix (with \(N\) a power of two):
>>> import scipy as sp
>>> H = sp.linalg.hadamard(N)
Turning this NumPy array into an efficient butterfly implementation requires a simple call L = ksd(H, chain) where chain is a Chain object specifying the nature of the Kronecker-sparse factors involved in the decomposition.
The classical chain associated to the Hadamard matrix is square-dyadic, and can be retrieved as follows:
>>> chain = Chain.square_dyadic(H.shape)
>>> L = ksd(H, chain)
It is also possible to use other chains (see below), leading to implementations with different tradeoffs between memory consumption, energy consumption and computational speed.
>>> chain2 = Chain.monarch(H.shape, ...)
>>> L2 = ksd(H, chain2)
Construction of individual or multiple Kronecker-sparse factors.
Under the hood, the operator L resulting from a call to ksd() is a (lazy) product of Kronecker-sparse factors, each of them benefitting from an optimized implementation on various backends via the function ksm() (Kronecker-sparse multiplication operator).
A single Kronecker-sparse factor is a structure determined by a 4D array \(T\) of shape (a, b, c, d), and is fully determined as L = ksm(T) given a 4-dimensional array of values \(T_{i,j,k,l}\) of size \(a\times b\times c\times d\).
The shape of L is then \(abd\times acd\), and converting it to an array would yield a structured sparse matrix with support (the set of possible nonzero entries) \(I_a\otimes 1_{b\times c}\otimes I_d\).
See [1] and the documentation of ksm() for further details and illustrations.
Implementation parameters.
Options of ksm() (and of ksd()) notably allow to use either the CPU or the GPU with various implementation parameters, or to directly build a lazy product of Kronecker-sparse factors L = ksm([T1, ..., Tn], ...) to minimize memory movements between CPU and GPU when computing L @ x.
Saving and loading butterfly operators.
The LazyLinOp resulting from a call to ksm() or ksd() can be saved to disk (as a .json file) and load using:
>>> lazylinop.butterfly.save(L, 'filename')
>>> L = lazylinop.butterfly.load('filename')
Plot butterfly operators.
The LazyLinOp resulting from a call to ksm() or ksd() can be plotted or saved to disk (as a .png file and a .svg file) using:
>>> N = 16
>>> L = lazylinop.butterfly.dft(N)
>>> lazylinop.butterfly.plot(L)
>>> lazylinop.butterfly.plot(L, "dft_16x16")
Pre-built Discrete-Fourier-Transform.
We provide dft() a pre-built LazyLinOp L with the Butterfly structure corresponding to the Discrete-Fourier-Transform.
The L corresponding to a DFT of a signal of size \(256\) with \(8\) factors (square-dyadic decomposition) is given by:
>>> L = lazylinop.butterfly.dft(256)
Pre-built Fast-Walsh-Hadamard-Transform.
We provide fwht() a pre-built LazyLinOp L with the Butterfly structure corresponding to the Fast-Walsh-Hadamard-Transform (FWHT).
The L corresponding to a FWHT of a signal of size \(256\) with \(8\) factors (square-dyadic decomposition) is given by:
>>> L = lazylinop.butterfly.fwht(256)
Chains and their manipulation.
A chain encodes the structure of the Butterfly operator L and corresponds to a sequence of tuples
such that Kronecker-sparse operators \(L_l\) with the corresponding structure have compatible sizes, i.e. such that the product \(L = L_1\cdots L_n\) is well-defined. The shape of \(L\) is given by the attribute chain.shape.
The concatenation of compatible chains is implemented via chain = chain1 @ chain2.
Specifying it explicitly is simple:
>>> chain = Chain([(a1, b1, c1, d1), ..., (an, bn, cn, dn)])
The most standard chain is probably the so-called square-dyadic chain, which is behind usual fast implementations of the Hadamard transform and of DFTs. It is only defined when \(\mathtt{shape}=\left(N,~N\right)\) with \(N\) a power of two as:
>>> sd_chain = Chain.square_dyadic(shape)
Another common chain with arbitrary non-prime shape is Monarch chain with two factors.
>>> chain = Chain.monarch(shape, ...)
You can visualize the structure of all Kronecker-sparse factors in these chains as follows:
>>> sd_chain.plot()
>>> chain.plot()
and anticipate the memory footprint of the corresponding operators:
>>> sd_chain = Chain.square_dyadic((32, 32))
>>> sd_chain.mem(np.float32)
1280
>>> chain = Chain.monarch((32, 32))
>>> chain.mem(np.float32)
1536
As you can see, the memory footprint of a monarch chain is higher than that of a square-dyadic one, but when it fits in the memory of a GPUs it can lead to a faster implementation.
Chainability.
This above chains satisfy a “chainability” property, which is crucial to ensure that ksd() can be run, with approximation guarantees [1].
Chainability can be checked using the attribute chain.chainable, and ksd() will generate an error if the provided chain is not chainable.
References#
[1] Butterfly Factorization with Error Guarantees. Leon Zheng, Quoc-Tung Le, Elisa Riccietti, and Remi Gribonval https://hal.science/hal-04763712v1/document
Butterfly construction#
- lazylinop.butterfly.ksd(A, chain, ortho=True, order='l2r', svd_backend=None, **kwargs)#
Returns a
LazyLinOpcorresponding to the (often called “butterfly”) factorization ofAinto Kronecker-sparse factors with sparsity patterns determined by a chainable instancechainofChain.L = ksd(...)returns aLazyLinOpcorresponding to the factorization ofAwhereL = ksm(...) @ ksm(...) @ ... @ ksm(...).Note
L.ks_valuescontains theks_valuesof the factorization (seeksm()for more details).As an example, the DFT matrix is factorized as follows:
M = 16 # Build DFT matrix using lazylinop.signal.fft function. from lazylinop.signal import fft V = fft(M).toarray() # Use square-dyadic decomposition. sd_chain = Chain.square dyadic(V.shape) # Multiply the DFT matrix with the bit-reversal permutation matrix. from lazylinop.basicops import bitrev P = bitrev(M) L = ksd(V @ P.T, chain=sd_chain)
- Args:
- A:
np.ndarray,cp.array,torch.TensororLazyLinOp Matrix or
LazyLinOpto factorize. whenAis aLazyLinOpsmall it is often faster to performksd(A.toarray(), ...)thanksd(A, ...)if the dense arrayA.toarray()fits in memory.- chain:
Chain Chainable instance of the
Chainclass. SeeChaindocumentation for more details.- ortho:
bool, optional Whether to use orthonormalisation or not during the algorithm, see [1] for more details. Default is
True.- order:
str, optional Determines in which order partial factorizations are performed, see [1] for more details.
'l2r'Left-to-right decomposition (default).'balanced'
- svd_backend:
str, optional See documentation of
linalg.svds()for more details.
- A:
- Kwargs:
Additional arguments
ksm_backend,ksm_paramsto pass toksm()function. Seeksm()for more details.- Returns:
Lis aLazyLinOpthat corresponds to the product ofchain.n_patternsLazyLinOpKronecker-sparse factors.The namespace and device of the
ks_valuesof all factors are determined as follows:If
Ais an array (oraslazylinop(array)) then its namespace and device are usedotherwize,
svd_backenddetermines the namespace and device
References:
[1] Butterfly Factorization with Error Guarantees. Léon Zheng, Quoc-Tung Le, Elisa Riccietti, and Rémi Gribonval https://hal.science/hal-04763712v1/document
- Examples:
>>> import numpy as np >>> from lazylinop.butterfly import Chain, ksd >>> from lazylinop.basicops import bitrev >>> from lazylinop.signal import fft >>> N = 256 >>> M = N >>> V = fft(M).toarray() >>> chain = Chain.square_dyadic(V.shape) >>> # Use bit reversal permutations matrix. >>> P = bitrev(N) >>> approx = (ksd(V @ P.T, chain) @ P).toarray() >>> error = np.linalg.norm(V - approx) / np.linalg.norm(V) >>> np.allclose(error, 0.0) True
- lazylinop.butterfly.ksm(ks_values, params=None, backend='xp')#
Returns a
LazyLinOpLfor Kronecker Sparse Matrix Multiplication (KSMM see [1]). The sparsity pattern (or support) of a Kronecker-Sparse factor is defined as \(I_a\otimes 1_{b,c}\otimes I_d\) while its values are given by a 4Dnp.ndarrayof shape(a, b, c, d).The shape of
Lis \(\left(abd,~acd\right)\).To fill a
ks_valuesand its Kronecker-Sparse factorM:M = np.zeros((a * b * d, a * c * d), dtype=np.float32) ks_values = np.empty((a, b, c, d), dtype=M.dtype) for i in range(a): for j in range(b): for k in range(c): for l in range(d): tmp = np.random.randn() ks_values[i, j, k, l] = tmp M[i * b * d + j * d + l, i * c * d + k * d + l] = tmp
For
a = 3, b = 3, c = 5, d = 5we have the following pattern.Note
You can access the
ks_valuesofL = ksm(...)usingL.ks_values.With OpenCL and CUDA backend,
L @ Xwill implicitly castXto:match the dtype of
L.ks_valuesbe of contiguous type
This can incur a loss of performance, as-well-as a loss of precision if the dtype of X was initially of higher precision than that of
L.ks_values.- Args:
- ks_values: CuPy/NumPy arrays, torch tensors or
listof arrays It could be:
A 4D array of values of the Kronecker-Sparse factor.
List of values of the Kronecker-Sparse factors. The length of the list corresponds to the number of Kronecker-Sparse factors.
The
dtypeof eachks_valuesis either'float16','float32','float64','complex64'or'complex128'. See code above for details on the expected indexing ofks_values.- backend: optional
The available backends depend on the namespace (see array-api-compat for more details) of the
ks_values. By default, use a namespace-based implementation.For
torchnamespace:backend='ksmm'to run the algorithm of [1]. It uses a CUDA device determined byks_valuesand relies oncp.RawModule.
For
cupynamespace:backend='ksmm'to run the algorithm of [1]. It uses a CUDA device determined byks_values(must be on GPU) and relies oncp.RawModule,__cuda_array_interface__,DLPack.backend='cupyx'uses thecupyx.scipy.sparse.csr_matrixfunction.
For
numpynamespace:backend='scipy'uses the SciPy sparse functionsscipy.sparse.block_diagandscipy.sparse.csr_matrix.backend=(platform, device)to use OpenCL to run the algorithm of [1].(None, None)uses the first platform and device.(None, 'cpu')use the first platform and CPU device.(None, 'gpu')use the first platform and GPU device.
Please consider the following piece of code for advanced choices:
import pyopencl as cl # Get your favorite platform. platform = cl.get_platforms()[pl_id] # To get your favorite CPU device. device = platform.get_devices(device_type=cl.device_type.CPU)[dev_id] # To get your favorite GPU device. device = platform.get_devices(device_type=cl.device_type.GPU)[dev_id]
To check platforms and devices of your system you can also run the command line
clinfo -a. See PyOpenCL documentation for more details.backend=pycuda.driver.Device(id)uses a CUDA device to run the algorithm of [1]. See PyCUDA documentation for more details.
- params:
tupleorlistoftuple, optional Argument
paramsonly works for OpenCL and CUDA backends. It could be:A tuple
paramsof tuples whereparams[0]andparams[1]expect a tuple of six elements(TILEX, TILEK, TILEY, TX, TY, VSIZE)(see [1] for more details). If(None, None)(default), the choice of hyper-parameters for multiplicationL @ Xand the multiplicationL.H @ Xis automatic. Because we did not run a fine-tuning for all the possible \(\left(a,~b,~c,~d\right)\) and \(\left(a,~c,~b,~d\right)\) tuples, automatic does not always correspond to the best choice.List of tuple of length the number of factors.
params[i][0]andparams[i][1]expect a tuple of six elements(TILEX, TILEK, TILEY, TX, TY, VSIZE)(see [1] for more details). IfNone(default), the choice of hyper-parameters for multiplicationL @ Xand the multiplicationL.H @ Xis automatic. Because we did not run a fine-tuning for all the possible \(\left(a,~b,~c,~d\right)\) and \(\left(a,~c,~b,~d\right)\) tuples, automatic does not always correspond to the best choice.
List of assertions the tuple
(TILEX, TILEK, TILEY, TX, TY, VSIZE)must satisfy:TILEX = X * TXTILEY = Y * TYbatch size % TILEX == 0for performance reason. Consider zero-padding of the batch.TILEX < batch sizeTILEK <= c and c % TILEK == 0for performance reason.TILEX > TILEK and TILEY > TILEK(VSIZE * X * Y) % TILEX == 0TILEK % strideInput == 0(VSIZE * X * Y) % TILEK == 0TILEY % strideValues == 0TILEY <= b(b * d) % (d * TILEY) == 0ks_values.dtype.itemsize * 2 * (TILEY * TILEK + TILEK * TILEX) < smem
where
smemis the shared memory of the hardware used to compute,VSIZEranges from \(1\) to \(4\),strideValues = VSIZE * X * Y / TILEKandstrideInput = VSIZE * X * Y / TILEX.
- ks_values: CuPy/NumPy arrays, torch tensors or
- Returns:
A
LazyLinOpinstanceLfor Kronecker Sparse Matrix Multiplication (KSMM). You can access theks_valuesofL = ksm(...)usingL.ks_values.- Examples:
>>> from lazylinop.butterfly.ksm import ksm >>> import numpy as np >>> a, b, c, d = 2, 4, 4, 2 >>> ks_values = np.full((a, b, c, d), 1.0, dtype=np.float32) >>> L = ksm(ks_values) >>> L.toarray(dtype='float') array([[1., 0., 1., 0., 1., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0.], [0., 1., 0., 1., 0., 1., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0.], [1., 0., 1., 0., 1., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0.], [0., 1., 0., 1., 0., 1., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0.], [1., 0., 1., 0., 1., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0.], [0., 1., 0., 1., 0., 1., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0.], [1., 0., 1., 0., 1., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0.], [0., 1., 0., 1., 0., 1., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 1., 0., 1., 0., 1., 0.], [0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 1., 0., 1., 0., 1.], [0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 1., 0., 1., 0., 1., 0.], [0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 1., 0., 1., 0., 1.], [0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 1., 0., 1., 0., 1., 0.], [0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 1., 0., 1., 0., 1.], [0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 1., 0., 1., 0., 1., 0.], [0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 1., 0., 1., 0., 1.]]) >>> # List of Kronecker-Sparse factors. >>> a1, b1, c1, d1 = 2, 4, 3, 3 >>> ks_values1 = np.full((a1, b1, c1, d1), 1.0, dtype=np.float32) >>> a2, b2, c2, d2 = 3, 3, 5, 2 >>> ks_values2 = np.full((a2, b2, c2, d2), 1.0, dtype=np.float32) >>> L = ksm(ks_values1) @ ksm(ks_values2) >>> M = ksm([ks_values1, ks_values2]) >>> np.allclose(L.toarray(dtype='float'), M.toarray(dtype='float')) True
References:
[1] Fast inference with Kronecker-sparse matrices. Antoine Gonon and Léon Zheng and Pascal Carrivain and Quoc-Tung Le https://arxiv.org/abs/2405.15013
See also
- lazylinop.butterfly.fuse(t1, t2)#
Fuse two 4D arrays of
ks_valuest1, t2of shape \(\left(a_1,~b_1,~c_1,~d_1\right)\) and \(\left(a_2,~b_2,~c_2,~d_2\right)\). The resulting 4D array ofks_valuestis of shape \(\left(a_1,~\frac{b_1d_1}{d_2},~\frac{a_2c_2}{a_1},~d_2\right)\) and satisfies(ksm(t1) @ ksm(t2)).toarray() == ksm(t).toarray().- Args:
- t1, t2: 4D arrays
First
ks_valuesof shape \(\left(a_1,~b_1,~c_1,~d_1\right)\).Second
ks_valuesof shape \(\left(a_2,~b_2,~c_2,~d_2\right)\).
The dimensions must satisfy:
\(a_1c_1d_1=a_2b_2d_2\)
\(a_1\) divides \(a_2\)
\(d_2\) divides \(d_1\)
otherwize an
Exceptionis returned.
- Returns:
The resulting
ks_valuestis a 4D array of shape \(\left(a_1,~\frac{b_1d_1}{d_2},~\frac{a_2c_2}{a_1},~d_2\right)\).
See also
- Examples:
>>> import numpy as np >>> from lazylinop.butterfly import ksm, fuse >>> a1, b1, c1, d1 = 2, 2, 2, 4 >>> a2, b2, c2, d2 = 4, 2, 2, 2 >>> v1 = np.random.randn(a1, b1, c1, d1) >>> v2 = np.random.randn(a2, b2, c2, d2) >>> v = fuse(v1, v2) >>> v.shape (2, 4, 4, 2) >>> L = ksm(v) >>> L1 = ksm(v1) >>> L2 = ksm(v2) >>> x = np.random.randn(L.shape[1]) >>> np.allclose(L @ x, L1 @ L2 @ x) True
- lazylinop.butterfly.dft(N, backend='numpy', dtype='complex64', device='cpu')#
Return a
LazyLinOpL with the Butterfly structure corresponding to the Discrete-Fourier-Transform (DFT).Shape of
Lis \(\left(N,~N\right)\) where \(N=2^n\) must be a power of two.The number of factors \(n\) of the square-dyadic decomposition is given by \(n=\log_2\left(N\right)\)
Infer the namespace (see array-api-compat for more details) of
L.ks_valuesfromdtypeanddevicearguments. By default, namespace ofL.ks_valuesisnumpyand dtype is'complex64'.- Args:
- N:
int Size of the DFT. \(N\) must be a power of two.
- backend:
str,tupleorpycuda.driver.Device, optional See
ksm()for more details.- dtype:
str, optional It could be either
'complex64'(default) or'complex128'.
- N:
- Returns:
LazyLinOpL corresponding to the DFT.- Examples:
>>> import numpy as np >>> from lazylinop.butterfly import dft as bdft >>> from lazylinop.signal import dft as sdft >>> N = 2 ** 7 >>> x = np.random.randn(N).astype('float32') >>> y = bdft(N) @ x >>> z = sdft(N) @ x >>> np.allclose(y, z) True
References:
[1] Learning Fast Algorithms for Linear Transforms Using Butterfly Factorizations. Dao T, Gu A, Eichhorn M, Rudra A, Re C. Proc Mach Learn Res. 2019 Jun;97:1517-1527. PMID: 31777847; PMCID: PMC6879380.
- lazylinop.butterfly.fwht(N, backend='numpy', dtype='float64', device='cpu')#
Return a
LazyLinOpL with the Butterfly structure corresponding to the Fast-Walsh-Hadamard-Transform (FWHT).Shape of
Lis \(\left(N,~N\right)\) where \(N=2^n\) must be a power of two.Lis orthogonal and symmetric and the inverse WHT operator isL.T = L.The number of factors \(n\) of the square-dyadic decomposition is given by \(n=\log_2\left(N\right)\)
Infer the namespace (see array-api-compat for more details) of
L.ks_valuesfromdtypeanddevicearguments. By default, namespace ofL.ks_valuesisnumpyand dtype is'float64'.- Args:
- N:
int Size of the FWHT. \(N\) must be a power of two.
- backend:
str,tupleorpycuda.driver.Device, optional See
ksm()for more details.- dtype:
str, optional The dtype of Hadamard matrix elements. The defaut value is ‘float64’.
- device: optional
The device of Hadamard matrix elements. The default value is
'cpu'.
- N:
- Returns:
LazyLinOpL corresponding to the FWHT.Lis equivalent tohadamard(N, backend, dtype) / sqrt(N).- Examples:
>>> import numpy as np >>> from lazylinop.butterfly import fwht as bfwht >>> from lazylinop.signal import fwht as sfwht >>> N = 2 ** 5 >>> x = np.random.randn(N).astype('float64') >>> y = bfwht(N) @ x >>> z = sfwht(N) @ x >>> np.allclose(y, z) True
See also
- lazylinop.butterfly.hadamard(N, backend='numpy', dtype='float64', device='cpu')#
Return a
LazyLinOpL with the Butterfly structure corresponding to the Hadamard matrix.Shape of
Lis \(\left(N,~N\right)\) where \(N=2^n\) must be a power of two.Lis not orthogonal and its inverse isL.T / N.Lis symmetric.The number of factors \(n\) of the square-dyadic decomposition is given by \(n=\log_2\left(N\right)\).
Infer the namespace (see array-api-compat for more details) of
L.ks_valuesfromdtypeanddevicearguments. By default, namespace ofL.ks_valuesisnumpyand dtype is'float64'.- Args:
- N:
int Size of the Hadamard matrix. \(N\) must be a power of two.
- backend:
str,tupleorpycuda.driver.Device, optional See
ksm()for more details.- dtype:
str, optional The dtype of Hadamard matrix elements. The defaut value is ‘float64’.
- device: optional
The device of Hadamard matrix elements. The default value is
'cpu'.
- N:
- Returns:
LazyLinOpL corresponding to the Hadamard matrix.- Examples:
>>> import numpy as np >>> import scipy as sp >>> from lazylinop.butterfly import hadamard >>> N = 2 ** 5 >>> x = np.random.randn(N).astype('float64') >>> y = hadamard(N) @ x >>> z = sp.linalg.hadamard(N) @ x >>> np.allclose(y, z) True
Chain class#
- class lazylinop.butterfly.Chain(ks_patterns)#
A
Chaininstance is built by calling the constructorChain(ks_patterns).- Args:
- ks_patterns:
list List of tuples \(((a_l,~b_l,~c_l,~d_l))_{l=1}^n\) each being called a pattern. The tuples \(((a_l,~b_l,~c_l,~d_l))_{l=1}^n\) must satisfy \(a_lc_ld_l=a_{l+1}b_{l+1}d_{l+1}\).
- ks_patterns:
- Attributes:
- ks_patterns:
list List of patterns \((a_i,~b_i,~c_i,~d_i)\).
- n_patterns:
int Equal to
len(ks_patterns).- shape:
tuple shapeis equal to \((a_1b_1d_1,~a_nc_nd_n)\) withn = n_patterns.- chainable:
bool Trueif for each \(l\):and \(a_l\) divides \(a_{l+1}\)
and \(d_{l+1}\) divides \(d_l\)
See [1] for more details.
- ks_patterns:
- Return:
chainwithks_patterns,n_patterns,shapeandchainableattributes.- Examples:
>>> from lazylinop.butterfly import Chain >>> chain = Chain([(2, 1, 1, 2), (2, 1, 1, 2)]) >>> chain.ks_patterns [(2, 1, 1, 2), (2, 1, 1, 2)] >>> # Concatenation of two chains. >>> chain1 = Chain([(1, 2, 2, 4)]) >>> chain2 = Chain([(4, 2, 2, 1)]) >>> chain = chain1 @ chain2 >>> chain.shape (8, 8) >>> chain.n_patterns 2 >>> chain.ks_patterns [(1, 2, 2, 4), (4, 2, 2, 1)]
References:
[1] Butterfly Factorization with Error Guarantees. Lu00E9on Zheng, Quoc-Tung Le, Elisa Riccietti, and Ru00E9mi Gribonval https://hal.science/hal-04763712v1/document
- lazylinop.butterfly.Chain.square_dyadic(shape)#
Build a square-dyadic chain from shape.
shapemust satisfyshape[0] = shape[1] = Nwith \(N=2^n\) a power of two. Number ofks_patternsis equal to \(n\). The l-th pattern is given by(2 ** (l - 1), 2, 2, shape[0] // 2 ** l)where1 <= l <= n.We can draw the square-dyadic decomposition for \(N=16\):
using:
sq_chain = Chain.square_dyadic((16, 16)) sq_chain.plot()
- Args:
- shape:
tuple Shape of the input matrix must be \((N,~N)\) with \(N=2^n\).
- shape:
- lazylinop.butterfly.Chain.monarch(shape, p=None, q=None)#
Build a Monarch chain \(((1,~p,~q,~\frac{M}{p}),~(q,~\frac{M}{p},~\frac{N}{q},~1))\) from shape. \(p\) must divide \(M\) and \(q\) must divide \(N\). See [1] for more details.
- Args:
- shape:
tuple Shape of the input matrix \((M,~N)\). \(M\) and \(N\) must not be prime numbers.
- p, q:
int, optional \(p\) must divide \(M\) and \(q\) must divide \(N\). If
p(resp.q) isNone,p(resp.q) is chosen such \(p\simeq\sqrt{m}\) (resp. \(q\simeq\sqrt{n}\)).
- shape:
- Examples:
>>> from lazylinop.butterfly.chain import Chain >>> M, N = 21, 16 >>> chain = Chain.monarch((M, N)) >>> chain.ks_patterns [(1, 3, 4, 7), (4, 7, 4, 1)] >>> M, N = 12, 25 >>> chain = Chain.monarch((M, N)) >>> chain.ks_patterns [(1, 4, 5, 3), (5, 3, 5, 1)] >>> M, N = 12, 16 >>> chain = Chain.monarch((M, N)) >>> chain.ks_patterns [(1, 4, 4, 3), (4, 3, 4, 1)] >>> chain = Chain.monarch((M, N), p=2, q=4) >>> chain.ks_patterns [(1, 2, 4, 6), (4, 6, 4, 1)]
References:
[1] Monarch: Expressive structured matrices for efficient and accurate training. In International Conference on Machine Learning, pages 4690-4721. PMLR, 2022. T. Dao, B. Chen, N. S. Sohomi, A. D. Desai, M. Poli, J. Grogan, A. Liu, A. Rao, A. Rudra, and C. Ru00E9.
- lazylinop.butterfly.Chain.wip_non_redundant(shape)#
This class method is still work-in-progress.
Build a non redundant chain from
shapeusing Lemma 4.25 from [1]. The function uses a prime factorization \((q_l)_{l=1}^L\) ofM = shape[0]and a prime factorization \((p_l)_{l=1}^L\) ofN = shape[0]where \(L\) is the length of the prime factorization. If the two factorization do not have the same length, merge the smallest elements of the larger factorization. Raise anExceptionifshape[0] > 2andshape[1] > 2are prime numbers.- Args:
- shape:
tuple Shape of the input matrix, expect a tuple \((M,~N)\).
Mis equal to the number of rows \(a_1b_1d_1\) of the first factor whileNis equal to the number of columns \(a_nc_nd_n\) of the last factor.
- shape:
- Examples:
>>> from lazylinop.butterfly import Chain >>> sq_chain = Chain.square_dyadic((8, 8)) >>> sq_chain.ks_patterns [(1, 2, 2, 4), (2, 2, 2, 2), (4, 2, 2, 1)] >>> nr_chain = Chain.wip_non_redundant((8, 8)) >>> nr_chain.ks_patterns [(1, 2, 2, 4), (2, 2, 2, 2), (4, 2, 2, 1)]
Chain methods#
- lazylinop.butterfly.Chain.mem(self, dtype)#
Return the memory in bytes of the
ks_valuesof dtypedtypeto be created corresponding toself.- Args:
- dtype:
str dtype of the
ks_values.
- dtype:
- Examples:
>>> import numpy as np >>> from lazylinop.butterfly import Chain >>> sd_chain = Chain.square_dyadic((32, 32)) >>> sd_chain.ks_patterns [(1, 2, 2, 16), (2, 2, 2, 8), (4, 2, 2, 4), (8, 2, 2, 2), (16, 2, 2, 1)] >>> sd_chain.mem(np.float32) 1280 >>> chain = Chain.monarch((32, 32)) >>> chain.ks_patterns [(1, 4, 4, 8), (4, 8, 8, 1)] >>> chain.mem(np.float32) 1536
- lazylinop.butterfly.Chain.plot(self, name=None)#
Plot
self.ks_patterns. The colors are randomly chosen. Matplotlib package must be installed.- Args:
- name:
str Save the plot in both PNG file
name + '.png'and SVG filename + '.svg'. Default value isNone(it only draws theself.ks_patterns).
- name:
- Examples:
>>> from lazylinop.butterfly import Chain >>> sq_chain = Chain.square_dyadic((32, 32)) >>> sq_chain.plot("square_dyadic") >>> chain = Chain.monarch((32, 32)) >>> chain.plot("monarch")
I/O (load/save)#
- lazylinop.butterfly.load(name)#
Load the
LazyLinOpLfrom filename.json. The filename.jsonhas been created bysave().- Args:
- name:
str Name of the
.jsonfile where to loadL.
- name:
- Returns:
Lis aLazyLinOpthat corresponds to the product ofn_patternsLazyLinOpeach one returned byksm(). If file does not exist, returnNone.
See also
- Examples:
>>> import scipy as sp >>> import numpy as np >>> from lazylinop.butterfly import Chain, ksd, load, save >>> H = sp.linalg.hadamard(8) >>> x = np.random.randn(8) >>> chain = Chain.square_dyadic(H.shape) >>> A = ksd(H, chain) >>> save(A, "hadamard_8x8") >>> A_ = load("hadamard_8x8") >>> y = A @ x >>> y_ = A_ @ x >>> np.allclose(y, y_) True
- lazylinop.butterfly.save(L, name)#
Save the instance
LofLazyLinOpreturned byL = ksm(...)orL = ksd(...)function. Save the result of the factorization in a json filename + '.json'.- Args:
- L:
LazyLinOp The
LazyLinOpLto save.- name:
str Name of the file.
- L:
See also
- Examples:
>>> import scipy as sp >>> import numpy as np >>> from lazylinop.butterfly import Chain, ksd, load, save >>> H = sp.linalg.hadamard(8) >>> x = np.random.randn(8) >>> chain = Chain.square_dyadic(H.shape) >>> L = ksd(H, chain) >>> save(L, "hadamard_8x8") >>> L_ = load("hadamard_8x8") >>> y = L @ x >>> y_ = L_ @ x >>> np.allclose(y, y_) True
Plot#
- lazylinop.butterfly.plot(L, name=None)#
Plot
L.ks_valueson a logarithmic scale. Ifks_valuesis complex plotxp.sqrt(ks_values * xp.conj(ks_values)). Matplotlib package must be installed.- Args:
- name:
str Save the plot in both PNG file
name + '.png'and SVG filename + '.svg'. Default value isNone(it only draws theL.ks_values).
- name:
- Examples:
>>> from lazylinop.butterfly import dft >>> from lazylinop.butterfly import plot >>> N = 16 >>> L = dft(N) >>> plot(L)
version 1.20.1 documentation