Butterfly module#
Butterfly construction#
lazylinop.wip.butterfly.dft_helper()
- lazylinop.wip.butterfly.dft.dft_helper(N, n_factors, backend='numpy', strategy='memory', dtype='complex64', device='cpu')#
Return a
LazyLinOpL corresponding to the Discrete-Fourier-Transform (DFT).Shape of
Lis \(\left(N,~N\right)\) where \(N=2^n\) must be a power of two.- Args:
- N:
int DFT of size \(N\). \(N\) must be a power of two.
- n_factors:
int Number of factors
n_factors <= n. Ifn_factors = n, return the square-dyadic decomposition. The performance of the algorithm depends on the number of factors, the size of the DFT as-well-as the strategy. Our experimentation shows that square-dyadic decomposition is always the worse choice. The best choice is two, three or four factors.- backend:
str,tupleorpycuda.driver.Device, optional See
ksm()for more details.- strategy:
str, optional It could be:
balancedfuse from left to right and right to left (\(n>3\)).Case
n = 6andn_factors = 2:step 0: 0 1 2 3 4 5
step 1: 01 2 3 45
step 2: 012 345
Case
n = 7andn_factors = 2:step 0: 0 1 2 3 4 5 6
step 1: 01 2 3 4 56
step 2: 012 3 456
step 3: 0123 456
Case
n = 7andn_factors = 3:step 0: 0 1 2 3 4 5 6
step 1: 01 2 3 4 56
step 2: 012 3 456
'memory'find the two consecutiveks_valuesthat minimize the memory of the fusedks_values. It is the default value.
- dtype:
str, optional It could be either
'complex64'(default) or'complex128'.
- N:
Benchmark of our DFT implementation is (we use default hyper-parameters here):
- Returns:
LazyLinOpL corresponding to the DFT.
Utilities#
- lazylinop.wip.butterfly.utils.clean(L)#
Release (OpenCL) or free (CUDA) device pointers
LazyLinOpLreturned by eitherksm(...)orksd(...). Once you computey = L @ xand you do not needLanymore, useclean(L)to clean memory and to deleteL.- Args:
- L:
LazyLinOp Clean device pointers from
Land delete it.
- L:
- lazylinop.wip.butterfly.utils.del_all_contexts()#
Delete all contexts.
version 1.20.1 documentation