masspcf#
Core library for piecewise constant functions, tensors, and computations.
pcf#
- class masspcf.functional.pcf.Pcf(arr: ndarray | Pcf | list[list[float | int] | tuple[float | int, ...]], dtype=None)#
Bases:
objectA piecewise constant function (PCF).
A PCF is defined by a sequence of (time, value) pairs \((t_0, v_0), (t_1, v_1), \ldots, (t_{n-1}, v_{n-1})\) with \(t_0 = 0\) and \(t_0 < t_1 < \cdots < t_{n-1}\). The function takes the value \(v_i\) on the interval \([t_i, t_{i+1})\) for \(0 \leq i < n-1\), and \(v_{n-1}\) on \([t_{n-1}, \infty)\).
The first breakpoint must have \(t_0 = 0\). An
InvalidArgumenterror is raised if the first time coordinate is not zero.- Parameters:
arr (numpy.ndarray or Pcf or list) – Input data. If an ndarray or list, should have shape (n, 2) where each row is a (time, value) pair. Can also be an existing
Pcfto copy.dtype (type, optional) – Data type for the PCF (
pcf32,pcf64,pcf32i, orpcf64i). IfNone, the dtype is inferred from the input array (e.g. anumpy.float32array produces a 32-bit PCF, anumpy.int32array produces a 32-bit integer PCF).
Examples
>>> import numpy as np >>> import masspcf as mpcf >>> f = mpcf.Pcf(np.array([[0.0, 1.0], [1.0, 2.0], [3.0, 0.0]], dtype=np.float32)) >>> f.size 3
- astype(dtype)#
Return a copy of the PCF cast to the given dtype (
pcf32,pcf64,pcf32i, orpcf64i).
- property size#
Number of breakpoints (time-value pairs) in this PCF.
- to_numpy()#
Convert the PCF to a numpy array of shape (n, 2) with (time, value) rows.
- class masspcf.functional.pcf.Rectangle(data)#
Bases:
objectA rectangle produced by iterating over a pair of PCFs.
- property f_value#
Value of the first PCF on this interval.
- property g_value#
Value of the second PCF on this interval.
- property left#
Left time boundary.
- property right#
Right time boundary.
tensor#
- class masspcf.tensor.BoolTensor(data: _Mock())#
Bases:
TensorTensor of boolean values, typically produced by elementwise comparisons.
- class masspcf.tensor.FloatTensor(data: _Mock(), dtype=None)#
Bases:
NumericTensor
- class masspcf.tensor.IntPcfTensor(data)#
Bases:
_PcfTensorBase
- class masspcf.tensor.IntTensor(data: _Mock(), dtype=None)#
Bases:
NumericTensor
- class masspcf.tensor.NumericTensor#
Bases:
Tensor,ArithmeticTensorMixin- array_equal(other) bool#
Test whether two tensors have the same shape and all equal elements.
- Parameters:
rhs (Tensor) – The tensor to compare with.
- Returns:
True if the tensors are elementwise equal, False otherwise.
- Return type:
bool
- class masspcf.tensor.PcfTensor(data)#
Bases:
_PcfTensorBase
- class masspcf.tensor.PointCloudTensor(data: _Mock())#
Bases:
Tensor
tensor_create#
- masspcf.tensor_create.array_split(tensor, indices_or_sections, axis=0)#
Split a tensor into sub-tensors, allowing uneven splits.
Like
split, but when indices_or_sections is an integer and the axis size is not evenly divisible, the first sections are one element larger.- Parameters:
tensor (Tensor) – The tensor to split.
indices_or_sections (int or list of int) – If an int, the tensor is split into that many parts (uneven allowed). If a list, it gives the indices where splits occur (same as
split).axis (int) – The axis along which to split (default 0).
- Returns:
A list of tensor views sharing data with the original.
- Return type:
list of Tensor
See also
splitSplit requiring equal divisions.
- masspcf.tensor_create.concatenate(tensors, axis=0)#
Concatenate tensors along an existing axis (outer indexing).
- masspcf.tensor_create.split(tensor, indices_or_sections, axis=0)#
Split a tensor into sub-tensors along an axis.
- Parameters:
tensor (Tensor) – The tensor to split.
indices_or_sections (int or list of int) – If an int, the tensor is split into that many equal parts. If a list, it gives the indices where splits occur.
axis (int) – The axis along which to split (default 0).
- Returns:
A list of tensor views sharing data with the original.
- Return type:
list of Tensor
See also
array_splitSplit allowing uneven divisions.
- masspcf.tensor_create.stack(tensors, axis=0)#
Stack tensors along a new axis. All tensors must have the same shape.
- masspcf.tensor_create.zeros(shape: _Mock(), dtype: dtype = masspcf.pcf32)#
Creates a new Tensor of the specified shape and dtype whose entries are “zero.” What “zero” means depends on the dtype:
dtype=pcf32/64: A PCF that takes the value 0 for all times. dtype=pcf32i/64i: An integer PCF that takes the value 0 for all times. dtype=float32/float64: The number 0. dtype=pcloud32/64: An empty point cloud. dtype=barcode32/64: An empty barcode. dtype=symmat32/64: A 0×0 symmetric matrix. dtype=distmat32/64: A 0×0 distance matrix.
- Parameters:
shape (ShapeLike) – Shape of the returned tensor
dtype – The data type of the elements
- Returns:
The newly created tensor
- Return type:
Tensor
reductions#
- masspcf.reductions.max_time(fs: Tensor | list[Pcf] | Pcf, dim: int = 0)#
Compute the maximum breakpoint time along the given dimension.
For each PCF \(f_i\) with breakpoints \((t_0^{(i)}, t_1^{(i)}, \ldots, t_{n_i-1}^{(i)})\), let \(T_i = t_{n_i-1}^{(i)}\) be the last breakpoint. For functions \(f_1, f_2, \ldots, f_n\) being reduced, this returns
\[\max(T_1, T_2, \ldots, T_n).\]The result is numeric, not a PCF.
See How dim works for a detailed explanation of dimension reduction semantics.
- Parameters:
fs (PcfContainerLike) – A
PcfTensorwith dtypepcf32orpcf64.dim (int, optional) – Dimension along which to reduce, by default 0.
- Returns:
A numeric tensor with the reduced dimension removed.
- Return type:
- masspcf.reductions.mean(fs: Tensor | list[Pcf] | Pcf, dim: int = 0)#
Compute the pointwise mean of a PCF tensor along the given dimension.
The mean is computed pointwise in time: for functions \(f_1, f_2, \ldots, f_n\) being reduced, the resulting function \(\bar{f}\) satisfies
\[\bar{f}(t) = \frac{1}{n} \sum_{i=1}^{n} f_i(t)\]for all \(t\).
See How dim works for a detailed explanation of dimension reduction semantics.
- Parameters:
fs (PcfContainerLike) – A
PcfTensorwith dtypepcf32orpcf64.dim (int, optional) – Dimension along which to reduce, by default 0.
- Returns:
A
PcfTensorwith the reduced dimension removed.- Return type:
distance#
- masspcf.distance.cdist(X: Tensor | list[Pcf] | Pcf, Y: Tensor | list[Pcf] | Pcf, p=1, verbose=False) FloatTensor#
Compute the pairwise \(L_p\) distances between two tensors of PCFs.
For tensors \(X\) of shape \((m_1, \ldots, m_n)\) and \(Y\) of shape \((k_1, \ldots, k_l)\), returns a tensor of shape \((m_1, \ldots, m_n, k_1, \ldots, k_l)\) where
\[D_{i_1, \ldots, i_n, j_1, \ldots, j_l} = \Vert X_{i_1, \ldots, i_n} - Y_{j_1, \ldots, j_l} \Vert_p.\]- Parameters:
X (PcfContainerLike) – A tensor of PCFs (any shape).
Y (PcfContainerLike) – A tensor of PCFs (any shape, same dtype as X).
p (float, optional) – The \(p\) parameter in the \(L_p\) distance (must be \(\geq 1\)), by default 1.
verbose (bool, optional) – Show progress information during computation, by default False.
- Returns:
A tensor of shape
(*X.shape, *Y.shape)containing pairwise distances.- Return type:
- masspcf.distance.lp_distance(f: Pcf, g: Pcf, p=1) float#
Compute the \(L_p\) distance between two PCFs.
\[\Vert f - g \Vert_p = \left(\int_0^\infty |f(t) - g(t)|^p\, dt\right)^{1/p}\]- Parameters:
- Returns:
The \(L_p\) distance between f and g.
- Return type:
float
- Raises:
ValueError – If
p < 1.TypeError – If f and g have different dtypes, or are integer PCFs.
- masspcf.distance.pdist(fs: Tensor | list[Pcf] | Pcf, p=1, verbose=False) DistanceMatrix#
Compute the pairwise \(L_p\) distance matrix for a 1-D tensor of PCFs.
For a tensor \((f_0, f_1, \ldots, f_{n-1})\), returns an \(n \times n\) matrix \(D\) where
\[D_{ij} = \Vert f_i - f_j \Vert_p.\]- Parameters:
fs (PcfContainerLike) – A 1-D tensor of PCFs.
p (float, optional) – The \(p\) parameter in the \(L_p\) distance (must be \(\geq 1\)), by default 1.
verbose (bool, optional) – Show progress information during computation, by default False.
- Returns:
A compressed symmetric distance matrix.
- Return type:
DistanceMatrix
- Raises:
ValueError – If
fsis not 1-dimensional.
symmetric_matrix#
- class masspcf.symmetric_matrix.SymmetricMatrix(n_or_data: int | SymmetricMatrix | CppSymmetricMatrix, dtype: float32 | float64 | None = None)#
Bases:
objectCompressed symmetric matrix using lower-triangular storage.
Stores only n*(n+1)/2 elements for an n×n symmetric matrix. Supports subscript access with
matrix[i, j].- Parameters:
n_or_data (int | SymmetricMatrix | CppSymmetricMatrix) – If an int, creates a zero-initialized matrix of that size. If a SymmetricMatrix or C++ symmetric matrix, wraps it directly.
dtype (float32 | float64 | None, optional) – Element precision.
float32stores entries as 32-bit floats,float64as 64-bit floats. Defaults tofloat64whenn_or_datais an int. Ignored otherwise.
- property dtype#
Element precision (
float32orfloat64).
- classmethod from_dense(array)#
Create a SymmetricMatrix from a dense n×n numpy array.
- property size: int#
- property storage_count: int#
- to_dense() ndarray#
Return the full n×n symmetric matrix as a numpy array.
- class masspcf.symmetric_matrix.SymmetricMatrixTensor(data: _Mock())#
Bases:
Tensor
norms#
- masspcf.norms.lp_norm(fs: Tensor | list[Pcf] | Pcf, p=1, verbose=False) FloatTensor#
Computes the \(L_p\) norm of each PCF in fs. For example, if fs is an \(m \times n\) array with elements indexed as \(f_{ij}\), \(0 \leq i < m, 0 \leq j < n\), we compute
\[\begin{split}\begin{pmatrix} \Vert f_{11} \Vert_p & \Vert f_{12} \Vert_p & \cdots & \Vert f_{1n} \Vert_p \\ \Vert f_{21} \Vert_p & \Vert f_{22} \Vert_p & \cdots & \Vert f_{2n} \Vert_p \\ \vdots & \vdots & \ddots & \vdots & \\ \Vert f_{m1} \Vert_p & \Vert f_{m2} \Vert_p & \cdots & \Vert f_{mn} \Vert_p \\ \end{pmatrix},\end{split}\]where
\[\Vert f_{ij} \Vert_p = \left(\int_0^\infty |f_i(t)|^p\, dt\right)^{1/p}.\]- Parameters:
fs (PcfContainerLike) – PCFs whose norms are to be computed.
p (int, optional) – \(p\) parameter in the \(L_p\) norm, by default 1
verbose (bool, optional) – Print additional information during the computation, by default False
- Returns:
Tensor of the same shape as fs with \(L_p\) norms of the input functions.
- Return type:
inner_product#
- masspcf.inner_product.l2_kernel(fs: Tensor | list[Pcf] | Pcf, verbose=False) SymmetricMatrix#
Compute the pairwise \(L_2\) kernel matrix for a 1-D tensor of PCFs.
For a tensor \((f_0, f_1, \ldots, f_{n-1})\), returns an \(n \times n\) matrix \(K\) where
\[K_{ij} = \langle f_i, f_j \rangle_{L_2} = \int_0^\infty f_i(t) \, f_j(t) \, dt.\]- Parameters:
fs (PcfContainerLike) – A 1-D tensor of PCFs.
verbose (bool, optional) – Show progress information during computation, by default False.
- Returns:
A compressed symmetric kernel matrix.
- Return type:
- Raises:
ValueError – If
fsis not 1-dimensional.
comparison#
- masspcf.comparison.allclose(a, b, atol=1e-08, rtol=1e-05) bool#
Test whether two objects are element-wise equal within a tolerance.
Returns
Truewhen, for every pair of corresponding elements \(a_i\) and \(b_i\),\[|a_i - b_i| \leq \texttt{atol} + \texttt{rtol} \cdot |b_i|.\]- Parameters:
a (FloatTensor | DistanceMatrix | SymmetricMatrix) – First object.
b (FloatTensor | DistanceMatrix | SymmetricMatrix) – Second object (must be the same type as a).
atol (float, optional) – Absolute tolerance, by default 1e-8.
rtol (float, optional) – Relative tolerance, by default 1e-5.
- Return type:
bool
- Raises:
TypeError – If the inputs are not a supported type or are not the same type.
io#
- masspcf.io.load(file)#
Load a tensor or object from a file in masspcf’s binary format.
The returned item will have the same type and dtype as what was saved.
- Parameters:
file (str or file-like) – A file path or an open file object in binary read mode.
- Returns:
The loaded item.
- Return type:
Tensor or Pcf or Barcode or DistanceMatrix or SymmetricMatrix
- masspcf.io.save(item, file)#
Save a tensor or object to a file in masspcf’s binary format.
All tensor types and standalone objects (Pcf, Barcode, DistanceMatrix, SymmetricMatrix) are supported.
- Parameters:
item (Tensor or Pcf or Barcode or DistanceMatrix or SymmetricMatrix) – The item to save.
file (str or file-like) – A file path or an open file object in binary write mode.
serialize#
- masspcf.serialize.from_serial_content(content: ndarray, enumeration: ndarray, dtype=None) PcfTensor#
Creates a Tensor of PCFs from serial numpy data.
- Parameters:
content (np.ndarray) –
(m, 2)array of points, where m is the sum of lengths of the individual PCFsenumeration (np.ndarray) –
(n_1, n_2, ..., n_k, 2)array of (start, end) pointers into the content array.dtype (datatype) – Sets the
dtypeof the resulting PCFArray. IfNone, uses thedtypeof the suppliedcontentarray. By default,None.
- Returns:
PcfTensor of shape
(n_1, n_2, ..., n_k), where element(i_1, i_2, ..., i_k)is a Pcf with pointscontent[start, stop]withstart=enumeration[i_1,...,i_k, 0]andstop=enumeration[i_1,...,i_k, 1].- Return type:
plotting#
- masspcf.plotting.plot(f: Tensor | list[Pcf] | Pcf, fmt='', ax=None, auto_label=False, max_time=None, **kwargs)#
Plot one or more PCFs using matplotlib’s step function.
- Parameters:
f (PcfContainerLike) – A single
Pcfor a 1-DPcfTensor.fmt (str, optional) – A matplotlib format string (e.g.
'r--'), by default''.ax (matplotlib axes, optional) – Axes to plot on. If
None, usesmatplotlib.pyplotdirectly.auto_label (bool, optional) – If
Trueandfis a tensor, label each PCF asf0,f1, etc. By defaultFalse.max_time (float, optional) – Extend the plot so the final constant segment reaches this time. If
None, single PCFs are not extended and tensors extend to the latest breakpoint across all elements.**kwargs – Additional keyword arguments passed to
matplotlib.pyplot.step(e.g.color,linewidth,alpha,label).
- Raises:
ValueError – If
fis a tensor with more than one dimension.
- masspcf.plotting.plot_barcode(bc, ax=None, y_offset=0, **kwargs)#
Plot a persistence barcode as horizontal line segments.
Each bar is drawn as a horizontal segment from birth to death. Bars with infinite death are drawn as arrows extending to the right edge of the plot.
- Parameters:
bc (Barcode or BarcodeTensor) – A single
Barcodeor a 1-DBarcodeTensor. For a tensor, the barcodes are stacked vertically in order.ax (matplotlib axes, optional) – Axes to plot on. If
None, usesmatplotlib.pyplotdirectly.y_offset (int, optional) – Starting y position for the first bar. Useful when stacking multiple barcodes on the same axes.
**kwargs – Additional keyword arguments passed to
matplotlib.collections.LineCollection(e.g.color,linewidth,alpha,label).
- Returns:
The next available y position (for stacking).
- Return type:
int
random#
- class masspcf.random.Generator(seed=None)#
Bases:
objectSeedable random number generator for masspcf.
- Parameters:
seed (int, optional) – Seed for deterministic generation. If
None, a non-deterministic seed is used.
- seed(seed)#
Re-seed the generator.
- masspcf.random.noisy_cos(shape, n_points=20, dtype=masspcf.pcf32, generator=None)#
Generate a tensor of noisy \(\cos(2\pi t)\) PCFs.
Each generated PCF has the form
\[f(t) = \cos(2\pi t) + \varepsilon(t)\]where \(\varepsilon(t) \sim \mathcal{N}(0, 0.1)\) is sampled independently at each breakpoint. The breakpoints are drawn uniformly from \([0, 1]\) and sorted, with the first breakpoint fixed at \(t = 0\) and the last value set to \(0\).
- Parameters:
shape (tuple of int) – Shape of the output tensor.
n_points (int, optional) – Number of breakpoints per PCF, by default 20.
dtype (type, optional) –
pcf32orpcf64, by defaultpcf32.generator (Generator, optional) – Random number generator. If
None, the global generator is used.
- Returns:
Tensor of noisy cosine PCFs with the given shape.
- Return type:
- masspcf.random.noisy_sin(shape, n_points=20, dtype=masspcf.pcf32, generator=None)#
Generate a tensor of noisy \(\sin(2\pi t)\) PCFs.
Each generated PCF has the form
\[f(t) = \sin(2\pi t) + \varepsilon(t)\]where \(\varepsilon(t) \sim \mathcal{N}(0, 0.1)\) is sampled independently at each breakpoint. The breakpoints are drawn uniformly from \([0, 1]\) and sorted, with the first breakpoint fixed at \(t = 0\) and the last value set to \(0\).
- Parameters:
shape (tuple of int) – Shape of the output tensor.
n_points (int, optional) – Number of breakpoints per PCF, by default 20.
dtype (type, optional) –
pcf32orpcf64, by defaultpcf32.generator (Generator, optional) – Random number generator. If
None, the global generator is used.
- Returns:
Tensor of noisy sine PCFs with the given shape.
- Return type:
- masspcf.random.seed(s)#
Seed the global random number generator.
- Parameters:
s (int) – Seed value.
system#
The masspcf.system module provides access to system-wide library settings. Note that these settings are per session and must be reconfigured for each Python kernel run.
Most users should not need to make any changes but we do provide the capability for advanced/expert users. No core functionality in the package requires manual modification of any of these options.
- masspcf.system.build_type() str#
Return the build type of the masspcf backend.
- Returns:
"CUDA"if built with GPU support,"CPU"otherwise.- Return type:
str
- masspcf.system.force_cpu(on: bool)#
Set forced execution on CPU. By default, execution may happen on either CPU or GPU (if using a GPU-enabled build of masspcf).
- Parameters:
on (bool) – If True, force execution on CPU for all operations. If False, execution may happen on either CPU or GPU (if using a GPU-enabled build of masspcf).
- masspcf.system.get_parallel_eval_threshold() int#
Return the current parallel evaluation threshold.
- masspcf.system.limit_cpus(n: int)#
Sets the upper limit on the number of CPU threads that can be used for computations.
Typically, the default corresponding to the number of hardware CPU threads is a good choice but it can be warranted to limit the number of threads in, e.g., multi-user environments. For normal use, we recommend using the default.
- Parameters:
n (int) – Number of CPU threads to use
- masspcf.system.limit_gpus(n: int)#
Sets the number of GPUs that can be used by masspcf. By default, all available GPUs are used.
This option only has an effect if masspcf is compiled with GPU support.
- Parameters:
n (int) – Number of GPUs to use
- masspcf.system.set_block_size(x: int, y: int)#
Set CUDA block size for (GPU) matrix computations. This is an advanced option that should only be modified by expert users.
- Parameters:
x (int) – Horizontal block size
y (int) – Vertical block size
- masspcf.system.set_cuda_threshold(n: int)#
Sets how many PCFs are required in a matrix computation before computations are moved from CPU to GPU. By default, the threshold is set to 500 PCFs.
- Parameters:
n (int) – Number of PCFs required before (supported) matrix computations are moved to GPU
- masspcf.system.set_device_verbose(on: bool)#
Enable verbose device output. In this mode, when operations that may occur on GPU are invoked, a message is logged stating whether the operation will be performed on CPU or GPU.
- Parameters:
on (bool) – Enable verbose device logging
- masspcf.system.set_min_block_side(n: int)#
Set the minimum block side length for the CUDA block scheduler.
This controls the minimum number of threads per GPU kernel launch, ensuring good GPU occupancy. A value of 0 (the default) auto-detects from the GPU hardware (SM count), targeting ~50% max occupancy.
This is an advanced option that should only be modified by expert users.
- Parameters:
n (int) – Minimum block side length. 0 = auto-detect from GPU hardware.
- masspcf.system.set_parallel_eval_threshold(n: int)#
Set the minimum tensor size for parallel tensor evaluation.
When a tensor has at least n elements,
tensor_evaldistributes the work across threads. Below this threshold evaluation is sequential. The default is 500.- Parameters:
n (int) – Minimum number of elements to trigger parallel evaluation.
gpu#
Detect CUDA-capable NVIDIA GPUs without requiring CUDA libraries.
Uses the C++ _gpu_detect module (direct OS API calls) when available, falling back to a pure-Python implementation using subprocess.
- masspcf.gpu.detect_nvidia_gpus()#
Detect NVIDIA GPUs present on the system.
Uses OS-level tools (lspci, sysfs, PowerShell, system_profiler). Does not require CUDA or any NVIDIA drivers/libraries.
- Returns:
A list of dicts, each with a
"name"key describing the GPU. An empty list means no NVIDIA GPUs were found.- Return type:
list[dict]
- masspcf.gpu.has_nvidia_gpu()#
Check whether the system has at least one NVIDIA GPU.
- Returns:
Trueif at least one NVIDIA GPU is detected.- Return type:
bool
- masspcf.gpu.nvidia_gpu_count()#
Return the number of NVIDIA GPUs detected.
- Returns:
Number of NVIDIA GPUs found.
- Return type:
int
typing#
- class masspcf.typing.dtype(name: str, doc: str = '')#
Describes the element type of a masspcf tensor.
Each dtype is a singleton instance (e.g.
masspcf.pcf32,masspcf.float64). Useisinstance(x, masspcf.dtype)to check whether a value is a masspcf dtype.- property name: str#
Short name of this dtype (e.g.
'pcf32').