Saving and loading#
masspcf provides a binary format for efficiently saving and loading tensors. All tensor types are supported, including PCF, numeric, point cloud, barcode, symmetric matrix tensors, etc.
Saving#
Use save() to write a tensor to a file:
import masspcf as mpcf
from masspcf.random import noisy_sin
X = noisy_sin((100,), n_points=50)
mpcf.save(X, 'my_pcfs.mpcf')
You can also pass an open file object in binary write mode:
with open('my_pcfs.mpcf', 'wb') as f:
mpcf.save(X, f)
Pickle support#
All tensor types and standalone objects (Pcf, Barcode, DistanceMatrix,
SymmetricMatrix) support Python’s pickle protocol. This means they work
with pickle.dumps/pickle.loads, copy.deepcopy, and multiprocessing:
import pickle
data = pickle.dumps(X)
X_restored = pickle.loads(data)
Pickling uses masspcf’s binary format internally, so it is efficient and preserves dtype and shape.
Note
Many masspcf operations (distance matrices, reductions, etc.) are already
parallelized internally using multithreading and GPU acceleration. Layering
Python multiprocessing on top will most likely decrease performance in
these cases due to process overhead and memory duplication.
Loading#
Use load() to read a tensor back:
X = mpcf.load('my_pcfs.mpcf')
The returned tensor will be of the same type and dtype as what was saved. As with save, you can also pass an open file object:
with open('my_pcfs.mpcf', 'rb') as f:
X = mpcf.load(f)