gsd.fl module#
GSD file layer API.
Low level access to gsd files. gsd.fl
allows direct access to create,
read, and write gsd
files. The module is implemented in C and is optimized.
See File layer examples for detailed example code.
- class gsd.fl.GSDFile#
GSD file access interface.
- Parameters:
name (str) – Name of the open file.
mode (str) – Mode of the open file.
gsd_version (tuple[int, int]) – GSD file layer version number (major, minor).
application (str) – Name of the generating application.
schema (str) – Name of the data schema.
schema_version (tuple[int, int]) – Schema version number (major, minor).
nframes (int) – Number of frames.
GSDFile
implements an object oriented class interface to the GSD file layer. Useopen()
to open a GSD file and obtain aGSDFile
instance.GSDFile
can be used as a context manager.- chunk_exists(frame, name)#
Test if a chunk exists.
- Parameters:
- Returns:
True
if the chunk exists in the file at the given frame.False
if it does not.- Return type:
Example
In [1]: with gsd.fl.open(name='file.gsd', mode='w', ...: application="My application", ...: schema="My Schema", schema_version=[1,0]) as f: ...: f.write_chunk(name='chunk1', ...: data=numpy.array([1,2,3,4], ...: dtype=numpy.float32)) ...: f.write_chunk(name='chunk2', ...: data=numpy.array([[5,6],[7,8]], ...: dtype=numpy.float32)) ...: f.end_frame() ...: f.write_chunk(name='chunk1', ...: data=numpy.array([9,10,11,12], ...: dtype=numpy.float32)) ...: f.write_chunk(name='chunk2', ...: data=numpy.array([[13,14],[15,16]], ...: dtype=numpy.float32)) ...: f.end_frame() ...: In [2]: f = gsd.fl.open(name='file.gsd', mode='r', ...: application="My application", ...: schema="My Schema", schema_version=[1,0]) ...: In [3]: f.chunk_exists(frame=0, name='chunk1') Out[3]: True In [4]: f.chunk_exists(frame=0, name='chunk2') Out[4]: True In [5]: f.chunk_exists(frame=0, name='chunk3') Out[5]: False In [6]: f.chunk_exists(frame=10, name='chunk1') Out[6]: False In [7]: f.close()
- close()#
Close the file.
Once closed, any other operation on the file object will result in a
ValueError
.close()
may be called more than once. The file is automatically closed when garbage collected or when the context manager exits.Example
In [1]: f = gsd.fl.open(name='file.gsd', mode='w', ...: application="My application", ...: schema="My Schema", schema_version=[1,0]) ...: In [2]: f.write_chunk(name='chunk1', ...: data=numpy.array([1,2,3,4], dtype=numpy.float32)) ...: In [3]: f.end_frame() In [4]: data = f.read_chunk(frame=0, name='chunk1') In [5]: f.close() # Read fails because the file is closed In [6]: data = f.read_chunk(frame=0, name='chunk1') --------------------------------------------------------------------------- ValueError Traceback (most recent call last) Cell In[6], line 1 ----> 1 data = f.read_chunk(frame=0, name='chunk1') File ~/checkouts/readthedocs.org/user_builds/gsd/envs/stable/lib/python3.11/site-packages/gsd/fl.pyx:770, in gsd.fl.GSDFile.read_chunk() ValueError: File is not open
- end_frame()#
Complete writing the current frame. After calling
end_frame()
future calls towrite_chunk()
will write to the next frame in the file.Danger
Call
end_frame()
to complete the current frame before closing the file. If you fail to callend_frame()
, the last frame will not be written to disk.Example
In [1]: f = gsd.fl.open(name='file.gsd', mode='w', ...: application="My application", ...: schema="My Schema", schema_version=[1,0]) ...: In [2]: f.write_chunk(name='chunk1', ...: data=numpy.array([1,2,3,4], dtype=numpy.float32)) ...: In [3]: f.end_frame() In [4]: f.write_chunk(name='chunk1', ...: data=numpy.array([9,10,11,12], ...: dtype=numpy.float32)) ...: In [5]: f.end_frame() In [6]: f.write_chunk(name='chunk1', ...: data=numpy.array([13,14], ...: dtype=numpy.float32)) ...: In [7]: f.end_frame() In [8]: f.nframes Out[8]: 3 In [9]: f.close()
- find_matching_chunk_names(match)#
Find all the chunk names in the file that start with the string match.
- Parameters:
match (str) – Start of the chunk name to match
- Returns:
Matching chunk names
- Return type:
Example
In [1]: with gsd.fl.open(name='file.gsd', mode='w', ...: application="My application", ...: schema="My Schema", schema_version=[1,0]) as f: ...: f.write_chunk(name='data/chunk1', ...: data=numpy.array([1,2,3,4], ...: dtype=numpy.float32)) ...: f.write_chunk(name='data/chunk2', ...: data=numpy.array([[5,6],[7,8]], ...: dtype=numpy.float32)) ...: f.write_chunk(name='input/chunk3', ...: data=numpy.array([9, 10], ...: dtype=numpy.float32)) ...: f.end_frame() ...: f.write_chunk(name='input/chunk4', ...: data=numpy.array([11, 12, 13, 14], ...: dtype=numpy.float32)) ...: f.end_frame() ...: In [2]: f = gsd.fl.open(name='file.gsd', mode='r', ...: application="My application", ...: schema="My Schema", schema_version=[1,0]) ...: In [3]: f.find_matching_chunk_names('') Out[3]: ['data/chunk1', 'data/chunk2', 'input/chunk3', 'input/chunk4'] In [4]: f.find_matching_chunk_names('data') Out[4]: ['data/chunk1', 'data/chunk2'] In [5]: f.find_matching_chunk_names('input') Out[5]: ['input/chunk3', 'input/chunk4'] In [6]: f.find_matching_chunk_names('other') Out[6]: [] In [7]: f.close()
- read_chunk(frame, name)#
Read a data chunk from the file and return it as a numpy array.
- Parameters:
- Returns:
Data read from file.
N
,M
, andtype
are determined by the chunk metadata. If the data is NxM in the file and M > 1, return a 2D array. If the data is Nx1, return a 1D array.- Return type:
(N,M)
or(N,)
numpy.ndarray
oftype
Tip
Each call invokes a disk read and allocation of a new numpy array for storage. To avoid overhead, call
read_chunk()
on the same chunk only once.Example
In [1]: with gsd.fl.open(name='file.gsd', mode='w', ...: application="My application", ...: schema="My Schema", schema_version=[1,0]) as f: ...: f.write_chunk(name='chunk1', ...: data=numpy.array([1,2,3,4], ...: dtype=numpy.float32)) ...: f.write_chunk(name='chunk2', ...: data=numpy.array([[5,6],[7,8]], ...: dtype=numpy.float32)) ...: f.end_frame() ...: f.write_chunk(name='chunk1', ...: data=numpy.array([9,10,11,12], ...: dtype=numpy.float32)) ...: f.write_chunk(name='chunk2', ...: data=numpy.array([[13,14],[15,16]], ...: dtype=numpy.float32)) ...: f.end_frame() ...: In [2]: f = gsd.fl.open(name='file.gsd', mode='r', ...: application="My application", ...: schema="My Schema", schema_version=[1,0]) ...: In [3]: f.read_chunk(frame=0, name='chunk1') Out[3]: array([1., 2., 3., 4.], dtype=float32) In [4]: f.read_chunk(frame=1, name='chunk1') Out[4]: array([ 9., 10., 11., 12.], dtype=float32) In [5]: f.read_chunk(frame=2, name='chunk1') --------------------------------------------------------------------------- KeyError Traceback (most recent call last) Cell In[5], line 1 ----> 1 f.read_chunk(frame=2, name='chunk1') File ~/checkouts/readthedocs.org/user_builds/gsd/envs/stable/lib/python3.11/site-packages/gsd/fl.pyx:783, in gsd.fl.GSDFile.read_chunk() KeyError: 'frame 2 / chunk chunk1 not found in: file.gsd' In [6]: f.close()
- truncate()#
Truncate all data from the file. After truncation, the file has no frames and no data chunks. The application, schema, and schema version remain the same.
Example
In [1]: with gsd.fl.open(name='file.gsd', mode='w', ...: application="My application", ...: schema="My Schema", schema_version=[1,0]) as f: ...: for i in range(10): ...: f.write_chunk(name='chunk1', ...: data=numpy.array([1,2,3,4], ...: dtype=numpy.float32)) ...: f.end_frame() ...: In [2]: f = gsd.fl.open(name='file.gsd', mode='r+', ...: application="My application", ...: schema="My Schema", schema_version=[1,0]) ...: In [3]: f.nframes Out[3]: 10 In [4]: f.schema, f.schema_version, f.application Out[4]: ('My Schema', (1, 0), 'My application') In [5]: f.truncate() In [6]: f.nframes Out[6]: 0 In [7]: f.schema, f.schema_version, f.application Out[7]: ('My Schema', (1, 0), 'My application') In [8]: f.close()
- upgrade()#
Upgrade a GSD file to the v2 specification in place. The file must be open in a writable mode.
- write_chunk(name, data)#
Write a data chunk to the file. After writing all chunks in the current frame, call
end_frame()
.- Parameters:
name (str) – Name of the chunk
data – Data to write into the chunk. Must be a numpy array, or array-like, with 2 or fewer dimensions.
Warning
write_chunk()
will implicitly converts array-like and non-contiguous numpy arrays to contiguous numpy arrays withnumpy.ascontiguousarray(data)
. This may or may not produce desired data types in the output file and incurs overhead.Example
In [1]: f = gsd.fl.open(name='file.gsd', mode='w', ...: application="My application", ...: schema="My Schema", schema_version=[1,0]) ...: In [2]: f.write_chunk(name='float1d', ...: data=numpy.array([1,2,3,4], ...: dtype=numpy.float32)) ...: In [3]: f.write_chunk(name='float2d', ...: data=numpy.array([[13,14],[15,16],[17,19]], ...: dtype=numpy.float32)) ...: In [4]: f.write_chunk(name='double2d', ...: data=numpy.array([[1,4],[5,6],[7,9]], ...: dtype=numpy.float64)) ...: In [5]: f.write_chunk(name='int1d', ...: data=numpy.array([70,80,90], ...: dtype=numpy.int64)) ...: In [6]: f.end_frame() In [7]: f.nframes Out[7]: 1 In [8]: f.close()
- gsd.fl.open(name, mode, application=None, schema=None, schema_version=None)#
open()
opens a GSD file and returns aGSDFile
instance. The return value ofopen()
can be used as a context manager.- Parameters:
Valid values for
mode
:mode
description
'r'
Open an existing file for reading.
'r+'
Open an existing file for reading and writing.
'w'
Open a file for reading and writing. Creates the file if needed, or overwrites an existing file.
'x'
Create a gsd file exclusively and opens it for reading and writing. Raise
FileExistsError
if it already exists.'a'
Open a file for reading and writing. Creates the file if it doesn’t exist.
When opening a file for reading (
'r'
and'r+'
modes):application
andschema_version
are ignored and may beNone
. Whenschema
is notNone
,open()
throws an exception if the file’s schema does not matchschema
.When opening a file for writing (
'w'
,'x'
, or'a'
modes): The givenapplication
,schema
, andschema_version
must not be None.Deprecated since version 2.9.0: The following values to
mode
are deprecated:mode
description
'rb'
Equivalent to
'r'
'rb+'
Equivalent to
'r+'
'wb'
Equivalent to
'w'
'wb+'
Equivalent to
'w'
'xb'
Equivalent to
'x'
'xb+'
Equivalent to
'x'
'ab'
Equivalent to
'r+'
Example
In [1]: with gsd.fl.open(name='file.gsd', mode='w', ...: application="My application", schema="My Schema", ...: schema_version=[1,0]) as f: ...: f.write_chunk(name='chunk1', ...: data=numpy.array([1,2,3,4], dtype=numpy.float32)) ...: f.write_chunk(name='chunk2', ...: data=numpy.array([[5,6],[7,8]], ...: dtype=numpy.float32)) ...: f.end_frame() ...: f.write_chunk(name='chunk1', ...: data=numpy.array([9,10,11,12], ...: dtype=numpy.float32)) ...: f.write_chunk(name='chunk2', ...: data=numpy.array([[13,14],[15,16]], ...: dtype=numpy.float32)) ...: f.end_frame() ...: In [2]: f = gsd.fl.open(name='file.gsd', mode='r') In [3]: if f.chunk_exists(frame=0, name='chunk1'): ...: data = f.read_chunk(frame=0, name='chunk1') ...: In [4]: data Out[4]: array([1., 2., 3., 4.], dtype=float32) In [5]: f.close()