gsd.fl module

GSD file layer API.

Low level access to gsd files. gsd.fl allows direct access to create, read, and write gsd files. The module is implemented in C and is optimized. See File layer examples for detailed example code.

  • GSDFile - Class interface to read and write gsd files.

  • open() - Open a gsd file.

class gsd.fl.GSDFile

GSD file access interface.

Parameters:
  • name (str) – Name of the open file.

  • mode (str) – Mode of the open file.

  • gsd_version (tuple[int, int]) – GSD file layer version number (major, minor).

  • application (str) – Name of the generating application.

  • schema (str) – Name of the data schema.

  • schema_version (tuple[int, int]) – Schema version number (major, minor).

  • nframes (int) – Number of frames.

GSDFile implements an object oriented class interface to the GSD file layer. Use open() to open a GSD file and obtain a GSDFile instance. GSDFile can be used as a context manager.

name

Name of the open file.

Type:

str

mode

Mode of the open file.

Type:

str

gsd_version

GSD file layer version number (major, minor).

Type:

tuple[int, int]

application

Name of the generating application.

Type:

str

schema

Name of the data schema.

Type:

str

schema_version

Schema version number (major, minor).

Type:

tuple[int, int]

nframes

Number of frames.

Type:

int

maximum_write_buffer_size

The Maximum write buffer size (bytes).

Type:

int

index_entries_to_buffer

Number of index entries to buffer before flushing.

Type:

int

__reduce__()

Allows filehandles to be pickled when in read only mode.

chunk_exists(frame, name)

Test if a chunk exists.

Parameters:
  • frame (int) – Index of the frame to check

  • name (str) – Name of the chunk

Returns:

True if the chunk exists in the file at the given frame. False if it does not.

Return type:

bool

Example

In [1]: with gsd.fl.open(name='file.gsd', mode='w',
   ...:                  application="My application",
   ...:                  schema="My Schema", schema_version=[1,0]) as f:
   ...:     f.write_chunk(name='chunk1',
   ...:                   data=numpy.array([1,2,3,4],
   ...:                                    dtype=numpy.float32))
   ...:     f.write_chunk(name='chunk2',
   ...:                   data=numpy.array([[5,6],[7,8]],
   ...:                                    dtype=numpy.float32))
   ...:     f.end_frame()
   ...:     f.write_chunk(name='chunk1',
   ...:                   data=numpy.array([9,10,11,12],
   ...:                                    dtype=numpy.float32))
   ...:     f.write_chunk(name='chunk2',
   ...:                   data=numpy.array([[13,14],[15,16]],
   ...:                                    dtype=numpy.float32))
   ...:     f.end_frame()
   ...: 

In [2]: f = gsd.fl.open(name='file.gsd', mode='r',
   ...:                 application="My application",
   ...:                 schema="My Schema", schema_version=[1,0])
   ...: 

In [3]: f.chunk_exists(frame=0, name='chunk1')
Out[3]: True

In [4]: f.chunk_exists(frame=0, name='chunk2')
Out[4]: True

In [5]: f.chunk_exists(frame=0, name='chunk3')
Out[5]: False

In [6]: f.chunk_exists(frame=10, name='chunk1')
Out[6]: False

In [7]: f.close()
close()

Close the file.

Once closed, any other operation on the file object will result in a ValueError. close() may be called more than once. The file is automatically closed when garbage collected or when the context manager exits.

Example

In [1]: f = gsd.fl.open(name='file.gsd', mode='w',
   ...:                 application="My application",
   ...:                 schema="My Schema", schema_version=[1,0])
   ...: 

In [2]: f.write_chunk(name='chunk1',
   ...:               data=numpy.array([1,2,3,4], dtype=numpy.float32))
   ...: 

In [3]: f.end_frame()

In [4]: data = f.read_chunk(frame=0, name='chunk1')

In [5]: f.close()

# Read fails because the file is closed
In [6]: data = f.read_chunk(frame=0, name='chunk1')
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[6], line 1
----> 1 data = f.read_chunk(frame=0, name='chunk1')

File ~/checkouts/readthedocs.org/user_builds/gsd/envs/stable/lib/python3.12/site-packages/gsd/fl.pyx:740, in gsd.fl.GSDFile.read_chunk()

ValueError: File is not open
end_frame()

Complete writing the current frame. After calling end_frame() future calls to write_chunk() will write to the next frame in the file.

Danger

Call end_frame() to complete the current frame before closing the file. If you fail to call end_frame(), the last frame will not be written to disk.

Example

In [1]: f = gsd.fl.open(name='file.gsd', mode='w',
   ...:                 application="My application",
   ...:                 schema="My Schema", schema_version=[1,0])
   ...: 

In [2]: f.write_chunk(name='chunk1',
   ...:               data=numpy.array([1,2,3,4], dtype=numpy.float32))
   ...: 

In [3]: f.end_frame()

In [4]: f.write_chunk(name='chunk1',
   ...:               data=numpy.array([9,10,11,12],
   ...:                                dtype=numpy.float32))
   ...: 

In [5]: f.end_frame()

In [6]: f.write_chunk(name='chunk1',
   ...:               data=numpy.array([13,14],
   ...:                                dtype=numpy.float32))
   ...: 

In [7]: f.end_frame()

In [8]: f.nframes
Out[8]: 3

In [9]: f.close()
find_matching_chunk_names(match)

Find all the chunk names in the file that start with the string match.

Parameters:

match (str) – Start of the chunk name to match

Returns:

Matching chunk names

Return type:

list[str]

Example

In [1]: with gsd.fl.open(name='file.gsd', mode='w',
   ...:                  application="My application",
   ...:                  schema="My Schema", schema_version=[1,0]) as f:
   ...:     f.write_chunk(name='data/chunk1',
   ...:                   data=numpy.array([1,2,3,4],
   ...:                                    dtype=numpy.float32))
   ...:     f.write_chunk(name='data/chunk2',
   ...:                   data=numpy.array([[5,6],[7,8]],
   ...:                                    dtype=numpy.float32))
   ...:     f.write_chunk(name='input/chunk3',
   ...:                   data=numpy.array([9, 10],
   ...:                                    dtype=numpy.float32))
   ...:     f.end_frame()
   ...:     f.write_chunk(name='input/chunk4',
   ...:                   data=numpy.array([11, 12, 13, 14],
   ...:                                    dtype=numpy.float32))
   ...:     f.end_frame()
   ...: 

In [2]: f = gsd.fl.open(name='file.gsd', mode='r',
   ...:                 application="My application",
   ...:                 schema="My Schema", schema_version=[1,0])
   ...: 

In [3]: f.find_matching_chunk_names('')
Out[3]: ['data/chunk1', 'data/chunk2', 'input/chunk3', 'input/chunk4']

In [4]: f.find_matching_chunk_names('data')
Out[4]: ['data/chunk1', 'data/chunk2']

In [5]: f.find_matching_chunk_names('input')
Out[5]: ['input/chunk3', 'input/chunk4']

In [6]: f.find_matching_chunk_names('other')
Out[6]: []

In [7]: f.close()
flush()

Flush all buffered frames to the file.

read_chunk(frame, name)

Read a data chunk from the file and return it as a numpy array.

Parameters:
  • frame (int) – Index of the frame to read

  • name (str) – Name of the chunk

Returns:

Data read from file. N, M, and type are determined by the chunk metadata. If the data is NxM in the file and M > 1, return a 2D array. If the data is Nx1, return a 1D array.

Return type:

(N,M) or (N,) numpy.ndarray of type

Tip

Each call invokes a disk read and allocation of a new numpy array for storage. To avoid overhead, call read_chunk() on the same chunk only once.

Example

In [1]: with gsd.fl.open(name='file.gsd', mode='w',
   ...:                  application="My application",
   ...:                  schema="My Schema", schema_version=[1,0]) as f:
   ...:     f.write_chunk(name='chunk1',
   ...:                   data=numpy.array([1,2,3,4],
   ...:                                    dtype=numpy.float32))
   ...:     f.write_chunk(name='chunk2',
   ...:                   data=numpy.array([[5,6],[7,8]],
   ...:                                    dtype=numpy.float32))
   ...:     f.end_frame()
   ...:     f.write_chunk(name='chunk1',
   ...:                   data=numpy.array([9,10,11,12],
   ...:                                    dtype=numpy.float32))
   ...:     f.write_chunk(name='chunk2',
   ...:                   data=numpy.array([[13,14],[15,16]],
   ...:                                    dtype=numpy.float32))
   ...:     f.end_frame()
   ...: 

In [2]: f = gsd.fl.open(name='file.gsd', mode='r',
   ...:                 application="My application",
   ...:                 schema="My Schema", schema_version=[1,0])
   ...: 

In [3]: f.read_chunk(frame=0, name='chunk1')
Out[3]: array([1., 2., 3., 4.], dtype=float32)

In [4]: f.read_chunk(frame=1, name='chunk1')
Out[4]: array([ 9., 10., 11., 12.], dtype=float32)

In [5]: f.read_chunk(frame=2, name='chunk1')
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[5], line 1
----> 1 f.read_chunk(frame=2, name='chunk1')

File ~/checkouts/readthedocs.org/user_builds/gsd/envs/stable/lib/python3.12/site-packages/gsd/fl.pyx:753, in gsd.fl.GSDFile.read_chunk()

KeyError: 'frame 2 / chunk chunk1 not found in: file.gsd'

In [6]: f.close()
truncate()

Truncate all data from the file. After truncation, the file has no frames and no data chunks. The application, schema, and schema version remain the same.

Example

In [1]: with gsd.fl.open(name='file.gsd', mode='w',
   ...:                  application="My application",
   ...:                  schema="My Schema", schema_version=[1,0]) as f:
   ...:     for i in range(10):
   ...:         f.write_chunk(name='chunk1',
   ...:                       data=numpy.array([1,2,3,4],
   ...:                                        dtype=numpy.float32))
   ...:         f.end_frame()
   ...: 

In [2]: f = gsd.fl.open(name='file.gsd', mode='r+',
   ...:                 application="My application",
   ...:                 schema="My Schema", schema_version=[1,0])
   ...: 

In [3]: f.nframes
Out[3]: 10

In [4]: f.schema, f.schema_version, f.application
Out[4]: ('My Schema', (1, 0), 'My application')

In [5]: f.truncate()

In [6]: f.nframes
Out[6]: 0

In [7]: f.schema, f.schema_version, f.application
Out[7]: ('My Schema', (1, 0), 'My application')

In [8]: f.close()
upgrade()

Upgrade a GSD file to the v2 specification in place. The file must be open in a writable mode.

write_chunk(name, data)

Write a data chunk to the file. After writing all chunks in the current frame, call end_frame().

Parameters:
  • name (str) – Name of the chunk

  • data – Data to write into the chunk. Must be a numpy array, or array-like, with 2 or fewer dimensions.

Warning

write_chunk() will implicitly converts array-like and non-contiguous numpy arrays to contiguous numpy arrays with numpy.ascontiguousarray(data). This may or may not produce desired data types in the output file and incurs overhead.

Example

In [1]: f = gsd.fl.open(name='file.gsd', mode='w',
   ...:                 application="My application",
   ...:                 schema="My Schema", schema_version=[1,0])
   ...: 

In [2]: f.write_chunk(name='float1d',
   ...:               data=numpy.array([1,2,3,4],
   ...:                                dtype=numpy.float32))
   ...: 

In [3]: f.write_chunk(name='float2d',
   ...:               data=numpy.array([[13,14],[15,16],[17,19]],
   ...:                                dtype=numpy.float32))
   ...: 

In [4]: f.write_chunk(name='double2d',
   ...:               data=numpy.array([[1,4],[5,6],[7,9]],
   ...:                                dtype=numpy.float64))
   ...: 

In [5]: f.write_chunk(name='int1d',
   ...:               data=numpy.array([70,80,90],
   ...:                                dtype=numpy.int64))
   ...: 

In [6]: f.end_frame()

In [7]: f.nframes
Out[7]: 1

In [8]: f.close()
gsd.fl.open(name, mode, application=None, schema=None, schema_version=None)

open() opens a GSD file and returns a GSDFile instance. The return value of open() can be used as a context manager.

Parameters:
  • name (str) – File name to open.

  • mode (str) – File access mode.

  • application (str) – Name of the application creating the file.

  • schema (str) – Name of the data schema.

  • schema_version (tuple[int, int]) – Schema version number (major, minor).

Valid values for mode:

mode

description

'r'

Open an existing file for reading.

'r+'

Open an existing file for reading and writing.

'w'

Open a file for reading and writing. Creates the file if needed, or overwrites an existing file.

'x'

Create a gsd file exclusively and opens it for reading and writing. Raise FileExistsError if it already exists.

'a'

Open a file for reading and writing. Creates the file if it doesn’t exist.

When opening a file for reading ('r' and 'r+' modes): application and schema_version are ignored and may be None. When schema is not None, open() throws an exception if the file’s schema does not match schema.

When opening a file for writing ('w', 'x', or 'a' modes): The given application, schema, and schema_version must not be None.

Example

In [1]: with gsd.fl.open(name='file.gsd', mode='w',
   ...:                  application="My application", schema="My Schema",
   ...:                  schema_version=[1,0]) as f:
   ...:     f.write_chunk(name='chunk1',
   ...:                   data=numpy.array([1,2,3,4], dtype=numpy.float32))
   ...:     f.write_chunk(name='chunk2',
   ...:                   data=numpy.array([[5,6],[7,8]],
   ...:                                    dtype=numpy.float32))
   ...:     f.end_frame()
   ...:     f.write_chunk(name='chunk1',
   ...:                   data=numpy.array([9,10,11,12],
   ...:                                    dtype=numpy.float32))
   ...:     f.write_chunk(name='chunk2',
   ...:                   data=numpy.array([[13,14],[15,16]],
   ...:                                    dtype=numpy.float32))
   ...:     f.end_frame()
   ...: 

In [2]: f = gsd.fl.open(name='file.gsd', mode='r')

In [3]: if f.chunk_exists(frame=0, name='chunk1'):
   ...:     data = f.read_chunk(frame=0, name='chunk1')
   ...: 

In [4]: data
Out[4]: array([1., 2., 3., 4.], dtype=float32)

In [5]: f.close()