gsd.fl module#

GSD file layer API.

Low level access to gsd files. gsd.fl allows direct access to create, read, and write gsd files. The module is implemented in C and is optimized. See File layer examples for detailed example code.

  • GSDFile - Class interface to read and write gsd files.

  • open() - Open a gsd file.

class gsd.fl.GSDFile#

GSD file access interface.

Parameters:
  • name (str) – Name of the open file.

  • mode (str) – Mode of the open file.

  • gsd_version (tuple[int, int]) – GSD file layer version number (major, minor).

  • application (str) – Name of the generating application.

  • schema (str) – Name of the data schema.

  • schema_version (tuple[int, int]) – Schema version number (major, minor).

  • nframes (int) – Number of frames.

GSDFile implements an object oriented class interface to the GSD file layer. Use open() to open a GSD file and obtain a GSDFile instance. GSDFile can be used as a context manager.

name#

Name of the open file.

Type:

str

mode#

Mode of the open file.

Type:

str

gsd_version#

GSD file layer version number (major, minor).

Type:

tuple[int, int]

application#

Name of the generating application.

Type:

str

schema#

Name of the data schema.

Type:

str

schema_version#

Schema version number (major, minor).

Type:

tuple[int, int]

nframes#

Number of frames.

Type:

int

chunk_exists(frame, name)#

Test if a chunk exists.

Parameters:
  • frame (int) – Index of the frame to check

  • name (str) – Name of the chunk

Returns:

True if the chunk exists in the file at the given frame. False if it does not.

Return type:

bool

Example

In [1]: with gsd.fl.open(name='file.gsd', mode='wb',
   ...:                  application="My application",
   ...:                  schema="My Schema", schema_version=[1,0]) as f:
   ...:     f.write_chunk(name='chunk1',
   ...:                   data=numpy.array([1,2,3,4],
   ...:                                    dtype=numpy.float32))
   ...:     f.write_chunk(name='chunk2',
   ...:                   data=numpy.array([[5,6],[7,8]],
   ...:                                    dtype=numpy.float32))
   ...:     f.end_frame()
   ...:     f.write_chunk(name='chunk1',
   ...:                   data=numpy.array([9,10,11,12],
   ...:                                    dtype=numpy.float32))
   ...:     f.write_chunk(name='chunk2',
   ...:                   data=numpy.array([[13,14],[15,16]],
   ...:                                    dtype=numpy.float32))
   ...:     f.end_frame()
   ...: 

In [2]: f = gsd.fl.open(name='file.gsd', mode='rb',
   ...:                 application="My application",
   ...:                 schema="My Schema", schema_version=[1,0])
   ...: 

In [3]: f.chunk_exists(frame=0, name='chunk1')
Out[3]: True

In [4]: f.chunk_exists(frame=0, name='chunk2')
Out[4]: True

In [5]: f.chunk_exists(frame=0, name='chunk3')
Out[5]: False

In [6]: f.chunk_exists(frame=10, name='chunk1')
Out[6]: False

In [7]: f.close()
close()#

Close the file.

Once closed, any other operation on the file object will result in a ValueError. close() may be called more than once. The file is automatically closed when garbage collected or when the context manager exits.

Example

In [1]: f = gsd.fl.open(name='file.gsd', mode='wb+',
   ...:                 application="My application",
   ...:                 schema="My Schema", schema_version=[1,0])
   ...: 

In [2]: f.write_chunk(name='chunk1',
   ...:               data=numpy.array([1,2,3,4], dtype=numpy.float32))
   ...: 

In [3]: f.end_frame()

In [4]: data = f.read_chunk(frame=0, name='chunk1')

In [5]: f.close()

# Read fails because the file is closed
In [6]: data = f.read_chunk(frame=0, name='chunk1')
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[6], line 1
----> 1 data = f.read_chunk(frame=0, name='chunk1')

File gsd/fl.pyx:734, in gsd.fl.GSDFile.read_chunk()

ValueError: File is not open
end_frame()#

Complete writing the current frame. After calling end_frame() future calls to write_chunk() will write to the next frame in the file.

Danger

Call end_frame() to complete the current frame before closing the file. If you fail to call end_frame(), the last frame will not be written to disk.

Example

In [1]: f = gsd.fl.open(name='file.gsd', mode='wb',
   ...:                 application="My application",
   ...:                 schema="My Schema", schema_version=[1,0])
   ...: 

In [2]: f.write_chunk(name='chunk1',
   ...:               data=numpy.array([1,2,3,4], dtype=numpy.float32))
   ...: 

In [3]: f.end_frame()

In [4]: f.write_chunk(name='chunk1',
   ...:               data=numpy.array([9,10,11,12],
   ...:                                dtype=numpy.float32))
   ...: 

In [5]: f.end_frame()

In [6]: f.write_chunk(name='chunk1',
   ...:               data=numpy.array([13,14],
   ...:                                dtype=numpy.float32))
   ...: 

In [7]: f.end_frame()

In [8]: f.nframes
Out[8]: 3

In [9]: f.close()
find_matching_chunk_names(match)#

Find all the chunk names in the file that start with the string match.

Parameters:

match (str) – Start of the chunk name to match

Returns:

Matching chunk names

Return type:

list[str]

Example

In [1]: with gsd.fl.open(name='file.gsd', mode='wb',
   ...:                  application="My application",
   ...:                  schema="My Schema", schema_version=[1,0]) as f:
   ...:     f.write_chunk(name='data/chunk1',
   ...:                   data=numpy.array([1,2,3,4],
   ...:                                    dtype=numpy.float32))
   ...:     f.write_chunk(name='data/chunk2',
   ...:                   data=numpy.array([[5,6],[7,8]],
   ...:                                    dtype=numpy.float32))
   ...:     f.write_chunk(name='input/chunk3',
   ...:                   data=numpy.array([9, 10],
   ...:                                    dtype=numpy.float32))
   ...:     f.end_frame()
   ...:     f.write_chunk(name='input/chunk4',
   ...:                   data=numpy.array([11, 12, 13, 14],
   ...:                                    dtype=numpy.float32))
   ...:     f.end_frame()
   ...: 

In [2]: f = gsd.fl.open(name='file.gsd', mode='rb',
   ...:                 application="My application",
   ...:                 schema="My Schema", schema_version=[1,0])
   ...: 

In [3]: f.find_matching_chunk_names('')
Out[3]: ['data/chunk1', 'data/chunk2', 'input/chunk3', 'input/chunk4']

In [4]: f.find_matching_chunk_names('data')
Out[4]: ['data/chunk1', 'data/chunk2']

In [5]: f.find_matching_chunk_names('input')
Out[5]: ['input/chunk3', 'input/chunk4']

In [6]: f.find_matching_chunk_names('other')
Out[6]: []

In [7]: f.close()
read_chunk(frame, name)#

Read a data chunk from the file and return it as a numpy array.

Parameters:
  • frame (int) – Index of the frame to read

  • name (str) – Name of the chunk

Returns:

Data read from file. N, M, and type are determined by the chunk metadata. If the data is NxM in the file and M > 1, return a 2D array. If the data is Nx1, return a 1D array.

Return type:

(N,M) or (N,) numpy.ndarray of type

Tip

Each call invokes a disk read and allocation of a new numpy array for storage. To avoid overhead, call read_chunk() on the same chunk only once.

Example

In [1]: with gsd.fl.open(name='file.gsd', mode='wb',
   ...:                  application="My application",
   ...:                  schema="My Schema", schema_version=[1,0]) as f:
   ...:     f.write_chunk(name='chunk1',
   ...:                   data=numpy.array([1,2,3,4],
   ...:                                    dtype=numpy.float32))
   ...:     f.write_chunk(name='chunk2',
   ...:                   data=numpy.array([[5,6],[7,8]],
   ...:                                    dtype=numpy.float32))
   ...:     f.end_frame()
   ...:     f.write_chunk(name='chunk1',
   ...:                   data=numpy.array([9,10,11,12],
   ...:                                    dtype=numpy.float32))
   ...:     f.write_chunk(name='chunk2',
   ...:                   data=numpy.array([[13,14],[15,16]],
   ...:                                    dtype=numpy.float32))
   ...:     f.end_frame()
   ...: 

In [2]: f = gsd.fl.open(name='file.gsd', mode='rb',
   ...:                 application="My application",
   ...:                 schema="My Schema", schema_version=[1,0])
   ...: 

In [3]: f.read_chunk(frame=0, name='chunk1')
Out[3]: array([1., 2., 3., 4.], dtype=float32)

In [4]: f.read_chunk(frame=1, name='chunk1')
Out[4]: array([ 9., 10., 11., 12.], dtype=float32)

In [5]: f.read_chunk(frame=2, name='chunk1')
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[5], line 1
----> 1 f.read_chunk(frame=2, name='chunk1')

File gsd/fl.pyx:747, in gsd.fl.GSDFile.read_chunk()

KeyError: 'frame 2 / chunk chunk1 not found in: file.gsd'

In [6]: f.close()
truncate()#

Truncate all data from the file. After truncation, the file has no frames and no data chunks. The application, schema, and schema version remain the same.

Example

In [1]: with gsd.fl.open(name='file.gsd', mode='wb',
   ...:                  application="My application",
   ...:                  schema="My Schema", schema_version=[1,0]) as f:
   ...:     for i in range(10):
   ...:         f.write_chunk(name='chunk1',
   ...:                       data=numpy.array([1,2,3,4],
   ...:                                        dtype=numpy.float32))
   ...:         f.end_frame()
   ...: 

In [2]: f = gsd.fl.open(name='file.gsd', mode='ab',
   ...:                 application="My application",
   ...:                 schema="My Schema", schema_version=[1,0])
   ...: 

In [3]: f.nframes
Out[3]: 10

In [4]: f.schema, f.schema_version, f.application
Out[4]: ('My Schema', (1, 0), 'My application')

In [5]: f.truncate()

In [6]: f.nframes
Out[6]: 0

In [7]: f.schema, f.schema_version, f.application
Out[7]: ('My Schema', (1, 0), 'My application')

In [8]: f.close()
upgrade()#

Upgrade a GSD file to the v2 specification in place. The file must be open in a writable mode.

write_chunk(name, data)#

Write a data chunk to the file. After writing all chunks in the current frame, call end_frame().

Parameters:
  • name (str) – Name of the chunk

  • data – Data to write into the chunk. Must be a numpy array, or array-like, with 2 or fewer dimensions.

Warning

write_chunk() will implicitly converts array-like and non-contiguous numpy arrays to contiguous numpy arrays with numpy.ascontiguousarray(data). This may or may not produce desired data types in the output file and incurs overhead.

Example

In [1]: f = gsd.fl.open(name='file.gsd', mode='wb',
   ...:                 application="My application",
   ...:                 schema="My Schema", schema_version=[1,0])
   ...: 

In [2]: f.write_chunk(name='float1d',
   ...:               data=numpy.array([1,2,3,4],
   ...:                                dtype=numpy.float32))
   ...: 

In [3]: f.write_chunk(name='float2d',
   ...:               data=numpy.array([[13,14],[15,16],[17,19]],
   ...:                                dtype=numpy.float32))
   ...: 

In [4]: f.write_chunk(name='double2d',
   ...:               data=numpy.array([[1,4],[5,6],[7,9]],
   ...:                                dtype=numpy.float64))
   ...: 

In [5]: f.write_chunk(name='int1d',
   ...:               data=numpy.array([70,80,90],
   ...:                                dtype=numpy.int64))
   ...: 

In [6]: f.end_frame()

In [7]: f.nframes
Out[7]: 1

In [8]: f.close()
gsd.fl.open(name, mode, application=None, schema=None, schema_version=None)#

open() opens a GSD file and returns a GSDFile instance. The return value of open() can be used as a context manager.

Parameters:
  • name (str) – File name to open.

  • mode (str) – File access mode.

  • application (str) – Name of the application creating the file.

  • schema (str) – Name of the data schema.

  • schema_version (tuple[int, int]) – Schema version number (major, minor).

Valid values for mode:

mode

description

'rb'

Open an existing file for reading.

'rb+'

Open an existing file for reading and writing.

'wb'

Open a file for reading and writing. Creates the file if needed, or overwrites an existing file.

'wb+'

Open a file for reading and writing. Creates the file if needed, or overwrites an existing file.

'xb'

Create a gsd file exclusively and opens it for reading and writing. Raise FileExistsError if it already exists.

'xb+'

Create a gsd file exclusively and opens it for reading and writing. Raise FileExistsError if it already exists.

'ab'

Open an existing file for reading and writing. Does not create or overwrite existing files.

When opening a file for reading ('r' and 'a' modes): application and schema_version are ignored and may be None. When schema is not None, open() throws an exception if the file’s schema does not match schema.

When opening a file for writing ('w' or 'x' modes): The given application, schema, and schema_version are saved in the file and must not be None.

Example

In [1]: with gsd.fl.open(name='file.gsd', mode='wb',
   ...:                  application="My application", schema="My Schema",
   ...:                  schema_version=[1,0]) as f:
   ...:     f.write_chunk(name='chunk1',
   ...:                   data=numpy.array([1,2,3,4], dtype=numpy.float32))
   ...:     f.write_chunk(name='chunk2',
   ...:                   data=numpy.array([[5,6],[7,8]],
   ...:                                    dtype=numpy.float32))
   ...:     f.end_frame()
   ...:     f.write_chunk(name='chunk1',
   ...:                   data=numpy.array([9,10,11,12],
   ...:                                    dtype=numpy.float32))
   ...:     f.write_chunk(name='chunk2',
   ...:                   data=numpy.array([[13,14],[15,16]],
   ...:                                    dtype=numpy.float32))
   ...:     f.end_frame()
   ...: 

In [2]: f = gsd.fl.open(name='file.gsd', mode='rb')

In [3]: if f.chunk_exists(frame=0, name='chunk1'):
   ...:     data = f.read_chunk(frame=0, name='chunk1')
   ...: 

In [4]: data
Out[4]: array([1., 2., 3., 4.], dtype=float32)

In [5]: f.close()