gsd.fl module

GSD file layer API.

Low level access to gsd files. gsd.fl allows direct access to create, read, and write gsd files. The module is implemented in C and is optimized. See File layer examples for detailed example code.

  • GSDFile - Class interface to read and write gsd files.

  • open() - Open a gsd file.

class gsd.fl.GSDFile

GSD file access interface.

Parameters
  • name (str) – Name of the open file.

  • mode (str) – Mode of the open file.

  • gsd_version (tuple[int, int]) – GSD file layer version number (major, minor).

  • application (str) – Name of the generating application.

  • schema (str) – Name of the data schema.

  • schema_version (tuple[int, int]) – Schema version number (major, minor).

  • nframes (int) – Number of frames.

GSDFile implements an object oriented class interface to the GSD file layer. Use open() to open a GSD file and obtain a GSDFile instance. GSDFile can be used as a context manager.

name

Name of the open file.

Type

str

mode

Mode of the open file.

Type

str

gsd_version

GSD file layer version number (major, minor).

Type

tuple[int, int]

application

Name of the generating application.

Type

str

schema

Name of the data schema.

Type

str

schema_version

Schema version number (major, minor).

Type

tuple[int, int]

nframes

Number of frames.

Type

int

chunk_exists(frame, name)

Test if a chunk exists.

Parameters
  • frame (int) – Index of the frame to check

  • name (str) – Name of the chunk

Returns

True if the chunk exists in the file at the given frame. False if it does not.

Return type

bool

Example

In [1]: with gsd.fl.open(name='file.gsd', mode='wb',
   ...:                  application="My application",
   ...:                  schema="My Schema", schema_version=[1,0]) as f:
   ...:     f.write_chunk(name='chunk1',
   ...:                   data=numpy.array([1,2,3,4],
   ...:                                    dtype=numpy.float32))
   ...:     f.write_chunk(name='chunk2',
   ...:                   data=numpy.array([[5,6],[7,8]],
   ...:                                    dtype=numpy.float32))
   ...:     f.end_frame()
   ...:     f.write_chunk(name='chunk1',
   ...:                   data=numpy.array([9,10,11,12],
   ...:                                    dtype=numpy.float32))
   ...:     f.write_chunk(name='chunk2',
   ...:                   data=numpy.array([[13,14],[15,16]],
   ...:                                    dtype=numpy.float32))
   ...:     f.end_frame()
   ...: 

In [2]: f = gsd.fl.open(name='file.gsd', mode='rb',
   ...:                 application="My application",
   ...:                 schema="My Schema", schema_version=[1,0])
   ...: 

In [3]: f.chunk_exists(frame=0, name='chunk1')
Out[3]: True

In [4]: f.chunk_exists(frame=0, name='chunk2')
Out[4]: True

In [5]: f.chunk_exists(frame=0, name='chunk3')
Out[5]: False

In [6]: f.chunk_exists(frame=10, name='chunk1')
Out[6]: False

In [7]: f.close()
close()

Close the file.

Once closed, any other operation on the file object will result in a ValueError. close() may be called more than once. The file is automatically closed when garbage collected or when the context manager exits.

Example

In [1]: f = gsd.fl.open(name='file.gsd', mode='wb+',
   ...:                 application="My application",
   ...:                 schema="My Schema", schema_version=[1,0])
   ...: 

In [2]: f.write_chunk(name='chunk1',
   ...:               data=numpy.array([1,2,3,4], dtype=numpy.float32))
   ...: 

In [3]: f.end_frame()

In [4]: data = f.read_chunk(frame=0, name='chunk1')

In [5]: f.close()

# Read fails because the file is closed
In [6]: data = f.read_chunk(frame=0, name='chunk1')
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-6-235a7eaf209c> in <module>
----> 1 data = f.read_chunk(frame=0, name='chunk1')

gsd/fl.pyx in gsd.fl.GSDFile.read_chunk()

ValueError: File is not open
end_frame()

Complete writing the current frame. After calling end_frame() future calls to write_chunk() will write to the next frame in the file.

Danger

Call end_frame() to complete the current frame before closing the file. If you fail to call end_frame(), the last frame will not be written to disk.

Example

In [1]: f = gsd.fl.open(name='file.gsd', mode='wb',
   ...:                 application="My application",
   ...:                 schema="My Schema", schema_version=[1,0])
   ...: 

In [2]: f.write_chunk(name='chunk1',
   ...:               data=numpy.array([1,2,3,4], dtype=numpy.float32))
   ...: 

In [3]: f.end_frame()

In [4]: f.write_chunk(name='chunk1',
   ...:               data=numpy.array([9,10,11,12],
   ...:                                dtype=numpy.float32))
   ...: 

In [5]: f.end_frame()

In [6]: f.write_chunk(name='chunk1',
   ...:               data=numpy.array([13,14],
   ...:                                dtype=numpy.float32))
   ...: 

In [7]: f.end_frame()

In [8]: f.nframes
Out[8]: 3

In [9]: f.close()
find_matching_chunk_names(match)

Find all the chunk names in the file that start with the string match.

Parameters

match (str) – Start of the chunk name to match

Returns

Matching chunk names

Return type

list[str]

Example

In [1]: with gsd.fl.open(name='file.gsd', mode='wb',
   ...:                  application="My application",
   ...:                  schema="My Schema", schema_version=[1,0]) as f:
   ...:     f.write_chunk(name='data/chunk1',
   ...:                   data=numpy.array([1,2,3,4],
   ...:                                    dtype=numpy.float32))
   ...:     f.write_chunk(name='data/chunk2',
   ...:                   data=numpy.array([[5,6],[7,8]],
   ...:                                    dtype=numpy.float32))
   ...:     f.write_chunk(name='input/chunk3',
   ...:                   data=numpy.array([9, 10],
   ...:                                    dtype=numpy.float32))
   ...:     f.end_frame()
   ...:     f.write_chunk(name='input/chunk4',
   ...:                   data=numpy.array([11, 12, 13, 14],
   ...:                                    dtype=numpy.float32))
   ...:     f.end_frame()
   ...: 

In [2]: f = gsd.fl.open(name='file.gsd', mode='rb',
   ...:                 application="My application",
   ...:                 schema="My Schema", schema_version=[1,0])
   ...: 

In [3]: f.find_matching_chunk_names('')
Out[3]: ['data/chunk1', 'data/chunk2', 'input/chunk3', 'input/chunk4']

In [4]: f.find_matching_chunk_names('data')
Out[4]: ['data/chunk1', 'data/chunk2']

In [5]: f.find_matching_chunk_names('input')
Out[5]: ['input/chunk3', 'input/chunk4']

In [6]: f.find_matching_chunk_names('other')
Out[6]: []

In [7]: f.close()
read_chunk(frame, name)

Read a data chunk from the file and return it as a numpy array.

Parameters
  • frame (int) – Index of the frame to read

  • name (str) – Name of the chunk

Returns

Data read from file. N, M, and type are determined by the chunk metadata. If the data is NxM in the file and M > 1, return a 2D array. If the data is Nx1, return a 1D array.

Return type

(N,M) or (N,) numpy.ndarray of type

Tip

Each call invokes a disk read and allocation of a new numpy array for storage. To avoid overhead, call read_chunk() on the same chunk only once.

Example

In [1]: with gsd.fl.open(name='file.gsd', mode='wb',
   ...:                  application="My application",
   ...:                  schema="My Schema", schema_version=[1,0]) as f:
   ...:     f.write_chunk(name='chunk1',
   ...:                   data=numpy.array([1,2,3,4],
   ...:                                    dtype=numpy.float32))
   ...:     f.write_chunk(name='chunk2',
   ...:                   data=numpy.array([[5,6],[7,8]],
   ...:                                    dtype=numpy.float32))
   ...:     f.end_frame()
   ...:     f.write_chunk(name='chunk1',
   ...:                   data=numpy.array([9,10,11,12],
   ...:                                    dtype=numpy.float32))
   ...:     f.write_chunk(name='chunk2',
   ...:                   data=numpy.array([[13,14],[15,16]],
   ...:                                    dtype=numpy.float32))
   ...:     f.end_frame()
   ...: 

In [2]: f = gsd.fl.open(name='file.gsd', mode='rb',
   ...:                 application="My application",
   ...:                 schema="My Schema", schema_version=[1,0])
   ...: 

In [3]: f.read_chunk(frame=0, name='chunk1')
Out[3]: array([1., 2., 3., 4.], dtype=float32)

In [4]: f.read_chunk(frame=1, name='chunk1')
Out[4]: array([ 9., 10., 11., 12.], dtype=float32)

In [5]: f.read_chunk(frame=2, name='chunk1')
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-5-f2a5b71c0390> in <module>
----> 1 f.read_chunk(frame=2, name='chunk1')

gsd/fl.pyx in gsd.fl.GSDFile.read_chunk()

KeyError: 'frame 2 / chunk chunk1 not found in: file.gsd'

In [6]: f.close()
truncate()

Truncate all data from the file. After truncation, the file has no frames and no data chunks. The application, schema, and schema version remain the same.

Example

In [1]: with gsd.fl.open(name='file.gsd', mode='wb',
   ...:                  application="My application",
   ...:                  schema="My Schema", schema_version=[1,0]) as f:
   ...:     for i in range(10):
   ...:         f.write_chunk(name='chunk1',
   ...:                       data=numpy.array([1,2,3,4],
   ...:                                        dtype=numpy.float32))
   ...:         f.end_frame()
   ...: 

In [2]: f = gsd.fl.open(name='file.gsd', mode='ab',
   ...:                 application="My application",
   ...:                 schema="My Schema", schema_version=[1,0])
   ...: 

In [3]: f.nframes
Out[3]: 10

In [4]: f.schema, f.schema_version, f.application
Out[4]: ('My Schema', (1, 0), 'My application')

In [5]: f.truncate()

In [6]: f.nframes
Out[6]: 0

In [7]: f.schema, f.schema_version, f.application
Out[7]: ('My Schema', (1, 0), 'My application')

In [8]: f.close()
upgrade()

Upgrade a GSD file to the v2 specification in place. The file must be open in a writable mode.

write_chunk(name, data)

Write a data chunk to the file. After writing all chunks in the current frame, call end_frame().

Parameters
  • name (str) – Name of the chunk

  • data – Data to write into the chunk. Must be a numpy array, or array-like, with 2 or fewer dimensions.

Warning

write_chunk() will implicitly converts array-like and non-contiguous numpy arrays to contiguous numpy arrays with numpy.ascontiguousarray(data). This may or may not produce desired data types in the output file and incurs overhead.

Example

In [1]: f = gsd.fl.open(name='file.gsd', mode='wb',
   ...:                 application="My application",
   ...:                 schema="My Schema", schema_version=[1,0])
   ...: 

In [2]: f.write_chunk(name='float1d',
   ...:               data=numpy.array([1,2,3,4],
   ...:                                dtype=numpy.float32))
   ...: 

In [3]: f.write_chunk(name='float2d',
   ...:               data=numpy.array([[13,14],[15,16],[17,19]],
   ...:                                dtype=numpy.float32))
   ...: 

In [4]: f.write_chunk(name='double2d',
   ...:               data=numpy.array([[1,4],[5,6],[7,9]],
   ...:                                dtype=numpy.float64))
   ...: 

In [5]: f.write_chunk(name='int1d',
   ...:               data=numpy.array([70,80,90],
   ...:                                dtype=numpy.int64))
   ...: 

In [6]: f.end_frame()

In [7]: f.nframes
Out[7]: 1

In [8]: f.close()
gsd.fl.open(name, mode, application=None, schema=None, schema_version=None)

open() opens a GSD file and returns a GSDFile instance. The return value of open() can be used as a context manager.

Parameters
  • name (str) – File name to open.

  • mode (str) – File access mode.

  • application (str) – Name of the application creating the file.

  • schema (str) – Name of the data schema.

  • schema_version (tuple[int, int]) – Schema version number (major, minor).

Valid values for mode:

mode

description

'rb'

Open an existing file for reading.

'rb+'

Open an existing file for reading and writing.

'wb'

Open a file for writing. Creates the file if needed, or overwrites an existing file.

'wb+'

Open a file for reading and writing. Creates the file if needed, or overwrites an existing file.

'xb'

Create a gsd file exclusively and opens it for writing. Raise FileExistsError if it already exists.

'xb+'

Create a gsd file exclusively and opens it for reading and writing. Raise FileExistsError if it already exists.

'ab'

Open an existing file for writing. Does not create or overwrite existing files.

When opening a file for reading ('r' and 'a' modes): application and schema_version are ignored and may be None. When schema is not None, open() throws an exception if the file’s schema does not match schema.

When opening a file for writing ('w' or 'x' modes): The given application, schema, and schema_version are saved in the file and must not be None.

Example

In [1]: with gsd.fl.open(name='file.gsd', mode='wb',
   ...:                  application="My application", schema="My Schema",
   ...:                  schema_version=[1,0]) as f:
   ...:     f.write_chunk(name='chunk1',
   ...:                   data=numpy.array([1,2,3,4], dtype=numpy.float32))
   ...:     f.write_chunk(name='chunk2',
   ...:                   data=numpy.array([[5,6],[7,8]],
   ...:                                    dtype=numpy.float32))
   ...:     f.end_frame()
   ...:     f.write_chunk(name='chunk1',
   ...:                   data=numpy.array([9,10,11,12],
   ...:                                    dtype=numpy.float32))
   ...:     f.write_chunk(name='chunk2',
   ...:                   data=numpy.array([[13,14],[15,16]],
   ...:                                    dtype=numpy.float32))
   ...:     f.end_frame()
   ...: 

In [2]: f = gsd.fl.open(name='file.gsd', mode='rb')

In [3]: if f.chunk_exists(frame=0, name='chunk1'):
   ...:     data = f.read_chunk(frame=0, name='chunk1')
   ...: 

In [4]: data
Out[4]: array([1., 2., 3., 4.], dtype=float32)

In [5]: f.close()