gsd.fl module

GSD file layer API.

Low level access to gsd files. gsd.fl allows direct access to create, read, and write gsd files. The module is implemented in C and is optimized. See File layer examples for detailed example code.

  • GSDFile - Class interface to read and write gsd files.
  • create() - Create a gsd file (deprecated).
  • open() - Open a gsd file.
class gsd.fl.GSDFile(name, mode, application, schema, schema_version)

GSD file access interface.

GSDFile implements an object oriented class interface to the GSD file layer. Use open() to open a GSD file and obtain a GSDFile instance. GSDFile can be used as a context manager.

Changed in version 1.2: For new code, use open() instead of constructing GSDFile directly. GSDFile.__init__ is backwards compatible with the old open syntax used in GSD versions 1.0.x and 1.1.x.

name

str – Name of the open file (read only).

mode

str – Mode of the open file (read only).

gsd_version

tuple[int] – GSD file layer version number [major, minor] (read only).

application

str – Name of the generating application (read only).

schema

str – Name of the data schema (read only).

schema_version

tuple[int] – Schema version number [major, minor] (read only).

nframes

int – Number of frames (read only).

chunk_exists(frame, name)

Test if a chunk exists.

Parameters:
  • frame (int) – Index of the frame to check
  • name (str) – Name of the chunk
Returns:

True if the chunk exists in the file. False if it does not.

Return type:

bool

Example

In [1]: with gsd.fl.open(name='file.gsd', mode='wb', application="My application", schema="My Schema", schema_version=[1,0]) as f:
   ...:     f.write_chunk(name='chunk1', data=numpy.array([1,2,3,4], dtype=numpy.float32));
   ...:     f.write_chunk(name='chunk2', data=numpy.array([[5,6],[7,8]], dtype=numpy.float32));
   ...:     f.end_frame();
   ...:     f.write_chunk(name='chunk1', data=numpy.array([9,10,11,12], dtype=numpy.float32));
   ...:     f.write_chunk(name='chunk2', data=numpy.array([[13,14],[15,16]], dtype=numpy.float32));
   ...:     f.end_frame();
   ...: 

In [2]: f = gsd.fl.open(name='file.gsd', mode='rb', application="My application", schema="My Schema", schema_version=[1,0])

In [3]: f.chunk_exists(frame=0, name='chunk1')
Out[3]: True

In [4]: f.chunk_exists(frame=0, name='chunk2')
Out[4]: True

In [5]: f.chunk_exists(frame=0, name='chunk3')
Out[5]: False

In [6]: f.chunk_exists(frame=10, name='chunk1')
Out[6]: False
close()

Close the file.

Once closed, any other operation on the file object will result in a ValueError. close() may be called more than once. The file is automatically closed when garbage collected or when the context manager exits.

Example

In [1]: f = gsd.fl.open(name='file.gsd', mode='wb+', application="My application", schema="My Schema", schema_version=[1,0])

In [2]: f.write_chunk(name='chunk1', data=numpy.array([1,2,3,4], dtype=numpy.float32))

In [3]: f.end_frame();

In [4]: data = f.read_chunk(frame=0, name='chunk1')

In [5]: f.close()

# Read fails because the file is closed
In [6]: data = f.read_chunk(frame=0, name='chunk1')
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-6-a11a1951c42f> in <module>()
----> 1 data = f.read_chunk(frame=0, name='chunk1')

fl.pyx in gsd.fl.GSDFile.read_chunk()

ValueError: File is not open
end_frame()

Complete writing the current frame. After calling end_frame() future calls to write_chunk() will write to the next frame in the file.

Danger

Call end_frame() to complete the current frame before closing the file. If you fail to call end_frame(), the last frame may not be written to disk.

Example

In [1]: f = gsd.fl.open(name='file.gsd', mode='wb', application="My application", schema="My Schema", schema_version=[1,0])

In [2]: f.write_chunk(name='chunk1', data=numpy.array([1,2,3,4], dtype=numpy.float32));

In [3]: f.end_frame();

In [4]: f.write_chunk(name='chunk1', data=numpy.array([9,10,11,12], dtype=numpy.float32));

In [5]: f.end_frame();

In [6]: f.write_chunk(name='chunk1', data=numpy.array([13,14], dtype=numpy.float32));

In [7]: f.end_frame();

In [8]: f.nframes
Out[8]: 3
read_chunk(frame, name)

Read a data chunk from the file and return it as a numpy array.

Parameters:
  • frame (int) – Index of the frame to read
  • name (str) – Name of the chunk
Returns:

Data read from file. type is determined by the chunk metadata. If the data is NxM in the file and M > 1, return a 2D array. If the data is Nx1, return a 1D array.

Return type:

numpy.ndarray[type, ndim=?, mode='c']

Tip

Each call to invokes a disk read and allocation of a new numpy array for storage. To avoid overhead, don’t call read_chunk() on the same chunk repeatedly. Cache the arrays instead.

Example

In [1]: with gsd.fl.open(name='file.gsd', mode='wb', application="My application", schema="My Schema", schema_version=[1,0]) as f:
   ...:     f.write_chunk(name='chunk1', data=numpy.array([1,2,3,4], dtype=numpy.float32));
   ...:     f.write_chunk(name='chunk2', data=numpy.array([[5,6],[7,8]], dtype=numpy.float32));
   ...:     f.end_frame();
   ...:     f.write_chunk(name='chunk1', data=numpy.array([9,10,11,12], dtype=numpy.float32));
   ...:     f.write_chunk(name='chunk2', data=numpy.array([[13,14],[15,16]], dtype=numpy.float32));
   ...:     f.end_frame();
   ...: 

In [2]: f = gsd.fl.open(name='file.gsd', mode='rb', application="My application", schema="My Schema", schema_version=[1,0])

In [3]: f.read_chunk(frame=0, name='chunk1')
Out[3]: array([ 1.,  2.,  3.,  4.], dtype=float32)

In [4]: f.read_chunk(frame=1, name='chunk1')
Out[4]: array([  9.,  10.,  11.,  12.], dtype=float32)

In [5]: f.read_chunk(frame=2, name='chunk1')
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-5-ba3db1173c96> in <module>()
----> 1 f.read_chunk(frame=2, name='chunk1')

fl.pyx in gsd.fl.GSDFile.read_chunk()

KeyError: 'frame 2 / chunk chunk1 not found in: file.gsd'
truncate()

Truncate all data from the file. After truncation, the file has no frames and no data chunks. The application, schema, and schema version remain the same.

Example

In [1]: with gsd.fl.open(name='file.gsd', mode='wb', application="My application", schema="My Schema", schema_version=[1,0]) as f:
   ...:     for i in range(10):
   ...:         f.write_chunk(name='chunk1', data=numpy.array([1,2,3,4], dtype=numpy.float32))
   ...:         f.end_frame();
   ...: 

In [2]: f = gsd.fl.open(name='file.gsd', mode='ab', application="My application", schema="My Schema", schema_version=[1,0])

In [3]: f.nframes
Out[3]: 10

In [4]: f.schema, f.schema_version, f.application
Out[4]: ('My Schema', (1, 0), 'My application')

In [5]: f.truncate()

In [6]: f.nframes
Out[6]: 0

In [7]: f.schema, f.schema_version, f.application
Out[7]: ('My Schema', (1, 0), 'My application')
write_chunk(name, data)

Write a data chunk to the file. After writing all chunks in the current frame, call end_frame().

Parameters:
  • name (str) – Name of the chunk
  • data – Data to write into the chunk. Must be a numpy array, or array-like, with 2 or fewer dimensions.

Warning

write_chunk() will implicitly converts array-like and non-contiguous numpy arrays to contiguous numpy arrays with numpy.ascontiguousarray(data). This may or may not produce desired data types in the output file and incurs overhead.

Example

In [1]: f = gsd.fl.open(name='file.gsd', mode='wb', application="My application", schema="My Schema", schema_version=[1,0])

In [2]: f.write_chunk(name='float1d', data=numpy.array([1,2,3,4], dtype=numpy.float32));

In [3]: f.write_chunk(name='float2d', data=numpy.array([[13,14],[15,16],[17,19]], dtype=numpy.float32));

In [4]: f.write_chunk(name='double2d', data=numpy.array([[1,4],[5,6],[7,9]], dtype=numpy.float64));

In [5]: f.write_chunk(name='int1d', data=numpy.array([70,80,90], dtype=numpy.int64));

In [6]: f.end_frame();

In [7]: f.nframes
Out[7]: 1

In [8]: f.close()
gsd.fl.create(name, application, schema, schema_version)

Create an empty GSD file on the filesystem.

Deprecated since version 1.2: As of version 1.2, you can create and open GSD files in the same call to open(). create() is kept for backwards compatibility.

Parameters:
  • name (str) – File name to open.
  • application (str) – Name of the application creating the file.
  • schema (str) – Name of the data schema.
  • schema_version (list[int]) – Schema version number [major, minor].

Example

Create a gsd file:

In [1]: gsd.fl.create(name="file.gsd",
   ...:               application="My application",
   ...:               schema="My Schema",
   ...:               schema_version=[1,0]);
   ...: 

Danger

The file is overwritten if it already exists.

gsd.fl.open(name, mode, application, schema, schema_version)

open() opens a GSD file and returns a GSDFile instance. The return value of open() can be used as a context manager.

Parameters:
  • name (str) – File name to open.
  • mode (str) – File access mode.
  • application (str) – Name of the application creating the file.
  • schema (str) – Name of the data schema.
  • schema_version (list[int]) – Schema version number [major, minor].

Valid values for mode:

mode description
'rb' Open an existing file for reading.
'rb+' Open an existing file for reading and writing. Inefficient for large files.
'wb' Open a file for writing. Creates the file if needed, or overwrites an existing file.
'wb+' Open a file for reading and writing. Creates the file if needed, or overwrites an existing file. Inefficient for large files.
'xb' Create a gsd file exclusively and opens it for writing. Raise an FileExistsError exception if it already exists.
'xb+' Create a gsd file exclusively and opens it for reading and writing. Raise an FileExistsError exception if it already exists. Inefficient for large files.
'ab' Open an existing file for writing. Does not create or overwrite existing files.

The ‘+’ read/write modes are inefficient at handling large files, as they read the entire file index into memory. Prefer the appropriate read or write only modes.

When opening a file for reading ('r' or 'a' modes): application and schema_version are ignored. open() throws an exception if the file’s schema does not match schema.

When opening a file for writing ('w' or 'x' modes): The given application, schema, and schema_version are saved in the file.

New in version 1.2.

Example

In [1]: with gsd.fl.open(name='file.gsd', mode='wb', application="My application", schema="My Schema", schema_version=[1,0]) as f:
   ...:     f.write_chunk(name='chunk1', data=numpy.array([1,2,3,4], dtype=numpy.float32));
   ...:     f.write_chunk(name='chunk2', data=numpy.array([[5,6],[7,8]], dtype=numpy.float32));
   ...:     f.end_frame();
   ...:     f.write_chunk(name='chunk1', data=numpy.array([9,10,11,12], dtype=numpy.float32));
   ...:     f.write_chunk(name='chunk2', data=numpy.array([[13,14],[15,16]], dtype=numpy.float32));
   ...:     f.end_frame();
   ...: 

In [2]: f = gsd.fl.GSDFile(name='file.gsd', mode='rb');

In [3]: if f.chunk_exists(frame=0, name='chunk1'):
   ...:     data = f.read_chunk(frame=0, name='chunk1')
   ...: 

In [4]: data
Out[4]: array([ 1.,  2.,  3.,  4.], dtype=float32)