File layer

The file layer python module gsd.fl allows direct low level access to read and write GSD files of any schema. The HOOMD reader (gsd.hoomd) provides higher level access to HOOMD schema files, see HOOMD.

View the page source to find unformatted example code.

Open a gsd file

In [1]: f = gsd.fl.open(name="file.gsd",
   ...:                 mode='wb',
   ...:                 application="My application",
   ...:                 schema="My Schema",
   ...:                 schema_version=[1,0])
   ...: 

In [2]: f.close()

Warning

Opening a gsd file with a ‘w’ or ‘x’ mode overwrites any existing file with the given name.

Write data

In [3]: f = gsd.fl.open(name="file.gsd",
   ...:                 mode='wb',
   ...:                 application="My application",
   ...:                 schema="My Schema",
   ...:                 schema_version=[1,0]);
   ...: 

In [4]: f.write_chunk(name='chunk1', data=numpy.array([1,2,3,4], dtype=numpy.float32))

In [5]: f.write_chunk(name='chunk2', data=numpy.array([[5,6],[7,8]], dtype=numpy.float32))

In [6]: f.end_frame()

In [7]: f.write_chunk(name='chunk1', data=numpy.array([9,10,11,12], dtype=numpy.float32))

In [8]: f.write_chunk(name='chunk2', data=numpy.array([[13,14],[15,16]], dtype=numpy.float32))

In [9]: f.end_frame()

In [10]: f.close()

Call gsd.fl.open() to access gsd files on disk. Add any number of named data chunks to each frame in the file with gsd.fl.GSDFile.write_chunk(). The data must be a 1 or 2 dimensional numpy array of a simple numeric type (or a data type that will automatically convert when passed to numpy.array(data). Call gsd.fl.GSDFile.end_frame() to end the frame and start the next one.

Note

While supported, implicit conversion to numpy arrays creates a copy of the data in memory and adds conversion overhead.

Warning

Make sure to call end_frame() before closing the file, or the last frame will be lost.

Read data

In [11]: f = gsd.fl.open(name="file.gsd",
   ....:                 mode='rb',
   ....:                 application="My application",
   ....:                 schema="My Schema",
   ....:                 schema_version=[1,0])
   ....: 

In [12]: f.read_chunk(frame=0, name='chunk1')
Out[12]: array([1., 2., 3., 4.], dtype=float32)

In [13]: f.read_chunk(frame=1, name='chunk2')
Out[13]: 
array([[13., 14.],
       [15., 16.]], dtype=float32)

In [14]: f.close()

gsd.fl.GSDFile.read_chunk() reads the named chunk at the given frame index in the file and returns it as a numpy array.

Test if a chunk exists

In [15]: f = gsd.fl.open(name="file.gsd",
   ....:                 mode='rb',
   ....:                 application="My application",
   ....:                 schema="My Schema",
   ....:                 schema_version=[1,0])
   ....: 

In [16]: f.chunk_exists(frame=0, name='chunk1')
Out[16]: True

In [17]: f.chunk_exists(frame=1, name='chunk2')
Out[17]: True

In [18]: f.chunk_exists(frame=2, name='chunk1')
Out[18]: False

In [19]: f.close()

gsd.fl.GSDFile.chunk_exists() tests to see if a chunk by the given name exists in the file at the given frame.

Discover chunk names

In [20]: f = gsd.fl.open(name="file.gsd",
   ....:                 mode='rb',
   ....:                 application="My application",
   ....:                 schema="My Schema",
   ....:                 schema_version=[1,0])
   ....: 

In [21]: f.find_matching_chunk_names('')
Out[21]: ['chunk1', 'chunk2']

In [22]: f.find_matching_chunk_names('chunk')
Out[22]: ['chunk1', 'chunk2']

In [23]: f.find_matching_chunk_names('chunk1')
Out[23]: ['chunk1']

In [24]: f.find_matching_chunk_names('other')
Out[24]: []

gsd.fl.GSDFile.find_matching_chunk_names() finds all chunk names present in a GSD file that start with the given string.

Read-only access

In [25]: f = gsd.fl.open(name="file.gsd",
   ....:                 mode='rb',
   ....:                 application="My application",
   ....:                 schema="My Schema",
   ....:                 schema_version=[1,0])
   ....: 

In [26]: if f.chunk_exists(frame=0, name='chunk1'):
   ....:     data = f.read_chunk(frame=0, name='chunk1')
   ....: 

In [27]: data
Out[27]: array([1., 2., 3., 4.], dtype=float32)

# Fails because the file is open read only
In [28]: f.write_chunk(name='error', data=numpy.array([1]))
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-28-c9aabea2641a> in <module>
----> 1 f.write_chunk(name='error', data=numpy.array([1]))

gsd/fl.pyx in gsd.fl.GSDFile.write_chunk()

gsd/fl.pyx in gsd.fl.__raise_on_error()

RuntimeError: File must be writable: file.gsd

In [29]: f.close()

Writes fail when a file is opened in a read only mode.

Access file metadata

In [30]: f = gsd.fl.open(name="file.gsd",
   ....:                 mode='rb',
   ....:                 application="My application",
   ....:                 schema="My Schema",
   ....:                 schema_version=[1,0])
   ....: 

In [31]: f.name
Out[31]: 'file.gsd'

In [32]: f.mode
Out[32]: 'rb'

In [33]: f.gsd_version
Out[33]: (2, 0)

In [34]: f.application
Out[34]: 'My application'

In [35]: f.schema
Out[35]: 'My Schema'

In [36]: f.schema_version
Out[36]: (1, 0)

In [37]: f.nframes
Out[37]: 2

In [38]: f.close()

File metadata are available as properties.

Open a file in read/write mode

In [39]: f = gsd.fl.open(name="file.gsd",
   ....:                 mode='wb+',
   ....:                 application="My application",
   ....:                 schema="My Schema",
   ....:                 schema_version=[1,0])
   ....: 

In [40]: f.write_chunk(name='double', data=numpy.array([1,2,3,4], dtype=numpy.float64));

In [41]: f.end_frame()

In [42]: f.nframes
Out[42]: 1

In [43]: f.read_chunk(frame=0, name='double')
Out[43]: array([1., 2., 3., 4.])

Open a file in read/write mode to allow both reading and writing.

Write a file in append mode

In [44]: f = gsd.fl.open(name="file.gsd",
   ....:                 mode='ab',
   ....:                 application="My application",
   ....:                 schema="My Schema",
   ....:                 schema_version=[1,0])
   ....: 

In [45]: f.write_chunk(name='int', data=numpy.array([10,20], dtype=numpy.int16));

In [46]: f.end_frame()

In [47]: f.nframes
Out[47]: 2

# Reads fail in append mode
In [48]: f.read_chunk(frame=2, name='double')
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-48-cab5b10fd02b> in <module>
----> 1 f.read_chunk(frame=2, name='double')

gsd/fl.pyx in gsd.fl.GSDFile.read_chunk()

KeyError: 'frame 2 / chunk double not found in: file.gsd'

In [49]: f.close()

Open a file in append mode to write additional chunks to an existing file, but prevent reading.

Use as a context manager

In [50]: with gsd.fl.open(name="file.gsd",
   ....:                 mode='rb',
   ....:                 application="My application",
   ....:                 schema="My Schema",
   ....:                 schema_version=[1,0]) as f:
   ....:     data = f.read_chunk(frame=0, name='double');
   ....: 

In [51]: data
Out[51]: array([1., 2., 3., 4.])

gsd.fl.GSDFile works as a context manager for guaranteed file closure and cleanup when exceptions occur.

Store string chunks

In [52]: f = gsd.fl.open(name="file.gsd",
   ....:                 mode='wb+',
   ....:                 application="My application",
   ....:                 schema="My Schema",
   ....:                 schema_version=[1,0])
   ....: 

In [53]: f.mode
Out[53]: 'wb+'

In [54]: s = "This is a string"

In [55]: b = numpy.array([s], dtype=numpy.dtype((bytes, len(s)+1)))

In [56]: b = b.view(dtype=numpy.int8)

In [57]: b
Out[57]: 
array([ 84, 104, 105, 115,  32, 105, 115,  32,  97,  32, 115, 116, 114,
       105, 110, 103,   0], dtype=int8)

In [58]: f.write_chunk(name='string', data=b)

In [59]: f.end_frame()

In [60]: r = f.read_chunk(frame=0, name='string')

In [61]: r
Out[61]: 
array([ 84, 104, 105, 115,  32, 105, 115,  32,  97,  32, 115, 116, 114,
       105, 110, 103,   0], dtype=int8)

In [62]: r = r.view(dtype=numpy.dtype((bytes, r.shape[0])));

In [63]: r[0].decode('UTF-8')
Out[63]: 'This is a string'

In [64]: f.close()

To store a string in a gsd file, convert it to a numpy array of bytes and store that data in the file. Decode the byte sequence to get back a string.

Truncate

In [65]: f = gsd.fl.open(name="file.gsd",
   ....:                 mode='ab',
   ....:                 application="My application",
   ....:                 schema="My Schema",
   ....:                 schema_version=[1,0])
   ....: 

In [66]: f.nframes
Out[66]: 1

In [67]: f.schema, f.schema_version, f.application
Out[67]: ('My Schema', (1, 0), 'My application')

In [68]: f.truncate()

In [69]: f.nframes
Out[69]: 0

In [70]: f.schema, f.schema_version, f.application
Out[70]: ('My Schema', (1, 0), 'My application')

Truncating a gsd file removes all data chunks from it, but retains the same schema, schema version, and application name. The file is not closed during this process. This is useful when writing restart files on a Lustre file system when file open operations need to be kept to a minimum.