File layer¶
The file layer python module gsd.fl
allows direct low level access to read and write
GSD files of any schema. The HOOMD reader (gsd.hoomd
) provides higher level access to
HOOMD schema files, see HOOMD.
View the page source to find unformatted example code.
Open a gsd file¶
In [1]: f = gsd.fl.open(name="file.gsd",
...: mode='wb',
...: application="My application",
...: schema="My Schema",
...: schema_version=[1,0])
...:
In [2]: f.close()
Warning
Opening a gsd file with a ‘w’ or ‘x’ mode overwrites any existing file with the given name.
Write data¶
In [3]: f = gsd.fl.open(name="file.gsd",
...: mode='wb',
...: application="My application",
...: schema="My Schema",
...: schema_version=[1,0]);
...:
In [4]: f.write_chunk(name='chunk1', data=numpy.array([1,2,3,4], dtype=numpy.float32))
In [5]: f.write_chunk(name='chunk2', data=numpy.array([[5,6],[7,8]], dtype=numpy.float32))
In [6]: f.end_frame()
In [7]: f.write_chunk(name='chunk1', data=numpy.array([9,10,11,12], dtype=numpy.float32))
In [8]: f.write_chunk(name='chunk2', data=numpy.array([[13,14],[15,16]], dtype=numpy.float32))
In [9]: f.end_frame()
In [10]: f.close()
Call gsd.fl.open()
to access gsd files on disk.
Add any number of named data chunks to each frame in the file with
gsd.fl.GSDFile.write_chunk()
. The data must be a 1 or 2
dimensional numpy array of a simple numeric type (or a data type that will automatically
convert when passed to numpy.array(data)
. Call gsd.fl.GSDFile.end_frame()
to end the frame and start the next one.
Note
While supported, implicit conversion to numpy arrays creates a copy of the data in memory and adds conversion overhead.
Warning
Make sure to call end_frame()
before closing the file, or the last frame will be lost.
Read data¶
In [11]: f = gsd.fl.open(name="file.gsd",
....: mode='rb',
....: application="My application",
....: schema="My Schema",
....: schema_version=[1,0])
....:
In [12]: f.read_chunk(frame=0, name='chunk1')
Out[12]: array([1., 2., 3., 4.], dtype=float32)
In [13]: f.read_chunk(frame=1, name='chunk2')
Out[13]:
array([[13., 14.],
[15., 16.]], dtype=float32)
In [14]: f.close()
gsd.fl.GSDFile.read_chunk()
reads the named chunk at the given frame index in the file
and returns it as a numpy array.
Test if a chunk exists¶
In [15]: f = gsd.fl.open(name="file.gsd",
....: mode='rb',
....: application="My application",
....: schema="My Schema",
....: schema_version=[1,0])
....:
In [16]: f.chunk_exists(frame=0, name='chunk1')
Out[16]: True
In [17]: f.chunk_exists(frame=1, name='chunk2')
Out[17]: True
In [18]: f.chunk_exists(frame=2, name='chunk1')
Out[18]: False
In [19]: f.close()
gsd.fl.GSDFile.chunk_exists()
tests to see if a chunk by the given name exists in the file
at the given frame.
Discover chunk names¶
In [20]: f = gsd.fl.open(name="file.gsd",
....: mode='rb',
....: application="My application",
....: schema="My Schema",
....: schema_version=[1,0])
....:
In [21]: f.find_matching_chunk_names('')
Out[21]: ['chunk1', 'chunk2']
In [22]: f.find_matching_chunk_names('chunk')
Out[22]: ['chunk1', 'chunk2']
In [23]: f.find_matching_chunk_names('chunk1')
Out[23]: ['chunk1']
In [24]: f.find_matching_chunk_names('other')
Out[24]: []
gsd.fl.GSDFile.find_matching_chunk_names()
finds all chunk names present in a GSD file that start with the
given string.
Read-only access¶
In [25]: f = gsd.fl.open(name="file.gsd",
....: mode='rb',
....: application="My application",
....: schema="My Schema",
....: schema_version=[1,0])
....:
In [26]: if f.chunk_exists(frame=0, name='chunk1'):
....: data = f.read_chunk(frame=0, name='chunk1')
....:
In [27]: data
Out[27]: array([1., 2., 3., 4.], dtype=float32)
# Fails because the file is open read only
In [28]: f.write_chunk(name='error', data=numpy.array([1]))
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-28-c9aabea2641a> in <module>
----> 1 f.write_chunk(name='error', data=numpy.array([1]))
gsd/fl.pyx in gsd.fl.GSDFile.write_chunk()
gsd/fl.pyx in gsd.fl.__raise_on_error()
RuntimeError: File must be writable: file.gsd
In [29]: f.close()
Writes fail when a file is opened in a read only mode.
Access file metadata¶
In [30]: f = gsd.fl.open(name="file.gsd",
....: mode='rb',
....: application="My application",
....: schema="My Schema",
....: schema_version=[1,0])
....:
In [31]: f.name
Out[31]: 'file.gsd'
In [32]: f.mode
Out[32]: 'rb'
In [33]: f.gsd_version
Out[33]: (2, 0)
In [34]: f.application
Out[34]: 'My application'
In [35]: f.schema
Out[35]: 'My Schema'
In [36]: f.schema_version
Out[36]: (1, 0)
In [37]: f.nframes
Out[37]: 2
In [38]: f.close()
File metadata are available as properties.
Open a file in read/write mode¶
In [39]: f = gsd.fl.open(name="file.gsd",
....: mode='wb+',
....: application="My application",
....: schema="My Schema",
....: schema_version=[1,0])
....:
In [40]: f.write_chunk(name='double', data=numpy.array([1,2,3,4], dtype=numpy.float64));
In [41]: f.end_frame()
In [42]: f.nframes
Out[42]: 1
In [43]: f.read_chunk(frame=0, name='double')
Out[43]: array([1., 2., 3., 4.])
Open a file in read/write mode to allow both reading and writing.
Write a file in append mode¶
In [44]: f = gsd.fl.open(name="file.gsd",
....: mode='ab',
....: application="My application",
....: schema="My Schema",
....: schema_version=[1,0])
....:
In [45]: f.write_chunk(name='int', data=numpy.array([10,20], dtype=numpy.int16));
In [46]: f.end_frame()
In [47]: f.nframes
Out[47]: 2
# Reads fail in append mode
In [48]: f.read_chunk(frame=2, name='double')
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-48-cab5b10fd02b> in <module>
----> 1 f.read_chunk(frame=2, name='double')
gsd/fl.pyx in gsd.fl.GSDFile.read_chunk()
KeyError: 'frame 2 / chunk double not found in: file.gsd'
In [49]: f.close()
Open a file in append mode to write additional chunks to an existing file, but prevent reading.
Use as a context manager¶
In [50]: with gsd.fl.open(name="file.gsd",
....: mode='rb',
....: application="My application",
....: schema="My Schema",
....: schema_version=[1,0]) as f:
....: data = f.read_chunk(frame=0, name='double');
....:
In [51]: data
Out[51]: array([1., 2., 3., 4.])
gsd.fl.GSDFile
works as a context manager for guaranteed file closure and cleanup
when exceptions occur.
Store string chunks¶
In [52]: f = gsd.fl.open(name="file.gsd",
....: mode='wb+',
....: application="My application",
....: schema="My Schema",
....: schema_version=[1,0])
....:
In [53]: f.mode
Out[53]: 'wb+'
In [54]: s = "This is a string"
In [55]: b = numpy.array([s], dtype=numpy.dtype((bytes, len(s)+1)))
In [56]: b = b.view(dtype=numpy.int8)
In [57]: b
Out[57]:
array([ 84, 104, 105, 115, 32, 105, 115, 32, 97, 32, 115, 116, 114,
105, 110, 103, 0], dtype=int8)
In [58]: f.write_chunk(name='string', data=b)
In [59]: f.end_frame()
In [60]: r = f.read_chunk(frame=0, name='string')
In [61]: r
Out[61]:
array([ 84, 104, 105, 115, 32, 105, 115, 32, 97, 32, 115, 116, 114,
105, 110, 103, 0], dtype=int8)
In [62]: r = r.view(dtype=numpy.dtype((bytes, r.shape[0])));
In [63]: r[0].decode('UTF-8')
Out[63]: 'This is a string'
In [64]: f.close()
To store a string in a gsd file, convert it to a numpy array of bytes and store that data in the file. Decode the byte sequence to get back a string.
Truncate¶
In [65]: f = gsd.fl.open(name="file.gsd",
....: mode='ab',
....: application="My application",
....: schema="My Schema",
....: schema_version=[1,0])
....:
In [66]: f.nframes
Out[66]: 1
In [67]: f.schema, f.schema_version, f.application
Out[67]: ('My Schema', (1, 0), 'My application')
In [68]: f.truncate()
In [69]: f.nframes
Out[69]: 0
In [70]: f.schema, f.schema_version, f.application
Out[70]: ('My Schema', (1, 0), 'My application')
Truncating a gsd file removes all data chunks from it, but retains the same schema, schema version, and application name. The file is not closed during this process. This is useful when writing restart files on a Lustre file system when file open operations need to be kept to a minimum.