HOOMD

gsd.hoomd provides high-level access to HOOMD schema GSD files.

View the page source to find unformatted example code.

Define a snapshot

In [1]: s = gsd.hoomd.Snapshot()

In [2]: s.particles.N = 4

In [3]: s.particles.types = ['A', 'B']

In [4]: s.particles.typeid = [0,0,1,1]

In [5]: s.particles.position = [[0,0,0],[1,1,1], [-1,-1,-1], [1,-1,-1]]

In [6]: s.configuration.box = [3, 3, 3, 0, 0, 0]

gsd.hoomd represents the state of a single frame with an instance of the class gsd.hoomd.Snapshot. Instantiate this class to create a system configuration. All fields default to None and are only written into the file if not None and do not match the data in the first frame, or defaults specified in the schema.

Create a hoomd gsd file

In [7]: gsd.hoomd.open(name='test.gsd', mode='wb')
Out[7]: <gsd.hoomd.HOOMDTrajectory at 0x7f13481897b8>

Append frames to a gsd file

In [8]: def create_frame(i):
   ...:     s = gsd.hoomd.Snapshot()
   ...:     s.configuration.step = i
   ...:     s.particles.N = 4+i
   ...:     s.particles.position = numpy.random.random(size=(4+i,3))
   ...:     return s
   ...: 

In [9]: t = gsd.hoomd.open(name='test.gsd', mode='wb')

In [10]: t.extend( (create_frame(i) for i in range(10)) )

In [11]: t.append( create_frame(10) )

In [12]: len(t)
Out[12]: 11

Use gsd.hoomd.open() to open a GSD file with the high level interface gsd.hoomd.HOOMDTrajectory. It behaves like a python list, with gsd.hoomd.HOOMDTrajectory.append() and gsd.hoomd.HOOMDTrajectory.extend() methods.

Note

gsd.hoomd.HOOMDTrajectory currently doesn’t support files opened in append mode.

Tip

When using gsd.hoomd.HOOMDTrajectory.extend(), pass in a generator or generator expression to avoid storing the entire trajectory in memory before writing it out.

Randomly index frames

In [13]: t = gsd.hoomd.open(name='test.gsd', mode='rb')

In [14]: snap = t[5]

In [15]: snap.configuration.step
Out[15]: 5

In [16]: snap.particles.N
Out[16]: 9

In [17]: snap.particles.position
Out[17]: 
array([[0.05551339, 0.7118579 , 0.55684733],
       [0.6320372 , 0.52013403, 0.35986692],
       [0.8490573 , 0.45080644, 0.30773836],
       [0.5693308 , 0.5857744 , 0.4568267 ],
       [0.72886753, 0.6121499 , 0.9813543 ],
       [0.5911686 , 0.26352417, 0.48863783],
       [0.417278  , 0.8825609 , 0.12694064],
       [0.9012692 , 0.30470046, 0.62210697],
       [0.61231   , 0.13405989, 0.8273897 ]], dtype=float32)

gsd.hoomd.HOOMDTrajectory supports random indexing of frames in the file. Indexing into a trajectory returns a gsd.hoomd.Snapshot.

Slicing and selection

Use the slicing operator to select individual frames or a subset of a trajectory.

In [18]: t = gsd.hoomd.open(name='test.gsd', mode='rb')

In [19]: for s in t[5:-2]:
   ....:     print(s.configuration.step, end=' ')
   ....: 
5 6 7 8 
In [20]: every_2nd_frame = t[::2]  # create a view of a trajectory subset

In [21]: for s in every_2nd_frame[:4]:
   ....:     print(s.configuration.step, end=' ')
   ....: 
0 2 4 6 

Slicing a trajectory creates a trajectory view, which can then be queried for length or sliced again. Selecting individual frames from a view works exactly like selecting individual frames from the original trajectory object.

Pure python reader

In [22]: f = gsd.pygsd.GSDFile(open('test.gsd', 'rb'))

In [23]: t = gsd.hoomd.HOOMDTrajectory(f);

In [24]: t[3].particles.position
Out[24]: 
array([[0.5692003 , 0.35402223, 0.04927382],
       [0.03480713, 0.97779214, 0.32133165],
       [0.8875883 , 0.71083546, 0.3328835 ],
       [0.3320165 , 0.18332727, 0.1802465 ],
       [0.28748494, 0.0110937 , 0.09906646],
       [0.4513102 , 0.4319952 , 0.8737437 ],
       [0.81227154, 0.05937914, 0.79250705]], dtype=float32)

You can use GSD without needing to compile C code to read GSD files using gsd.pygsd.GSDFile in combination with gsd.hoomd.HOOMDTrajectory. It only supports the rb mode and does not read files as fast as the C implementation. It takes in a python file-like object, so it can be used with in-memory IO classes, and grid file classes that access data over the internet.

Access state data

In [25]: with gsd.hoomd.open(name='test2.gsd', mode='wb') as t:
   ....:     s = gsd.hoomd.Snapshot()
   ....:     s.particles.types = ['A', 'B']
   ....:     s.state['hpmc/convex_polygon/N'] = [3, 4]
   ....:     s.state['hpmc/convex_polygon/vertices'] = [[-1, -1],
   ....:                                                [1, -1],
   ....:                                                [1, 1],
   ....:                                                [-2, -2],
   ....:                                                [2, -2],
   ....:                                                [2, 2],
   ....:                                                [-2, 2]]
   ....:     t.append(s)
   ....: 

State data is stored in the state dictionary as numpy arrays. Place data into this dictionary directly without the ‘state/’ prefix and gsd will include it in the output. Shape vertices are stored in a packed format. In this example, type ‘A’ has 3 vertices (the first 3 in the list) and type ‘B’ has 4 (the next 4).

In [26]: with gsd.hoomd.open(name='test2.gsd', mode='rb') as t:
   ....:     s = t[0]
   ....:     print(s.state['hpmc/convex_polygon/N'])
   ....:     print(s.state['hpmc/convex_polygon/vertices'])
   ....: 
[3 4]
[[-1. -1.]
 [ 1. -1.]
 [ 1.  1.]
 [-2. -2.]
 [ 2. -2.]
 [ 2.  2.]
 [-2.  2.]]

Access read state data in the same way.

Access logged data

In [27]: with gsd.hoomd.open(name='example.gsd', mode='wb') as t:
   ....:     s = gsd.hoomd.Snapshot()
   ....:     s.particles.N = 4
   ....:     s.log['particles/net_force'] = numpy.array([[-1,2,-3],
   ....:                                     [0,2,-4],
   ....:                                     [-3,2,1],
   ....:                                     [1,2,3]], dtype=numpy.float32)
   ....:     s.log['value/potential_energy'] = [1.5]
   ....:     t.append(s)
   ....: 

Logged data is stored in the log dictionary as numpy arrays. Place data into this dictionary directly without the ‘log/’ prefix and gsd will include it in the output. Store per-particle quantities with the prefix particles/. Choose another prefix for other quantities.

In [28]: t = gsd.hoomd.open(name='example.gsd', mode='rb')

In [29]: s = t[0]

In [30]: s.log['particles/net_force']
Out[30]: 
array([[-1.,  2., -3.],
       [ 0.,  2., -4.],
       [-3.,  2.,  1.],
       [ 1.,  2.,  3.]], dtype=float32)

In [31]: s.log['value/potential_energy']
Out[31]: array([1.5])

Read logged data from the log dictionary.

Logged data must be a convertible to a numpy array of a supported type:

In [32]: with gsd.hoomd.open(name='example.gsd', mode='wb') as t:
   ....:     s = gsd.hoomd.Snapshot()
   ....:     s.particles.N = 4
   ....:     s.log['invalid'] = dict(a=1, b=5)
   ....:     t.append(s)
   ....: 
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-32-e8566430e2ad> in <module>
      3     s.particles.N = 4
      4     s.log['invalid'] = dict(a=1, b=5)
----> 5     t.append(s)

~/checkouts/readthedocs.org/user_builds/gsd/conda/v2.0.0/lib/python3.7/site-packages/gsd-2.0.0-py3.7-linux-x86_64.egg/gsd/hoomd.py in append(self, snapshot)
    720         # write log data
    721         for log, data in snapshot.log.items():
--> 722             self.file.write_chunk('log/' + log, data)
    723 
    724         self.file.end_frame()

gsd/fl.pyx in gsd.fl.GSDFile.write_chunk()

ValueError: invalid type for chunk: log/invalid

Use multiprocessing with HOOMDTrajectory

In [33]: import multiprocessing as mp

In [34]: def cnt_part(args):
   ....:    t, frame = args
   ....:    return len(t[frame].particles.position)
   ....: 

In [35]: with gsd.hoomd.open(name='test.gsd', mode='rb') as t:
   ....:    with mp.Pool(processes=mp.cpu_count()) as pool:
   ....:       result = pool.map(cnt_part, [(t, frame) for frame in range(len(t))])
   ....: 

gsd.hoomd.HOOMDTrajectory and both gsd.fl.GSDFile and gsd.pygsd.GSDFile can be pickled when in read mode to allow for multiprocessing through pythons native multiprocessing library. Here cnt_part finds the number of particles in each frame and appends it to a list. This code would result in a list of all particle numbers throughout the trajectory file.