Format overview

MDAnalysis can read topology or coordinate information from a wide variety of file formats. The emphasis is on formats used in popular simulation packages. By default, MDAnalysis figures out formats by looking at the extension, unless the format is explicitly specified with the format or topology_format keywords.

Below is a table of formats in MDAnalysis, and which information can be read from them. A topology file supplies the list of atoms in the system, their connectivity and possibly additional information such as B-factors, partial charges, etc. The details depend on the file format and not every topology file provides all (or even any) additional data.

Important

File formats are complicated and not always well defined. MDAnalysis tries to follow published standards but this can sometimes surprise users. It is highly recommended that you read the page for your data file format instead of assuming certain behaviour. If you encounter problems with a file format, please get in touch with us.

As a minimum, all topology parsers will provide atom ids, atom types, masses, resids, resnums, and segids. They will also assign all Atoms to Residues and all Residues to Segments. For systems without residues and segments, this results in there being a single Residue and Segment to which all Atoms belong. See Topology attributes for more topology attributes.

Often when data is not provided by a file, it will be guessed based on other data in the file. In this scenario, MDAnalysis will issue a warning. See Guessing for more information.

If a trajectory is loaded without time information, MDAnalysis will set a default timestep of 1.0 ps, where the first frame starts at 0.0 ps. In order to change these, pass the following optional arguments to Universe:

  • dt: the timestep

  • time_offset: the starting time from which to calculate the time of each frame

Topology

Coordinates