Supported File Formats

CHARMM crd file format (charmm)

CHARMM coordinate files contain information about the location of each atom in Cartesian space. The format of the ASCII (CARD) CHARMM coordinate files is: Title line(s), number of atoms in file and the coordinate lines (one for each atom in the file).

The coordinate lines contain specific information about each atom. These have the following structure: Atom number (sequential), residue number (specified relative to first residue in the PSF), residue name, atom type, x-coordinate, y-coordinate, z-coordinate, segment identifier, residue identifier and a weighting array value.

Filename patterns: *.crd

iodata.formats.charmm.load_one()

  • Always loads atcoords, atffparams, atmasses, extra

  • May load title

VASP 5 CHGCAR file format (chgcar)

This format is used by VASP 5.X and VESTA.

Note that even though the CHGCAR and LOCPOT files look very similar, they require different conversions to atomic units.

Filename patterns: CHGCAR*, AECCAR*

iodata.formats.chgcar.load_one()

  • Always loads atcoords, atnums, cellvecs, cube, title

CP2K ATOM output file format (cp2klog)

Filename patterns: *.cp2k.out

iodata.formats.cp2klog.load_one()

  • Always loads atcoords, atcorenums, atnums, energy, mo, obasis

This function assumes that the following subsections are present in the CP2K ATOM input file, in the section ATOM%PRINT:

&PRINT
  &POTENTIAL
  &END POTENTIAL
  &BASIS_SET
  &END BASIS_SET
  &ORBITALS
  &END ORBITALS
&END PRINT

Gaussian Cube file format (cube)

Cube files are generated by various QC codes these days, including Gaussian, CP2K, GPAW, Q-Chem, …

Note that the second column in the geometry specification of the cube file is interpreted as the effective core charges.

Filename patterns: *.cube, *.cub

iodata.formats.cube.load_one()

  • Always loads atcoords, atcorenums, atnums, cellvecs, cube

iodata.formats.cube.dump_one()

  • Requires atcoords, atnums, cube

  • May dump title, atcorenums

Extended XYZ file format (extxyz)

The extended XYZ file format is defined in the ASE documentation.

Usually, the different frames in a trajectory describe different geometries of the same molecule, with atoms in the same order. The load_many function below can also handle an XYZ with different molecules, e.g. a molecular database.

Filename patterns: *.extxyz

iodata.formats.extxyz.load_one()

  • Always loads title

  • May load atcoords, atgradient, atmasses, atnums, cellvecs, charge, energy, extra

iodata.formats.extxyz.load_many()

  • Always loads title

  • May load atcoords, atgradient, atmasses, atnums, cellvecs, charge, energy, extra

Gaussian FCHK file format (fchk)

Filename patterns: *.fchk, *.fch

iodata.formats.fchk.load_one()

  • Always loads atcharges, atcoords, atnums, atcorenums, lot, mo, obasis, obasis_name, run_type, title

  • May load energy, atfrozen, atgradient, athessian, atmasses, one_rdms, extra, moments

iodata.formats.fchk.dump_one()

  • Requires atnums, atcorenums

  • May dump atcharges, atcoords, atfrozen, atgradient, athessian, atmasses, charge, energy, lot, mo, one_rdms, obasis_name, extra, moments

iodata.formats.fchk.load_many()

  • Always loads atcoords, atgradient, atnums, atcorenums, energy, extra, title

Trajectories from a Gaussian optimization, relaxed scan or IRC calculation are written in groups of frames, called “points” in the Gaussian world, e.g. to discrimininate between different values of the constraint in a relaxed geometry. In most cases, e.g. IRC or conventional optimization, there is only one “point”. Within one “point”, one can have multiple geometries and their properties. This information is stored in the extra attribute:

  • ipoint is the counter for a point

  • npoint is the total number of points.

  • istep is the counter within one “point”

  • nstep is the total number of geometries within in a “point”.

  • reaction_coordinate is only present in case of an IRC calculation.

Molpro 2012 FCIDUMP file format (fcidump)

Notes

  1. This function works only for restricted wave-functions.

  2. One- and two-electron integrals are stored in chemists’ notation in an FCIDUMP file, while IOData internally uses Physicist’s notation.

  3. Keep in mind that the FCIDUMP format changed in MOLPRO 2012, so files generated with older versions are not supported.

Filename patterns: *FCIDUMP*

iodata.formats.fcidump.load_one()

  • Always loads core_energy, one_ints, nelec, spinpol, two_ints

iodata.formats.fcidump.dump_one()

  • Requires one_ints, two_ints

  • May dump core_energy, nelec, spinpol

The dictionary one_ints must contain a field core_mo. Similarly, two_ints must contain two_mo.

GAMESS punch file format (gamess)

Filename patterns: *.dat

iodata.formats.gamess.load_one()

  • Always loads title, energy, grot, atgradient, athessian, atmasses, atnums, atcoords

Gaussian input format (gaussianinput)

Filename patterns: *.com, *.gjf

iodata.formats.gaussianinput.load_one()

  • Always loads atcoords, atnums, title

Gaussian Log file format (gaussianlog)

To write out the integrals in a Gaussian log file, which can be loaded with this module, you need to use the following Gaussian command line:

scf(conventional) iop(3/33=5) extralinks=l316 iop(3/27=999)

Filename patterns: *.log

iodata.formats.gaussianlog.load_one()

  • Always loads

  • May load one_ints, two_ints

GROMACS gro file format (gromacs)

Files with the gro file extension contain a molecular structure in Gromos87 format. GROMACS gro files can be used as trajectory by simply concatenating files.

http://manual.gromacs.org/current/reference-manual/file-formats.html#gro

Filename patterns: *.gro

iodata.formats.gromacs.load_one()

  • Always loads atcoords, atffparams, cellvecs, extra, title

iodata.formats.gromacs.load_many()

  • Always loads atcoords, atffparams, cellvecs, extra, title

QCSchema JSON file format (json)

QCSchema defines four different subschema:

  • Molecule: specifying a molecular system

  • Input: specifying QC program input for a specific Molecule

  • Output: specifying QC program output for a specific Molecule

  • Basis: specifying a basis set for a specific Molecule

General Usage

The QCSchema format is intended to be a catch-all file format for storing and sharing QC calculation data. Due to the wide number of possibilities of the data contained in a single file, not every field in a QCSchema file directly corresponds to an IOData attribute. For example, qcschema_output files allow for many fields capturing different energy contributions, especially for coupled-cluster calculations. To accommodate this fact, IOData does not always assume the intent of the user; instead, IOData ensures that every field in the file is stored in a structured manner. When a QCSchema field does not correspond to an IOData attribute, that data is instead stored in the extra dict, in a dictionary corresponding to the subschema where that data was found. In cases where multiple subschema contain the relevant field (e.g. the Output subschema contains the entirety of the Input subschema), the data will be found in the smallest subschema (for the example above, in IOData.extra["input"], not IOData.extra["output"]).

Dumping an IOData instance to a QCSchema file involves adding relevant required (and optional, if needed) fields to the necessary dictionaries in the extra dict. One exception is the provenance field: if the only desired provenance data is the creation of the file by IOData, that data will be added automatically.

The following sections will describe the requirements of each subschema and the behaviour to expect from IOData when loading in or dumping out a QCSchema file.

Schema Definitions

Provenance Information

The provenance field contains information about how the associated QCSchema object and its attributes were generated, provided, and manipulated. A provenance entry expects these fields:

Field

Description

creator

Required. The program that generated, provided, or manipulated this file.

version

The version of the creator.

routine

The routine of the creator.

In QCElemental, only a single provenance entry is permitted. When generating a QCSchema file for use with QCElemental, the easiest way to ensure compliance is to leave the provenance field blank, to allow the dump_one function to generate the correct provenance information. However, allowing only one entry for provenance information limits the ability to properly trace a file through several operations during complex workflows. With this in mind, IOData supports an enhanced provenance field, in the form of a list of provenance entries, with new entries appended to the end of the list.

Molecule Schema

The qcschema_molecule subschema describes a molecular system, and contains the data necessary to specify a molecular system and support I/O and manipulation processes.

The following is an example of a minimal qcschema_molecule file:

{
  "schema_name": "qcschema_molecule",
  "schema_version": 2,
  "symbols":  ["Li", "Cl"],
  "geometry": [0.000000, 0.000000, -1.631761, 0.000000, 0.000000, 0.287958],
  "molecular_charge": 0,
  "molecular_multiplicity": 1,
  "provenance": {
    "creator": "HORTON3",
    "routine": "Manual validation"
  }
}

The required fields and corresponding types for a qcschema_molecule file are:

Field

Type

IOData attr.

Description

schema_name

str

N/A

The name of the QCSchema subschema. Fixed as qcschema_molecule.

schema_version

str

N/A

The version of the subschema specification. 2.0 is the current version.

symbols

list(N_at)

atnums

An array of the atomic symbols for the system.

geometry

list(3*N_at)

atcoords

An ordered array of XYZ atomic coordinates, corresponding to the order of symbols. The first three elements correspond to atom one, the second three to atom two, etc.

molecular_charge

float

charge

The net electrostatic charge of the molecule. Some writers assume a default of 0.

molecular_multiplicity

int

spinpol

The total multiplicity of this molecule. Some writers assume a default of 1.

provenance

dict or list

N/A

Information about the file was generated, provided, and manipulated. See Provenance section above for more details.

Note: N_at corresponds to the number of atoms in the molecule, as defined by the length of symbols.

The optional fields and corresponding types for a qcschema_molecule file are:

Field

Type

IOData attr.

Description

atom_labels

list(N_at)

N/A

Additional per-atom labels. Typically used for model conversions, not user assignment. The indices of this array correspond to the symbols ordering.

atomic_numbers

list(N_at)

atnums

An array of atomic numbers for each atom. Typically inferred from symbols.

comment

str

N/A

Additional comments for this molecule. These comments are intended for user information, not any computational tasks.

connectivity

list

bonds

The connectivity information between each atom in the symbols array. Each entry in this array is a 3-item array, [index_a, index_b, bond_order], where the indices correspond to the atom indices in symbols.

extras

dict

N/A

Extra information to associate with this molecule.

fix_symmetry

str

g_rot

Maximal point group symmetry with which the molecule should be treated.

fragments

list(N_fr)

N/A

An array that designates which sets of atoms are fragments within the molecule. This is a nested array, with the indices of the base array corresponding to the values in fragment_charges and fragment_multiplicities and the values in the nested arrays corresponding to the indices of symbols.

fragment_charges

list(N_fr)

N/A

The total charge of each fragment in fragments. The indices of this array correspond to the fragments ordering.

fragment_multiplicities

list(N_fr)

N/A

The multiplicity of each fragment in fragments. The indices of this array correspond to the fragments ordering.

id

str

N/A

A unique identifier for this molecule.

identifiers

dict

N/A

Additional identifiers by which this molecule can be referenced, such as INCHI, SMILES, etc.

real

list(N_at)

atcorenums

An array indicating whether each atom is real (true) or a ghost/virtual atom (false). The indices of this array correspond to the symbols ordering.

mass_numbers

list(N_at)

atmasses

An array of atomic mass numbers for each atom. The indices of this array correspond to the symbols ordering.

masses

list(N_at)

atmasses

An array of atomic masses [u] for each atom. Typically inferred from symbols. The indices of this array correspond to the symbols ordering.

name

str

title

An arbitrary, common, or human-readable name to assign to this molecule.

Note: N_at corresponds to the number of atoms in the molecule, as defined by the length of symbols; N_fr corresponds to the number of fragments in the molecule, as defined by the length of fragments. Fragment data is stored in a sub-dictionary, fragments.

The following are additional optional keywords used in QCElemental’s QCSchema implementation. These keywords mostly correspond to specific QCElemental functionality, and may not necessarily produce similar results in other QCSchema parsers.

Field

Type

Description

fix_com

bool

An indicator to prevent pre-processing the molecule by translating the COM to (0,0,0) in Euclidean coordinate space.

fix_orientation

bool

An indicator to prevent pre-processing the molecule by orienting via the inertia tensor.

validated

bool

An indicator that the input molecule data has been previously checked for schema and physics (e.g. non-overlapping atoms, feasible multiplicity) compliance. Generally should only be true when set by a trusted validator.

Input Schema

The qcschema_input subschema describes all data necessary to generate and parse a QC program input file for a given molecule.

The following is an example of a minimal qcschema_input file:

{
  "schema_name": "qcschema_input",
  "schema_version": 2.0,
  "molecule": {
    "schema_name": "qcschema_molecule",
    "schema_version": 2.0,
    "symbols":  ["Li", "Cl"],
    "geometry": [0.000000, 0.000000, -1.631761, 0.000000, 0.000000, 0.287958],
    "molecular_charge": 0.0,
    "molecular_multiplicity": 1,
    "provenance": {
      "creator": "HORTON3",
      "routine": "Manual validation"
    }
  },
  "driver": "energy",
  "model": {
    "method": "B3LYP",
    "basis": "Def2TZVP"
  }
}

The required fields and corresponding types for a qcschema_input file are:

Field

Type

IOData attr.

Description

schema_name

str

N/A

The QCSchema specification to which this model conforms. Fixed as qcschema_input.

schema_version

float

N/A

The version number of schema_name to which this model conforms, currently 2.

molecule

dict

N/A

QCSchema Molecule instance.

driver

str

N/A

The type of calculation being performed. One of energy, gradient, hessian, or properties.

model

dict

N/A

The quantum chemistry model specification for a given operation to compute against. See Model section below.

The optional fields and corresponding types for a qcschema_input file are:

Field

Type

IOData attr.

Description

extras

dict

N/A

Extra information associated with the input.

id

str

N/A

An identifier for the input object.

keywords

dict

N/A

QC program-specific keywords to be used for a computation. See details below for IOData-specific usages.

protocols

dict

N/A

Protocols regarding the manipulation of the output that results from this input. See Protocols section below.

provenance

dict or list

N/A

Information about the file was generated, provided, and manipulated. See Provenance section above for more information.

IOData currently supports the following keywords for qcschema_input files:

Keyword

Type

IOData attr.

Description

run_type

str

run_type

The type of calculation that lead to the results stored in IOData, which must be one of the following: energy, energy_force, opt, scan, freq or None.

Model Subschema

The model dict contains the following fields:

Field

Type

IOData attr.

Description

method

str

lot

The level of theory used for the computation (e.g. B3LYP, PBE, CCSD(T), etc.)

basis

str or dict

N/A

The quantum chemistry basis set to evaluate (e.g. 6-31G, cc-pVDZ, etc.) Can be ‘none’ for methods without basis sets. Must be either a string specifying the basis set name (the same as its name in the Basis Set Exchange, when possible) or a qcschema_basis instance.

Protocols Subschema

The protocols dict contains the following fields:

Field

Type

IOData attr.

Description

wavefunction

str

N/A

Specification of the wavefunction properties to keep from the resulting output. One of all, orbitals_and_eigenvalues, return_results, or none.

keep_stdout

bool

N/A

An indicator to keep the output file from the resulting output.

Output Schema

The qcschema_output subschema describes all data necessary to generate and parse a QC program’s output file for a given molecule.

The following is an example of a minimal qcschema_output file:

{
  "schema_name": "qcschema_output",
  "schema_version": 2.0,
  "molecule": {
    "schema_name": "qcschema_molecule",
    "schema_version": 2.0,
    "symbols":  ["Li", "Cl"],
    "geometry": [0.000000, 0.000000, -1.631761, 0.000000, 0.000000, 0.287958],
    "molecular_charge": 0.0,
    "molecular_multiplicity": 1,
    "provenance": {
      "creator": "HORTON3",
      "routine": "Manual validation"
    }
  },
  "driver": "energy",
  "model": {
    "method": "HF",
    "basis": "STO-4G"
  },
  "properties": {},
  "return_result": -464.626219879,
  "success": true
}

The required fields and corresponding types for a qcschema_output file are:

Field

Type

IOData attr.

Description

schema_name

str

N/A

The QCSchema specification to which this model conforms. Fixed as qcschema_output.

schema_version

float

N/A

The version number of schema_name to which this model conforms, currently 2.

molecule

dict

N/A

QCSchema Molecule instance.

driver

str

N/A

The type of calculation being performed. One of energy, gradient, hessian, or properties.

model

dict

N/A

The quantum chemistry model specification for a given operation to compute against.

properties

dict

N/A

Named properties of quantum chemistry computations. See Properties section below.

return_result

varies

N/A

The result requested by the driver. The type depends on the driver.

success

bool

N/A

An indicator for the success of the QC program’s execution.

The optional fields and corresponding types for a qcschema_output file are:

Field

Type

IOData attr.

Description

error

dict

N/A

A complete description of an error-terminated computation. See Error section below.

extras

dict

N/A

Extra information associated with the input. Also specified for qcschema_input.

id

str

N/A

An identifier for the input object. Also specified for qcschema_input.

keywords

dict

N/A

QC program-specific keywords to be used for a computation. See details below for IOData-specific usages. Also specified for qcschema_input.

protocols

dict

N/A

Protocols regarding the manipulation of the output that results from this input. See Protocols section above. Also specified for qcschema_input.

provenance

dict or list

N/A

Information about the file was generated, provided, and manipulated. See Provenance section above for more information. Also specified for qcschema_input.

stderr

str

N/A

The standard error (stderr) of the associated computation.

stdout

str

N/A

The standard output (stdout) of the associated computation.

wavefunction

dict

N/A

The wavefunction properties of a QC computation. All matrices appear in column-major order. See Wavefunction section below.

Properties Subschema

The properties dict contains named properties of quantum chemistry computations. Due to the variability possible for the contents of an output file, IOData does not guess at which properties are desired by the user, and stores all properties in the extra["output]["properties"] dict for easy retrieval. The current QCSchema standard provides names for the following properties:

Field

Description

calcinfo_nbasis

The number of basis functions for the computation.

calcinfo_nmo

The number of molecular orbitals for the computation.

calcinfo_nalpha

The number of alpha electrons in the computation.

calcinfo_nbeta

The number of beta electrons in the computation.

calcinfo_natom

The number of atoms in the computation.

nuclear_repulsion_energy

The nuclear repulsion energy term.

return_energy

The energy of the requested method, identical to return_value for energy computations.

scf_one_electron_energy

The one-electron (core Hamiltonian) energy contribution to the total SCF energy.

scf_two_electron_energy

The two-electron energy contribution to the total SCF energy.

scf_vv10_energy

The VV10 functional energy contribution to the total SCF energy.

scf_xc_energy

The functional (XC) energy contribution to the total SCF energy.

scf_dispersion_correction_energy

The dispersion correction appended to an underlying functional when a DFT-D method is requested.

scf_dipole_moment

The X, Y, and Z dipole components.

scf_total_energy

The total electronic energy of the SCF stage of the calculation.

scf_iterations

The number of SCF iterations taken before convergence.

mp2_same_spin_correlation_energy

The portion of MP2 doubles correlation energy from same-spin (i.e. triplet) correlations.

mp2_opposite_spin_correlation_energy

The portion of MP2 doubles correlation energy from opposite-spin (i.e. singlet) correlations.

mp2_singles_energy

The singles portion of the MP2 correlation energy. Zero except in ROHF.

mp2_doubles_energy

The doubles portion of the MP2 correlation energy including

same-spin and opposite-spin correlations.

mp2_total_correlation_energy

The MP2 correlation energy.

mp2_correlation_energy

The MP2 correlation energy.

mp2_total_energy

The total MP2 energy (MP2 correlation energy + HF energy).

mp2_dipole_moment

The MP2 X, Y, and Z dipole components.

ccsd_same_spin_correlation_energy

The portion of CCSD doubles correlation energy from same-spin (i.e. triplet) correlations.

ccsd_opposite_spin_correlation_energy

The portion of CCSD doubles correlation energy from opposite-spin (i.e. singlet) correlations

ccsd_singles_energy

The singles portion of the CCSD correlation energy. Zero except in ROHF.

ccsd_doubles_energy

The doubles portion of the CCSD correlation energy including same-spin and opposite-spin correlations.

ccsd_correlation_energy

The CCSD correlation energy.

ccsd_total_energy

The total CCSD energy (CCSD correlation energy + HF energy).

ccsd_dipole_moment

The CCSD X, Y, and Z dipole components.

ccsd_iterations

The number of CCSD iterations taken before convergence.

ccsd_prt_pr_correlation_energy

The CCSD(T) correlation energy.

ccsd_prt_pr_total_energy

The total CCSD(T) energy (CCSD(T) correlation energy + HF energy).

ccsd_prt_pr_dipole_moment

The CCSD(T) X, Y, and Z dipole components.

ccsd_prt_pr_iterations

The number of CCSD(T) iterations taken before convergence.

ccsdt_correlation_energy

The CCSDT correlation energy.

ccsdt_total_energy

The total CCSDT energy (CCSDT correlation energy + HF energy).

ccsdt_dipole_moment

The CCSDT X, Y, and Z dipole components.

ccsdt_iterations

The number of CCSDT iterations taken before convergence.

ccsdtq_correlation_energy

The CCSDTQ correlation energy.

ccsdtq_total_energy

The total CCSDTQ energy (CCSDTQ correlation energy + HF energy).

ccsdtq_dipole_moment

The CCSDTQ X, Y, and Z dipole components.

ccsdtq_iterations

The number of CCSDTQ iterations taken before convergence.

Error Subschema

The error dict contains the following fields:

Field

Type

IOData attr.

Description

error_type

str

N/A

The type of error raised during the computation.

error_message

str

N/A

Additional information related to the error, such as the backtrace.

extras

dict

N/A

Additional data associated with the error.

Wavefunction subschema

The wavefunction subschema contains the wavefunction properties of a QC computation. All matrices appear in column-major order. The current QCSchema standard provides names for the following wavefunction properties:

https://github.com/evaleev/libint/wiki/using-modern-CPlusPlus-API#solid-harmonic-gaussians-ordering-and-normalization

Field

Description

basis

A qcschema_basis instance for the one-electron AO basis set. AO basis functions are ordered according to the CCA standard as implemented in libint.

restricted

An indicator for a restricted calculation (alpha == beta). When true, all beta quantites are omitted, since quantity_b == quantity_a

h_core_a

Alpha-spin core (one-electron) Hamiltonian.

h_core_b

Beta-spin core (one-electron) Hamiltonian.

h_effective_a

Alpha-spin effective core (one-electron) Hamiltonian.

h_effective_b

Beta-spin effective core (one-electron) Hamiltonian.

scf_orbitals_a

Alpha-spin SCF orbitals.

scf_orbitals_b

Beta-spin SCF orbitals.

scf_density_a

Alpha-spin SCF density matrix.

scf_density_b

Beta-spin SCF density matrix.

scf_fock_a

Alpha-spin SCF Fock matrix.

scf_fock_b

Beta-spin SCF Fock matrix.

scf_eigenvalues_a

Alpha-spin SCF eigenvalues.

scf_eigenvalues_b

Beta-spin SCF eigenvalues.

scf_occupations_a

Alpha-spin SCF orbital occupations.

scf_occupations_b

Beta-spin SCF orbital occupations.

orbitals_a

Keyword for the primary return alpha-spin orbitals.

orbitals_b

Keyword for the primary return beta-spin orbitals.

density_a

Keyword for the primary return alpha-spin density.

density_b

Keyword for the primary return beta-spin density.

fock_a

Keyword for the primary return alpha-spin Fock matrix.

fock_b

Keyword for the primary return beta-spin Fock matrix.

eigenvalues_a

Keyword for the primary return alpha-spin eigenvalues.

eigenvalues_b

Keyword for the primary return beta-spin eigenvalues.

occupations_a

Keyword for the primary return alpha-spin orbital occupations.

occupations_b

Keyword for the primary return beta-spin orbital occupations.

Filename patterns: *.json

iodata.formats.json.load_one()

  • Always loads atnums, atcorenums, atcoords, charge, nelec, spinpol

  • May load atmasses, bonds, energy, g_rot, lot, obasis, obasis_name, title, extra

iodata.formats.json.dump_one()

  • Requires atnums, atcoords, charge, spinpol

  • May dump title, atcorenums, atmasses, bonds, g_rot, extra

VASP 5 LOCPOT file format (locpot)

This format is used by VASP 5.X and VESTA.

Note that even though the CHGCAR and LOCPOT files look very similar, they require different conversions to atomic units.

Filename patterns: LOCPOT*

iodata.formats.locpot.load_one()

  • Always loads atcoords, atnums, cellvecs, cube, title

MOL2 file format (mol2)

There are different formats of mol2 files. Here the compatibility with AMBER software was the main objective to write out files with atomic charges used by antechamber.

Filename patterns: *.mol2

iodata.formats.mol2.load_one()

  • Always loads atcoords, atnums, atcharges, atffparams

  • May load title

iodata.formats.mol2.dump_one()

  • Requires atcoords, atnums

  • May dump atcharges, atffparams, title

iodata.formats.mol2.load_many()

  • Always loads atcoords, atnums, atcharges, atffparams

  • May load title

iodata.formats.mol2.dump_many()

  • Requires atcoords, atnums, atcharges

  • May dump title

Molden file format (molden)

Many QC codes can write out Molden files, e.g. Molpro, Orca, PSI4, Molden, Turbomole. Keep in mind that several of these write incorrect versions of the file format, but these errors are corrected when loading them with IOData.

Filename patterns: *.molden.input, *.molden

iodata.formats.molden.load_one()

  • Always loads atcoords, atnums, atcorenums, mo, obasis

  • May load title

  • Keyword arguments norm_threshold

iodata.formats.molden.dump_one()

  • Requires atcoords, atnums, mo, obasis

  • May dump atcorenums, title

Molekel file format (molekel)

This format is used by two programs: Molekel and Orca.

Filename patterns: *.mkl

iodata.formats.molekel.load_one()

  • Always loads atcoords, atnums, mo, obasis

  • May load atcharges

  • Keyword arguments norm_threshold

iodata.formats.molekel.dump_one()

  • Requires atcoords, atnums, mo, obasis

  • May dump atcharges

Multiwfn MWFN file format (mwfn)

Filename patterns: *.mwfn

iodata.formats.mwfn.load_one()

  • Always loads atcoords, atnums, atcorenums, energy, mo, obasis, extra, title

Orca output file format (orcalog)

Filename patterns: *.out

iodata.formats.orcalog.load_one()

  • Always loads atcoords, atnums, energy, moments, extra

PDB file format (pdb)

There are different formats of pdb files. The convention used here is the last updated one and is described in this link: http://www.wwpdb.org/documentation/file-format-content/format33/v3.3.html

Filename patterns: *.pdb

iodata.formats.pdb.load_one()

  • Always loads atcoords, atnums, atffparams, extra

  • May load title, bonds

iodata.formats.pdb.dump_one()

  • Requires atcoords, atnums, extra

  • May dump atffparams, title, bonds

iodata.formats.pdb.load_many()

  • Always loads atcoords, atnums, atffparams, extra

  • May load title

iodata.formats.pdb.dump_many()

  • Requires atcoords, atnums, extra

  • May dump atffparams, title

VASP 5 POSCAR file format (poscar)

This format is used by VASP 5.X and VESTA.

Filename patterns: POSCAR*

iodata.formats.poscar.load_one()

  • Always loads atcoords, atnums, cellvecs, title

iodata.formats.poscar.dump_one()

  • Requires atcoords, atnums, cellvecs

  • May dump title

Q-Chem Log file format (qchemlog)

This module will load Q-Chem log file into IODATA.

Filename patterns: *.qchemlog

iodata.formats.qchemlog.load_one()

  • Always loads atcoords, atmasses, atnums, energy, g_rot, mo, lot, obasis_name, run_type, extra

  • May load athessian

SDF file format (sdf)

Usually, the different frames in a trajectory describe different geometries of the same molecule, with atoms in the same order. The load_many and dump_many functions below can also handle an SDF file with different molecules, e.g. a molecular database.

The SDF format is somewhat documented on the following page: http://www.nonlinear.com/progenesis/sdf-studio/v0.9/faq/sdf-file-format-guidance.aspx

This format is one of the chemical table file formats: https://en.wikipedia.org/wiki/Chemical_table_file

Filename patterns: *.sdf

iodata.formats.sdf.load_one()

  • Always loads atcoords, atnums, bonds, title

iodata.formats.sdf.dump_one()

  • Requires atcoords, atnums

  • May dump title, bonds

iodata.formats.sdf.load_many()

  • Always loads atcoords, atnums, bonds, title

iodata.formats.sdf.dump_many()

  • Requires atcoords, atnums

  • May dump title, bonds

Gaussian/GAMESS-US WFN file format (wfn)

Only use this format if the program that generated it does not offer any alternatives that HORTON can load. The WFN format has the disadvantage that it cannot represent contractions and therefore expands all orbitals into a decontracted basis. This makes the post-processing less efficient compared to formats that do support contractions of Gaussian functions.

Filename patterns: *.wfn

iodata.formats.wfn.load_one()

  • Always loads atcoords, atnums, energy, mo, obasis, title, extra

iodata.formats.wfn.dump_one()

  • Requires atcoords, atnums, energy, mo, obasis, title, extra

AIM/AIMAll WFX file format (wfx)

See http://aim.tkgristmill.com/wfxformat.html

Filename patterns: *.wfx

iodata.formats.wfx.load_one()

  • Always loads atcoords, atgradient, atnums, energy, extra, mo, obasis, title

iodata.formats.wfx.dump_one()

  • Requires atcoords, atnums, atcorenums, mo, obasis, charge

  • May dump title, energy, spinpol, lot, atgradient, extra

XYZ file format (xyz)

Usually, the different frames in a trajectory describe different geometries of the same molecule, with atoms in the same order. The load_many and dump_many functions below can also handle an XYZ with different molecules, e.g. a molecular database.

The load_* and dump_* functions all accept the optional argument atom_columns. This argument fixes the meaning of the columns to be loaded from or dumped to an XYZ file. The following example defines, in addition to the conventional columns, also a column with atomic charges and three columns with atomic forces.

atom_columns = iodata.formats.xyz.DEFAULT_ATOM_COLUMNS + [
    # Atomic charges are stored in a dictionary atcharges and they key
    # refers to the name of the partitioning method.
    ("atcharges", "mulliken", (), float, float, "{:10.5f}".format),
    # Note that in IOData, the energy gradient is stored, which contains the
    # negative forces.
    ("atgradient", None, (3,), float,
     (lambda word: -float(word)),
     (lambda value: "{:15.10f}".format(-value)))
]

mol = load_one("test.xyz", atom_columns=atom_columns)
# The following attributes are present:
print(mol.atnums)
print(mol.atcoords)
print(mol.atcharges["mulliken"])
print(mol.atgradient)

When defining atom_columns, no columns can be skipped, such that all information loaded from a file can also be written back out when dumping it.

Filename patterns: *.xyz

iodata.formats.xyz.load_one()

  • Always loads atcoords, atnums, title

  • Keyword arguments atom_columns

iodata.formats.xyz.dump_one()

  • Requires atcoords, atnums

  • May dump title

  • Keyword arguments atom_columns

iodata.formats.xyz.load_many()

  • Always loads atcoords, atnums, title

  • Keyword arguments atom_columns

iodata.formats.xyz.dump_many()

  • Requires atcoords, atnums

  • May dump title

  • Keyword arguments atom_columns