Getting Started¶
IOData can be used to read and write different quantum chemistry file formats.
Script usage¶
The simplest way to use IOData, without writing any code is to use the iodata-convert
script.
iodata-convert in.fchk out.molden
See the --help
option for more details on usage.
Code usage¶
More complex use cases can be implemented in Python, using IOData as a library. IOData stores an object containing the data read from the file.
Reading¶
To read a file, use something like this:
from iodata import load_one
mol = load_one('water.xyz') # XYZ files contain atomic coordinates in Angstrom
print(mol.atcoords) # print coordinates in Bohr.
Note that IOData will automatically convert units from the file format’s official specification to atomic units (which is the format used throughout HORTON3).
The file format is inferred from the extension, but one can override the detection mechanism by manually specifying the format:
from iodata import load_one
mol = load_one('water.foo', 'xyz') # XYZ file with unusual extension
print(mol.atcoords)
IOData also has basic support for loading databases of molecules. For example, the following will iterate over all frames in an XYZ file:
from iodata import load_many
# print the title line from each frame in the trajectory.
for mol in load_many('trajectory.xyz'):
print(mol.title)
Writing¶
IOData can also be used to write different file formats:
from iodata import load_one, dump_one
mol = load_one('water.fchk')
# Here you may put some code to manipulate mol before writing it the data
# to a different file.
dump_one(mol, 'water.molden')
One could also convert (and manipulate) an entire trajectory. The following example converts a geometry optimization trajectory from a Gaussian FCHK file to an XYZ file:
from iodata import load_many, dump_many
# Conversion without manipulation.
dump_many((mol for mol in load_many('water_opt.fchk')), 'water_opt.xyz')
If you wish to perform some manipulations before writing the trajectory, the simplest way is to load the entire trajectory in a list of IOData objects and dump it later:
from iodata import load_many, dump_many
# Read the trajectory
trj = list(load_many('water_opt.fchk'))
# Manipulate if desired
# ...
# Write the trajectory
dump_many(trj, 'water_opt.xyz')
For very large trajectories, you may want to avoid loading it as a whole in
memory. For this, one should avoid making the list
object in the above
example. The following approach would be more memory efficient.
from iodata import load_many, dump_many
def itermols():
for mol in load_many("traj1.xyz"):
# Do some manipulations
yield modified_mol
dump_many(itermols(), "traj2.xyz")
Input files¶
IOData can be used to write input files for quantum-chemistry software. By default minimal settings are used, which can be changed if needed. For example, the following will prepare a Gaussian input for a HF/STO-3G calculation from a PDB file:
from iodata import load_one, write_input
write_input(load_one("water.pdb"), "water.com", fmt="gaussian")
The level of theory and other settings can be modified by setting corresponding attributes in the IOData object:
from iodata import load_one, write_input
mol = load_one("water.pdb")
mol.lot = "B3LYP"
mol.obasis_name = "6-31g*"
mol.run_type = "opt"
write_input(mol, "water.com", fmt="gaussian")
The run types can be any of the following: energy
, energy_force
,
opt
, scan
or freq
. These are translated into program-specific
keywords when the file is written.
It is possible to define a custom input file template to allow for specialized
commands. This is done by passing a template string using the optional template
keyword,
placing each IOData attribute (or additional keyword, as shown below) in curly brackets:
from iodata import load_one, write_input
mol = load_one("water.pdb")
mol.lot = "B3LYP"
mol.obasis_name = "Def2QZVP"
mol.run_type = "opt"
custom_template = """\
%NProcShared=4
%mem=16GB
%chk=B3LYP_def2qzvp_H2O
#n {lot}/{obasis_name} scf=(maxcycle=900,verytightlineq,xqc) integral=(grid=ultrafinegrid) pop=(cm5, hlygat, mbs, npa, esp)
{title}
{charge} {spinmult}
{geometry}
"""
write_input(mol, "water.com", fmt="gaussian", template=custom_template)
The input file template may also include keywords that are not part of the IOData object:
from iodata import load_one, write_input
mol = load_one("water.pdb")
mol.lot = "B3LYP"
mol.obasis_name = "Def2QZVP"
mol.run_type = "opt"
custom_template = """\
%chk={chk_name}
#n {lot}/{obasis_name} {run_type}
{title}
{charge} {spinmult}
{geometry}
"""
# Custom keywords as arguments (best for few extra arguments)
write_input(mol, "water.com", fmt="gaussian", template=custom_template, chk_name="B3LYP_def2qzvp_water")
# Custom keywords from a dict (in cases with many extra arguments)
custom_keywords = {"chk_name": "B3LYP_def2qzvp_waters"}
write_input(mol, "water.com", fmt="gaussian", template=custom_template, **custom_keywords)
In some cases, it may be preferable to load the template from file, instead of defining it in the script:
from iodata import load_one, write_input
mol = load_one("water.pdb")
mol.lot = "B3LYP"
mol.obasis_name = "6-31g*"
mol.run_type = "opt"
write_input(mol, "water.com", fmt="gaussian", template=open("my_template.com", "r").read())
Data storage¶
IOData can be used to store data in a consistent format for writing at a future point.
import numpy as np
from iodata import IOData
mol = IOData(title="water")
mol.atnums = np.array([8, 1, 1])
mol.atcoords = np.array([[0, 0, 0,], [0, 1, 0,], [0, -1, 0,]]) # in Bohr
Unit conversion¶
IOData always represents all quantities in atomic units and unit conversion
constants are defined in iodata.utils
. Conversion to atomic units is done
by multiplication with a unit constant. This convention can be easily
remembered with the following examples:
When you say “this bond length is 1.5 Å”, the IOData equivalent is
bond_length = 1.5 * angstrom
.The conversion from atomic units is similar to axes labels in old papers. For example. a bond length in angstrom is printed as “Bond length / Å”. Expressing this with IOData’s conventions gives
print("Bond length in Angstrom:", bond_length / angstrom)
(This is rather different from the ASE conventions.)