Welcome to IOData’s documentation!¶
IOData is a free and open-source Python library for parsing, storing, and converting various file formats commonly used by quantum chemistry, molecular dynamics, and plane-wave density-functional-theory software programs. It also supports a flexible framework for generating input files for various software packages.
Please use the following citation in any publication using IOData library:
“IOData: A python library for reading, writing, and converting computational chemistry file formats and generating input files.”, T. Verstraelen, W. Adams, L. Pujal, A. Tehrani, B. D. Kelly, L. Macaya, F. Meng, M. Richer, R. Hernandez‐Esparza, X. D. Yang, M. Chan, T. D. Kim, M. Cools‐Ceuppens, V. Chuiko, E. Vohringer‐Martinez,P. W. Ayers, F. Heidar‐Zadeh, J Comput Chem. 2021; 42: 458– 464.
For the list of file formats that can be loaded or dumped by IOData, see Supported File Formats. The two tables below summarize the file formats and features supported by IOData.
Code |
Definition |
---|---|
L |
loading is supported |
D |
dumping is supported |
(d) |
attribute may be derived from other attributes |
R |
attribute is always read |
r |
attribute is read if present |
W |
attribute is always written |
w |
attribute is is written if present |
Attribute |
fchk: LD |
json: LD |
qchemlog: L |
extxyz: L |
wfx: LD |
mwfn: L |
gamess: L |
wfn: LD |
pdb: LD |
molden: LD |
cp2klog: L |
orcalog: L |
molekel: LD |
mol2: LD |
locpot: L |
gromacs: L |
fcidump: LD |
cube: LD |
chgcar: L |
charmm: L |
sdf: LD |
poscar: LD |
xyz: LD |
gaussianlog: L |
|
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Rw |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
rw |
Rw |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
|
Rw |
RW |
R |
r |
RW |
R |
R |
RW |
RW |
RW |
R |
R |
RW |
RW |
R |
R |
. |
RW |
R |
R |
RW |
RW |
RW |
R |
. |
|
|
RW |
Rw |
. |
. |
W |
R |
. |
. |
. |
Rw |
R |
. |
. |
. |
. |
. |
. |
Rw |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
Rw |
. |
. |
. |
. |
Rw |
. |
R |
. |
. |
. |
R |
. |
. |
. |
. |
. |
|
rw |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
|
rw |
. |
. |
r |
Rw |
. |
R |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
|
rw |
. |
r |
. |
. |
. |
R |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
|
rw |
rw |
R |
r |
. |
. |
R |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
R |
. |
. |
. |
. |
. |
|
RW |
RW |
R |
r |
RW |
R |
R |
RW |
RW |
RW |
R |
R |
RW |
RW |
R |
. |
. |
RW |
R |
. |
RW |
RW |
RW |
R |
. |
|
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
|
. |
rw |
. |
. |
. |
. |
. |
. |
rw |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
Rw |
. |
. |
. |
. |
|
. |
. |
. |
r |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
R |
R |
. |
R |
R |
. |
. |
RW |
. |
. |
. |
|
|
w |
RW |
. |
r |
W |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
Rw |
. |
. |
. |
. |
. |
. |
. |
. |
|
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
R |
. |
. |
RW |
R |
. |
. |
. |
. |
. |
. |
|
rw |
r |
R |
r |
Rw |
R |
R |
RW |
. |
. |
R |
R |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
|
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
|
rw |
rw |
R |
r |
Rw |
R |
. |
RW |
RW |
. |
. |
R |
. |
. |
. |
R |
. |
. |
. |
R |
. |
. |
. |
. |
. |
|
. |
rw |
R |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
|
Rw |
r |
R |
. |
w |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
|
Rw |
. |
R |
. |
RW |
R |
. |
RW |
. |
RW |
R |
. |
RW |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
|
rw |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
R |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
|
|
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
|
. |
R |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
Rw |
. |
. |
. |
. |
. |
. |
. |
. |
R |
r |
. |
. |
RW |
R |
. |
RW |
. |
RW |
R |
. |
RW |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
|
Rw |
r |
R |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
|
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
RW |
. |
. |
. |
. |
. |
. |
. |
r |
|
rw |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
|
R |
. |
R |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
|
|
. |
RW |
. |
. |
w |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
Rw |
. |
. |
. |
. |
. |
. |
. |
. |
R |
rw |
. |
R |
Rw |
R |
R |
RW |
rw |
rw |
. |
. |
. |
rw |
R |
R |
. |
w |
R |
r |
Rw |
Rw |
Rw |
R |
. |
|
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
RW |
. |
. |
. |
. |
. |
. |
. |
r |
|
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
. |
User Documentation¶
Installation¶
Stable releases¶
Warning
We are preparing a 1.0 release. Until then, these instructions for installing a stable release will not work yet. If you enjoy living on the edge, try the development release as explained in the “Latest git revision” section below.
Python 3 (>=3.6) must be installed before you can install IOData. In addition, IOData has the following dependencies:
numpy >= 1.0: https://numpy.org/
scipy: https://scipy.org/
attrs >= 20.1.0: https://www.attrs.org/en/stable/index.html
importlib_resources [only for Python 3.6]: https://gitlab.com/python-devs/importlib_resources
Normally, you don’t need to install these dependencies manually. They will be installed automatically when you follow the instructions below.
Installation with Ana- or Miniconda¶
To install IOData using the conda package management system, install miniconda or anaconda first, and then:
# Activate your main conda environment if it is not loaded in your .bashrc.
# E.g. run the following if you have miniconda installed in e.g. ~/miniconda3
source ~/miniconda3/bin/activate
# Create a horton3 conda environment. (optional, recommended)
conda create -n horton3
source activate horton3
# Install the stable release.
conda install -c theochem qc-iodata
# Unstable releases
# (Only do this if you understand the implications.)
# Install the testing release. (beta)
conda install -c theochem/label/test qc-iodata
# Install the development release. (alpha)
conda install -c theochem/label/dev qc-iodata
Installation with Pip¶
You can work in a virtual environment:
# Create a virtual environment in ~/horton3 # Feel free to change the path. python3 -m venv ~/horton3 # Activate the virtual environemnt. source ~/horton3/bin/activate # Install the stable release in the venv horton3. pip3 install qc-iodata # alternative: python3 -m pip install qc-iodata # For developers, install a pre-release (alpha or beta). # (Only do this if you understand the implications.) pip3 install --pre qc-iodata # alternative: python3 -m pip install --pre qc-iodata
You can install into your
${HOME}
directory, without creating a virtual environment.# Install the stable release in your home directory. pip3 install qc-iodata --user # alternative: python3 -m pip install qc-iodata --user # For developers, install a pre-release (alpha or beta). # (Only do this if you understand the implications.) pip3 install --pre qc-iodata --user # alternative: python3 -m pip install --pre qc-iodata --user
This is by far the simplest method, ideal to get started, but you have only one home directory. If the installation breaks due to some experimentation, it is harder to make a clean start in comparison to the other options.
In case the pip3
executable is not found, pip may be installed in a
directory which is not included in your ${PATH}
variable. This seems to be a
common issue on macOS. A simple workaround is to replace pip3
by python3
-m pip
.
In case Python and your operating system are up to date, you may also use
pip
instead of pip3
or python
instead of python3
. The 3
is
only used to avoid potential confusion with Python 2. Note that the 3
is
only present in names of executables, not names of Python modules.
Latest git revision¶
This section shows how one can install the latest revision of IOData from the git repository. This kind of installation comes with some risks (sudden API changes, bugs, …) and so be prepared to accept them when using the following installation instructions.
There are two installation methods:
Quick and dirty. Of this method, there are four variants, depending on the correctness of your
PATH
variable and the presence of a virtual or conda environment. These different scenarios are explained in more detail in the previous section.# with env, correct PATH pip install git+https://github.com/theochem/iodata.git # with env, broken PATH python -m pip install git+https://github.com/theochem/iodata.git # without env, correct PATH pip install git+https://github.com/theochem/iodata.git --user # without env, broken PATH python -m pip install git+https://github.com/theochem/iodata.git --user
Slow and smart. In addition to the four variations in the quick and dirty method, the slow and smart can be used with
pip
or just withsetup.py
. You also have the options to use SSH or HTTPS protocols to clone the git repository. Pick whichever works best for you.# A) Clone git repo with https OR ssh: # The second one only works if you have ssh set up for Github # A1) https git clone https://github.com/theochem/iodata.git # A2) ssh git clone git@github.com:theochem/iodata.git # B) Optionally write the version string pip install roberto # or any of the three other ways of running pip, see above. rob write-version # C) Actual install, 6 different methods. # C1) setup.py, with env python setup.py install # C2) pip, with env, correct PATH pip install . # C3) pip, with env, broken PATH python -m pip install . # C4) setup.py, without env python setup.py install --user # C5) pip, without env, correct PATH pip install . --user # C6) pip, without env, broken PATH python -m pip install . --user
Testing¶
The tests are automatically run when we build packages with conda, but you may try them again on your own machine after installation.
With Ana- or Miniconda:
# Install pytest in your conda env.
conda install pytest pytest-xdist
# Then run the tests.
pytest --pyargs iodata -n auto
With Pip:
# Install pytest in your conda env ...
pip install pytest pytest-xdist
# .. and refresh the virtual environment.
# This is a venv quirk. Without it, pytest may not find IOData.
deactivate && source ~/horton3/activate
# Alternatively, install pytest in your home directory.
pip install pytest pytest-xdist --user
# Finally, run the tests.
pytest --pyargs iodata -n auto
Getting Started¶
IOData can be used to read and write different quantum chemistry file formats.
Script usage¶
The simplest way to use IOData, without writing any code is to use the iodata-convert
script.
iodata-convert in.fchk out.molden
See the --help
option for more details on usage.
Code usage¶
More complex use cases can be implemented in Python, using IOData as a library. IOData stores an object containing the data read from the file.
Reading¶
To read a file, use something like this:
from iodata import load_one
mol = load_one('water.xyz') # XYZ files contain atomic coordinates in Angstrom
print(mol.atcoords) # print coordinates in Bohr.
Note that IOData will automatically convert units from the file format’s official specification to atomic units (which is the format used throughout HORTON3).
The file format is inferred from the extension, but one can override the detection mechanism by manually specifying the format:
from iodata import load_one
mol = load_one('water.foo', 'xyz') # XYZ file with unusual extension
print(mol.atcoords)
IOData also has basic support for loading databases of molecules. For example, the following will iterate over all frames in an XYZ file:
from iodata import load_many
# print the title line from each frame in the trajectory.
for mol in load_many('trajectory.xyz'):
print(mol.title)
Writing¶
IOData can also be used to write different file formats:
from iodata import load_one, dump_one
mol = load_one('water.fchk')
# Here you may put some code to manipulate mol before writing it the data
# to a different file.
dump_one(mol, 'water.molden')
One could also convert (and manipulate) an entire trajectory. The following example converts a geometry optimization trajectory from a Gaussian FCHK file to an XYZ file:
from iodata import load_many, dump_many
# Conversion without manipulation.
dump_many((mol for mol in load_many('water_opt.fchk')), 'water_opt.xyz')
If you wish to perform some manipulations before writing the trajectory, the simplest way is to load the entire trajectory in a list of IOData objects and dump it later:
from iodata import load_many, dump_many
# Read the trajectory
trj = list(load_many('water_opt.fchk'))
# Manipulate if desired
# ...
# Write the trajectory
dump_many(trj, 'water_opt.xyz')
For very large trajectories, you may want to avoid loading it as a whole in
memory. For this, one should avoid making the list
object in the above
example. The following approach would be more memory efficient.
from iodata import load_many, dump_many
def itermols():
for mol in load_many("traj1.xyz"):
# Do some manipulations
yield modified_mol
dump_many(itermols(), "traj2.xyz")
Input files¶
IOData can be used to write input files for quantum-chemistry software. By default minimal settings are used, which can be changed if needed. For example, the following will prepare a Gaussian input for a HF/STO-3G calculation from a PDB file:
from iodata import load_one, write_input
write_input(load_one("water.pdb"), "water.com", fmt="gaussian")
The level of theory and other settings can be modified by setting corresponding attributes in the IOData object:
from iodata import load_one, write_input
mol = load_one("water.pdb")
mol.lot = "B3LYP"
mol.obasis_name = "6-31g*"
mol.run_type = "opt"
write_input(mol, "water.com", fmt="gaussian")
The run types can be any of the following: energy
, energy_force
,
opt
, scan
or freq
. These are translated into program-specific
keywords when the file is written.
It is possible to define a custom input file template to allow for specialized
commands. This is done by passing a template string using the optional template
keyword,
placing each IOData attribute (or additional keyword, as shown below) in curly brackets:
from iodata import load_one, write_input
mol = load_one("water.pdb")
mol.lot = "B3LYP"
mol.obasis_name = "Def2QZVP"
mol.run_type = "opt"
custom_template = """\
%NProcShared=4
%mem=16GB
%chk=B3LYP_def2qzvp_H2O
#n {lot}/{obasis_name} scf=(maxcycle=900,verytightlineq,xqc) integral=(grid=ultrafinegrid) pop=(cm5, hlygat, mbs, npa, esp)
{title}
{charge} {spinmult}
{geometry}
"""
write_input(mol, "water.com", fmt="gaussian", template=custom_template)
The input file template may also include keywords that are not part of the IOData object:
from iodata import load_one, write_input
mol = load_one("water.pdb")
mol.lot = "B3LYP"
mol.obasis_name = "Def2QZVP"
mol.run_type = "opt"
custom_template = """\
%chk={chk_name}
#n {lot}/{obasis_name} {run_type}
{title}
{charge} {spinmult}
{geometry}
"""
# Custom keywords as arguments (best for few extra arguments)
write_input(mol, "water.com", fmt="gaussian", template=custom_template, chk_name="B3LYP_def2qzvp_water")
# Custom keywords from a dict (in cases with many extra arguments)
custom_keywords = {"chk_name": "B3LYP_def2qzvp_waters"}
write_input(mol, "water.com", fmt="gaussian", template=custom_template, **custom_keywords)
In some cases, it may be preferable to load the template from file, instead of defining it in the script:
from iodata import load_one, write_input
mol = load_one("water.pdb")
mol.lot = "B3LYP"
mol.obasis_name = "6-31g*"
mol.run_type = "opt"
write_input(mol, "water.com", fmt="gaussian", template=open("my_template.com", "r").read())
Data storage¶
IOData can be used to store data in a consistent format for writing at a future point.
import numpy as np
from iodata import IOData
mol = IOData(title="water")
mol.atnums = np.array([8, 1, 1])
mol.atcoords = np.array([[0, 0, 0,], [0, 1, 0,], [0, -1, 0,]]) # in Bohr
Unit conversion¶
IOData always represents all quantities in atomic units and unit conversion
constants are defined in iodata.utils
. Conversion to atomic units is done
by multiplication with a unit constant. This convention can be easily
remembered with the following examples:
When you say “this bond length is 1.5 Å”, the IOData equivalent is
bond_length = 1.5 * angstrom
.The conversion from atomic units is similar to axes labels in old papers. For example. a bond length in angstrom is printed as “Bond length / Å”. Expressing this with IOData’s conventions gives
print("Bond length in Angstrom:", bond_length / angstrom)
(This is rather different from the ASE conventions.)
Supported File Formats¶
CHARMM crd file format (charmm
)¶
CHARMM coordinate files contain information about the location of each atom in Cartesian space. The format of the ASCII (CARD) CHARMM coordinate files is: Title line(s), number of atoms in file and the coordinate lines (one for each atom in the file).
The coordinate lines contain specific information about each atom. These have the following structure: Atom number (sequential), residue number (specified relative to first residue in the PSF), residue name, atom type, x-coordinate, y-coordinate, z-coordinate, segment identifier, residue identifier and a weighting array value.
Filename patterns: *.crd
iodata.formats.charmm.load_one()
¶
Always loads
atcoords
,atffparams
,atmasses
,extra
May load
title
VASP 5 CHGCAR file format (chgcar
)¶
This format is used by VASP 5.X and VESTA.
Note that even though the CHGCAR
and LOCPOT
files look very similar, they require
different conversions to atomic units.
Filename patterns: CHGCAR*
, AECCAR*
iodata.formats.chgcar.load_one()
¶
Always loads
atcoords
,atnums
,cellvecs
,cube
,title
CP2K ATOM output file format (cp2klog
)¶
Filename patterns: *.cp2k.out
iodata.formats.cp2klog.load_one()
¶
Always loads
atcoords
,atcorenums
,atnums
,energy
,mo
,obasis
This function assumes that the following subsections are present in the CP2K
ATOM input file, in the section ATOM%PRINT
:
&PRINT
&POTENTIAL
&END POTENTIAL
&BASIS_SET
&END BASIS_SET
&ORBITALS
&END ORBITALS
&END PRINT
Gaussian Cube file format (cube
)¶
Cube files are generated by various QC codes these days, including Gaussian, CP2K, GPAW, Q-Chem, …
Note that the second column in the geometry specification of the cube file is interpreted as the effective core charges.
Filename patterns: *.cube
, *.cub
iodata.formats.cube.load_one()
¶
Always loads
atcoords
,atcorenums
,atnums
,cellvecs
,cube
iodata.formats.cube.dump_one()
¶
Requires
atcoords
,atnums
,cube
May dump
title
,atcorenums
Extended XYZ file format (extxyz
)¶
The extended XYZ file format is defined in the ASE documentation.
Usually, the different frames in a trajectory describe different geometries of the same
molecule, with atoms in the same order. The load_many
function below can also
handle an XYZ with different molecules, e.g. a molecular database.
Filename patterns: *.extxyz
iodata.formats.extxyz.load_one()
¶
Always loads
title
May load
atcoords
,atgradient
,atmasses
,atnums
,cellvecs
,charge
,energy
,extra
iodata.formats.extxyz.load_many()
¶
Always loads
title
May load
atcoords
,atgradient
,atmasses
,atnums
,cellvecs
,charge
,energy
,extra
Gaussian FCHK file format (fchk
)¶
Filename patterns: *.fchk
, *.fch
iodata.formats.fchk.load_one()
¶
Always loads
atcharges
,atcoords
,atnums
,atcorenums
,lot
,mo
,obasis
,obasis_name
,run_type
,title
May load
energy
,atfrozen
,atgradient
,athessian
,atmasses
,one_rdms
,extra
,moments
iodata.formats.fchk.dump_one()
¶
Requires
atnums
,atcorenums
May dump
atcharges
,atcoords
,atfrozen
,atgradient
,athessian
,atmasses
,charge
,energy
,lot
,mo
,one_rdms
,obasis_name
,extra
,moments
iodata.formats.fchk.load_many()
¶
Always loads
atcoords
,atgradient
,atnums
,atcorenums
,energy
,extra
,title
Trajectories from a Gaussian optimization, relaxed scan or IRC calculation are written in
groups of frames, called “points” in the Gaussian world, e.g. to discrimininate between
different values of the constraint in a relaxed geometry. In most cases, e.g. IRC or
conventional optimization, there is only one “point”. Within one “point”, one can have
multiple geometries and their properties. This information is stored in the extra
attribute:
ipoint
is the counter for a pointnpoint
is the total number of points.istep
is the counter within one “point”nstep
is the total number of geometries within in a “point”.reaction_coordinate
is only present in case of an IRC calculation.
Molpro 2012 FCIDUMP file format (fcidump
)¶
Notes¶
This function works only for restricted wave-functions.
One- and two-electron integrals are stored in chemists’ notation in an FCIDUMP file, while IOData internally uses Physicist’s notation.
Keep in mind that the FCIDUMP format changed in MOLPRO 2012, so files generated with older versions are not supported.
Filename patterns: *FCIDUMP*
iodata.formats.fcidump.load_one()
¶
Always loads
core_energy
,one_ints
,nelec
,spinpol
,two_ints
iodata.formats.fcidump.dump_one()
¶
Requires
one_ints
,two_ints
May dump
core_energy
,nelec
,spinpol
The dictionary one_ints
must contain a field core_mo
. Similarly, two_ints
must
contain two_mo
.
GAMESS punch file format (gamess
)¶
Filename patterns: *.dat
iodata.formats.gamess.load_one()
¶
Always loads
title
,energy
,grot
,atgradient
,athessian
,atmasses
,atnums
,atcoords
Gaussian input format (gaussianinput
)¶
Filename patterns: *.com
, *.gjf
iodata.formats.gaussianinput.load_one()
¶
Always loads
atcoords
,atnums
,title
Gaussian Log file format (gaussianlog
)¶
To write out the integrals in a Gaussian log file, which can be loaded with this module, you need to use the following Gaussian command line:
scf(conventional) iop(3/33=5) extralinks=l316 iop(3/27=999)
Filename patterns: *.log
iodata.formats.gaussianlog.load_one()
¶
Always loads
May load
one_ints
,two_ints
GROMACS gro file format (gromacs
)¶
Files with the gro file extension contain a molecular structure in Gromos87 format. GROMACS gro files can be used as trajectory by simply concatenating files.
http://manual.gromacs.org/current/reference-manual/file-formats.html#gro
Filename patterns: *.gro
iodata.formats.gromacs.load_one()
¶
Always loads
atcoords
,atffparams
,cellvecs
,extra
,title
iodata.formats.gromacs.load_many()
¶
Always loads
atcoords
,atffparams
,cellvecs
,extra
,title
QCSchema JSON file format (json
)¶
QCSchema defines four different subschema:
Molecule: specifying a molecular system
Input: specifying QC program input for a specific Molecule
Output: specifying QC program output for a specific Molecule
Basis: specifying a basis set for a specific Molecule
General Usage¶
The QCSchema format is intended to be a catch-all file format for storing and sharing QC calculation
data. Due to the wide number of possibilities of the data contained in a single file, not every
field in a QCSchema file directly corresponds to an IOData attribute. For example,
qcschema_output
files allow for many fields capturing different energy contributions, especially
for coupled-cluster calculations. To accommodate this fact, IOData does not always assume the intent
of the user; instead, IOData ensures that every field in the file is stored in a structured manner.
When a QCSchema field does not correspond to an IOData attribute, that data is instead stored in the
extra
dict, in a dictionary corresponding to the subschema where that data was found. In cases
where multiple subschema contain the relevant field (e.g. the Output subschema contains the entirety
of the Input subschema), the data will be found in the smallest subschema (for the example above, in
IOData.extra["input"]
, not IOData.extra["output"]
).
Dumping an IOData instance to a QCSchema file involves adding relevant required (and optional, if
needed) fields to the necessary dictionaries in the extra
dict. One exception is the
provenance
field: if the only desired provenance data is the creation of the file by IOData,
that data will be added automatically.
The following sections will describe the requirements of each subschema and the behaviour to expect from IOData when loading in or dumping out a QCSchema file.
Schema Definitions¶
Provenance Information¶
The provenance field contains information about how the associated QCSchema object and its attributes were generated, provided, and manipulated. A provenance entry expects these fields:
Field |
Description |
---|---|
creator |
Required. The program that generated, provided, or manipulated this file. |
version |
The version of the creator. |
routine |
The routine of the creator. |
In QCElemental, only a single provenance entry is permitted. When generating a QCSchema file for use
with QCElemental, the easiest way to ensure compliance is to leave the provenance field blank, to
allow the dump_one
function to generate the correct provenance information. However, allowing
only one entry for provenance information limits the ability to properly trace a file through
several operations during complex workflows. With this in mind, IOData supports an enhanced
provenance field, in the form of a list of provenance entries, with new entries appended to the end
of the list.
Molecule Schema¶
The qcschema_molecule
subschema describes a molecular system, and contains the data necessary to
specify a molecular system and support I/O and manipulation processes.
The following is an example of a minimal qcschema_molecule
file:
{
"schema_name": "qcschema_molecule",
"schema_version": 2,
"symbols": ["Li", "Cl"],
"geometry": [0.000000, 0.000000, -1.631761, 0.000000, 0.000000, 0.287958],
"molecular_charge": 0,
"molecular_multiplicity": 1,
"provenance": {
"creator": "HORTON3",
"routine": "Manual validation"
}
}
The required fields and corresponding types for a qcschema_molecule
file are:
Field |
Type |
IOData attr. |
Description |
---|---|---|---|
schema_name |
str |
N/A |
The name of the QCSchema subschema. Fixed as
|
schema_version |
str |
N/A |
The version of the subschema specification. 2.0 is the current version. |
symbols |
list(N_at) |
|
An array of the atomic symbols for the system. |
geometry |
list(3*N_at) |
|
An ordered array of XYZ atomic coordinates,
corresponding to the order of |
molecular_charge |
float |
|
The net electrostatic charge of the molecule. Some writers assume a default of 0. |
molecular_multiplicity |
int |
|
The total multiplicity of this molecule. Some writers assume a default of 1. |
provenance |
dict or list |
N/A |
Information about the file was generated, provided, and manipulated. See Provenance section above for more details. |
Note: N_at corresponds to the number of atoms in the molecule, as defined by the length of
symbols
.
The optional fields and corresponding types for a qcschema_molecule
file are:
Field |
Type |
IOData attr. |
Description |
---|---|---|---|
atom_labels |
list(N_at) |
N/A |
Additional per-atom labels. Typically used for
model conversions, not user assignment. The
indices of this array correspond to the
|
atomic_numbers |
list(N_at) |
|
An array of atomic numbers for each atom.
Typically inferred from |
comment |
str |
N/A |
Additional comments for this molecule. These comments are intended for user information, not any computational tasks. |
connectivity |
list |
|
The connectivity information between each atom
in the |
extras |
dict |
N/A |
Extra information to associate with this molecule. |
fix_symmetry |
str |
|
Maximal point group symmetry with which the molecule should be treated. |
fragments |
list(N_fr) |
N/A |
An array that designates which sets of atoms are
fragments within the molecule. This is a nested
array, with the indices of the base array
corresponding to the values in
|
fragment_charges |
list(N_fr) |
N/A |
The total charge of each fragment in
|
fragment_multiplicities |
list(N_fr) |
N/A |
The multiplicity of each fragment in
|
id |
str |
N/A |
A unique identifier for this molecule. |
identifiers |
dict |
N/A |
Additional identifiers by which this molecule can be referenced, such as INCHI, SMILES, etc. |
real |
list(N_at) |
|
An array indicating whether each atom is real
(true) or a ghost/virtual atom (false). The
indices of this array correspond to the
|
mass_numbers |
list(N_at) |
|
An array of atomic mass numbers for each atom.
The indices of this array correspond to the
|
masses |
list(N_at) |
|
An array of atomic masses [u] for each atom.
Typically inferred from |
name |
str |
|
An arbitrary, common, or human-readable name to assign to this molecule. |
Note: N_at corresponds to the number of atoms in the molecule, as defined by the length of
symbols
; N_fr corresponds to the number of fragments in the molecule, as defined by the length
of fragments
. Fragment data is stored in a sub-dictionary, fragments
.
The following are additional optional keywords used in QCElemental’s QCSchema implementation. These keywords mostly correspond to specific QCElemental functionality, and may not necessarily produce similar results in other QCSchema parsers.
Field |
Type |
Description |
---|---|---|
fix_com |
bool |
An indicator to prevent pre-processing the molecule by translating the COM to (0,0,0) in Euclidean coordinate space. |
fix_orientation |
bool |
An indicator to prevent pre-processing the molecule by orienting via the inertia tensor. |
validated |
bool |
An indicator that the input molecule data has been previously checked for schema and physics (e.g. non-overlapping atoms, feasible multiplicity) compliance. Generally should only be true when set by a trusted validator. |
Input Schema¶
The qcschema_input
subschema describes all data necessary to generate and parse a QC program
input file for a given molecule.
The following is an example of a minimal qcschema_input
file:
{
"schema_name": "qcschema_input",
"schema_version": 2.0,
"molecule": {
"schema_name": "qcschema_molecule",
"schema_version": 2.0,
"symbols": ["Li", "Cl"],
"geometry": [0.000000, 0.000000, -1.631761, 0.000000, 0.000000, 0.287958],
"molecular_charge": 0.0,
"molecular_multiplicity": 1,
"provenance": {
"creator": "HORTON3",
"routine": "Manual validation"
}
},
"driver": "energy",
"model": {
"method": "B3LYP",
"basis": "Def2TZVP"
}
}
The required fields and corresponding types for a qcschema_input
file are:
Field |
Type |
IOData attr. |
Description |
---|---|---|---|
schema_name |
str |
N/A |
The QCSchema specification to which this model
conforms. Fixed as |
schema_version |
float |
N/A |
The version number of |
molecule |
dict |
N/A |
QCSchema Molecule instance. |
driver |
str |
N/A |
The type of calculation being performed. One of
|
model |
dict |
N/A |
The quantum chemistry model specification for a given operation to compute against. See Model section below. |
The optional fields and corresponding types for a qcschema_input file are:
Field |
Type |
IOData attr. |
Description |
---|---|---|---|
extras |
dict |
N/A |
Extra information associated with the input. |
id |
str |
N/A |
An identifier for the input object. |
keywords |
dict |
N/A |
QC program-specific keywords to be used for a computation. See details below for IOData-specific usages. |
protocols |
dict |
N/A |
Protocols regarding the manipulation of the output that results from this input. See Protocols section below. |
provenance |
dict or list |
N/A |
Information about the file was generated, provided, and manipulated. See Provenance section above for more information. |
IOData currently supports the following keywords for qcschema_input
files:
Keyword |
Type |
IOData attr. |
Description |
---|---|---|---|
run_type |
str |
|
The type of calculation that lead to the results
stored in IOData, which must be one of the
following: |
Model Subschema¶
The model
dict contains the following fields:
Field |
Type |
IOData attr. |
Description |
---|---|---|---|
method |
str |
|
The level of theory used for the computation (e.g. B3LYP, PBE, CCSD(T), etc.) |
basis |
str or dict |
N/A |
The quantum chemistry basis set to evaluate (e.g. 6-31G, cc-pVDZ, etc.) Can be ‘none’ for methods without basis sets. Must be either a string specifying the basis set name (the same as its name in the Basis Set Exchange, when possible) or a qcschema_basis instance. |
Protocols Subschema¶
The protocols
dict contains the following fields:
Field |
Type |
IOData attr. |
Description |
---|---|---|---|
wavefunction |
str |
N/A |
Specification of the wavefunction properties to
keep from the resulting output. One of |
keep_stdout |
bool |
N/A |
An indicator to keep the output file from the resulting output. |
Output Schema¶
The qcschema_output
subschema describes all data necessary to generate and parse a QC program’s
output file for a given molecule.
The following is an example of a minimal qcschema_output
file:
{
"schema_name": "qcschema_output",
"schema_version": 2.0,
"molecule": {
"schema_name": "qcschema_molecule",
"schema_version": 2.0,
"symbols": ["Li", "Cl"],
"geometry": [0.000000, 0.000000, -1.631761, 0.000000, 0.000000, 0.287958],
"molecular_charge": 0.0,
"molecular_multiplicity": 1,
"provenance": {
"creator": "HORTON3",
"routine": "Manual validation"
}
},
"driver": "energy",
"model": {
"method": "HF",
"basis": "STO-4G"
},
"properties": {},
"return_result": -464.626219879,
"success": true
}
The required fields and corresponding types for a qcschema_output
file are:
Field |
Type |
IOData attr. |
Description |
---|---|---|---|
schema_name |
str |
N/A |
The QCSchema specification to which this model
conforms. Fixed as |
schema_version |
float |
N/A |
The version number of |
molecule |
dict |
N/A |
QCSchema Molecule instance. |
driver |
str |
N/A |
The type of calculation being performed. One of
|
model |
dict |
N/A |
The quantum chemistry model specification for a given operation to compute against. |
properties |
dict |
N/A |
Named properties of quantum chemistry computations. See Properties section below. |
return_result |
varies |
N/A |
The result requested by the |
success |
bool |
N/A |
An indicator for the success of the QC program’s execution. |
The optional fields and corresponding types for a qcschema_output
file are:
Field |
Type |
IOData attr. |
Description |
---|---|---|---|
error |
dict |
N/A |
A complete description of an error-terminated computation. See Error section below. |
extras |
dict |
N/A |
Extra information associated with the input. Also specified for qcschema_input. |
id |
str |
N/A |
An identifier for the input object. Also specified for qcschema_input. |
keywords |
dict |
N/A |
QC program-specific keywords to be used for a computation. See details below for IOData-specific usages. Also specified for qcschema_input. |
protocols |
dict |
N/A |
Protocols regarding the manipulation of the output that results from this input. See Protocols section above. Also specified for qcschema_input. |
provenance |
dict or list |
N/A |
Information about the file was generated, provided, and manipulated. See Provenance section above for more information. Also specified for qcschema_input. |
stderr |
str |
N/A |
The standard error (stderr) of the associated computation. |
stdout |
str |
N/A |
The standard output (stdout) of the associated computation. |
wavefunction |
dict |
N/A |
The wavefunction properties of a QC computation. All matrices appear in column-major order. See Wavefunction section below. |
Properties Subschema¶
The properties
dict contains named properties of quantum chemistry computations. Due to the
variability possible for the contents of an output file, IOData does not guess at which properties
are desired by the user, and stores all properties in the extra["output]["properties"]
dict for
easy retrieval. The current QCSchema standard provides names for the following properties:
Field |
Description |
---|---|
calcinfo_nbasis |
The number of basis functions for the computation. |
calcinfo_nmo |
The number of molecular orbitals for the computation. |
calcinfo_nalpha |
The number of alpha electrons in the computation. |
calcinfo_nbeta |
The number of beta electrons in the computation. |
calcinfo_natom |
The number of atoms in the computation. |
nuclear_repulsion_energy |
The nuclear repulsion energy term. |
return_energy |
The energy of the requested method, identical to
|
scf_one_electron_energy |
The one-electron (core Hamiltonian) energy contribution to the total SCF energy. |
scf_two_electron_energy |
The two-electron energy contribution to the total SCF energy. |
scf_vv10_energy |
The VV10 functional energy contribution to the total SCF energy. |
scf_xc_energy |
The functional (XC) energy contribution to the total SCF energy. |
scf_dispersion_correction_energy |
The dispersion correction appended to an underlying functional when a DFT-D method is requested. |
scf_dipole_moment |
The X, Y, and Z dipole components. |
scf_total_energy |
The total electronic energy of the SCF stage of the calculation. |
scf_iterations |
The number of SCF iterations taken before convergence. |
mp2_same_spin_correlation_energy |
The portion of MP2 doubles correlation energy from same-spin (i.e. triplet) correlations. |
mp2_opposite_spin_correlation_energy |
The portion of MP2 doubles correlation energy from opposite-spin (i.e. singlet) correlations. |
mp2_singles_energy |
The singles portion of the MP2 correlation energy. Zero except in ROHF. |
mp2_doubles_energy |
|
mp2_total_correlation_energy |
The MP2 correlation energy. |
mp2_correlation_energy |
The MP2 correlation energy. |
mp2_total_energy |
The total MP2 energy (MP2 correlation energy + HF energy). |
mp2_dipole_moment |
The MP2 X, Y, and Z dipole components. |
ccsd_same_spin_correlation_energy |
The portion of CCSD doubles correlation energy from same-spin (i.e. triplet) correlations. |
ccsd_opposite_spin_correlation_energy |
The portion of CCSD doubles correlation energy from opposite-spin (i.e. singlet) correlations |
ccsd_singles_energy |
The singles portion of the CCSD correlation energy. Zero except in ROHF. |
ccsd_doubles_energy |
The doubles portion of the CCSD correlation energy including same-spin and opposite-spin correlations. |
ccsd_correlation_energy |
The CCSD correlation energy. |
ccsd_total_energy |
The total CCSD energy (CCSD correlation energy + HF energy). |
ccsd_dipole_moment |
The CCSD X, Y, and Z dipole components. |
ccsd_iterations |
The number of CCSD iterations taken before convergence. |
ccsd_prt_pr_correlation_energy |
The CCSD(T) correlation energy. |
ccsd_prt_pr_total_energy |
The total CCSD(T) energy (CCSD(T) correlation energy + HF energy). |
ccsd_prt_pr_dipole_moment |
The CCSD(T) X, Y, and Z dipole components. |
ccsd_prt_pr_iterations |
The number of CCSD(T) iterations taken before convergence. |
ccsdt_correlation_energy |
The CCSDT correlation energy. |
ccsdt_total_energy |
The total CCSDT energy (CCSDT correlation energy + HF energy). |
ccsdt_dipole_moment |
The CCSDT X, Y, and Z dipole components. |
ccsdt_iterations |
The number of CCSDT iterations taken before convergence. |
ccsdtq_correlation_energy |
The CCSDTQ correlation energy. |
ccsdtq_total_energy |
The total CCSDTQ energy (CCSDTQ correlation energy + HF energy). |
ccsdtq_dipole_moment |
The CCSDTQ X, Y, and Z dipole components. |
ccsdtq_iterations |
The number of CCSDTQ iterations taken before convergence. |
Error Subschema¶
The error
dict contains the following fields:
Field |
Type |
IOData attr. |
Description |
---|---|---|---|
error_type |
str |
N/A |
The type of error raised during the computation. |
error_message |
str |
N/A |
Additional information related to the error, such as the backtrace. |
extras |
dict |
N/A |
Additional data associated with the error. |
Wavefunction subschema¶
The wavefunction subschema contains the wavefunction properties of a QC computation. All matrices appear in column-major order. The current QCSchema standard provides names for the following wavefunction properties:
Field |
Description |
---|---|
basis |
A |
restricted |
An indicator for a restricted calculation (alpha == beta). When true, all beta quantites are omitted, since quantity_b == quantity_a |
h_core_a |
Alpha-spin core (one-electron) Hamiltonian. |
h_core_b |
Beta-spin core (one-electron) Hamiltonian. |
h_effective_a |
Alpha-spin effective core (one-electron) Hamiltonian. |
h_effective_b |
Beta-spin effective core (one-electron) Hamiltonian. |
scf_orbitals_a |
Alpha-spin SCF orbitals. |
scf_orbitals_b |
Beta-spin SCF orbitals. |
scf_density_a |
Alpha-spin SCF density matrix. |
scf_density_b |
Beta-spin SCF density matrix. |
scf_fock_a |
Alpha-spin SCF Fock matrix. |
scf_fock_b |
Beta-spin SCF Fock matrix. |
scf_eigenvalues_a |
Alpha-spin SCF eigenvalues. |
scf_eigenvalues_b |
Beta-spin SCF eigenvalues. |
scf_occupations_a |
Alpha-spin SCF orbital occupations. |
scf_occupations_b |
Beta-spin SCF orbital occupations. |
orbitals_a |
Keyword for the primary return alpha-spin orbitals. |
orbitals_b |
Keyword for the primary return beta-spin orbitals. |
density_a |
Keyword for the primary return alpha-spin density. |
density_b |
Keyword for the primary return beta-spin density. |
fock_a |
Keyword for the primary return alpha-spin Fock matrix. |
fock_b |
Keyword for the primary return beta-spin Fock matrix. |
eigenvalues_a |
Keyword for the primary return alpha-spin eigenvalues. |
eigenvalues_b |
Keyword for the primary return beta-spin eigenvalues. |
occupations_a |
Keyword for the primary return alpha-spin orbital occupations. |
occupations_b |
Keyword for the primary return beta-spin orbital occupations. |
Filename patterns: *.json
iodata.formats.json.load_one()
¶
Always loads
atnums
,atcorenums
,atcoords
,charge
,nelec
,spinpol
May load
atmasses
,bonds
,energy
,g_rot
,lot
,obasis
,obasis_name
,title
,extra
iodata.formats.json.dump_one()
¶
Requires
atnums
,atcoords
,charge
,spinpol
May dump
title
,atcorenums
,atmasses
,bonds
,g_rot
,extra
VASP 5 LOCPOT file format (locpot
)¶
This format is used by VASP 5.X and VESTA.
Note that even though the CHGCAR
and LOCPOT
files look very similar, they require
different conversions to atomic units.
Filename patterns: LOCPOT*
iodata.formats.locpot.load_one()
¶
Always loads
atcoords
,atnums
,cellvecs
,cube
,title
MOL2 file format (mol2
)¶
There are different formats of mol2 files. Here the compatibility with AMBER software was the main objective to write out files with atomic charges used by antechamber.
Filename patterns: *.mol2
iodata.formats.mol2.load_one()
¶
Always loads
atcoords
,atnums
,atcharges
,atffparams
May load
title
iodata.formats.mol2.dump_one()
¶
Requires
atcoords
,atnums
May dump
atcharges
,atffparams
,title
iodata.formats.mol2.load_many()
¶
Always loads
atcoords
,atnums
,atcharges
,atffparams
May load
title
iodata.formats.mol2.dump_many()
¶
Requires
atcoords
,atnums
,atcharges
May dump
title
Molden file format (molden
)¶
Many QC codes can write out Molden files, e.g. Molpro, Orca, PSI4, Molden, Turbomole. Keep in mind that several of these write incorrect versions of the file format, but these errors are corrected when loading them with IOData.
Filename patterns: *.molden.input
, *.molden
iodata.formats.molden.load_one()
¶
Always loads
atcoords
,atnums
,atcorenums
,mo
,obasis
May load
title
Keyword arguments
norm_threshold
iodata.formats.molden.dump_one()
¶
Requires
atcoords
,atnums
,mo
,obasis
May dump
atcorenums
,title
Molekel file format (molekel
)¶
This format is used by two programs: Molekel and Orca.
Filename patterns: *.mkl
iodata.formats.molekel.load_one()
¶
Always loads
atcoords
,atnums
,mo
,obasis
May load
atcharges
Keyword arguments
norm_threshold
iodata.formats.molekel.dump_one()
¶
Requires
atcoords
,atnums
,mo
,obasis
May dump
atcharges
Multiwfn MWFN file format (mwfn
)¶
Filename patterns: *.mwfn
iodata.formats.mwfn.load_one()
¶
Always loads
atcoords
,atnums
,atcorenums
,energy
,mo
,obasis
,extra
,title
Orca output file format (orcalog
)¶
Filename patterns: *.out
iodata.formats.orcalog.load_one()
¶
Always loads
atcoords
,atnums
,energy
,moments
,extra
PDB file format (pdb
)¶
There are different formats of pdb files. The convention used here is the last updated one and is described in this link: http://www.wwpdb.org/documentation/file-format-content/format33/v3.3.html
Filename patterns: *.pdb
iodata.formats.pdb.load_one()
¶
Always loads
atcoords
,atnums
,atffparams
,extra
May load
title
,bonds
iodata.formats.pdb.dump_one()
¶
Requires
atcoords
,atnums
,extra
May dump
atffparams
,title
,bonds
iodata.formats.pdb.load_many()
¶
Always loads
atcoords
,atnums
,atffparams
,extra
May load
title
iodata.formats.pdb.dump_many()
¶
Requires
atcoords
,atnums
,extra
May dump
atffparams
,title
VASP 5 POSCAR file format (poscar
)¶
This format is used by VASP 5.X and VESTA.
Filename patterns: POSCAR*
iodata.formats.poscar.load_one()
¶
Always loads
atcoords
,atnums
,cellvecs
,title
iodata.formats.poscar.dump_one()
¶
Requires
atcoords
,atnums
,cellvecs
May dump
title
Q-Chem Log file format (qchemlog
)¶
This module will load Q-Chem log file into IODATA.
Filename patterns: *.qchemlog
iodata.formats.qchemlog.load_one()
¶
Always loads
atcoords
,atmasses
,atnums
,energy
,g_rot
,mo
,lot
,obasis_name
,run_type
,extra
May load
athessian
SDF file format (sdf
)¶
Usually, the different frames in a trajectory describe different geometries of the same
molecule, with atoms in the same order. The load_many
and dump_many
functions
below can also handle an SDF file with different molecules, e.g. a molecular database.
The SDF format is somewhat documented on the following page: http://www.nonlinear.com/progenesis/sdf-studio/v0.9/faq/sdf-file-format-guidance.aspx
This format is one of the chemical table file formats: https://en.wikipedia.org/wiki/Chemical_table_file
Filename patterns: *.sdf
iodata.formats.sdf.load_one()
¶
Always loads
atcoords
,atnums
,bonds
,title
iodata.formats.sdf.dump_one()
¶
Requires
atcoords
,atnums
May dump
title
,bonds
iodata.formats.sdf.load_many()
¶
Always loads
atcoords
,atnums
,bonds
,title
iodata.formats.sdf.dump_many()
¶
Requires
atcoords
,atnums
May dump
title
,bonds
Gaussian/GAMESS-US WFN file format (wfn
)¶
Only use this format if the program that generated it does not offer any alternatives that HORTON can load. The WFN format has the disadvantage that it cannot represent contractions and therefore expands all orbitals into a decontracted basis. This makes the post-processing less efficient compared to formats that do support contractions of Gaussian functions.
Filename patterns: *.wfn
iodata.formats.wfn.load_one()
¶
Always loads
atcoords
,atnums
,energy
,mo
,obasis
,title
,extra
iodata.formats.wfn.dump_one()
¶
Requires
atcoords
,atnums
,energy
,mo
,obasis
,title
,extra
AIM/AIMAll WFX file format (wfx
)¶
See http://aim.tkgristmill.com/wfxformat.html
Filename patterns: *.wfx
iodata.formats.wfx.load_one()
¶
Always loads
atcoords
,atgradient
,atnums
,energy
,extra
,mo
,obasis
,title
iodata.formats.wfx.dump_one()
¶
Requires
atcoords
,atnums
,atcorenums
,mo
,obasis
,charge
May dump
title
,energy
,spinpol
,lot
,atgradient
,extra
XYZ file format (xyz
)¶
Usually, the different frames in a trajectory describe different geometries of the same
molecule, with atoms in the same order. The load_many
and dump_many
functions
below can also handle an XYZ with different molecules, e.g. a molecular database.
The load_*
and dump_*
functions all accept the optional argument
atom_columns
. This argument fixes the meaning of the columns to be loaded
from or dumped to an XYZ file. The following example defines, in addition to the
conventional columns, also a column with atomic charges and three columns with
atomic forces.
atom_columns = iodata.formats.xyz.DEFAULT_ATOM_COLUMNS + [
# Atomic charges are stored in a dictionary atcharges and they key
# refers to the name of the partitioning method.
("atcharges", "mulliken", (), float, float, "{:10.5f}".format),
# Note that in IOData, the energy gradient is stored, which contains the
# negative forces.
("atgradient", None, (3,), float,
(lambda word: -float(word)),
(lambda value: "{:15.10f}".format(-value)))
]
mol = load_one("test.xyz", atom_columns=atom_columns)
# The following attributes are present:
print(mol.atnums)
print(mol.atcoords)
print(mol.atcharges["mulliken"])
print(mol.atgradient)
When defining atom_columns
, no columns can be skipped, such that all
information loaded from a file can also be written back out when dumping it.
Filename patterns: *.xyz
iodata.formats.xyz.load_one()
¶
Always loads
atcoords
,atnums
,title
Keyword arguments
atom_columns
iodata.formats.xyz.dump_one()
¶
Requires
atcoords
,atnums
May dump
title
Keyword arguments
atom_columns
iodata.formats.xyz.load_many()
¶
Always loads
atcoords
,atnums
,title
Keyword arguments
atom_columns
iodata.formats.xyz.dump_many()
¶
Requires
atcoords
,atnums
May dump
title
Keyword arguments
atom_columns
Supported Input Formats¶
Gaussian Input Module (gaussian
)¶
iodata.formats.gaussian.write_input()
¶
Requires
atnums
,atcoords
May use
title
,run_type
,lot
,obasis_name
,spinmult
,charge
Default Template¶
'''\
#n {lot}/{obasis_name} {run_type}
{title}
{charge} {spinmult}
{geometry}
'''
Orca Input Module (orca
)¶
iodata.formats.orca.write_input()
¶
Requires
atnums
,atcoords
May use
title
,run_type
,lot
,obasis_name
,spinmult
,charge
Default Template¶
'''\
! {lot} {obasis_name} {run_type}
# {title}
*xyz {charge} {spinmult}
{geometry}
*
'''
Basis set conventions¶
IOData can load molecular orbital coefficients, density matrices and atomic orbital basis sets from various file formats, and it can also write orbitals and the basis sets in the Molden format. To achieve an unambiguous numerical representation of these objects, conventions for the ordering basis functions (within one shell) and normalization of Gaussian primitives must be fixed.
IOData does not use hard-coded conventions but keeps track of them in attributes
of them in IOData.obasis
. This attribute is an instance of the
iodata.basis.MolecularBasis
class, of which the conventions
and
primitive_normalization
attributes contain all the relevant information.
For the time being, the primitive_normalization
is always set to 'L2'
,
meaning that the contraction coefficients assume L2-normalized Gaussian
primitives. However, IOData does not enforce normalized contractions.
The first subsection provides a mathematical definition of the Gaussian basis
functions, which is followed by the specification of the conventions
attribute of the MolecularBasis
class.
Gaussian basis functions¶
IOData supports contracted Gaussian basis functions, which have in general the following form:
where \(K\) is the contraction length, \(D_k\) is a contraction coefficient, \(N\) is a normalization constant, \(P\) is a Cartesian polynomial, \(\alpha_k\) is an exponent and \(\mathbf{r}_A\) is the center of the basis function. The summation over \(k\) is conventionally called a contraction of primitive Gaussian basis functions. The L2-normalization of each primitive depends on both the polynomial and the exponent and is defined by the following relation:
Two types of polynomials will be defined below: Cartesian and pure (harmonic) basis functions.
Cartesian basis functions¶
When the polynomial consists of a single term as follows:
with \(n_x\), \(n_y\), \(n_z\), zero or positive integer powers, one speaks of Cartesian Gaussian basis functions. One refers to the sum of the powers as the angular momentum of the Cartesian Gaussian basis.
The normalization constant of a primitive function is:
In practice one combines all basis functions of a given angular momentum (or algebraic order) into one shell. A basis specification typically only mentions the total angular momentum, and it is assumed that all polynomials of that order are included in the basis set. The number of basis functions, i.e. the number of polynomials, for a given angular momentum, \(\ell=n_x+n_y+n_z\), is \((\ell+1)(\ell+2)/2\).
Pure or harmonic basis functions¶
When the polynomial is a real regular solid harmonic, one speaks of pure Gaussian basis functions:
where \(C_{\ell m}\) and \(S_{\ell m}\) are cosine- and sine-like real regular solid harmonics, defined for \(\ell \ge 0\) as follows:
where \(R_\ell^m\) are the regular solid harmonics, which have in general complex function values. The factor \((-1)^m\) undoes the Condon-Shortley phase. In these equations, spherical coordinates are used:
The regular solid harmonics are derived from the standard spherical harmonics, \(Y_\ell^m\), as follows:
where \(P_\ell^m\) are the associated Legendre functions. After substituting this definition of the regular solid harmonics into the real forms, one obtains:
Also here, the factor \((-1)^m\) cancels out the Condon-Shortley phase. These expressions show that cosine-like functions contain a factor \(\cos(m \phi)\), and similarly the sine-like functions contain a factor \(\sin(m \phi)\). The factor \(r^\ell\) causes real regular solid harmonics to be homogeneous Cartesian polynomials, i.e. linear combinations of the Cartesian polynomials defined in the previous subsection.
Real regular solid harmonics are used because the pure s- and p-type functions are consistent with their Cartesian counterparts:
The normalization constant of a pure Gaussian basis function is:
In practical applications, all the basis functions of a given angular momentum are used and grouped into a shell. A basis specification typically only mentions the total angular momentum, and it is assumed that all polynomials of that order are included in the basis set. The number of basis functions, i.e. the number of polynomials, for a given angular momentum, \(\ell\), is \(2\ell+1\).
The conventions
attribute¶
Different file formats supported by IOData have an incompatible ordering of
basis functions within one shell. Also the sign conventions may differ from
the definitions given above. The conventions
attribute of
iodata.basis.MolecularBasis
specifies the ordering and sign flips
relative to the above definitions. It is a dictionary,
whose keys are tuples denoting a shell type
(angmom, char)
whereangmom
is a positive integer denoting the angular momentum andchar
is either'c'
or'p'
for Cartesian are pure, respectivelyand whose values are lists of basis function strings, where each string denotes one basis function.
A basis function string has a one-to-one correspondence to the Cartesian or pure polynomials defined above.
In case of Cartesian functions, \(x^{n_x} y^{n_y} z^{n_z}\) is represented by the string
'x' * nx + 'y' * ny + 'z' * nz
, except for the s-type function, which is represented by'1'
.In case of pure functions, \(C_{\ell m}\) is represented by
'c{}'.format(m)
and \(S_{\ell m}\) is by's{}'.format(m)
. The angular momentum quantum number is not included because it is implied by the key in theconventions
dictionary.
Each basis function string can be prefixed with a minus sign, to denote a sign flip with respect to the definitions on this page. The order of the string in the list defines the order of the corresponding basis functions within one shell.
For example, pure and Cartesian s, p and d functions in Gaussian FCHK files adhere to the following convention:
conventions = {
(0, 'c'): ['1'],
(1, 'c'): ['x', 'y', 'z'],
(2, 'c'): ['xx', 'yy', 'zz', 'xy', 'xz', 'yz'],
(2, 'p'): ['c0', 'c1', 's1', 'c2', 's2'],
}
(Pure s and p functions are never used in a Gaussian FCHK file.)
Notes on other conventions¶
To avoid confusion, negative magnetic quantum numbers are never used to label pure functions in IOData. The basis strings contain ‘c’ and ‘s’ instead. In the literature, e.g. in the book Molecular Electronic-Structure Theory by Helgaker, Jørgensen and Olsen, negative magnetic quantum numbers for pure functions are usually referring to sine-like functions:
Note that \(\ell\) and \(m\) both appear as subscripts in \(R_{\ell, m}\) and \(R_{\ell, -m}\) to tell them apart from their complex counterparts.
Transformation from Cartesian to pure functions¶
Pure Gaussian primitives can written as linear combinations of Cartesian ones. Hence, integrals over Cartesian functions can also be transformed into integrals over pure primitives. This transformation is the last step in the calculation of the overlap matrix in IOData:
Integrals are first computed for Gaussian primitives without normalization.
Normalization constants for Cartesian primitives are multiplied into the integrals.
Integrals over primitives are contracted.
Optionally, the integrals for Cartesian functions are transformed into integrals for pure functions.
For the last step, pre-computed transformations matrices (generated by
tools/harmonics.py
are stored in iodata/overlap_cartpure.py
using the
HORTON2_CONVENTIONS
. The derivation of these transformation matrices is
explained below.
Recursive computation of real regular solid harmonics¶
First, we construct two sets of recursion relations for \(\phi\) and \(\theta\) separately. These will be combined to form the final set of recursion relations that directly operate on the real regular solid harmonics. In these two sets, the notation \(\rho = \sqrt{x^2 + y^2}\) is used.
The first set of recursion relations starts from a fairly trivial idea:
Second, recursion relations for associated Legendre functions can be modified to contain \(r\), \(z\) and \(\rho\), such that \(\cos\theta\) does not appear explicitly:
The two sets could be used separately to construct real regular solid harmonics, but they feature \(\rho=\sqrt{x^2+y^2}\), while the regular solid harmonics should be homogeneous polynomials. We can get rid of \(\rho\) by combining the two sets into one:
These equations show that real regular solid harmonics are homogeneous polynomials in \(x\), \(y\) and \(z\). Advantages of this approach are (i) the absence of trigonometric expressions and (ii) the similarity between cosine and sine expressions. (Coefficients can be reused.) These recursion relations should be numerically stable for the computation of real regular solid harmonics as a function of Cartesian coordinates. They can also be used to build a transformation matrix from Cartesian mononomials into real regular solid harmonics.
Transformation matrices without normalization¶
The above recursion relations result in the following transformation matrices. These were obtained by running:
python tools/harmonics.py none latex 3
Taking into account normalization¶
For the calculation of the overlap matrix, the transformations need to be modified, to transform normalized Cartesian functions into normalized pure functions. Accounting for normalization yields slightly different matrices shown below. These were obtained by running:
python tools/harmonics.py L2 latex 3
IOData Changelog¶
Version 1.0.0
Originally, IOData was a subpackage of HORTON2. It is currently factored out, modernized and ported to Python 3. In this process, the API was seriously refactored, essentially designed from scratch. Compared to HORTON2, IOData 1.0.0 contains the following API-breaking changes:
The user-facing API is now a set of five functions:
iodata.api.load_one()
,iodata.api.dump_one()
,iodata.api.load_many()
,iodata.api.dump_many()
andiodata.api.write_input()
.The
iodata.iodata.IOData
object is implemented with the attrs module, which facilites type hinting and checking.The
load_many
anddump_many
functions can handle trajectories and database formats. (At the time of writing, only XYZ and FCHK are supported.)The
write_input
function can be used to prepare inputs for quantum chemistry software. This function supports user-provided templates.IOData does not impose a specific ordering of the atomic orbital basis functions (within one shell). Practically all possible conventions are supported and one can easily convert from one to another.
All attributes of IOData are either built-in Python types, Numpy arrays or NamedTuples defined in IOData. It no longer relies on other parts of HORTON2 to define these data types. (This is most relevant for the orbital basis, the molecular orbitals and the cube data.)
Nearly all attributes of the IOData class have been renamed to more systematic terminology.
All file format modules have an identical API (and therefore do not fit into a single namespace).
Ghost atoms are now loaded as atoms with a zero effective core charge (
atcorenums
).Spin multiplicity is no longer used. Instead, the spin polarization is stored = abs(nalpha - nbeta).
The internal HDF5 file format support has been removed.
Many smaller changes have been made, which would be too tedious to be listed here.
In addition, several new attributes were added to the IOData
class, and
several of them can also be read from file formats we already supported
previously. This work will be expanded upon in future releases.
Acknowledgments¶
This software was developed using funding from a variety of international sources including, but not limited to: Canarie, Canada Research Chairs, Compute Canada, European Union’s Horizon 2020 Marie Sklodowska-Curie Actions (Individual Fellowship No 800130), Foundation of Scientific Research–Flanders (FWO), McMaster University, Queen’s University, Natural Sciences and Engineering Research Council of Canada (NSERC), National Fund for Scientific and Technological Development of Chile (FONDECYT), Research Board of Ghent University (BOF), and Compute Canada.
Developer Documentation¶
Contributing¶
We’d love you to contribute. Here are some practical hints to help out.
This document assumes you are familiar with Bash and Python.
General recommendations¶
Please, be careful with tools like autopep8, black or yapf. They may result in a massive number of changes, making pull requests harder to review. Also, when using them, use a maximum line length of 100. To avoid confusion, only clean up the code you are working on. A safer option is to use
cardboardlint -F -r master
. This will only clean code where you have already made changes.Do not add module-level
pylint: disable=...
lines, except for theno-member
warning in the unit test modules. When adding pylint exception, place them as locally as possible and make sure they are justified.Use type hinting to document the types of function (and method) arguments and return values. This is not yet consistently done throughout IOData at the moment, but it would be helpful to do so in future pull requests. Avoid using strings to postpone the evaluation of the type. (See PEP 0563 for more details on postponed type annotation.)
In unit testing, use
np.testing.assert_allclose
andnp.testing.assert_equal
for comparing floating-point and integer numpy arrays respectively.np.testing.assert_allclose
can also be used for comparing floating point scalars. In all other cases (not involving floating point numbers), the simpleassert a == b
works equally well and is more readable.IOData always uses atomic units internally. See Unit conversion for details.
Adding new file formats¶
Each file format is implemented in a module of the package iodata.formats
.
These modules all follow the same API. Please consult existing formats for some
guidance, e.g. the iodata.formats.xyz
is a simple but complete
example. From the following list, PATTERNS
and one of the functions must
be implemented:
PATTERNS = [ ... ]
: a list of glob patterns used to recognize file formats from the file names. This is used to select the correct module fromiodata.formats
in functions iniodata.api
.load_one
: load a single IOData object.dump_one
: dump a single IOData object.load_many
: load multiple IOData objects (iterator) from a single file.dump_many
: dump multiple IOData objects (iterator) to a single file.
load_one
function: reading a single IOData object from a file¶
In order to read from a new file format, the module must contain a load_one
function with the following signature:
@document_load_one("format", ['list', 'of', 'guaranteed', 'attributes'],
['list', 'of', 'attributes', 'which', 'may', 'be', 'read'],
notes)
def load_one(lit: LineIterator) -> dict:
"""Do not edit this docstring. It will be overwritten."""
# Actual code to read the file
The LineIterator
instance provides a convenient interface for reading files
and can be found in iodata.utils
. As a rule of thumb, always use
next(lit)
to read a new line from the file. You can use this iterator in
a few ways:
# When you need to read one line.
line = next(lit)
# When sections appear in a file in fixed order, you can use helper functions.
data1 = _load_helper_section1(lit)
data2 = _load_helper_section2(lit)
# When you intend to read everything in a file (not for trajectories).
for line in lit:
# do something with line.
# When you just need to read a section.
for line in lit:
# do something with line
if done_with_section:
break
# When you need a fixed numbers of lines, say 10.
for i in range(10):
line = next(lit)
# More complex example, in which you detect several sections and call other
# functions to parse those sections. The code is not sensitive to the
# order of the sections.
while True:
line = next(lit)
if end_pattern in line:
break
elif line == 'section1':
data1 = _load_helper_section1(lit)
elif line == 'section2':
data2 = _load_helper_section2(lit)
# Same as above, but reading till end of file. You cannot use a for loop
# when multiple lines must be read in one iteration.
while True:
try:
line = next(lit)
except StopIteration:
break
if end_pattern in line:
break
elif line == 'section1':
data1 = _load_helper_section1(lit)
elif line == 'section2':
data2 = _load_helper_section2(lit)
In some cases, one may have to push back a line because it was read too early.
For example, in the Molden format, this is sometimes unavoidable. When needed
you can push back the line for later reading with lit.back(line)
.
# When you just need to read a section
for line in lit:
# do something with line
if done_with_section:
# only now it becomes clear that you've read one line to far
lit.back(line)
break
When you encounter a file-format error while reading the file, call
lit.error(msg)
, where msg
is a short message describing the problem.
The error appearing on screen will automatically also contain the filename
and line number.
dump_one
functions: writing a single IOData object to a file¶
The dump_one
functions are conceptually simpler: they just receive an open
file object and an IOData
instance as arguments, and should write the data
to the open file.
@document_dump_one("format", ['guaranteed', 'attributes'], ['optional', 'attribtues'], notes)
def dump_one(f: TextIO, data: IOData):
"""Do not edit this docstring. It will be overwritten."""
# code to write data to f.
load_many
function: reading multiple IOData objects from a single file¶
This function works essentially in the same way as load_one
, but can load
multiple molecules. For example:
@document_load_many("XYZ", ['atcoords', 'atnums', 'title'])
def load_many(lit: LineIterator) -> Iterator[dict]:
"""Do not edit this docstring. It will be overwritten."""
# XYZ Trajectory files are a simple concatenation of individual XYZ files,'
# making it travial to load many frames.
while True:
try:
yield load_one(lit)
except StopIteration:
return
The XYZ trajectory format is simply a concatenation of individual XYZ files,
such that one can use the load_one function to read a single frame. In some
file formats, more complicated approaches are needed. In any case, one must
use the yield
keyword for every frame read from a file.
dump_many
function: writing multiple IOData objects to a single file¶
Also dump_many
is very similar to dump_one
, but just takes an iterator
over multiple IOData instances as argument. It is expected to write all of these
to a single open file object. For example:
@document_dump_many("XYZ", ['atcoords', 'atnums'], ['title'])
def dump_many(f: TextIO, datas: Iterator[IOData]):
"""Do not edit this docstring. It will be overwritten."""
# Similar to load_many, this is relatively easy.
for data in datas:
dump_one(f, data)
Also here, we take advantage of the simple structure of the XYZ trajectory format, i.e. the simple concatenation of individual XYZ files. For other formats, this could become more complicated.
Github work flow¶
Before diving into technicalities: if you intend to make major changes, beyond fixing bugs and small functionality improvements, please open a Github issue first, so we can discuss before coding. Please explain what you intend to accomplish and why. That often saves a lot of time and trouble in the long run.
Use the issue to plan your changes. Try to solve only one problem at a time, instead of fixing several issues and adding different features in a single shot. Small changes are easier to handle, also for the reviewer in the last step below.
Mention in the corresponding issue when you are working on it. “Claim” the issue to avoid duplicate efforts.
Check your GitHub settings and your local git configuration:
If you don’t have an SSH key pair yet, create one with the following terminal command:
ssh-keygen -t rsa -b 4096 -C "your_email@example.com"
A suitable name for this key would be
id_rsa_github
. An empty pass phrase is convenient and should be fine. This will generate a private and a public key in${HOME}/.ssh
.Upload your public SSH key to https://github.com/settings/keys. This is a single long line in
id_rsa_github.pub
, which you can copy-paste into the browser.Configure SSH to use this key pair for authentication when pushing branches to Github. Add the following to your
.ssh/config
file:Host github.com Hostname github.com ForwardX11 no IdentityFile /home/your_user_name/.ssh/id_rsa_github
(Make sure you have the correct path to the private key file.)
Configure git to use the name and e-mail address tied to your Github account:
git config --global user.name "Your Name" git config --global user.email "youremail@yourdomain.com"
Install Roberto, which is the driver for our CI setup. It can also replicate the continuous integration on your local machine, which makes it easier to prepare a passable pull request. See https://theochem.github.io/roberto/.
Make a fork of the project, using the Github “fork” feature.
Clone the original repository on your local machine and enter the directory
git clone git@github.com:theochem/iodata.git cd iodata
Add your fork as a second remote to your local repository, for which we will use the short name
mine
below, but any short name is fine:git remote add mine git@github.com:<your-github-account>/iodata.git
Make a new branch, with a name that hints at the purpose of your modification:
git checkout -b new-feature
Make changes to the source. Please, make it easy for others to understand your code. Also, add tests that verify your code works as intended. Rules of thumb:
Write transparent code, e.g. self-explaining variable names.
Add comments to passages that are not easy to understand at first glance.
Write docstrings explaining the API.
Add unit tests when feasible.
Commit your changes with a meaningful commit message. The first line is a short summary, written in the imperative mood. Optionally, this can be followed by an empty line and a longer description.
If you feel the summary line is too short to describe what you did, it may be better to split your changes into multiple commits.
Run Roberto and fix all problems it reports. Either one of the following should work
rob # Normal case python3 -m roberto # Only if your PATH is not set correctly
Style issues, failing tests and packaging issues should all be detected at this stage.
Push your branch to your forked repository on Github:
git push mine -u new-feature
A link should be printed on screen, which will take the next step for you.
Make a pull request from your branch new-feature in your forked repository to the master branch in the original repository.
Wait for the tests on Travis-CI to complete. These should pass. Also coverage analysis will be shown, but this is merely indicative. Normally, someone should review your pull request in a few days. Ideally, the review results in minor corrections at worst. We’ll do our best to avoid larger problems in step 1.
Notes on attrs¶
IOData uses the attrs library, not to be confused with the attr library,
for classes representing data loaded from files: IOData
, MolecularBasis
,
Shell
, MolecularOrbitals
and Cube
. This enables basic attribute
validation, which eliminates potentially silly bugs.
(See iodata/attrutils.py
and the usage of validate_shape
in all those
classes.)
The following attrs
functions could be convenient when working with these
classes:
The data can be turned into plain Python data types with the
attr.asdict
function. Make sure you add theretain_collection_types=True
option, to avoid the following issue: https://github.com/python-attrs/attrs/issues/646 For example.from iodata import load_one import attr iodata = load_one("example.xyz") fields = attr.asdict(iodata, retain_collection_types=True)
A similar
astuple
function works as you would expect.A shallow copy with a few modified attributes can be created with the evolve method, which is a wrapper for
attr.evolve
:from iodata import load_one import attr iodata1 = load_one("example.xyz") iodata2 = attr.evolve(iodata1, title="another title")
The usage of evolve becomes mandatory when you want to change two or more attributes whose shape need to be consistent. For example, the following would fail:
from iodata import IOData iodata = IOData(atnums=[7, 7], atcoords=[[0, 0, 0], [2, 0, 0]]) # The next line will fail because the size of atnums and atcoords # becomes inconsistent. iodata.atnums = [8, 8, 8] iodata.atcoords = [[0, 0, 0], [2, 0, 1], [4, 0, 0]]
The following code, which has the same intent, does work:
from iodata import IOData import attr iodata1 = IOData(atnums=[7, 7], atcoords=[[0, 0, 0], [2, 0, 0]]) iodata2 = attr.evolve( iodata1, atnums=[8, 8, 8], atcoords=[[0, 0, 0], [2, 0, 1], [4, 0, 0]], ) For brevity, lists (of lists) were used in these examples. These are always converted to arrays by the constructor or when assigning them to attributes.
API Reference¶
iodata¶
iodata package¶
Subpackages¶
iodata.formats package¶
Submodules¶
iodata.formats.charmm module¶
CHARMM crd file format.
CHARMM coordinate files contain information about the location of each atom in Cartesian space. The format of the ASCII (CARD) CHARMM coordinate files is: Title line(s), number of atoms in file and the coordinate lines (one for each atom in the file).
The coordinate lines contain specific information about each atom. These have the following structure: Atom number (sequential), residue number (specified relative to first residue in the PSF), residue name, atom type, x-coordinate, y-coordinate, z-coordinate, segment identifier, residue identifier and a weighting array value.
- load_one(lit)[source]¶
Load a single frame from a CRD file.
- Parameters:
lit (
LineIterator
) – The line iterator to read the data from.- Returns:
result – A dictionary with IOData attributes. The following attributes are guaranteed to be loaded:
atcoords
,atffparams
,atmasses
,extra
. The following may be loaded if present in the file:title
.- Return type:
Notes
iodata.formats.chgcar module¶
VASP 5 CHGCAR file format.
This format is used by VASP 5.X and VESTA.
Note that even though the CHGCAR
and LOCPOT
files look very similar, they require
different conversions to atomic units.
- load_one(lit)[source]¶
Load a single frame from a VASP 5 CHGCAR file.
- Parameters:
lit (
LineIterator
) – The line iterator to read the data from.- Returns:
result – A dictionary with IOData attributes. The following attributes are guaranteed to be loaded:
atcoords
,atnums
,cellvecs
,cube
,title
.- Return type:
Notes
iodata.formats.cp2klog module¶
CP2K ATOM output file format.
- load_one(lit)[source]¶
Load a single frame from a CP2K ATOM outupt file.
- Parameters:
lit (
LineIterator
) – The line iterator to read the data from.- Returns:
result – A dictionary with IOData attributes. The following attributes are guaranteed to be loaded:
atcoords
,atcorenums
,atnums
,energy
,mo
,obasis
.- Return type:
Notes
This function assumes that the following subsections are present in the CP2K ATOM input file, in the section
ATOM%PRINT
:&PRINT &POTENTIAL &END POTENTIAL &BASIS_SET &END BASIS_SET &ORBITALS &END ORBITALS &END PRINT
iodata.formats.cube module¶
Gaussian Cube file format.
Cube files are generated by various QC codes these days, including Gaussian, CP2K, GPAW, Q-Chem, …
Note that the second column in the geometry specification of the cube file is interpreted as the effective core charges.
- load_one(lit)[source]¶
Load a single frame from a Gaussian Cube file.
- Parameters:
lit (
LineIterator
) – The line iterator to read the data from.- Returns:
result – A dictionary with IOData attributes. The following attributes are guaranteed to be loaded:
atcoords
,atcorenums
,atnums
,cellvecs
,cube
.- Return type:
Notes
iodata.formats.extxyz module¶
Extended XYZ file format.
The extended XYZ file format is defined in the ASE documentation.
Usually, the different frames in a trajectory describe different geometries of the same
molecule, with atoms in the same order. The load_many
function below can also
handle an XYZ with different molecules, e.g. a molecular database.
- load_many(lit)[source]¶
Load multiple frames from a EXTXYZ file.
- Parameters:
lit (
LineIterator
) – The line iterator to read the data from.- Yields:
result (dict) – A dictionary with IOData attributes. The following attribtues are guaranteed to be loaded:
title
. The following may be loaded if present in the file:atcoords
,atgradient
,atmasses
,atnums
,cellvecs
,charge
,energy
,extra
.- Return type:
Notes
- load_one(lit)[source]¶
Load a single frame from a EXTXYZ file.
- Parameters:
lit (
LineIterator
) – The line iterator to read the data from.- Returns:
result – A dictionary with IOData attributes. The following attributes are guaranteed to be loaded:
title
. The following may be loaded if present in the file:atcoords
,atgradient
,atmasses
,atnums
,cellvecs
,charge
,energy
,extra
.- Return type:
Notes
iodata.formats.fchk module¶
Gaussian FCHK file format.
- dump_one(f, data)[source]¶
Dump a single frame into a Gaussian Formatted Checkpoint file.
- Parameters:
f (
TextIO
) – A writeable file object.data (
IOData
) – An IOData instance which must have the following attributes initialized:atnums
,atcorenums
. If the following attributes are present, they are also dumped into the file:atcharges
,atcoords
,atfrozen
,atgradient
,athessian
,atmasses
,charge
,energy
,lot
,mo
,one_rdms
,obasis_name
,extra
,moments
.
Notes
- load_many(lit)[source]¶
Load multiple frames from a XYZ file.
- Parameters:
lit (
LineIterator
) – The line iterator to read the data from.- Yields:
result (dict) – A dictionary with IOData attributes. The following attribtues are guaranteed to be loaded:
atcoords
,atgradient
,atnums
,atcorenums
,energy
,extra
,title
.- Return type:
Notes
Trajectories from a Gaussian optimization, relaxed scan or IRC calculation are written in groups of frames, called “points” in the Gaussian world, e.g. to discrimininate between different values of the constraint in a relaxed geometry. In most cases, e.g. IRC or conventional optimization, there is only one “point”. Within one “point”, one can have multiple geometries and their properties. This information is stored in the
extra
attribute:ipoint
is the counter for a pointnpoint
is the total number of points.istep
is the counter within one “point”nstep
is the total number of geometries within in a “point”.reaction_coordinate
is only present in case of an IRC calculation.
- load_one(lit)[source]¶
Load a single frame from a Gaussian Formatted Checkpoint file.
- Parameters:
lit (
LineIterator
) – The line iterator to read the data from.- Returns:
result – A dictionary with IOData attributes. The following attributes are guaranteed to be loaded:
atcharges
,atcoords
,atnums
,atcorenums
,lot
,mo
,obasis
,obasis_name
,run_type
,title
. The following may be loaded if present in the file:energy
,atfrozen
,atgradient
,athessian
,atmasses
,one_rdms
,extra
,moments
.- Return type:
Notes
iodata.formats.fcidump module¶
Molpro 2012 FCIDUMP file format.
Notes
This function works only for restricted wave-functions.
One- and two-electron integrals are stored in chemists’ notation in an FCIDUMP file, while IOData internally uses Physicist’s notation.
Keep in mind that the FCIDUMP format changed in MOLPRO 2012, so files generated with older versions are not supported.
- dump_one(f, data)[source]¶
Dump a single frame into a Molpro 2012 FCIDUMP file.
- Parameters:
Notes
The dictionary
one_ints
must contain a fieldcore_mo
. Similarly,two_ints
must containtwo_mo
.
- load_one(lit)[source]¶
Load a single frame from a Molpro 2012 FCIDUMP file.
- Parameters:
lit (
LineIterator
) – The line iterator to read the data from.- Returns:
result – A dictionary with IOData attributes. The following attributes are guaranteed to be loaded:
core_energy
,one_ints
,nelec
,spinpol
,two_ints
.- Return type:
Notes
iodata.formats.gamess module¶
GAMESS punch file format.
- load_one(lit)[source]¶
Load a single frame from a PUNCH file.
- Parameters:
lit (
LineIterator
) – The line iterator to read the data from.- Returns:
result – A dictionary with IOData attributes. The following attributes are guaranteed to be loaded:
title
,energy
,grot
,atgradient
,athessian
,atmasses
,atnums
,atcoords
.- Return type:
Notes
iodata.formats.gaussianinput module¶
Gaussian input format.
- load_one(lit)[source]¶
Load a single frame from a Gaussian Input File file.
- Parameters:
lit (
LineIterator
) – The line iterator to read the data from.- Returns:
result – A dictionary with IOData attributes. The following attributes are guaranteed to be loaded:
atcoords
,atnums
,title
.- Return type:
Notes
iodata.formats.gaussianlog module¶
Gaussian Log file format.
To write out the integrals in a Gaussian log file, which can be loaded with this module, you need to use the following Gaussian command line:
scf(conventional) iop(3/33=5) extralinks=l316 iop(3/27=999)
- load_one(lit)[source]¶
Load a single frame from a Gaussian Log file.
- Parameters:
lit (
LineIterator
) – The line iterator to read the data from.- Returns:
result – A dictionary with IOData attributes. The following attributes are guaranteed to be loaded: . The following may be loaded if present in the file:
one_ints
,two_ints
.- Return type:
Notes
iodata.formats.gromacs module¶
GROMACS gro file format.
Files with the gro file extension contain a molecular structure in Gromos87 format. GROMACS gro files can be used as trajectory by simply concatenating files.
http://manual.gromacs.org/current/reference-manual/file-formats.html#gro
- load_many(lit)[source]¶
Load multiple frames from a GRO file.
- Parameters:
lit (
LineIterator
) – The line iterator to read the data from.- Yields:
result (dict) – A dictionary with IOData attributes. The following attribtues are guaranteed to be loaded:
atcoords
,atffparams
,cellvecs
,extra
,title
.- Return type:
Notes
- load_one(lit)[source]¶
Load a single frame from a GRO file.
- Parameters:
lit (
LineIterator
) – The line iterator to read the data from.- Returns:
result – A dictionary with IOData attributes. The following attributes are guaranteed to be loaded:
atcoords
,atffparams
,cellvecs
,extra
,title
.- Return type:
Notes
iodata.formats.json module¶
QCSchema JSON file format.
QCSchema defines four different subschema:
Molecule: specifying a molecular system
Input: specifying QC program input for a specific Molecule
Output: specifying QC program output for a specific Molecule
Basis: specifying a basis set for a specific Molecule
General Usage¶
The QCSchema format is intended to be a catch-all file format for storing and sharing QC calculation
data. Due to the wide number of possibilities of the data contained in a single file, not every
field in a QCSchema file directly corresponds to an IOData attribute. For example,
qcschema_output
files allow for many fields capturing different energy contributions, especially
for coupled-cluster calculations. To accommodate this fact, IOData does not always assume the intent
of the user; instead, IOData ensures that every field in the file is stored in a structured manner.
When a QCSchema field does not correspond to an IOData attribute, that data is instead stored in the
extra
dict, in a dictionary corresponding to the subschema where that data was found. In cases
where multiple subschema contain the relevant field (e.g. the Output subschema contains the entirety
of the Input subschema), the data will be found in the smallest subschema (for the example above, in
IOData.extra["input"]
, not IOData.extra["output"]
).
Dumping an IOData instance to a QCSchema file involves adding relevant required (and optional, if
needed) fields to the necessary dictionaries in the extra
dict. One exception is the
provenance
field: if the only desired provenance data is the creation of the file by IOData,
that data will be added automatically.
The following sections will describe the requirements of each subschema and the behaviour to expect from IOData when loading in or dumping out a QCSchema file.
Schema Definitions¶
Provenance Information¶
The provenance field contains information about how the associated QCSchema object and its attributes were generated, provided, and manipulated. A provenance entry expects these fields:
Field |
Description |
---|---|
creator |
Required. The program that generated, provided, or manipulated this file. |
version |
The version of the creator. |
routine |
The routine of the creator. |
In QCElemental, only a single provenance entry is permitted. When generating a QCSchema file for use
with QCElemental, the easiest way to ensure compliance is to leave the provenance field blank, to
allow the dump_one
function to generate the correct provenance information. However, allowing
only one entry for provenance information limits the ability to properly trace a file through
several operations during complex workflows. With this in mind, IOData supports an enhanced
provenance field, in the form of a list of provenance entries, with new entries appended to the end
of the list.
Molecule Schema¶
The qcschema_molecule
subschema describes a molecular system, and contains the data necessary to
specify a molecular system and support I/O and manipulation processes.
The following is an example of a minimal qcschema_molecule
file:
{
"schema_name": "qcschema_molecule",
"schema_version": 2,
"symbols": ["Li", "Cl"],
"geometry": [0.000000, 0.000000, -1.631761, 0.000000, 0.000000, 0.287958],
"molecular_charge": 0,
"molecular_multiplicity": 1,
"provenance": {
"creator": "HORTON3",
"routine": "Manual validation"
}
}
The required fields and corresponding types for a qcschema_molecule
file are:
Field |
Type |
IOData attr. |
Description |
---|---|---|---|
schema_name |
str |
N/A |
The name of the QCSchema subschema. Fixed as
|
schema_version |
str |
N/A |
The version of the subschema specification. 2.0 is the current version. |
symbols |
list(N_at) |
|
An array of the atomic symbols for the system. |
geometry |
list(3*N_at) |
|
An ordered array of XYZ atomic coordinates,
corresponding to the order of |
molecular_charge |
float |
|
The net electrostatic charge of the molecule. Some writers assume a default of 0. |
molecular_multiplicity |
int |
|
The total multiplicity of this molecule. Some writers assume a default of 1. |
provenance |
dict or list |
N/A |
Information about the file was generated, provided, and manipulated. See Provenance section above for more details. |
Note: N_at corresponds to the number of atoms in the molecule, as defined by the length of
symbols
.
The optional fields and corresponding types for a qcschema_molecule
file are:
Field |
Type |
IOData attr. |
Description |
---|---|---|---|
atom_labels |
list(N_at) |
N/A |
Additional per-atom labels. Typically used for
model conversions, not user assignment. The
indices of this array correspond to the
|
atomic_numbers |
list(N_at) |
|
An array of atomic numbers for each atom.
Typically inferred from |
comment |
str |
N/A |
Additional comments for this molecule. These comments are intended for user information, not any computational tasks. |
connectivity |
list |
|
The connectivity information between each atom
in the |
extras |
dict |
N/A |
Extra information to associate with this molecule. |
fix_symmetry |
str |
|
Maximal point group symmetry with which the molecule should be treated. |
fragments |
list(N_fr) |
N/A |
An array that designates which sets of atoms are
fragments within the molecule. This is a nested
array, with the indices of the base array
corresponding to the values in
|
fragment_charges |
list(N_fr) |
N/A |
The total charge of each fragment in
|
fragment_multiplicities |
list(N_fr) |
N/A |
The multiplicity of each fragment in
|
id |
str |
N/A |
A unique identifier for this molecule. |
identifiers |
dict |
N/A |
Additional identifiers by which this molecule can be referenced, such as INCHI, SMILES, etc. |
real |
list(N_at) |
|
An array indicating whether each atom is real
(true) or a ghost/virtual atom (false). The
indices of this array correspond to the
|
mass_numbers |
list(N_at) |
|
An array of atomic mass numbers for each atom.
The indices of this array correspond to the
|
masses |
list(N_at) |
|
An array of atomic masses [u] for each atom.
Typically inferred from |
name |
str |
|
An arbitrary, common, or human-readable name to assign to this molecule. |
Note: N_at corresponds to the number of atoms in the molecule, as defined by the length of
symbols
; N_fr corresponds to the number of fragments in the molecule, as defined by the length
of fragments
. Fragment data is stored in a sub-dictionary, fragments
.
The following are additional optional keywords used in QCElemental’s QCSchema implementation. These keywords mostly correspond to specific QCElemental functionality, and may not necessarily produce similar results in other QCSchema parsers.
Field |
Type |
Description |
---|---|---|
fix_com |
bool |
An indicator to prevent pre-processing the molecule by translating the COM to (0,0,0) in Euclidean coordinate space. |
fix_orientation |
bool |
An indicator to prevent pre-processing the molecule by orienting via the inertia tensor. |
validated |
bool |
An indicator that the input molecule data has been previously checked for schema and physics (e.g. non-overlapping atoms, feasible multiplicity) compliance. Generally should only be true when set by a trusted validator. |
Input Schema¶
The qcschema_input
subschema describes all data necessary to generate and parse a QC program
input file for a given molecule.
The following is an example of a minimal qcschema_input
file:
{
"schema_name": "qcschema_input",
"schema_version": 2.0,
"molecule": {
"schema_name": "qcschema_molecule",
"schema_version": 2.0,
"symbols": ["Li", "Cl"],
"geometry": [0.000000, 0.000000, -1.631761, 0.000000, 0.000000, 0.287958],
"molecular_charge": 0.0,
"molecular_multiplicity": 1,
"provenance": {
"creator": "HORTON3",
"routine": "Manual validation"
}
},
"driver": "energy",
"model": {
"method": "B3LYP",
"basis": "Def2TZVP"
}
}
The required fields and corresponding types for a qcschema_input
file are:
Field |
Type |
IOData attr. |
Description |
---|---|---|---|
schema_name |
str |
N/A |
The QCSchema specification to which this model
conforms. Fixed as |
schema_version |
float |
N/A |
The version number of |
molecule |
dict |
N/A |
QCSchema Molecule instance. |
driver |
str |
N/A |
The type of calculation being performed. One of
|
model |
dict |
N/A |
The quantum chemistry model specification for a given operation to compute against. See Model section below. |
The optional fields and corresponding types for a qcschema_input file are:
Field |
Type |
IOData attr. |
Description |
---|---|---|---|
extras |
dict |
N/A |
Extra information associated with the input. |
id |
str |
N/A |
An identifier for the input object. |
keywords |
dict |
N/A |
QC program-specific keywords to be used for a computation. See details below for IOData-specific usages. |
protocols |
dict |
N/A |
Protocols regarding the manipulation of the output that results from this input. See Protocols section below. |
provenance |
dict or list |
N/A |
Information about the file was generated, provided, and manipulated. See Provenance section above for more information. |
IOData currently supports the following keywords for qcschema_input
files:
Keyword |
Type |
IOData attr. |
Description |
---|---|---|---|
run_type |
str |
|
The type of calculation that lead to the results
stored in IOData, which must be one of the
following: |
Model Subschema¶
The model
dict contains the following fields:
Field |
Type |
IOData attr. |
Description |
---|---|---|---|
method |
str |
|
The level of theory used for the computation (e.g. B3LYP, PBE, CCSD(T), etc.) |
basis |
str or dict |
N/A |
The quantum chemistry basis set to evaluate (e.g. 6-31G, cc-pVDZ, etc.) Can be ‘none’ for methods without basis sets. Must be either a string specifying the basis set name (the same as its name in the Basis Set Exchange, when possible) or a qcschema_basis instance. |
Protocols Subschema¶
The protocols
dict contains the following fields:
Field |
Type |
IOData attr. |
Description |
---|---|---|---|
wavefunction |
str |
N/A |
Specification of the wavefunction properties to
keep from the resulting output. One of |
keep_stdout |
bool |
N/A |
An indicator to keep the output file from the resulting output. |
Output Schema¶
The qcschema_output
subschema describes all data necessary to generate and parse a QC program’s
output file for a given molecule.
The following is an example of a minimal qcschema_output
file:
{
"schema_name": "qcschema_output",
"schema_version": 2.0,
"molecule": {
"schema_name": "qcschema_molecule",
"schema_version": 2.0,
"symbols": ["Li", "Cl"],
"geometry": [0.000000, 0.000000, -1.631761, 0.000000, 0.000000, 0.287958],
"molecular_charge": 0.0,
"molecular_multiplicity": 1,
"provenance": {
"creator": "HORTON3",
"routine": "Manual validation"
}
},
"driver": "energy",
"model": {
"method": "HF",
"basis": "STO-4G"
},
"properties": {},
"return_result": -464.626219879,
"success": true
}
The required fields and corresponding types for a qcschema_output
file are:
Field |
Type |
IOData attr. |
Description |
---|---|---|---|
schema_name |
str |
N/A |
The QCSchema specification to which this model
conforms. Fixed as |
schema_version |
float |
N/A |
The version number of |
molecule |
dict |
N/A |
QCSchema Molecule instance. |
driver |
str |
N/A |
The type of calculation being performed. One of
|
model |
dict |
N/A |
The quantum chemistry model specification for a given operation to compute against. |
properties |
dict |
N/A |
Named properties of quantum chemistry computations. See Properties section below. |
return_result |
varies |
N/A |
The result requested by the |
success |
bool |
N/A |
An indicator for the success of the QC program’s execution. |
The optional fields and corresponding types for a qcschema_output
file are:
Field |
Type |
IOData attr. |
Description |
---|---|---|---|
error |
dict |
N/A |
A complete description of an error-terminated computation. See Error section below. |
extras |
dict |
N/A |
Extra information associated with the input. Also specified for qcschema_input. |
id |
str |
N/A |
An identifier for the input object. Also specified for qcschema_input. |
keywords |
dict |
N/A |
QC program-specific keywords to be used for a computation. See details below for IOData-specific usages. Also specified for qcschema_input. |
protocols |
dict |
N/A |
Protocols regarding the manipulation of the output that results from this input. See Protocols section above. Also specified for qcschema_input. |
provenance |
dict or list |
N/A |
Information about the file was generated, provided, and manipulated. See Provenance section above for more information. Also specified for qcschema_input. |
stderr |
str |
N/A |
The standard error (stderr) of the associated computation. |
stdout |
str |
N/A |
The standard output (stdout) of the associated computation. |
wavefunction |
dict |
N/A |
The wavefunction properties of a QC computation. All matrices appear in column-major order. See Wavefunction section below. |
Properties Subschema¶
The properties
dict contains named properties of quantum chemistry computations. Due to the
variability possible for the contents of an output file, IOData does not guess at which properties
are desired by the user, and stores all properties in the extra["output]["properties"]
dict for
easy retrieval. The current QCSchema standard provides names for the following properties:
Field |
Description |
---|---|
calcinfo_nbasis |
The number of basis functions for the computation. |
calcinfo_nmo |
The number of molecular orbitals for the computation. |
calcinfo_nalpha |
The number of alpha electrons in the computation. |
calcinfo_nbeta |
The number of beta electrons in the computation. |
calcinfo_natom |
The number of atoms in the computation. |
nuclear_repulsion_energy |
The nuclear repulsion energy term. |
return_energy |
The energy of the requested method, identical to
|
scf_one_electron_energy |
The one-electron (core Hamiltonian) energy contribution to the total SCF energy. |
scf_two_electron_energy |
The two-electron energy contribution to the total SCF energy. |
scf_vv10_energy |
The VV10 functional energy contribution to the total SCF energy. |
scf_xc_energy |
The functional (XC) energy contribution to the total SCF energy. |
scf_dispersion_correction_energy |
The dispersion correction appended to an underlying functional when a DFT-D method is requested. |
scf_dipole_moment |
The X, Y, and Z dipole components. |
scf_total_energy |
The total electronic energy of the SCF stage of the calculation. |
scf_iterations |
The number of SCF iterations taken before convergence. |
mp2_same_spin_correlation_energy |
The portion of MP2 doubles correlation energy from same-spin (i.e. triplet) correlations. |
mp2_opposite_spin_correlation_energy |
The portion of MP2 doubles correlation energy from opposite-spin (i.e. singlet) correlations. |
mp2_singles_energy |
The singles portion of the MP2 correlation energy. Zero except in ROHF. |
mp2_doubles_energy |
|
mp2_total_correlation_energy |
The MP2 correlation energy. |
mp2_correlation_energy |
The MP2 correlation energy. |
mp2_total_energy |
The total MP2 energy (MP2 correlation energy + HF energy). |
mp2_dipole_moment |
The MP2 X, Y, and Z dipole components. |
ccsd_same_spin_correlation_energy |
The portion of CCSD doubles correlation energy from same-spin (i.e. triplet) correlations. |
ccsd_opposite_spin_correlation_energy |
The portion of CCSD doubles correlation energy from opposite-spin (i.e. singlet) correlations |
ccsd_singles_energy |
The singles portion of the CCSD correlation energy. Zero except in ROHF. |
ccsd_doubles_energy |
The doubles portion of the CCSD correlation energy including same-spin and opposite-spin correlations. |
ccsd_correlation_energy |
The CCSD correlation energy. |
ccsd_total_energy |
The total CCSD energy (CCSD correlation energy + HF energy). |
ccsd_dipole_moment |
The CCSD X, Y, and Z dipole components. |
ccsd_iterations |
The number of CCSD iterations taken before convergence. |
ccsd_prt_pr_correlation_energy |
The CCSD(T) correlation energy. |
ccsd_prt_pr_total_energy |
The total CCSD(T) energy (CCSD(T) correlation energy + HF energy). |
ccsd_prt_pr_dipole_moment |
The CCSD(T) X, Y, and Z dipole components. |
ccsd_prt_pr_iterations |
The number of CCSD(T) iterations taken before convergence. |
ccsdt_correlation_energy |
The CCSDT correlation energy. |
ccsdt_total_energy |
The total CCSDT energy (CCSDT correlation energy + HF energy). |
ccsdt_dipole_moment |
The CCSDT X, Y, and Z dipole components. |
ccsdt_iterations |
The number of CCSDT iterations taken before convergence. |
ccsdtq_correlation_energy |
The CCSDTQ correlation energy. |
ccsdtq_total_energy |
The total CCSDTQ energy (CCSDTQ correlation energy + HF energy). |
ccsdtq_dipole_moment |
The CCSDTQ X, Y, and Z dipole components. |
ccsdtq_iterations |
The number of CCSDTQ iterations taken before convergence. |
Error Subschema¶
The error
dict contains the following fields:
Field |
Type |
IOData attr. |
Description |
---|---|---|---|
error_type |
str |
N/A |
The type of error raised during the computation. |
error_message |
str |
N/A |
Additional information related to the error, such as the backtrace. |
extras |
dict |
N/A |
Additional data associated with the error. |
Wavefunction subschema¶
The wavefunction subschema contains the wavefunction properties of a QC computation. All matrices appear in column-major order. The current QCSchema standard provides names for the following wavefunction properties:
Field |
Description |
---|---|
basis |
A |
restricted |
An indicator for a restricted calculation (alpha == beta). When true, all beta quantites are omitted, since quantity_b == quantity_a |
h_core_a |
Alpha-spin core (one-electron) Hamiltonian. |
h_core_b |
Beta-spin core (one-electron) Hamiltonian. |
h_effective_a |
Alpha-spin effective core (one-electron) Hamiltonian. |
h_effective_b |
Beta-spin effective core (one-electron) Hamiltonian. |
scf_orbitals_a |
Alpha-spin SCF orbitals. |
scf_orbitals_b |
Beta-spin SCF orbitals. |
scf_density_a |
Alpha-spin SCF density matrix. |
scf_density_b |
Beta-spin SCF density matrix. |
scf_fock_a |
Alpha-spin SCF Fock matrix. |
scf_fock_b |
Beta-spin SCF Fock matrix. |
scf_eigenvalues_a |
Alpha-spin SCF eigenvalues. |
scf_eigenvalues_b |
Beta-spin SCF eigenvalues. |
scf_occupations_a |
Alpha-spin SCF orbital occupations. |
scf_occupations_b |
Beta-spin SCF orbital occupations. |
orbitals_a |
Keyword for the primary return alpha-spin orbitals. |
orbitals_b |
Keyword for the primary return beta-spin orbitals. |
density_a |
Keyword for the primary return alpha-spin density. |
density_b |
Keyword for the primary return beta-spin density. |
fock_a |
Keyword for the primary return alpha-spin Fock matrix. |
fock_b |
Keyword for the primary return beta-spin Fock matrix. |
eigenvalues_a |
Keyword for the primary return alpha-spin eigenvalues. |
eigenvalues_b |
Keyword for the primary return beta-spin eigenvalues. |
occupations_a |
Keyword for the primary return alpha-spin orbital occupations. |
occupations_b |
Keyword for the primary return beta-spin orbital occupations. |
- load_one(lit)[source]¶
Load a single frame from a QCSchema file.
- Parameters:
lit (
LineIterator
) – The line iterator to read the data from.- Returns:
result – A dictionary with IOData attributes. The following attributes are guaranteed to be loaded:
atnums
,atcorenums
,atcoords
,charge
,nelec
,spinpol
. The following may be loaded if present in the file:atmasses
,bonds
,energy
,g_rot
,lot
,obasis
,obasis_name
,title
,extra
.- Return type:
Notes
iodata.formats.locpot module¶
VASP 5 LOCPOT file format.
This format is used by VASP 5.X and VESTA.
Note that even though the CHGCAR
and LOCPOT
files look very similar, they require
different conversions to atomic units.
- load_one(lit)[source]¶
Load a single frame from a VASP 5 LOCPOT file.
- Parameters:
lit (
LineIterator
) – The line iterator to read the data from.- Returns:
result – A dictionary with IOData attributes. The following attributes are guaranteed to be loaded:
atcoords
,atnums
,cellvecs
,cube
,title
.- Return type:
Notes
iodata.formats.mol2 module¶
MOL2 file format.
There are different formats of mol2 files. Here the compatibility with AMBER software was the main objective to write out files with atomic charges used by antechamber.
- load_many(lit)[source]¶
Load multiple frames from a MOL2 file.
- Parameters:
lit (
LineIterator
) – The line iterator to read the data from.- Yields:
result (dict) – A dictionary with IOData attributes. The following attribtues are guaranteed to be loaded:
atcoords
,atnums
,atcharges
,atffparams
. The following may be loaded if present in the file:title
.- Return type:
Notes
- load_one(lit)[source]¶
Load a single frame from a MOL2 file.
- Parameters:
lit (
LineIterator
) – The line iterator to read the data from.- Returns:
result – A dictionary with IOData attributes. The following attributes are guaranteed to be loaded:
atcoords
,atnums
,atcharges
,atffparams
. The following may be loaded if present in the file:title
.- Return type:
Notes
iodata.formats.molden module¶
Molden file format.
Many QC codes can write out Molden files, e.g. Molpro, Orca, PSI4, Molden, Turbomole. Keep in mind that several of these write incorrect versions of the file format, but these errors are corrected when loading them with IOData.
- load_one(lit, norm_threshold=0.0001)[source]¶
Load a single frame from a Molden file.
- Parameters:
lit (
LineIterator
) – The line iterator to read the data from.norm_threshold (
float
) – When the normalization of one of the orbitals exceeds norm_threshold, a correction is attempted or an error is raised when no suitable correction can be found.
- Returns:
result – A dictionary with IOData attributes. The following attributes are guaranteed to be loaded:
atcoords
,atnums
,atcorenums
,mo
,obasis
. The following may be loaded if present in the file:title
.- Return type:
Notes
iodata.formats.molekel module¶
Molekel file format.
This format is used by two programs: Molekel and Orca.
- load_one(lit, norm_threshold=0.0001)[source]¶
Load a single frame from a Molekel file.
- Parameters:
lit (
LineIterator
) – The line iterator to read the data from.norm_threshold (
float
) – When the normalization of one of the orbitals exceeds norm_threshold, a correction is attempted or an error is raised when no suitable correction can be found.
- Returns:
result – A dictionary with IOData attributes. The following attributes are guaranteed to be loaded:
atcoords
,atnums
,mo
,obasis
. The following may be loaded if present in the file:atcharges
.- Return type:
Notes
iodata.formats.mwfn module¶
Multiwfn MWFN file format.
- load_one(lit)[source]¶
Load a single frame from a MWFN file.
- Parameters:
lit (
LineIterator
) – The line iterator to read the data from.- Returns:
result – A dictionary with IOData attributes. The following attributes are guaranteed to be loaded:
atcoords
,atnums
,atcorenums
,energy
,mo
,obasis
,extra
,title
.- Return type:
Notes
iodata.formats.orcalog module¶
Orca output file format.
- load_one(lit)[source]¶
Load a single frame from a Orca output file.
- Parameters:
lit (
LineIterator
) – The line iterator to read the data from.- Returns:
result – A dictionary with IOData attributes. The following attributes are guaranteed to be loaded:
atcoords
,atnums
,energy
,moments
,extra
.- Return type:
Notes
iodata.formats.pdb module¶
PDB file format.
There are different formats of pdb files. The convention used here is the last updated one and is described in this link: http://www.wwpdb.org/documentation/file-format-content/format33/v3.3.html
- load_many(lit)[source]¶
Load multiple frames from a PDB file.
- Parameters:
lit (
LineIterator
) – The line iterator to read the data from.- Yields:
result (dict) – A dictionary with IOData attributes. The following attribtues are guaranteed to be loaded:
atcoords
,atnums
,atffparams
,extra
. The following may be loaded if present in the file:title
.- Return type:
Notes
- load_one(lit)[source]¶
Load a single frame from a PDB file.
- Parameters:
lit (
LineIterator
) – The line iterator to read the data from.- Returns:
result – A dictionary with IOData attributes. The following attributes are guaranteed to be loaded:
atcoords
,atnums
,atffparams
,extra
. The following may be loaded if present in the file:title
,bonds
.- Return type:
Notes
iodata.formats.poscar module¶
VASP 5 POSCAR file format.
This format is used by VASP 5.X and VESTA.
- load_one(lit)[source]¶
Load a single frame from a VASP 5 POSCAR file.
- Parameters:
lit (
LineIterator
) – The line iterator to read the data from.- Returns:
result – A dictionary with IOData attributes. The following attributes are guaranteed to be loaded:
atcoords
,atnums
,cellvecs
,title
.- Return type:
Notes
iodata.formats.qchemlog module¶
Q-Chem Log file format.
This module will load Q-Chem log file into IODATA.
- load_one(lit)[source]¶
Load a single frame from a qchemlog file.
- Parameters:
lit (
LineIterator
) – The line iterator to read the data from.- Returns:
result – A dictionary with IOData attributes. The following attributes are guaranteed to be loaded:
atcoords
,atmasses
,atnums
,energy
,g_rot
,mo
,lot
,obasis_name
,run_type
,extra
. The following may be loaded if present in the file:athessian
.- Return type:
Notes
iodata.formats.sdf module¶
SDF file format.
Usually, the different frames in a trajectory describe different geometries of the same
molecule, with atoms in the same order. The load_many
and dump_many
functions
below can also handle an SDF file with different molecules, e.g. a molecular database.
The SDF format is somewhat documented on the following page: http://www.nonlinear.com/progenesis/sdf-studio/v0.9/faq/sdf-file-format-guidance.aspx
This format is one of the chemical table file formats: https://en.wikipedia.org/wiki/Chemical_table_file
- load_many(lit)[source]¶
Load multiple frames from a SDF file.
- Parameters:
lit (
LineIterator
) – The line iterator to read the data from.- Yields:
result (dict) – A dictionary with IOData attributes. The following attribtues are guaranteed to be loaded:
atcoords
,atnums
,bonds
,title
.- Return type:
Notes
- load_one(lit)[source]¶
Load a single frame from a SDF file.
- Parameters:
lit (
LineIterator
) – The line iterator to read the data from.- Returns:
result – A dictionary with IOData attributes. The following attributes are guaranteed to be loaded:
atcoords
,atnums
,bonds
,title
.- Return type:
Notes
iodata.formats.wfn module¶
Gaussian/GAMESS-US WFN file format.
Only use this format if the program that generated it does not offer any alternatives that HORTON can load. The WFN format has the disadvantage that it cannot represent contractions and therefore expands all orbitals into a decontracted basis. This makes the post-processing less efficient compared to formats that do support contractions of Gaussian functions.
- build_obasis(icenters, type_assignments, exponents, lit)[source]¶
Construct a basis set using the arrays read from a WFN or WFX file.
- Parameters:
icenters (
ndarray
) – The center indices for all basis functions. shape=(nbasis,). Lowest index is zero.type_assignments (
ndarray
) – Integer codes for basis function names. shape=(nbasis,). Lowest index is zero.exponents (
ndarray
) – The Gaussian exponents of all basis functions. shape=(nbasis,)
- Return type:
Tuple
[MolecularBasis
,ndarray
]
- get_mocoeff_scales(obasis)[source]¶
Get the L2-normalization of the un-normalized Cartesian basis functions.
- Parameters:
obasis (
MolecularBasis
) – The molecular orbital basis.- Returns:
Scaling factors to be multiplied into the molecular orbital coefficients.
- Return type:
scales
- load_one(lit)[source]¶
Load a single frame from a WFN file.
- Parameters:
lit (
LineIterator
) – The line iterator to read the data from.- Returns:
result – A dictionary with IOData attributes. The following attributes are guaranteed to be loaded:
atcoords
,atnums
,energy
,mo
,obasis
,title
,extra
.- Return type:
Notes
- load_wfn_low(lit)[source]¶
Load data from a WFN file into arrays.
- Parameters:
lit (
LineIterator
) – The line iterator to read the data from.- Return type:
iodata.formats.wfx module¶
AIM/AIMAll WFX file format.
See http://aim.tkgristmill.com/wfxformat.html
- dump_one(f, data)[source]¶
Dump a single frame into a WFX file.
- Parameters:
f (
TextIO
) – A writeable file object.data (
IOData
) – An IOData instance which must have the following attributes initialized:atcoords
,atnums
,atcorenums
,mo
,obasis
,charge
. If the following attributes are present, they are also dumped into the file:title
,energy
,spinpol
,lot
,atgradient
,extra
.
Notes
- load_one(lit)[source]¶
Load a single frame from a WFX file.
- Parameters:
lit (
LineIterator
) – The line iterator to read the data from.- Returns:
result – A dictionary with IOData attributes. The following attributes are guaranteed to be loaded:
atcoords
,atgradient
,atnums
,energy
,extra
,mo
,obasis
,title
.- Return type:
Notes
iodata.formats.xyz module¶
XYZ file format.
Usually, the different frames in a trajectory describe different geometries of the same
molecule, with atoms in the same order. The load_many
and dump_many
functions
below can also handle an XYZ with different molecules, e.g. a molecular database.
The load_*
and dump_*
functions all accept the optional argument
atom_columns
. This argument fixes the meaning of the columns to be loaded
from or dumped to an XYZ file. The following example defines, in addition to the
conventional columns, also a column with atomic charges and three columns with
atomic forces.
atom_columns = iodata.formats.xyz.DEFAULT_ATOM_COLUMNS + [
# Atomic charges are stored in a dictionary atcharges and they key
# refers to the name of the partitioning method.
("atcharges", "mulliken", (), float, float, "{:10.5f}".format),
# Note that in IOData, the energy gradient is stored, which contains the
# negative forces.
("atgradient", None, (3,), float,
(lambda word: -float(word)),
(lambda value: "{:15.10f}".format(-value)))
]
mol = load_one("test.xyz", atom_columns=atom_columns)
# The following attributes are present:
print(mol.atnums)
print(mol.atcoords)
print(mol.atcharges["mulliken"])
print(mol.atgradient)
When defining atom_columns
, no columns can be skipped, such that all
information loaded from a file can also be written back out when dumping it.
- dump_many(f, datas, atom_columns=None)[source]¶
Dump multiple frames into a XYZ file.
- Parameters:
f (
TextIO
) – A writeable file object.datas (
Iterator
[IOData
]) – An iterator over IOData instances which must have the following attributes initialized:atcoords
,atnums
. If the following attributes are present, they are also dumped into the file:title
.atom_columns – A list of atomic fields to be loaded. Each field as a tuple with the following items: attribute (
str
), key (None
orstr
, whenstr
theIOData
attribute is adict
), shape for one atom (tuple
), dtype, load_word (function taking string and returning a value with the correct type), dump_word (function taking a value and returning a formatted string).
Notes
- dump_one(f, data, atom_columns=None)[source]¶
Dump a single frame into a XYZ file.
- Parameters:
f (
TextIO
) – A writeable file object.data (
IOData
) – An IOData instance which must have the following attributes initialized:atcoords
,atnums
. If the following attributes are present, they are also dumped into the file:title
.atom_columns – A list of atomic fields to be loaded. Each field as a tuple with the following items: attribute (
str
), key (None
orstr
, whenstr
theIOData
attribute is adict
), shape for one atom (tuple
), dtype, load_word (function taking string and returning a value with the correct type), dump_word (function taking a value and returning a formatted string).
Notes
- load_many(lit, atom_columns=None)[source]¶
Load multiple frames from a XYZ file.
- Parameters:
lit (
LineIterator
) – The line iterator to read the data from.atom_columns – A list of atomic fields to be loaded. Each field as a tuple with the following items: attribute (
str
), key (None
orstr
, whenstr
theIOData
attribute is adict
), shape for one atom (tuple
), dtype, load_word (function taking string and returning a value with the correct type), dump_word (function taking a value and returning a formatted string).
- Yields:
result (dict) – A dictionary with IOData attributes. The following attribtues are guaranteed to be loaded:
atcoords
,atnums
,title
.- Return type:
Notes
- load_one(lit, atom_columns=None)[source]¶
Load a single frame from a XYZ file.
- Parameters:
lit (
LineIterator
) – The line iterator to read the data from.atom_columns – A list of atomic fields to be loaded. Each field as a tuple with the following items: attribute (
str
), key (None
orstr
, whenstr
theIOData
attribute is adict
), shape for one atom (tuple
), dtype, load_word (function taking string and returning a value with the correct type), dump_word (function taking a value and returning a formatted string).
- Returns:
result – A dictionary with IOData attributes. The following attributes are guaranteed to be loaded:
atcoords
,atnums
,title
.- Return type:
Notes
Module contents¶
iodata.inputs package¶
Submodules¶
iodata.inputs.common module¶
Utilities for writing input files.
iodata.inputs.gaussian module¶
Gaussian Input Module.
- write_input(f, data, template=None, **kwargs)[source]¶
Write a GAUSSIAN input file.
- Parameters:
f (
TextIO
) – A writeable file object.data (
IOData
) – An IOData instance which must have the following attributes initialized:atnums
,atcoords
. If the following attributes are present, they are also written into the file:title
,run_type
,lot
,obasis_name
,spinmult
,charge
. If these attributes are not assigned, internal default values are used.
Notes
iodata.inputs.orca module¶
Orca Input Module.
- write_input(f, data, template=None, **kwargs)[source]¶
Write a ORCA input file.
- Parameters:
f (
TextIO
) – A writeable file object.data (
IOData
) – An IOData instance which must have the following attributes initialized:atnums
,atcoords
. If the following attributes are present, they are also written into the file:title
,run_type
,lot
,obasis_name
,spinmult
,charge
. If these attributes are not assigned, internal default values are used.
Notes
Module contents¶
iodata.test package¶
Subpackages¶
Submodules¶
iodata.test.common module¶
iodata.test.test_attrutils module¶
iodata.test.test_basis module¶
iodata.test.test_charmm module¶
Test iodata.formats.orcalog module.
iodata.test.test_chgcar module¶
Test iodata.formats.chgcar module.
iodata.test.test_cli module¶
Unit tests for iodata.__main__.
iodata.test.test_cp2klog module¶
iodata.test.test_cube module¶
Test iodata.formats.cube module.
iodata.test.test_extxyz module¶
Test iodata.formats.extxyz module.
iodata.test.test_fchk module¶
iodata.test.test_fcidump module¶
Test iodata.formats.fcidump module.
iodata.test.test_gamess module¶
Test iodata.formats.gamess module.
iodata.test.test_gaussianinput module¶
Test iodata.formats.gaussianinput module.
iodata.test.test_gaussianlog module¶
Test iodata.formats.log module.
iodata.test.test_gromacs module¶
Test iodata.formats.gromacs module.
iodata.test.test_inputs module¶
Test iodata.inputs module.
iodata.test.test_iodata module¶
iodata.test.test_json module¶
iodata.test.test_locpot module¶
Test iodata.formats.locpot module.
iodata.test.test_mol2 module¶
iodata.test.test_molden module¶
iodata.test.test_molekel module¶
iodata.test.test_mwfn module¶
Test iodata.formats.mwfn module.
iodata.test.test_orbitals module¶
iodata.test.test_orcalog module¶
Test iodata.formats.orcalog module.
iodata.test.test_overlap module¶
iodata.test.test_pdb module¶
iodata.test.test_poscar module¶
Test iodata.formats.poscar module.
iodata.test.test_qchemlog module¶
Test iodata.formats.qchemlog module.
iodata.test.test_sdf module¶
iodata.test.test_utils module¶
Unit tests for iodata.utils.
iodata.test.test_wfn module¶
iodata.test.test_wfx module¶
iodata.test.test_xyz module¶
Test iodata.formats.xyz module.
Module contents¶
Submodules¶
iodata.api module¶
Functions to be used by end users.
- dump_many(iodatas, filename, fmt=None, **kwargs)[source]¶
Write multiple IOData instances to a file.
This routine uses the extension or prefix of the filename to determine the file format. For each file format, a specialized function is called that does the real work.
- dump_one(iodata, filename, fmt=None, **kwargs)[source]¶
Write data to a file.
This routine uses the extension or prefix of the filename to determine the file format. For each file format, a specialized function is called that does the real work.
- Parameters:
iodata (
IOData
) – The object containing the data to be written.filename (
str
) – The file to write the data to.fmt (
Optional
[str
]) – The name of the file format module to use. When not given, it is guessed from the filename.**kwargs – Keyword arguments are passed on to the format-specific dump_one function.
- load_many(filename, fmt=None, **kwargs)[source]¶
Load multiple IOData instances from a file.
This function uses the extension or prefix of the filename to determine the file format. When the file format is detected, a specialized load function is called for the heavy lifting.
- Parameters:
- Yields:
out – An instance of IOData with data for one frame loaded for the file.
- Return type:
- load_one(filename, fmt=None, **kwargs)[source]¶
Load data from a file.
This function uses the extension or prefix of the filename to determine the file format. When the file format is detected, a specialized load function is called for the heavy lifting.
- Parameters:
- Returns:
The instance of IOData with data loaded from the input files.
- Return type:
out
- write_input(iodata, filename, fmt, template=None, **kwargs)[source]¶
Write input file using an instance of IOData for the specified software format.
iodata.attrutils module¶
Utilities for building attr classes.
- validate_shape(*shape_requirements)[source]¶
Return a validator for the shape of an array or the length of an iterable.
- Parameters:
shape_requirements (
tuple
) – Specifications for the required shape. Every item of the tuple describes the required size of the corresponding axis of an array. Also the number of items should match the dimensionality of the array. When the validator is used for general iterables, this tuple should contain just one element. Possible values for each item are explained in the “Notes” section below.- Returns:
A validator function for the attr library.
- Return type:
validator
Notes
Every element of
shape_requirements
defines the expected size of an array along the corresponding axis. An item in this tuple at position (or index)i
can be one of the following:An integer, which is taken as the expected size along axis
i
.None. In this case, the size of the array along axis
i
is not checked.A string, which should be the name of another integer attribute with the expected size along axis
i
. The other attribute is always an attribute of the same object as the attribute being checked.A 2-tuple containing a name and an integer. In this case, the name refers to another attribute which is an array or an iterable. When the integer is 0, just the length of the other attribute is used. When the integer is non-zero, the other attribute must be an array and the integer selects an axis. The size of the other array along the selected axis is then used as the expected size of the array being checked along axis
i
.
iodata.basis module¶
Utility functions for working with basis sets.
Notes
Basis set conventions and terminology are documented in Basis set conventions.
- class MolecularBasis(shells, conventions, primitive_normalization)[source]¶
Bases:
object
A complete molecular orbital or density basis set.
- shells¶
A list of objects of type Shell which can support generalized contractions.
- conventions¶
A dictionary specifying the ordered basis functions for a given angular momentum and kind. The key is a tuple of angular momentum integer and kind character (‘c’ for Cartesian and ‘p’ for pure/spherical) and the value is a list of basis function strings. For example,
{ ### Conventions for Cartesian functions # E.g., alphabetically ordered Cartesian functions. (0, 'c'): ['1'], (1, 'c'): ['x', 'y', 'z'], (2, 'c'): ['xx', 'xy', 'xz', 'yy', 'yz', 'zz'], ### Conventions for pure functions. # The notation is referring to real solid spherical harmonics. # See https://en.wikipedia.org/wiki/Solid_harmonics#Real_form # 'c{m}' = solid harmonic containing cos(m phi) # 's{m}' = solid harmonic containing sin(m phi) # where m is the magnetic quantum number and phi is the # azimuthal angle. # For example, wikipedia-ordered real spherical harmonics, # see https://en.wikipedia.org/wiki/Spherical_harmonics#Real_form (2, 'p'): ['s2', 's1', 'c0', 'c1', 'c2'], # Different quantum-chemistry codes may use incompatible # orderings and sign conventions. E.g. Molden files written # by ORCA use the following convention for pure f functions: (3, 'p'): ['c0', 'c1', 's1', 'c2', 's2', '-c3', '-s3'], # Note that the minus sign in the last two basis functions # denotes that the signs of these harmonics have been changed. }
The basis function strings in the conventions dictionary are documented in Basis set conventions.
- primitive_normalization¶
The normalization convention of primitives, which can be ‘L2’ (orbitals) or ‘L1’ (densities) normalized.
- __init__(shells, conventions, primitive_normalization)¶
Method generated by attrs for class MolecularBasis.
- class Shell(icenter, angmoms, kinds, exponents, coeffs)[source]¶
Bases:
object
A shell in a molecular basis representing (generalized) contractions with the same exponents.
- icenter¶
An integer index specifying the row in the atcoords array of IOData object.
- angmoms¶
An integer array of angular momentum quantum numbers, non-negative, with shape (ncon,).
- kinds¶
List of strings describing the kind of contractions: ‘c’ for Cartesian and ‘p’ for pure. Pure functions are only allowed for angmom>1. The length equals the number of contractions: len(angmoms)=ncon.
- exponents¶
The array containing the exponents of the primitives, with shape (nprim,).
- coeffs¶
The array containing the coefficients of the normalized primitives in each contraction; shape = (nprim, ncon). These coefficients assume that the primitives are L2 (orbitals) or L1 (densities) normalized, but contractions are not necessarily normalized. (This depends on the code which generated the contractions.)
- __init__(icenter, angmoms, kinds, exponents, coeffs)¶
Method generated by attrs for class Shell.
-
coeffs:
ndarray
¶
-
exponents:
ndarray
¶
- convert_convention_shell(conv1, conv2, reverse=False)[source]¶
Return a permutation vector and sign changes to convert from 1 to 2.
The transformation from convention 1 to convention 2 can be done applying the results of this function as follows:
vector2 = vector1[permutation]*signs
When using the option
reverse=True
, one can use the results to convert in the opposite sense:vector1 = vector2[permutation]*signs
- Parameters:
conv1 (
List
[str
]) – Two lists, with the same strings (in different order), where each string may be prefixed with a ‘-‘.conv2 (
List
[str
]) – Two lists, with the same strings (in different order), where each string may be prefixed with a ‘-‘.reverse – When, true the conversion from 2 to 1 is returned.
- Return type:
Tuple
[ndarray
,ndarray
]- Returns:
permutation – An integer array that permutes basis function from 1 to 2.
signs – Sign changes when going from 1 to 2, must be applied after permutation
- convert_conventions(molbasis, new_conventions, reverse=False)[source]¶
Return a permutation vector and sign changes to convert from 1 to 2.
The transformation from molbasis.convention to the new convention can be done applying the results of this function as follows:
vector2 = vector1[permutation]*signs
When using the option
reverse=True
, one can use the results to convert in the opposite sense:vector1 = vector2[permutation]*signs
- Parameters:
molbasis (
MolecularBasis
) – The description of a molecular basis set.new_conventions (
Dict
[str
,List
[str
]]) – The new conventions for ordering and signs, to which data for the orbital basis needs to be converted.reverse – When, true the conversion from 2 to 1 is returned.
- Return type:
Tuple
[ndarray
,ndarray
]- Returns:
permutation – An integer array that permutes basis function from 1 to 2.
signs – Sign changes when going from 1 to 2, must be applied after permutation
- get_default_conventions()[source]¶
Produce conventions dictionaries compatible with HORTON2 and CCA.
Do not change this! Both conventions are also used by several file formats from other QC codes.
Common Component Architecture (CCA) conventions are defined in appendix B of the following article:
Kenny, J. P.; Janssen, C. L.; Valeev, E. F.; Windus, T. L. Components for Integral Evaluation in Quantum Chemistry: Components for Integral Evaluation in Quantum Chemistry. J. Comput. Chem. 2008, 29 (4), 562–577. https://doi.org/10.1002/jcc.20815.
The ordering of the spherical harmonics within one shell is rather vague in appendix B and a more precise description is given on the LibInt Wiki:
https://github.com/evaleev/libint/wiki/using-modern-CPlusPlus-API
- iter_cart_alphabet(n)[source]¶
Loop over powers of Cartesian basis functions in alphabetical order.
See https://theochem.github.io/horton/2.1.1/tech_ref_gaussian_basis.html for details.
- Parameters:
n (
int
) – The angular momentum, i.e. sum of Cartesian powers in this case.- Return type:
ndarray
iodata.docstrings module¶
Docstring decorators for file format implementations.
- document_dump_many(fmt, required, optional=None, kwdocs={}, notes=None)[source]¶
Decorate a dump_many function to generate a docstring.
- Parameters:
fmt (
str
) – The name of the file format.required (
List
[str
]) – A list of mandatory IOData attributes needed to write the file.optional (
Optional
[List
[str
]]) – A list of optional IOData attributes which can be include when writing the file.kwdocs (
Dict
[str
,str
]) – A dictionary with documentation for keyword arguments. Each key is a keyword argument name and the corresponding value is text explaining the argument.notes (
Optional
[str
]) – Additional information to be added to the docstring.
- Returns:
A decorator function.
- Return type:
decorator
- document_dump_one(fmt, required, optional=None, kwdocs={}, notes=None)[source]¶
Decorate a dump_one function to generate a docstring.
- Parameters:
fmt (
str
) – The name of the file format.required (
List
[str
]) – A list of mandatory IOData attributes needed to write the file.optional (
Optional
[List
[str
]]) – A list of optional IOData attributes which can be include when writing the file.kwdocs (
Dict
[str
,str
]) – A dictionary with documentation for keyword arguments. Each key is a keyword argument name and the corresponding value is text explaining the argument.notes (
Optional
[str
]) – Additional information to be added to the docstring.
- Returns:
A decorator function.
- Return type:
decorator
- document_load_many(fmt, guaranteed, ifpresent=None, kwdocs={}, notes=None)[source]¶
Decorate a load_many function to generate a docstring.
- Parameters:
fmt (
str
) – The name of the file format.guaranteed (
List
[str
]) – A list of IOData attributes this format can certainly read.ifpresent (
Optional
[List
[str
]]) – A list of IOData attributes this format reads of present in the file.kwdocs (
Dict
[str
,str
]) – A dictionary with documentation for keyword arguments. Each key is a keyword argument name and the corresponding value is text explaining the argument.notes (
Optional
[str
]) – Additional information to be added to the docstring.
- Returns:
A decorator function.
- Return type:
decorator
- document_load_one(fmt, guaranteed, ifpresent=None, kwdocs={}, notes=None)[source]¶
Decorate a load_one function to generate a docstring.
- Parameters:
fmt (
str
) – The name of the file format.guaranteed (
List
[str
]) – A list of IOData attributes this format can certainly read.ifpresent (
Optional
[List
[str
]]) – A list of IOData attributes this format reads of present in the file.kwdocs (
Dict
[str
,str
]) – A dictionary with documentation for keyword arguments. Each key is a keyword argument name and the corresponding value is text explaining the argument.notes (
Optional
[str
]) – Additional information to be added to the docstring.
- Returns:
A decorator function.
- Return type:
decorator
- document_write_input(fmt, required, optional=None, kwdocs={}, notes=None)[source]¶
Decorate a write_input function to generate a docstring.
- Parameters:
fmt (
str
) – The name of the file format.required (
List
[str
]) – A list of mandatory IOData attributes needed to write the file.optional (
Optional
[List
[str
]]) – A list of optional IOData attributes which can be include when writing the file.kwdocs (
Dict
[str
,str
]) – A dictionary with documentation for keyword arguments. Each key is a keyword argument name and the corresponding value is text explaining the argument.notes (
Optional
[str
]) – Additional information to be added to the docstring.
- Returns:
A decorator function.
- Return type:
decorator
iodata.iodata module¶
Module for handling input/output from different file formats.
- class IOData(atcharges={}, atcoords=None, atcorenums=None, atffparams={}, atfrozen=None, atgradient=None, athessian=None, atmasses=None, atnums=None, basisdef=None, bonds=None, cellvecs=None, charge=None, core_energy=None, cube=None, energy=None, extcharges=None, extra={}, g_rot=None, lot=None, mo=None, moments={}, nelec=None, obasis=None, obasis_name=None, one_ints={}, one_rdms={}, run_type=None, spinpol=None, title=None, two_ints={}, two_rdms={})[source]¶
Bases:
object
A container class for data loaded from (or to be written to) a file.
In principle, the constructor accepts any keyword argument, which is stored as an attribute. All attributes are optional. Attributes can be set are removed after the IOData instance is constructed. The following attributes are supported by at least one of the io formats:
- atcharges¶
A dictionary where keys are names of charge definitions and values are arrays with atomic charges (size N).
- atcoords¶
A (N, 3) float array with Cartesian coordinates of the atoms.
- atcorenums¶
A (N,) float array with pseudo-potential core charges. The matrix elements corresponding to ghost atoms are zero.
- atffparams¶
A dictionary with arrays of atomic force field parameters (typically non-bonded). Keys include ‘charges’, ‘vdw_radii’, ‘sigmas’, ‘epsilons’, ‘alphas’ (atomic polarizabilities), ‘c6s’, ‘c8s’, ‘c10s’, ‘buck_as’, ‘buck_bs’, ‘lj_as’, ‘core_charges’, ‘valence_charges’, ‘valence_widths’, etc. Not all of them have to be present, depending on the use case.
- atfrozen¶
A (N,) bool array with frozen atoms. (All atoms are free if this attribute is not set.)
- atgradient¶
A (N, 3) float array with the first derivatives of the energy w.r.t. Cartesian atomic displacements.
- athessian¶
A (3*N, 3*N) array containing the energy Hessian w.r.t Cartesian atomic displacements.
- atmasses¶
A (N,) float array with atomic masses
- atnums¶
A (N,) int vector with the atomic numbers.
- basisdef¶
A basis set definition, i.e. a dictionary whose keys are symbols (of chemical elements), atomic numbers (similar to previous, str to make distinction with following) or an atom index (integer referring to a specific atom in a molecule). The format of the values is to be decided when implementing a load function for basis set definitions.
- bonds¶
An (nbond, 3) array with the list of covalent bonds. Each row represents one bond and consists of three integers: first atom index (starting from zero), second atom index & an optional bond type. Numerical values of bond types are defined in
iodata.periodic
.
- cellvecs¶
A (NP, 3) array containing the (real-space) cell vectors describing periodic boundary conditions. A single vector corresponds to a 1D cell, e.g. for a wire. Two vectors describe a 2D cell, e.g. for a membrane. Three vectors describe a 3D cell, e.g. a crystalline solid.
- charge¶
The net charge of the system. When possible, this is derived from atcorenums and nelec.
- core_energy¶
The Hartree-Fock energy due to the core orbitals
- cube¶
An instance of Cube, describing the volumetric data from a cube (or similar) file.
- energy¶
The total energy (electronic + nn)
- extcharges¶
Array with values of external charges, with shape (nextcharge, 4). First three columns for Cartesian X, Y and Z coordinates, last column for the actual charge.
- extra¶
A dictionary with additional data loaded from a file. Any data which cannot be assigned to the other attributes belongs here. It may be decided in future to move some of the results from this dictionary to IOData attributes, with a more final name.
- g_rot¶
The rotational symmetry number of the molecule.
- lot¶
The level of theory used to compute the orbitals (and other properties).
- mo¶
An instance of MolecularOrbitals.
- moments¶
A dictionary with electrostatic multipole moments. Keys are (angmom, kind) tuples where angmom is an integer for the angular momentum and kind is ‘c’ for Cartesian or ‘p’ for pure functions (only for angmom >= 2). The corresponding values are 1D numpy arrays. The order of the components of the multipole moments follows the HORTON2_CONVENTIONS from iodata/basis.py
- nelec¶
The number of electrons.
- obasis¶
An OrderedDict containing parameters to instantiate a GOBasis class.
- obasis_name¶
A name or DOI describing the basis set used for the orbitals in the mo attribute (if applicable). Should be consistent with www.basissetexchange.org.
- one_ints¶
Dictionary where keys are names and values are numpy arrays with one-body operators, typically integrals of a one-body operator with a pair of (Gaussian) basis functions. Names can start with
olp
(overlap),kin
(kinetic energy),na
(nuclear attraction),core
(core hamiltonian), etc., orone
(general one-electron integral). When relevant, these names must have a suffix_ao
or_mo
to clarify in which basis the integrals are computed._ao
is used to denote integrals in a non-orthogonal (atomic orbital) basis._mo
is used to denote an orthogonal (molecular orbital) basis. For the overlap integrals, this suffix can be omitted because it is only useful to compute them in the atomic-orbital basis.
- one_rdms¶
Dictionary where keys are names and values are one-particle density matrices. Names can be
scf
,post_scf
,scf_spin
,post_scf_spin
. When relevant, these names must have a suffix_ao
or_mo
to clarify in which basis the RDMs are computed._ao
is used to denote a non-orthogonal (atomic orbital) basis._mo
is used to denote an orthogonal (molecular orbital) basis. For the SCF RDMs, this suffix can be omitted because it is only useful to compute them in the atomic-orbital basis.
- run_type¶
The type of calculation that lead to the results stored in IOData, which must be one of the following: ‘energy’, ‘energy_force’, ‘opt’, ‘scan’, ‘freq’ or None.
- spinpol¶
The spin polarization. By default, its value is derived from the molecular orbitals (mo attribute), as abs(nalpha - nbeta). In this case, spinpol cannot be set. When no molecular orbitals are present, this attribute can be set.
- title¶
A suitable name for the data.
- two_ints¶
Dictionary where keys are names and values are numpy arrays with two-body operators, typically integrals of two-body operator with four of (Gaussian) basis functions. Names can start with
er
(electron repulsion) ortwo
(general pairswise interaction). When relevant, these names must have a suffix_ao
or_mo
to clarify in which basis the integrals are computed. Seeone_ints
for more details. Array indexes are in physicist’s notation.
- two_rdms¶
Dictionary where keys are names and values are two-particle density matrices. Names can be
post_scf
orpost_scf_spin
. When relevant, these names must have a suffix_ao
or_mo
to clarify in which basis the RDMs are computed. Seeone_rdms
for more details. Array indexes are in physicist’s notation.
- __init__(atcharges={}, atcoords=None, atcorenums=None, atffparams={}, atfrozen=None, atgradient=None, athessian=None, atmasses=None, atnums=None, basisdef=None, bonds=None, cellvecs=None, charge=None, core_energy=None, cube=None, energy=None, extcharges=None, extra={}, g_rot=None, lot=None, mo=None, moments={}, nelec=None, obasis=None, obasis_name=None, one_ints={}, one_rdms={}, run_type=None, spinpol=None, title=None, two_ints={}, two_rdms={})¶
Method generated by attrs for class IOData.
-
atcoords:
ndarray
¶
- property atcorenums: ndarray¶
Return effective core charges.
-
atfrozen:
ndarray
¶
-
atgradient:
ndarray
¶
-
athessian:
ndarray
¶
-
atmasses:
ndarray
¶
-
atnums:
ndarray
¶
-
bonds:
ndarray
¶
-
cellvecs:
ndarray
¶
-
extcharges:
ndarray
¶
-
obasis:
MolecularBasis
¶
iodata.orbitals module¶
Data structure for molecular orbitals.
- class MolecularOrbitals(kind, norba, norbb, occs=None, coeffs=None, energies=None, irreps=None)[source]¶
Bases:
object
Class of Orthonormal Molecular Orbitals.
- kind¶
Type of molecular orbitals, which can be ‘restricted’, ‘unrestricted’, or ‘generalized’.
- norba¶
Number of (occupied and virtual) alpha molecular orbitals. Set to None in case oftype==’generalized’.
- norbb¶
Number of (occupied and virtual) beta molecular orbitals. Set to None in case of type==’generalized’. This is expected to be equal to norba for the restricted kind.
- occs¶
Molecular orbital occupation numbers. The length equals the number of columns of coeffs.
- coeffs¶
Molecular orbital coefficients. In case of restricted: shape = (nbasis, norba) = (nbasis, norbb). In case of unrestricted: shape = (nbasis, norba + norbb). In case of generalized: shape = (2 * nbasis, norb), where norb is the total number of orbitals.
- energies¶
Molecular orbital energies. The length equals the number of columns of coeffs.
- irreps¶
Irreducible representation. The length equals the number of columns of coeffs.
- Warning¶
- Type:
the interpretation of the occupation numbers may only be suitable
- for single-reference orbitals (not fractionally occupied natural orbitals.)
- When an occupation number is in ]0, 1], it is assumed that an alpha orbital
- is (fractionally) occupied. When an occupation number is in ]1, 2], it is
- assumed that the alpha orbital is fully occupied and the beta orbital is
- (fractionally) occupied.
- __init__(kind, norba, norbb, occs=None, coeffs=None, energies=None, irreps=None)¶
Method generated by attrs for class MolecularOrbitals.
-
coeffs:
ndarray
¶
- property coeffsa¶
Return alpha orbital coefficients.
- property coeffsb¶
Return beta orbital coefficients.
-
energies:
ndarray
¶
- property energiesa¶
Return alpha orbital energies.
- property energiesb¶
Return beta orbital energies.
-
irreps:
ndarray
¶
- property irrepsa¶
Return alpha irreps.
- property irrepsb¶
Return beta irreps.
- property nbasis¶
Return the number of spatial basis functions.
- property norb¶
Return the number of spatially distinct orbitals.
Notes
In case of restricted wavefunctions, this may be less than just the sum of
norba
andnorbb
, because alpha and beta orbitals share the same spatical dependence.
-
occs:
ndarray
¶
- property occsa¶
Return alpha occupation numbers.
- property occsb¶
Return beta occupation numbers.
iodata.overlap module¶
Module for computing overlap of atomic orbital basis functions.
- compute_overlap(obasis0, atcoords0, obasis1=None, atcoords1=None)[source]¶
Compute overlap matrix for the given molecular basis set(s).
\[\braket{\psi_{i}}{\psi_{j}}\]When only one basis set is given, the overlap matrix of that basis (with itself) is computed. If a second basis set (with its atomic coordinates) is provided, the overlap between the two basis sets is computed.
This function takes into account the requested order of the basis functions in
obasis0.conventions
(andobasis1.conventions
). Note that only L2 normalized primitives are supported at the moment.- Parameters:
obasis0 (
MolecularBasis
) – The orbital basis set.atcoords0 (
ndarray
) – The atomic Cartesian coordinates (including those of ghost atoms).obasis1 (
Optional
[MolecularBasis
]) – An optional second orbital basis set.atcoords1 (
Optional
[ndarray
]) – An optional second array with atomic Cartesian coordinates (including those of ghost atoms).
- Returns:
The matrix with overlap integrals,
shape=(obasis0.nbasis, obasis1.nbasis)
.- Return type:
overlap
iodata.overlap_cartpure module¶
Transformation matrices from Cartesian to pure basis functions.
Both Cartesian and pure functions are assumed to be normalized. These matrices were generated with:
python tools/harmonics.py L2 python 7
iodata.periodic module¶
Periodic table module.
iodata.utils module¶
Utility functions module.
- class Cube(origin, axes, data)[source]¶
Bases:
object
The volumetric data from a cube (or similar) file.
- origin¶
A 3D vector with the origin of the axes frame.
- axes¶
A (3, 3) array where each row represents the spacing between two neighboring grid points along the first, second and third axis, respectively.
- data¶
A (K, L, M) array of data on a uniform grid
- __init__(origin, axes, data)¶
Method generated by attrs for class Cube.
-
axes:
ndarray
¶
-
data:
ndarray
¶
-
origin:
ndarray
¶
- property shape¶
Shape of the rectangular grid.
- exception FileFormatError[source]¶
Bases:
OSError
Raised when incorrect content is encountered when loading files.
- __init__(*args, **kwargs)¶
- args¶
- characters_written¶
- errno¶
POSIX exception code
- filename¶
exception filename
- filename2¶
second exception filename
- strerror¶
exception strerror
- with_traceback()¶
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
- exception FileFormatWarning[source]¶
Bases:
Warning
Raised when incorrect content is encountered and fixed when loading files.
- __init__(*args, **kwargs)¶
- args¶
- with_traceback()¶
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
- class LineIterator(filename)[source]¶
Bases:
object
Iterator class for looping over lines and keeping track of the line number.
- __init__(filename)[source]¶
Initialize a LineIterator.
- Parameters:
filename (
str
) – The file that will be read.
- check_dm(dm, overlap, eps=0.0001, occ_max=1.0)[source]¶
Check if the density matrix has eigenvalues in the proper range.
- Parameters:
- Raises:
ValueError – When the density matrix has wrong eigenvalues.
- derive_naturals(dm, overlap)[source]¶
Derive natural orbitals from a given density matrix.
- Parameters:
dm (
ndarray
) – The density matrix. shape=(nbasis, nbasis)overlap (
ndarray
) – The overlap matrix shape=(nbasis, nbasis)
- Return type:
Tuple
[ndarray
,ndarray
]- Returns:
coeffs – Orbital coefficients shape=(nbasis, nfn)
occs – Orbital occupations shape=(nfn, )
- set_four_index_element(four_index_object, i, j, k, l, value)[source]¶
Assign values to a four index object, account for 8-fold index symmetry.
This function assumes physicists’ notation.
- Parameters:
four_index_object (
ndarray
) – The four-index object. It will be written to. shape=(nbasis, nbasis, nbasis, nbasis), dtype=floati (
int
) – The indices to assign to.j (
int
) – The indices to assign to.k (
int
) – The indices to assign to.l (
int
) – The indices to assign to.value (
float
) – The value of the matrix element to store.
- volume(cellvecs)[source]¶
Calculate the (generalized) cell volume.
- Parameters:
cellvecs (
ndarray
) – A numpy matrix of shape (x,3) where x is in {1,2,3}. Each row is one cellvector.- Returns:
In case of 3D, the cell volume. In case of 2D, the cell area. In case of 1D, the cell length.
- Return type:
volume
Module contents¶
Input and Output Module.