matador.utils package¶
This module contains various utility functions that are used liberally throughout matador.
Submodules¶
matador.utils.ase_utils module¶
This file implements some light wrappers to the Atomic Simulation Environment (ASE).
-
matador.utils.ase_utils.
ase2dict
(atoms, as_model=False) → Union[dict, matador.crystal.crystal.Crystal][source]¶ Return a matador document (dictionary or
Crystal
) from an ase.Atoms object.
-
matador.utils.ase_utils.
doc2ase
(doc: Union[dict, matador.crystal.crystal.Crystal], add_keys_to_info=True)[source]¶ Convert matador document to simple ASE object.
- Parameters
doc (dict/
Crystal
) – matador document or Crystal containing the structure.- Keyword Arguments
add_keys_to_info (bool) – whether or not to add the keys from the matador document to the info section of the Atoms object.
matador.utils.castep_help_utils module¶
This submodule is essentially a script to scrape CASTEP help strings for all possible CASTEP parameters.
matador.utils.castep_params module¶
This file contains a Python list of all CASTEP parameters, automatically generated with file_utils.scrape_castep_params().
matador.utils.cell_utils module¶
This submodule implements some useful functions for real/reciprocal cell manipulation, symmetry checking and sampling (e.g. grids and paths.)
-
matador.utils.cell_utils.
abc2cart
(lattice_abc)[source]¶ Converts lattice parameters into Cartesian lattice vectors.
-
matador.utils.cell_utils.
cart2abcstar
(lattice_cart)[source]¶ Convert lattice_cart =[[a1,a2,a3],[b1,b2,b3],[c1,c2,c3]] to the reciprocal of the lattice vectors, NOT the reciprocal lattice vectors.
-
matador.utils.cell_utils.
cart2abc
(lattice_cart)[source]¶ Convert Cartesian lattice vectors to lattice parametres.
-
matador.utils.cell_utils.
frac2cart
(lattice_cart, positions_frac)[source]¶ Convert positions_frac block into positions_abs.
-
matador.utils.cell_utils.
wrap_frac_coords
(positions, remove=False)[source]¶ Wrap the given fractional coordinates back into the cell.
-
matador.utils.cell_utils.
switch_coords
(lattice, pos, norm=None)[source]¶ Act on coordinates with the relevant lattice vectors to switch from fractional to absolute coordinates.
- Parameters
lattice (np.ndarray(3, 3)) – either lattice_cart or reciprocal lattice_cart
(np.ndarray(3, (pos) – )): input positions to convert
- Keyword Arguments
norm (float) – divide final coordinates by normalisation factor, e.g. 2*np.pi when lattice is recip and positions are cartesian.
- Returns
): converted positions
- Return type
np.ndarray(3,
-
matador.utils.cell_utils.
cart2frac
(lattice_cart, positions_abs)[source]¶ Convert positions_abs block into positions_frac (and equivalent in reciprocal space).
-
matador.utils.cell_utils.
real2recip
(real_lat)[source]¶ Convert the real lattice in Cartesian basis to the reciprocal space lattice.
-
matador.utils.cell_utils.
calc_mp_grid
(lattice_cart, spacing)[source]¶ Return correct Monkhorst-Pack grid based on lattice vectors and desired spacing.
-
matador.utils.cell_utils.
shift_to_include_gamma
(mp_grid)[source]¶ Calculate the shift required to include $Gamma$. in the Monkhorst-Pack grid.
-
matador.utils.cell_utils.
shift_to_exclude_gamma
(mp_grid)[source]¶ Calculate the shift required to exclude $Gamma$. in the Monkhorst-Pack grid. Returns the “minimal shift”, i.e. only one direction will be shifted.
-
matador.utils.cell_utils.
get_best_mp_offset_for_cell
(doc)[source]¶ Calculates the “best” kpoint_mp_offset to use for the passed cell. If the crystal has a hexagonal space group, then the offset returned will shift the grid to include $Gamma$ point, and vice versa for non-hexagonal cells.
-
matador.utils.cell_utils.
calc_mp_spacing
(real_lat, mp_grid, prec=3)[source]¶ Convert real lattice in Cartesian basis and the kpoint_mp_grid into a grid spacing.
-
matador.utils.cell_utils.
get_seekpath_kpoint_path
(doc, standardize=True, explicit=True, spacing=0.01, threshold=1e-07, debug=False, symmetry_tol=None)[source]¶ Return the conventional kpoint path of the relevant crystal system according to the definitions by “HKPOT” in Comp. Mat. Sci. 128, 2017:
http://dx.doi.org/10.1016/j.commatsci.2016.10.015
- Parameters
doc (dict/tuple) – matador doc or spglib tuple to find kpoint path for.
- Keyword Arguments
- Returns
standardized version of input doc list: list of kpoint positions dict: full dictionary of all seekpath results
- Return type
-
matador.utils.cell_utils.
doc2spg
(doc, check_occ=True)[source]¶ Return an spglib input tuple from a matador doc.
-
matador.utils.cell_utils.
get_space_group_label_latex
(label)[source]¶ Return the LaTeX format of the passed space group label. Takes any string, leaves the first character upright, italicses the rest, handles subscripts and bars over numbers.
-
matador.utils.cell_utils.
standardize_doc_cell
(doc, primitive=True, symprec=0.01)[source]¶ Return standardized cell data from matador doc.
-
matador.utils.cell_utils.
get_spacegroup_spg
(doc, symprec=0.01, check_occ=True)[source]¶ Return spglib spacegroup for a cell.
-
matador.utils.cell_utils.
add_noise
(doc, amplitude=0.1)[source]¶ Add random noise to the positions of structure contained in doc. Useful for force convergence tests.
-
matador.utils.cell_utils.
calc_pairwise_distances_pbc
(poscart, images, lattice, rmax, poscart_b=None, compress=False, debug=False, filter_zero=False, per_image=False)[source]¶ Calculate PBC distances with SciPy’s cdist, given the image cell vectors.
- Parameters
poscart (numpy.ndarray) – list or array of absolute atomic coordinates.
images – iterable of lattice vector multiples (e.g. [2, -1, 3]) required to obtain the translation to desired image cells.
lattice (
list
iflist
) – list of lattice vectors of the real cell.rmax (float) – maximum value after which to mask the array.
- Keyword Arguments
poscart_b (numpy.ndarray) – absolute positions of another type of atom, where only A-B distances will be calculated.
debug (bool) – print timing data and how many distances were masked.
compress (bool) – whether or not to compressed the output array, useful when e.g. creating PDFs but not when atom ID is important.
filter_zero (bool) – whether or not to filter out the “self-interaction” zero distances.
per_image (bool) – return a list of distances per image, as opposed to one large flat. This preserves atom IDs for use elsewhere.
- Returns
- pairwise 2-D d_ij masked array with values
or stripped 1-D array containing just the distances, or a list of numpy arrays if per_image is True.
- Return type
distances (numpy.ndarray)
matador.utils.chem_utils module¶
This submodule defines some useful chemical functions and constants, with a focus on battery materials.
-
matador.utils.chem_utils.
get_iupac_ordering
()[source]¶ Stub for implementing IUPAC chemical ordering in formulae.
-
matador.utils.chem_utils.
get_atomic_symbol
(atomic_number)[source]¶ Returns elemental symbol from atomic number.
-
matador.utils.chem_utils.
get_concentration
(doc, elements, include_end=False)[source]¶ Returns x for A_x B_{1-x} or x,y for A_x B_y C_z, (x+y+z=1).
- Parameters
doc (list/dict) – structure to evaluate OR matador-style stoichiometry.
elements (list) – list of element symbols to enforce ordering.
- Keyword Arguments
include_end (bool) – whether or not to return the final value, i.e. [x, y, z] rather than [x, y] in the above.
- Returns
concentrations of elements in given order.
- Return type
list of float
-
matador.utils.chem_utils.
get_num_intercalated
(cursor)[source]¶ Return array of the number of intercalated atoms per host atom from a list of structures, of type defined by the first entry in the structures’ concentration vectors.
- Parameters
cursor (list of dict) – structures to evaluate.
- Returns
number of intercalated ions in each structure.
- Return type
ndarray
-
matador.utils.chem_utils.
get_binary_grav_capacities
(x, m_B)[source]¶ Returns capacity in mAh/g from x/y in A_x B_y and m_B in a.m.u.
-
matador.utils.chem_utils.
get_generic_grav_capacity
(concs, elements)[source]¶ Returns gravimetric capacity of <elements[0]> in mAh/g of matador doc.
-
matador.utils.chem_utils.
get_binary_volumetric_capacity
(initial_doc, final_doc)[source]¶ For initial (delithiated/sodiated) (single element) structure and final (maximally charged) binary structure, calculate the volumetric capacity.
-
matador.utils.chem_utils.
get_atoms_per_fu
(doc)[source]¶ Calculate and return the number of atoms per formula unit.
- Parameters
doc (list/dict) – structure to evaluate OR matador-style stoichiometry.
-
matador.utils.chem_utils.
get_formation_energy
(chempots, doc, energy_key='enthalpy_per_atom')[source]¶ From given chemical potentials, calculate the simplest formation energy per atom of the desired document.
Note
recursive_get(doc, energy_key) MUST return an energy per atom for the target doc and the chemical potentials.
- Parameters
chempots (list of dict) – list of chempot structures, must be unique.
doc (dict) – structure to evaluate.
- Keyword Arguments
energy_key (str or list) – name of energy field to use to calculate formation energy. Can use a list of keys/subkeys/indices to query nested dicts with matador.utils.cursor_utils.recursive_get.
- Returns
formation energy per atom.
- Return type
-
matador.utils.chem_utils.
get_number_of_chempots
(stoich, chempot_stoichs, precision=5)[source]¶ Return the required number of each (arbitrary) chemical potentials to construct one formula unit of the input stoichiometry. Uses least-squares as implemented by numpy.linalg.lstsq and rounds the output precision based on the precision kwarg.
- Parameters
stoich (list/dict) – matador-style stoichiometry, e.g. [[‘Li’, 3], [‘P’, 1]], or the full document.
chempot_stoichs (list/dict) – list of stoichiometries of the input chemical potentials, or the full documents.
- Keyword Arguments
precision (int/None) – number of decimal places to round answer to. None maintains the precision from numpy.linalg.lstsq.
- Returns
- number of each chemical potential required to create
1 formula unit.
- Return type
- Raises
RuntimeError – if the stoichiometry provided cannot be created with the given chemical potentials.
-
matador.utils.chem_utils.
get_stoich
(atom_types)[source]¶ Return integer stoichiometry from atom_types list.
-
matador.utils.chem_utils.
get_padded_composition
(stoichiometry, elements)[source]¶ Return a list that contains how many of each species in elements exists in the given stoichiometry. e.g. for [[‘Li’, 2], [‘O’, 1]] with elements [‘O’, ‘Li’, ‘Ba’], this function will return [1, 2, 0].
-
matador.utils.chem_utils.
get_ratios_from_stoichiometry
(stoichiometry)[source]¶ Get a dictionary of pairwise atomic ratios.
-
matador.utils.chem_utils.
get_stoich_from_formula
(formula: str, sort=True)[source]¶ Convert formula string, e.g. Li2TiP4 into a matador-style stoichiometry, e.g. [[‘Li’, 2], [‘Ti’, 1], [‘P’, 4]].
-
matador.utils.chem_utils.
parse_element_string
(elements_str, stoich=False)[source]¶ Parse element query string with macros. Has to parse braces too, and throw an error if brackets are unmatched.
- e.g.
Parameters: ‘[VII][Fe,Ru,Os][I]’ Returns: [‘[VII]’, ‘[Fe,Ru,Os]’, ‘[I]’]
- e.g.2
Parameters: ‘[VII]2[Fe,Ru,Os][I]’ Returns: [‘[VII]2’, ‘[Fe,Ru,Os]’, ‘[I]’]
- Parameters
elements_str – str, chemical formula, including macros.
- Keyword Arguments
stoich – bool, parse as a stoichiometry, i.e. check for numbers
- Raises
RuntimeError – if the composition contains unmatched brackets.
- Returns
split list of elements contained in input
- Return type
-
matador.utils.chem_utils.
get_root_source
(source)[source]¶ Get the main file source from a doc’s source list.
- Parameters
source (str/list/dict) – contents of doc[‘source’] or the doc itself.
- Returns
- “root” filename, e.g. if source = [‘KP.cell’, ‘KP.param’,
’KP_specific_structure.res’] then root = ‘KP_specific_structure’.
- Return type
-
matador.utils.chem_utils.
get_formula_from_stoich
(stoich, elements=None, tex=False, sort=True, latex_sub_style='')[source]¶ Get the chemical formula of a structure from its matador stoichiometry.
- Parameters
stoich (list) – matador-style stoichiometry.
- Keyword Arguments
- Returns
the string representation of the chemical formula.
- Return type
matador.utils.cursor_utils module¶
This submodule defines some useful generic cursor methods for displaying, extracting and refining results from a Mongo cursor/list.
-
matador.utils.cursor_utils.
recursive_get
(data, keys, _top=True)[source]¶ Recursively slice a nested dictionary by a list of keys.
- Parameters
- Raises
KeyError – if any in chain keys are missing,
IndexError – if any element of a sublist is missing.
-
matador.utils.cursor_utils.
recursive_set
(data, keys, value)[source]¶ Recursively slice a nested dictionary by a list of keys and set the value.
-
matador.utils.cursor_utils.
display_results
(cursor, energy_key='enthalpy_per_atom', summary=False, args=None, argstr=None, additions=None, deletions=None, sort=True, hull=False, markdown=False, latex=False, colour=True, return_str=False, use_source=True, details=False, per_atom=False, eform=False, source=False, **kwargs)[source]¶ Print query results in a table, with many options for customisability.
TODO: this function has gotten out of control and should be rewritten.
- Parameters
cursor (list of dict or pm.cursor.Cursor) – list of matador documents
- Keyword Arguments
summary (bool) – print a summary per stoichiometry, that uses the lowest energy phase (requires sort=True).
argstr (str) – string to store matador initialisation command
eform (bool) – prepend energy key with “formation_”.
sort (bool) – sort input cursor by the value of energy key.
return_str (bool) – return string instead of printing.
details (bool) – print extra details as an extra line per structure.
per_atom (bool) – print quantities per atom, rather than per fu.
source (bool) – print all source files associated with the structure.
use_source (bool) – use the source instead of the text id when displaying a structure.
hull (bool) – whether or not to print hull-style (True) or query-style
energy_key (str or list) – key (or recursive key) to print as energy (per atom)
markdown (bool) – whether or not to write a markdown file containing results
latex (bool) – whether or not to create a LaTeX table
colour (bool) – colour on-hull structures
additions (list) – list of string text_ids to be coloured green with a (+) or, list of indices referring to those structures in the cursor.
deletions (list) – list of string text_ids to be coloured red with a (-) or, list of indices referring to those structures in the cursor.
kwargs (dict) – any extra args are ignored.
- Returns
markdown or latex string, if markdown or latex is True, else None.
- Return type
-
matador.utils.cursor_utils.
loading_bar
(iterable, width=80, verbosity=0)[source]¶ Checks if tqdm exists and makes a loading bar, otherwise just returns initial iterable.
- Parameters
iterable (iterable) – the thing to be iterated over.
- Keyword Arguments
width (int) – maximum number of columns to use on screen.
- Returns
the decorated iterator.
- Return type
iterable
-
matador.utils.cursor_utils.
set_cursor_from_array
(cursor, array, key)[source]¶ Updates the key-value pair for documents in internal cursor from a numpy array.
-
matador.utils.cursor_utils.
get_array_from_cursor
(cursor, key, pad_missing=False)[source]¶ Returns a numpy array of the values of a key in a cursor, where the key can be defined as list of keys to use with recursive_get.
- Parameters
- Keyword Arguments
pad_missing (bool) – whether to fill array with NaN’s where data is missing.
- Raises
KeyError – if any document is missing that key, unless pad_missing is True.
- Returns
- numpy array containing results, padded
with np.nan if key is missing and pad_missing is True.
- Return type
np.ndarray
-
matador.utils.cursor_utils.
get_guess_doc_provenance
(sources, icsd=None)[source]¶ Returns a guess at the provenance of a structure from its source list.
Return possiblities are ‘ICSD’, ‘SWAP’, ‘OQMD’ or ‘AIRSS’, ‘MP’ or ‘PF’.
-
matador.utils.cursor_utils.
filter_unique_structures
(cursor, quiet=False, **kwargs)[source]¶ Wrapper for matador.fingerprints.similarity.get_uniq_cursor that displays the results and returns the filtered cursor.
-
matador.utils.cursor_utils.
filter_cursor
(cursor, key, vals, verbosity=0)[source]¶ Returns a cursor obeying the filter on the given key. Any documents that are missing the key will not be returned. Any documents with values that cannot be compared to floats will also not be returned.
matador.utils.db_utils module¶
matador.utils.errors module¶
This submodule module implements some useful exception types,
mostly for use in the compute
and calculator
submodules.
-
exception
matador.utils.errors.
CalculationError
[source]¶ Bases:
Exception
Raised when a particular calculation fails, for non-fatal reasons.
-
exception
matador.utils.errors.
MaxMemoryEstimateExceeded
[source]¶ Bases:
Exception
Raised when a structure is estimated to exceed the max memory.
-
exception
matador.utils.errors.
CriticalError
[source]¶ Bases:
RuntimeError
Raise this when you don’t want any more jobs to run because something uncorrectable has happened! Plays more nicely with multiprocessing than SystemExit.
-
exception
matador.utils.errors.
InputError
[source]¶ Bases:
RuntimeError
Raise this when there is an issue with the input files.
-
exception
matador.utils.errors.
WalltimeError
[source]¶ Bases:
RuntimeError
Raise this when you don’t want any more jobs to run because they’re about to exceed the max walltime.
-
exception
matador.utils.errors.
NodeCollisionError
[source]¶ Bases:
matador.utils.errors.CalculationError
Dummy exception to raise when one node has tried to run a calculation that another node is performing.
matador.utils.hull_utils module¶
This file implements some useful geometric functions for the construction and manipulation of convex hulls.
-
matador.utils.hull_utils.
vertices2plane
(points)[source]¶ Convert points (xi, yi, zi) for i=1,..,3 into the equation of the plane spanned by the vectors v12, v13. For unit vectors e(i):
v12 x v13 = n = i*e(1) + j*e(2) + k*e(3)
and so the equation of the plane is
i*x + j*y + k*z + d = 0.
- Parameters
points (list of np.ndarray) – list of 3 3D numpy arrays containing the points comprising the vertex.
- Returns
- a function which will return the vertical distance between
the point and the plane:
- Return type
callable
-
matador.utils.hull_utils.
vertices2line
(points)[source]¶ Perform a simple linear interpolation on two points.
-
matador.utils.hull_utils.
is_point_in_triangle
(point, triangle, preprocessed_triangle=False)[source]¶ Check whether a point is inside a triangle.
- Parameters
point (np.ndarray) – 3x1 array containing the coordinates of the point.
triangle (np.ndarray) – 3x3 array specifying the coordinates of the triangle vertices.
- Keyword Arguments
preprocessed_triangle (bool) – if True, treat the input triangle as already processed, i.e. the array contains the inverse of the barycentric coordinate array.
- Returns
- whether or not the point is found to lie inside the
triangle. If all vertices of the triangle lie on the same line, return False.
- Return type
-
matador.utils.hull_utils.
barycentric2cart
(structures)[source]¶ Convert ternary (x, y) in A_x B_y C_{1-x-y} to positions projected onto 2D plane.
Input structures array is of the form:
- [
[l(1)_0, l(2)_0, Eform_0], [l(1)_n, l(2)_n, Eform_n]
]
where l3 = 1 - l2 - l1 are the barycentric coordinates of the point in the triangle defined by the chemical potentials.
- Parameters
structures (list of np.ndarray) – list of 3D numpy arrays containing input points.
- Returns
- list of numpy arrays containing converted
coordinates.
- Return type
list of np.ndarray
matador.utils.pmg_utils module¶
This file implements some light wrappers to the pymatgen, via ASE.
-
matador.utils.pmg_utils.
get_chemsys
(elements, dumpfile=None)[source]¶ Scrape the Materials Project for the chemical system specified by elements, e.g. for elements [‘A’, ‘B’, ‘C’] the query performed is chemsys=’A-B-C’ & nelements=3. Requires interactive user input of their MP API key (unless set by environment variable $PMG_MAPI_KEY).
-
matador.utils.pmg_utils.
doc2pmg
(doc: Union[dict, matador.crystal.crystal.Crystal])[source]¶ Converts matador document/Crystal to a pymatgen structure, via ASE.
-
matador.utils.pmg_utils.
pmg2dict
(pmg: pymatgen.core.structure.Structure, as_model=False) → Union[dict, matador.crystal.crystal.Crystal][source]¶ Converts a pymatgen.Structure to a matador document/Crystal.
-
matador.utils.pmg_utils.
mp2dict
(response)[source]¶ Convert a response from pymatgen.MPRester into a matador document, via an ASE atoms object. Expects certain properties to be requested in order to construct a full matador document, e.g. structure & input.
- Parameters
response (dict) – containing one item of the MPRester response.
matador.utils.print_utils module¶
This file implements some useful wrappers to the print function for writing errors and warnings to stderr.
-
matador.utils.print_utils.
dumps
(obj, **kwargs)[source]¶ Mirrors json.dumps whilst handling numpy arrays.
-
class
matador.utils.print_utils.
NumpyEncoder
(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)[source]¶ Bases:
json.encoder.JSONEncoder
This encoder handles NumPy arrays in JSON, and was taken from StackOverflow (where else) (https://stackoverflow.com/a/47626762).
Constructor for JSONEncoder, with sensible defaults.
If skipkeys is false, then it is a TypeError to attempt encoding of keys that are not str, int, float or None. If skipkeys is True, such items are simply skipped.
If ensure_ascii is true, the output is guaranteed to be str objects with all incoming non-ASCII characters escaped. If ensure_ascii is false, the output can contain non-ASCII characters.
If check_circular is true, then lists, dicts, and custom encoded objects will be checked for circular references during encoding to prevent an infinite recursion (which would cause an OverflowError). Otherwise, no such check takes place.
If allow_nan is true, then NaN, Infinity, and -Infinity will be encoded as such. This behavior is not JSON specification compliant, but is consistent with most JavaScript based encoders and decoders. Otherwise, it will be a ValueError to encode such floats.
If sort_keys is true, then the output of dictionaries will be sorted by key; this is useful for regression tests to ensure that JSON serializations can be compared on a day-to-day basis.
If indent is a non-negative integer, then JSON array elements and object members will be pretty-printed with that indent level. An indent level of 0 will only insert newlines. None is the most compact representation.
If specified, separators should be an (item_separator, key_separator) tuple. The default is (‘, ‘, ‘: ‘) if indent is
None
and (‘,’, ‘: ‘) otherwise. To get the most compact JSON representation, you should specify (‘,’, ‘:’) to eliminate whitespace.If specified, default is a function that gets called for objects that can’t otherwise be serialized. It should return a JSON encodable version of the object or raise a
TypeError
.-
default
(obj)[source]¶ Implement this method in a subclass such that it returns a serializable object for
o
, or calls the base implementation (to raise aTypeError
).For example, to support arbitrary iterators, you could implement default like this:
def default(self, o): try: iterable = iter(o) except TypeError: pass else: return list(iterable) # Let the base class default method raise the TypeError return JSONEncoder.default(self, o)
-
matador.utils.viz_utils module¶
This submodule contains a dirty wrapper of ase-gui for quick visualisation, and nglview wrapper for JupyterNotebook visualisation, and some colour definitions scraped from VESTA configs.
-
matador.utils.viz_utils.
get_element_colours
()[source]¶ Read element colours from VESTA file. The colours file can be specified in the matadorrc. If unspecified, the default ../config/vesta_elements.ini will be used.