filter_utils
filter_utils#
- catlas.filter_utils.filter_best_facet_by_surface_property(bag_partition, name: str, val: dict)#
Parse each facet and pick the one that should be the lowest energy by the broken bond model or the surface density model
- Parameters
bag_partition (Iterable[dict]) – a partition of a dask bag containing enumerated surfaces
name (str) – filter name to be applied (comes from the main catlas config)
val (dict) – values associated with name from the config yaml file, which futher specify how the filter should be applied
- Returns
the bag partition with undesired slabs filtered out
- Return type
Iterable[dict]
- catlas.filter_utils.filter_by_surface_property(bag_partition, name: str, val: dict)#
Parse all miller facets per material and pick those that should be lower energy by the broken bond model or the surface density model
- Parameters
bag_partition (Iterable[dict]) – a partition of a dask bag containing enumerated surfaces
name (str) – filter name to be applied (comes from the main catlas config)
val (dict) – values associated with name from the config yaml file, which futher specify how the filter should be applied
- Returns
the bag partition with undesired slabs filtered out
- Return type
Iterable[dict]
- catlas.filter_utils.filter_columns_by_type(df, type_kws)#
- Filter columns of a dataframe based on what datatype they contain.
If any element of the provided list is present in the string representation of the type of the first non-None element of the column, that column name will be included in the list of returned columns.
Example: if type_kws=[‘ocp’, ‘ocdata’], you will filter out any column whose first valid element is an ocp.ocpmodels.preprocessing.atoms_to_graphs.AtomsToGraphs object, or an ocdata.surfaces.Surface object, or a pydocparser.Parser object.
- Parameters
df (pd.core.frame.DataFrame) – a pandas DataFrame
column_kws (list[str]) – If any element of this list is present in the type of
- Returns
the index corresponds to the column name, the value is True if the column contains a type corresponding to the filtering criteria.
- Return type
pd.core.series.Series[bool]
- catlas.filter_utils.get_bond_length(ucell, neighbor_factor)#
Gets all bond lengths of all symmetrically distinct sites and organizes it as a dictionary with the unique Wyckoff symbol as key and the bondlegnth as float value. :param ucell: PMG Structure representation of a bulk unit cell. :type ucell: pymatgen.structure.Structure :param factor: buffer for the radius to look
for neighbors in order to calculate bond length
- Returns
{wyckoff symbol (str): bondlength (float)}
- Return type
dict
- catlas.filter_utils.get_broken_bonds(row: dict, neighbor_factor: float) float #
Estimates surface energy using a broken bond model.
- Parameters
ucell (pymatgen.structure.Structure) – PMG Structure representation of a bulk unit cell.
slab (pymatgen.structure.Structure) – PMG Structure representation of a slab cell.
ecoh (float) – Cohesive energy which correlates to the surface energy
factor (float) – buffer for the radius to look for neighbors in order to calculate bond length
- Returns
Rough estimate of surface energy
- Return type
(float)
- catlas.filter_utils.get_bulk_cn(ucell, neighbor_factor)#
Gets coordination number of each symmetrically distinct site in the unit cell and organizes it as a dictionary with the unique Wyckoff symbol as key and the coordination number as an int value.
- Parameters
ucell (pymatgen.structure.Structure) – PMG Structure representation of a bulk unit cell.
factor (float) – buffer for the radius to look for neighbors in order to calculate bond length
- Returns
{wyckoff symbol (str): coordination number (int)}
- Return type
(dict)
- catlas.filter_utils.get_center_of_mass(pmg_struct)#
Calculates the center of mass of a pmg structure.
- Parameters
pmg_struct (pymatgen.core.structure.Structure) – pymatgen structure to be considered.
- Returns
the center of mass
- Return type
numpy.ndarray
- catlas.filter_utils.get_decomposition_bools_from_list(pbx, pbx_entry, conditions)#
Evaluates decomposition energies at regular pH and voltage windows at specified pH and voltage points.
- Parameters
pbx (pymatgen.analysis.pourbaix_diagram.PourbaixDiagram) – an electrochemical stability diagram for the reference system.
pbx_entry (pymatgen.analysis.pourbaix_diagram.PourbaixEntry) – a pourbaix entry specific to the material we want to calculate the decomposition energy of.
conditions (list[dict]) – Conditions to evaluate the decomposition energy at. Each dictionary contains a pH and a voltage, both expressed as floats.
- Returns
Whether the input entry is stable under each set of conditions.
- Return type
Iterable[bool]
- catlas.filter_utils.get_decomposition_bools_from_range(pbx, pbx_entry, conditions)#
Evaluates decomposition energies at regular pH and voltage windows within specified intervals.
- Parameters
pbx (pymatgen.analysis.pourbaix_diagram.PourbaixDiagram) – a pourbaix diagram object containing information about the reference chemical system.
pbx_entry (pymatgen.analysis.pourbaix_diagram.PourbaixEntry) – a pourbaix entry specific to the material we want to calculate the decomposition energy of.
conditions (dict) – A dictionary specifying what condition or sets of conditions to evaluate the decomposition energy at.
- Returns
- A list corresponding to whether the input entry is stable under
each set of conditions.
- Return type
Iterable[bool]
- catlas.filter_utils.get_elements_in_groups(groups: list) list #
Grabs the element symbols of all elements in the specified groups.
- Parameters
groups (list[str]) – Names of groups to include in the output. Any element in any group will be included. Valid groups are listed in the implemented_groups variable.
- Returns
Elements included in the input groups.
- Return type
list[str]
- catlas.filter_utils.get_first_type(x)#
Get the type of the input, unpacking lists first if necessary. This is used to discard large objects from the output df of catlas if they are specified as unnecessary in the config yaml by examining the type of objects in a list where applicable.
- Parameters
x (Any) – An object to get the type of
- Returns
The type of the input
- Return type
type
- catlas.filter_utils.get_pourbaix_info(entry: dict) dict #
Construct a Pourbaix diagram for a material. This currently only supports MP inputs.
- Parameters
entry – bulk structure entry as constructed by catlas.load_bulk_structures.load_bulks_from_db
- Raises
ValueError – The bulk id provided in the entry is not a materials project id
- catlas.filter_utils.get_pourbaix_stability(entry: dict, conditions: dict) list #
Evaluate whether a material will be stable under various electrochemical conditions.
- Parameters
entry – A dictionary containing the bulk entry which will be assessed
conditions – The dictionary of Pourbaix settings set in the config yaml
- Returns
True if the material is stable for each input condition
- Return type
list[bool]
- catlas.filter_utils.get_surface_density(row: dict, neighbor_factor: float) float #
Estimates surface density multiplied by cohesive energy.
- Parameters
ucell (pymatgen.structure.Structure) – PMG Structure representation of a bulk unit cell.
slab (pymatgen.structure.Structure) – PMG Structure representation of a slab cell.
ecoh (float) – Cohesive energy which correlates to the surface energy
factor (float) – buffer for the radius to look for neighbors in order to calculate bond length
- Returns
Rough estimate of cohesive energy x surface density
- Return type
(float)
- catlas.filter_utils.get_total_bb(ucell, slab, neighbor_factor: float) float #
Calculates the total ratio of broken bonds to bulk coordination number. Often used as a factor in surface energy.
- Parameters
ucell (pymatgen.structure.Structure) – PMG Structure representation of a bulk unit cell.
slab (pymatgen.structure.Structure) – PMG Structure representation of a slab cell.
factor (float) – buffer for the radius to look for neighbors in order to calculate bond length
- Returns
Sum of undercoordination/full bulk coordination for each surface site
- Return type
(float)
- catlas.filter_utils.get_total_nn(ucell, slab, neighbor_factor: float) int #
Calculates the sum of nearest neighbors for each surface site.
- Parameters
ucell (pymatgen.structure.Structure) – PMG Structure representation of a bulk unit cell.
slab (pymatgen.structure.Structure) – PMG Structure representation of a slab cell.
factor (float) – buffer for the radius to look for neighbors in order to calculate bond length
- Returns
Sum of surface coordination number
- Return type
(int)
- catlas.filter_utils.pb_query_and_write(entry: dict, lmdb_path: str)#
Pull pourbaix info from MP and write it to the lmdb.
- Parameters
entry (dict) – entry to query
lmdb_path (str) – path of lmdb to write to
- catlas.filter_utils.surface_area(slab)#
Gets cross section surface area of the slab. :param slab: PMG Structure representation of a slab. :type slab: pymatgen.structure.Structure
- Returns
surface area
- Return type
(float)
- catlas.filter_utils.write_pourbaix_info(pbx_entry: dict, lmdb_path)#
Write the pourbaix query info to lmdb for future use.
- Parameters
pbx_entry (dict) – Relevant pourbaix query info for a single mpid.
lmdb_path (str) – Location where the lmdb will be written, including file name.