filter_utils
filter_utils#
- catlas.filter_utils.filter_best_facet_by_surface_property(bag_partition, name: str, val: dict)#
- Parse each facet and pick the one that should be the lowest energy by the broken bond model or the surface density model - Parameters
- bag_partition (Iterable[dict]) – a partition of a dask bag containing enumerated surfaces 
- name (str) – filter name to be applied (comes from the main catlas config) 
- val (dict) – values associated with name from the config yaml file, which futher specify how the filter should be applied 
 
- Returns
- the bag partition with undesired slabs filtered out 
- Return type
- Iterable[dict] 
 
- catlas.filter_utils.filter_by_surface_property(bag_partition, name: str, val: dict)#
- Parse all miller facets per material and pick those that should be lower energy by the broken bond model or the surface density model - Parameters
- bag_partition (Iterable[dict]) – a partition of a dask bag containing enumerated surfaces 
- name (str) – filter name to be applied (comes from the main catlas config) 
- val (dict) – values associated with name from the config yaml file, which futher specify how the filter should be applied 
 
- Returns
- the bag partition with undesired slabs filtered out 
- Return type
- Iterable[dict] 
 
- catlas.filter_utils.filter_columns_by_type(df, type_kws)#
- Filter columns of a dataframe based on what datatype they contain.
- If any element of the provided list is present in the string representation of the type of the first non-None element of the column, that column name will be included in the list of returned columns. - Example: if type_kws=[‘ocp’, ‘ocdata’], you will filter out any column whose first valid element is an ocp.ocpmodels.preprocessing.atoms_to_graphs.AtomsToGraphs object, or an ocdata.surfaces.Surface object, or a pydocparser.Parser object. 
 - Parameters
- df (pd.core.frame.DataFrame) – a pandas DataFrame 
- column_kws (list[str]) – If any element of this list is present in the type of 
 
- Returns
- the index corresponds to the column name, the value is True if the column contains a type corresponding to the filtering criteria. 
- Return type
- pd.core.series.Series[bool] 
 
- catlas.filter_utils.get_bond_length(ucell, neighbor_factor)#
- Gets all bond lengths of all symmetrically distinct sites and organizes it as a dictionary with the unique Wyckoff symbol as key and the bondlegnth as float value. :param ucell: PMG Structure representation of a bulk unit cell. :type ucell: pymatgen.structure.Structure :param factor: buffer for the radius to look - for neighbors in order to calculate bond length - Returns
- {wyckoff symbol (str): bondlength (float)} 
- Return type
- dict 
 
- catlas.filter_utils.get_broken_bonds(row: dict, neighbor_factor: float) float#
- Estimates surface energy using a broken bond model. - Parameters
- ucell (pymatgen.structure.Structure) – PMG Structure representation of a bulk unit cell. 
- slab (pymatgen.structure.Structure) – PMG Structure representation of a slab cell. 
- ecoh (float) – Cohesive energy which correlates to the surface energy 
- factor (float) – buffer for the radius to look for neighbors in order to calculate bond length 
 
- Returns
- Rough estimate of surface energy 
- Return type
- (float) 
 
- catlas.filter_utils.get_bulk_cn(ucell, neighbor_factor)#
- Gets coordination number of each symmetrically distinct site in the unit cell and organizes it as a dictionary with the unique Wyckoff symbol as key and the coordination number as an int value. - Parameters
- ucell (pymatgen.structure.Structure) – PMG Structure representation of a bulk unit cell. 
- factor (float) – buffer for the radius to look for neighbors in order to calculate bond length 
 
- Returns
- {wyckoff symbol (str): coordination number (int)} 
- Return type
- (dict) 
 
- catlas.filter_utils.get_center_of_mass(pmg_struct)#
- Calculates the center of mass of a pmg structure. - Parameters
- pmg_struct (pymatgen.core.structure.Structure) – pymatgen structure to be considered. 
- Returns
- the center of mass 
- Return type
- numpy.ndarray 
 
- catlas.filter_utils.get_decomposition_bools_from_list(pbx, pbx_entry, conditions)#
- Evaluates decomposition energies at regular pH and voltage windows at specified pH and voltage points. - Parameters
- pbx (pymatgen.analysis.pourbaix_diagram.PourbaixDiagram) – an electrochemical stability diagram for the reference system. 
- pbx_entry (pymatgen.analysis.pourbaix_diagram.PourbaixEntry) – a pourbaix entry specific to the material we want to calculate the decomposition energy of. 
- conditions (list[dict]) – Conditions to evaluate the decomposition energy at. Each dictionary contains a pH and a voltage, both expressed as floats. 
 
- Returns
- Whether the input entry is stable under each set of conditions. 
- Return type
- Iterable[bool] 
 
- catlas.filter_utils.get_decomposition_bools_from_range(pbx, pbx_entry, conditions)#
- Evaluates decomposition energies at regular pH and voltage windows within specified intervals. - Parameters
- pbx (pymatgen.analysis.pourbaix_diagram.PourbaixDiagram) – a pourbaix diagram object containing information about the reference chemical system. 
- pbx_entry (pymatgen.analysis.pourbaix_diagram.PourbaixEntry) – a pourbaix entry specific to the material we want to calculate the decomposition energy of. 
- conditions (dict) – A dictionary specifying what condition or sets of conditions to evaluate the decomposition energy at. 
 
- Returns
- A list corresponding to whether the input entry is stable under
- each set of conditions. 
 
- Return type
- Iterable[bool] 
 
- catlas.filter_utils.get_elements_in_groups(groups: list) list#
- Grabs the element symbols of all elements in the specified groups. - Parameters
- groups (list[str]) – Names of groups to include in the output. Any element in any group will be included. Valid groups are listed in the implemented_groups variable. 
- Returns
- Elements included in the input groups. 
- Return type
- list[str] 
 
- catlas.filter_utils.get_first_type(x)#
- Get the type of the input, unpacking lists first if necessary. This is used to discard large objects from the output df of catlas if they are specified as unnecessary in the config yaml by examining the type of objects in a list where applicable. - Parameters
- x (Any) – An object to get the type of 
- Returns
- The type of the input 
- Return type
- type 
 
- catlas.filter_utils.get_pourbaix_info(entry: dict) dict#
- Construct a Pourbaix diagram for a material. This currently only supports MP inputs. - Parameters
- entry – bulk structure entry as constructed by catlas.load_bulk_structures.load_bulks_from_db 
- Raises
- ValueError – The bulk id provided in the entry is not a materials project id 
 
- catlas.filter_utils.get_pourbaix_stability(entry: dict, conditions: dict) list#
- Evaluate whether a material will be stable under various electrochemical conditions. - Parameters
- entry – A dictionary containing the bulk entry which will be assessed 
- conditions – The dictionary of Pourbaix settings set in the config yaml 
 
- Returns
- True if the material is stable for each input condition 
- Return type
- list[bool] 
 
- catlas.filter_utils.get_surface_density(row: dict, neighbor_factor: float) float#
- Estimates surface density multiplied by cohesive energy. - Parameters
- ucell (pymatgen.structure.Structure) – PMG Structure representation of a bulk unit cell. 
- slab (pymatgen.structure.Structure) – PMG Structure representation of a slab cell. 
- ecoh (float) – Cohesive energy which correlates to the surface energy 
- factor (float) – buffer for the radius to look for neighbors in order to calculate bond length 
 
- Returns
- Rough estimate of cohesive energy x surface density 
- Return type
- (float) 
 
- catlas.filter_utils.get_total_bb(ucell, slab, neighbor_factor: float) float#
- Calculates the total ratio of broken bonds to bulk coordination number. Often used as a factor in surface energy. - Parameters
- ucell (pymatgen.structure.Structure) – PMG Structure representation of a bulk unit cell. 
- slab (pymatgen.structure.Structure) – PMG Structure representation of a slab cell. 
- factor (float) – buffer for the radius to look for neighbors in order to calculate bond length 
 
- Returns
- Sum of undercoordination/full bulk coordination for each surface site 
- Return type
- (float) 
 
- catlas.filter_utils.get_total_nn(ucell, slab, neighbor_factor: float) int#
- Calculates the sum of nearest neighbors for each surface site. - Parameters
- ucell (pymatgen.structure.Structure) – PMG Structure representation of a bulk unit cell. 
- slab (pymatgen.structure.Structure) – PMG Structure representation of a slab cell. 
- factor (float) – buffer for the radius to look for neighbors in order to calculate bond length 
 
- Returns
- Sum of surface coordination number 
- Return type
- (int) 
 
- catlas.filter_utils.pb_query_and_write(entry: dict, lmdb_path: str)#
- Pull pourbaix info from MP and write it to the lmdb. - Parameters
- entry (dict) – entry to query 
- lmdb_path (str) – path of lmdb to write to 
 
 
- catlas.filter_utils.surface_area(slab)#
- Gets cross section surface area of the slab. :param slab: PMG Structure representation of a slab. :type slab: pymatgen.structure.Structure - Returns
- surface area 
- Return type
- (float) 
 
- catlas.filter_utils.write_pourbaix_info(pbx_entry: dict, lmdb_path)#
- Write the pourbaix query info to lmdb for future use. - Parameters
- pbx_entry (dict) – Relevant pourbaix query info for a single mpid. 
- lmdb_path (str) – Location where the lmdb will be written, including file name. 
 
 
