Adsorption energy calculations using a pre-trained OCP calculator with OC22 referencing#

For the OC22 release, we introduced total energy model predictions. This is more flexible than the adsorption energy predictions from OC20 and allows many other properties to be calculated.

To use this model we’ll have to use it separately for predictions for the gas phase species and bare slab like we did in Adsorption energy calculations using a toy calculator.

For this example we’ll use a checkpoint for a GemNet-OC model that was trained on both OC20 and OC22 total energy data.

Warning

This checkpoint has a few important caveats to keep in mind before using it blindly:

This is trained as a force field to predict both energies and forces for specific configurations
The energies and forces are not internally consistent, since we’ve found that models that predict the energy and force separately tend to do better for OC20

Download pretrained checkpoint#

We have released checkpoints of all the models on the leaderboard here. These trained models can be used as an ASE calculator for various calculations.

!wget -q -nc https://dl.fbaipublicfiles.com/opencatalystproject/models/2022_09/oc22/s2ef/gnoc_oc22_oc20_all_s2ef.pt
checkpoint_path = "gnoc_oc22_oc20_all_s2ef.pt"

Relaxing the adslab#

We can load the pre-trained OCP calculator very easily!

from ocpmodels.common.relaxation.ase_utils import OCPCalculator

ocp_calculator = OCPCalculator(checkpoint=checkpoint_path)

WARNING:root:Unable to identify OCP trainer, defaulting to `forces`. Specify the `trainer` argument into OCPCalculator if otherwise.

WARNING:root:Unrecognized arguments: ['symmetric_edge_symmetrization']

amp: false
cmd:
  checkpoint_dir: /home/runner/work/ml_catalysis_tutorials/ml_catalysis_tutorials/ml-catalysis-tutorials/notes/pretrained_models/checkpoints/2022-10-31-18-03-44
  commit: cba9fb6
  identifier: ''
  logs_dir: /home/runner/work/ml_catalysis_tutorials/ml_catalysis_tutorials/ml-catalysis-tutorials/notes/pretrained_models/logs/tensorboard/2022-10-31-18-03-44
  print_every: 100
  results_dir: /home/runner/work/ml_catalysis_tutorials/ml_catalysis_tutorials/ml-catalysis-tutorials/notes/pretrained_models/results/2022-10-31-18-03-44
  seed: null
  timestamp_id: 2022-10-31-18-03-44
dataset: null
gpus: 0
logger: tensorboard
model: gemnet_oc
model_attributes:
  activation: silu
  atom_edge_interaction: true
  atom_interaction: true
  cbf:
    name: spherical_harmonics
  cutoff: 12.0
  cutoff_aeaint: 12.0
  cutoff_aint: 12.0
  cutoff_qint: 12.0
  direct_forces: true
  edge_atom_interaction: true
  emb_size_aint_in: 64
  emb_size_aint_out: 64
  emb_size_atom: 256
  emb_size_cbf: 16
  emb_size_edge: 512
  emb_size_quad_in: 32
  emb_size_quad_out: 32
  emb_size_rbf: 16
  emb_size_sbf: 32
  emb_size_trip_in: 64
  emb_size_trip_out: 64
  envelope:
    exponent: 5
    name: polynomial
  extensive: true
  forces_coupled: false
  max_neighbors: 30
  max_neighbors_aeaint: 20
  max_neighbors_aint: 1000
  max_neighbors_qint: 8
  num_after_skip: 2
  num_atom: 3
  num_atom_emb_layers: 2
  num_before_skip: 2
  num_blocks: 4
  num_concat: 1
  num_global_out_layers: 2
  num_output_afteratom: 3
  num_radial: 128
  num_spherical: 7
  otf_graph: true
  output_init: HeOrthogonal
  qint_tags:
  - 1
  - 2
  quad_interaction: true
  rbf:
    name: gaussian
  regress_forces: true
  sbf:
    name: legendre_outer
  symmetric_edge_symmetrization: false
noddp: false
optim:
  batch_size: 16
  clip_grad_norm: 10
  ema_decay: 0.999
  energy_coefficient: 1
  eval_batch_size: 16
  eval_every: 5000
  factor: 0.8
  force_coefficient: 1
  load_balancing: atoms
  loss_energy: mae
  loss_force: atomwisel2
  lr_initial: 0.0005
  max_epochs: 80
  mode: min
  num_workers: 2
  optimizer: AdamW
  optimizer_params:
    amsgrad: true
  patience: 3
  scheduler: ReduceLROnPlateau
  weight_decay: 0
slurm:
  additional_parameters:
    constraint: volta32gb
  cpus_per_task: 3
  folder: /checkpoint/abhshkdz/ocp_oct1_logs/57632342
  gpus_per_node: 8
  job_id: '57632342'
  job_name: gnoc_oc22_oc20_all_s2ef
  mem: 480GB
  nodes: 8
  ntasks_per_node: 8
  partition: ocp,learnaccel
  time: 4320
task:
  dataset: oc22_lmdb
  description: Regressing to energies and forces for DFT trajectories from OCP
  eval_on_free_atoms: true
  grad_input: atomic forces
  labels:
  - potential energy
  metric: mae
  primary_metric: forces_mae
  train_on_free_atoms: true
  type: regression
trainer: forces

Now we’ll basically repeat the process from Adsorption energy calculations using a toy calculator using our pre-trained calculator.

import os

import ase.io
import numpy as np
from ase import Atoms
from ase.build import add_adsorbate, fcc100, molecule
from ase.calculators.emt import EMT
from ase.constraints import FixAtoms
from ase.io import extxyz
from ase.io.trajectory import Trajectory
from ase.optimize import LBFGS

# This cell sets up and runs a structural relaxation
# of a Cu(100) surface. It uses ASE scripts to make the surface.
# The actual surfaces in OC20 were prepared slightly differently.

# Make the bare slab using an ASE helper script
adslab = fcc100("Cu", size=(3, 3, 3))

# Now, let's add the adsorbate to the bare slab
adsorbate = molecule("CH3O")
add_adsorbate(adslab, adsorbate, 3, offset=(1, 1))  # adslab = adsorbate + slab

# Tag all slab atoms below surface as 0, surface as 1, adsorbate as 2
tags = np.zeros(len(adslab))
tags[18:27] = 1
tags[27:] = 2
adslab.set_tags(tags)

# Fixed atoms are prevented from moving during a structure relaxation.
# We fix all slab atoms beneath the surface.
cons = FixAtoms(indices=[atom.index for atom in adslab if (atom.tag == 0)])
adslab.set_constraint(cons)
adslab.center(vacuum=13.0, axis=2)
adslab.set_pbc(True)

# Set the toy calculator (EMT) so ASE knows how to get energies/forces
# at each step
adslab.set_calculator(ocp_calculator)

os.makedirs("data", exist_ok=True)

# Define structure optimizer - LBFGS. Run for 100 steps,
# or if the max force on all atoms (fmax) is below 0 ev/A.
# fmax is typically set to 0.01-0.05 eV/A,
# for this demo however we run for the full 100 steps.

dyn = LBFGS(adslab, trajectory="data/gemnetOC_oc22_Cu100+CH3O.traj")
dyn.run(fmax=0.03, steps=100)

adslab_traj = ase.io.read("data/gemnetOC_oc22_Cu100+CH3O.traj", ":")

/home/runner/work/ml_catalysis_tutorials/ml_catalysis_tutorials/ocp/ocpmodels/preprocessing/atoms_to_graphs.py:139: UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor. (Triggered internally at  /opt/conda/conda-bld/pytorch_1639180507909/work/torch/csrc/utils/tensor_new.cpp:201.)
  cell = torch.Tensor(atoms.get_cell()).view(1, 3, 3)
/home/runner/micromamba-root/envs/buildenv/lib/python3.9/site-packages/torch/autocast_mode.py:141: UserWarning: User provided device_type of 'cuda', but CUDA is not available. Disabling
  warnings.warn('User provided device_type of \'cuda\', but CUDA is not available. Disabling')
/home/runner/micromamba-root/envs/buildenv/lib/python3.9/site-packages/torch/functional.py:1069: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at  /opt/conda/conda-bld/pytorch_1639180507909/work/aten/src/ATen/native/TensorShape.cpp:2157.)
  return _VF.cartesian_prod(tensors)  # type: ignore[attr-defined]

       Step     Time          Energy         fmax
LBFGS:    0 18:03:13     -102.925842        2.5531

LBFGS:    1 18:03:13     -103.141159        0.9889

LBFGS:    2 18:03:14     -103.250916        0.5346

LBFGS:    3 18:03:14     -103.297417        0.3750

LBFGS:    4 18:03:15     -103.335434        0.3782

LBFGS:    5 18:03:16     -103.381653        0.4258

LBFGS:    6 18:03:16     -103.432320        0.4314

LBFGS:    7 18:03:17     -103.505104        0.4952

LBFGS:    8 18:03:17     -103.558937        0.6570

LBFGS:    9 18:03:18     -103.624023        1.1068

LBFGS:   10 18:03:19     -103.776688        1.6545

LBFGS:   11 18:03:19     -103.837852        2.2376

LBFGS:   12 18:03:20     -103.885040        1.3860

LBFGS:   13 18:03:20     -103.955612        1.1837

LBFGS:   14 18:03:21     -104.188095        1.0432

LBFGS:   15 18:03:22     -104.208069        0.7563

LBFGS:   16 18:03:22     -104.247948        0.6825

LBFGS:   17 18:03:23     -104.316689        0.7840

LBFGS:   18 18:03:24     -104.294258        0.5222

LBFGS:   19 18:03:24     -104.289474        0.3653

LBFGS:   20 18:03:25     -104.295074        0.3320

LBFGS:   21 18:03:25     -104.306656        0.4035

LBFGS:   22 18:03:26     -104.316551        0.4414

LBFGS:   23 18:03:27     -104.327782        0.3486

LBFGS:   24 18:03:27     -104.324432        0.3621

LBFGS:   25 18:03:28     -104.326538        0.4152

LBFGS:   26 18:03:28     -104.326027        0.4467

LBFGS:   27 18:03:29     -104.327187        0.6533

LBFGS:   28 18:03:30     -104.341805        0.6168

LBFGS:   29 18:03:30     -104.365189        0.3099

LBFGS:   30 18:03:31     -104.373894        0.3115

LBFGS:   31 18:03:32     -104.372292        0.2812

LBFGS:   32 18:03:32     -104.371628        0.2143

LBFGS:   33 18:03:33     -104.372612        0.1837

LBFGS:   34 18:03:33     -104.367661        0.1609

LBFGS:   35 18:03:34     -104.363922        0.2064

LBFGS:   36 18:03:35     -104.358955        0.2744

LBFGS:   37 18:03:35     -104.376450        0.3068

LBFGS:   38 18:03:36     -104.410309        0.3493

LBFGS:   39 18:03:36     -104.437386        0.2716

LBFGS:   40 18:03:37     -104.441780        0.2895

LBFGS:   41 18:03:38     -104.453445        0.3642

LBFGS:   42 18:03:38     -104.438026        0.6118

LBFGS:   43 18:03:39     -104.420090        1.1748

LBFGS:   44 18:03:40     -104.476585        0.9399

LBFGS:   45 18:03:40     -104.563576        0.9454

LBFGS:   46 18:03:41     -104.504128        2.2475

LBFGS:   47 18:03:41     -104.601463        0.7750

LBFGS:   48 18:03:42     -104.632332        0.6778

LBFGS:   49 18:03:43     -104.661018        0.8294

LBFGS:   50 18:03:43     -104.786324        0.8348

LBFGS:   51 18:03:44     -104.835426        0.6420

LBFGS:   52 18:03:44     -104.878456        0.3826

LBFGS:   53 18:03:45     -104.911453        0.3141

LBFGS:   54 18:03:46     -104.953659        0.4630

LBFGS:   55 18:03:46     -104.977753        0.5032

LBFGS:   56 18:03:47     -104.987679        0.5333

LBFGS:   57 18:03:47     -104.992859        0.4323

LBFGS:   58 18:03:48     -105.007919        0.3024

LBFGS:   59 18:03:49     -105.016884        0.2702

LBFGS:   60 18:03:49     -105.015167        0.2741

LBFGS:   61 18:03:50     -105.018517        0.2390

LBFGS:   62 18:03:51     -105.020927        0.1419

LBFGS:   63 18:03:51     -105.025475        0.1193

LBFGS:   64 18:03:52     -105.035484        0.1341

LBFGS:   65 18:03:52     -105.047981        0.1222

LBFGS:   66 18:03:53     -105.080688        0.1629

LBFGS:   67 18:03:54     -105.105751        0.1815

LBFGS:   68 18:03:54     -105.122963        0.1548

LBFGS:   69 18:03:55     -105.142929        0.1156

LBFGS:   70 18:03:55     -105.140564        0.1282

LBFGS:   71 18:03:56     -105.138969        0.1167

LBFGS:   72 18:03:57     -105.132248        0.0959

LBFGS:   73 18:03:57     -105.138504        0.0735

LBFGS:   74 18:03:58     -105.134331        0.0728

LBFGS:   75 18:03:59     -105.130386        0.0864

LBFGS:   76 18:03:59     -105.112701        0.1147

LBFGS:   77 18:04:00     -105.091721        0.1018

LBFGS:   78 18:04:00     -105.084084        0.0934

LBFGS:   79 18:04:01     -105.087753        0.0577

LBFGS:   80 18:04:02     -105.092865        0.0529

LBFGS:   81 18:04:02     -105.091103        0.0560

LBFGS:   82 18:04:03     -105.087654        0.0800

LBFGS:   83 18:04:04     -105.083412        0.0895

LBFGS:   84 18:04:04     -105.089378        0.0472

LBFGS:   85 18:04:05     -105.080627        0.0365

LBFGS:   86 18:04:05     -105.087029        0.0341

LBFGS:   87 18:04:06     -105.091805        0.0368

LBFGS:   88 18:04:07     -105.100128        0.0337

LBFGS:   89 18:04:07     -105.104546        0.0247

Notice the energies are much more negative than the results in Adsorption energy calculations using a pre-trained OCP calculator with OC20 referencing since these are total DFT energies, not adsorption energies.

Let’s check to see how the relaxation went to make sure it looks reasonable.

import matplotlib.pyplot as plt
from ase.visualize.plot import animate
from matplotlib import rc

# the `index` argument corresponds to what frame of the trajectory to read in, specifiying ":" reads in the full trajectory.
traj = ase.io.read("data/gemnetOC_oc22_Cu100+CH3O.traj", index=":")
anim = animate(traj, radii=0.8, rotation=("-75x, 45y, 10z"))

rc("animation", html="jshtml")
anim

../../_images/ase_calculator_demo_oc22_8_1.png

Now let’s predict the rest of the energies we need to compute the referenced adsorption energy. Note that the code is basically identical to the EMT example in Adsorption energy calculations using a toy calculator; we just replace the EMT calculator with our new calculator.

We’ll also use the real DFT-calculated per-element reference energies from the OC22 paper.

# Adslab energy
relaxed_adslab = ase.io.read("data/gemnetOC_oc22_Cu100+CH3O.traj")
adslab_energy = relaxed_adslab.get_potential_energy()
print(f"Adsorbate+slab energy = {adslab_energy:.2f} eV")

# Corresponding raw slab used in original adslab (adsorbate+slab) system.
slab = fcc100("Cu", size=(3, 3, 3))
tags = np.zeros(len(slab))
tags[18:27] = 1
slab.set_tags(tags)
cons = FixAtoms(indices=[atom.index for atom in slab if (atom.tag == 0)])
slab.set_constraint(cons)
slab.center(vacuum=13.0, axis=2)
slab.set_calculator(ocp_calculator)
dyn = LBFGS(slab)
dyn.run(fmax=0.03, steps=100)
slab_energy = slab.get_potential_energy()
print(f"Bare slab energy = {slab_energy:.2f} eV")

# These should be fixed!
gas_reference_energies = {
    "H": 1,
    "N": 1,
    "O": 1,
    "C": 1,
}

adsorbate = Atoms("CH3O").get_chemical_symbols()

adsorbate_reference_energy = 0
for ads in adsorbate:
    adsorbate_reference_energy += gas_reference_energies[ads]

print(f"Adsorbate reference energy = {adsorbate_reference_energy:.2f} eV\n")

adsorption_energy = adslab_energy - slab_energy - adsorbate_reference_energy
print(f"Adsorption energy: {adsorption_energy:.2f} eV")

Adsorbate+slab energy = -105.10 eV

/home/runner/micromamba-root/envs/buildenv/lib/python3.9/site-packages/torch/autocast_mode.py:141: UserWarning: User provided device_type of 'cuda', but CUDA is not available. Disabling
  warnings.warn('User provided device_type of \'cuda\', but CUDA is not available. Disabling')

       Step     Time          Energy         fmax
LBFGS:    0 18:04:38      -78.860931        0.0821

LBFGS:    1 18:04:39      -78.934349        0.0822

LBFGS:    2 18:04:39      -78.732971        0.2116

LBFGS:    3 18:04:40      -78.878304        0.0002

Bare slab energy = -78.88 eV
Adsorbate reference energy = 5.00 eV

Adsorption energy: -31.23 eV

Compare the results here with what the OC20-style calculator predicted in Adsorption energy calculations using a pre-trained OCP calculator with OC20 referencing. Note that the adsorbate rotates so that the O is pointing down on the surface and the adsorbate does not associate, similar to the OC20 example.

import pandas as pd
import plotly.express as px

energies = [
    image.get_potential_energy() - slab_energy - adsorbate_reference_energy
    for image in traj
]

df = pd.DataFrame(
    {
        "Relaxation Step": range(len(energies)),
        "GemNet-OC (OC20+OC22) Adsorption Energy [eV]": energies,
    }
)

fig = px.line(df, x="Relaxation Step", y="GemNet-OC (OC20+OC22) Adsorption Energy [eV]")
fig.show()

Warning

Note that the energy here is not necessarily monotonically decreasing despite this being a local relaxation. This is likely because the potential here uses direct predictions for the forces, rather than calculating them through the gradient, which results in faster model training and allows for larger and more expressive models. Following the gradients downhill might cause the energy to increase.

ML for Catalysis Tutorials

Adsorption energy calculations using a pre-trained OCP calculator with OC22 referencing

Contents

Adsorption energy calculations using a pre-trained OCP calculator with OC22 referencing#

Download pretrained checkpoint#

Relaxing the adslab#