Adsorption energy calculations using a pre-trained OCP calculator with OC22 referencing#

For the OC22 release, we introduced total energy model predictions. This is more flexible than the adsorption energy predictions from OC20 and allows many other properties to be calculated.

To use this model we’ll have to use it separately for predictions for the gas phase species and bare slab like we did in Adsorption energy calculations using a toy calculator.

For this example we’ll use a checkpoint for a GemNet-OC model that was trained on both OC20 and OC22 total energy data.

Warning

This checkpoint has a few important caveats to keep in mind before using it blindly:

  • This is trained as a force field to predict both energies and forces for specific configurations

  • The energies and forces are not internally consistent, since we’ve found that models that predict the energy and force separately tend to do better for OC20

Download pretrained checkpoint#

We have released checkpoints of all the models on the leaderboard here. These trained models can be used as an ASE calculator for various calculations.

!wget -q -nc https://dl.fbaipublicfiles.com/opencatalystproject/models/2022_09/oc22/s2ef/gnoc_oc22_oc20_all_s2ef.pt
checkpoint_path = "gnoc_oc22_oc20_all_s2ef.pt"

Relaxing the adslab#

We can load the pre-trained OCP calculator very easily!

from ocpmodels.common.relaxation.ase_utils import OCPCalculator

ocp_calculator = OCPCalculator(checkpoint=checkpoint_path)
WARNING:root:Unable to identify OCP trainer, defaulting to `forces`. Specify the `trainer` argument into OCPCalculator if otherwise.
WARNING:root:Unrecognized arguments: ['symmetric_edge_symmetrization']
amp: false
cmd:
  checkpoint_dir: /home/runner/work/ml_catalysis_tutorials/ml_catalysis_tutorials/ml-catalysis-tutorials/notes/pretrained_models/checkpoints/2022-10-31-18-03-44
  commit: cba9fb6
  identifier: ''
  logs_dir: /home/runner/work/ml_catalysis_tutorials/ml_catalysis_tutorials/ml-catalysis-tutorials/notes/pretrained_models/logs/tensorboard/2022-10-31-18-03-44
  print_every: 100
  results_dir: /home/runner/work/ml_catalysis_tutorials/ml_catalysis_tutorials/ml-catalysis-tutorials/notes/pretrained_models/results/2022-10-31-18-03-44
  seed: null
  timestamp_id: 2022-10-31-18-03-44
dataset: null
gpus: 0
logger: tensorboard
model: gemnet_oc
model_attributes:
  activation: silu
  atom_edge_interaction: true
  atom_interaction: true
  cbf:
    name: spherical_harmonics
  cutoff: 12.0
  cutoff_aeaint: 12.0
  cutoff_aint: 12.0
  cutoff_qint: 12.0
  direct_forces: true
  edge_atom_interaction: true
  emb_size_aint_in: 64
  emb_size_aint_out: 64
  emb_size_atom: 256
  emb_size_cbf: 16
  emb_size_edge: 512
  emb_size_quad_in: 32
  emb_size_quad_out: 32
  emb_size_rbf: 16
  emb_size_sbf: 32
  emb_size_trip_in: 64
  emb_size_trip_out: 64
  envelope:
    exponent: 5
    name: polynomial
  extensive: true
  forces_coupled: false
  max_neighbors: 30
  max_neighbors_aeaint: 20
  max_neighbors_aint: 1000
  max_neighbors_qint: 8
  num_after_skip: 2
  num_atom: 3
  num_atom_emb_layers: 2
  num_before_skip: 2
  num_blocks: 4
  num_concat: 1
  num_global_out_layers: 2
  num_output_afteratom: 3
  num_radial: 128
  num_spherical: 7
  otf_graph: true
  output_init: HeOrthogonal
  qint_tags:
  - 1
  - 2
  quad_interaction: true
  rbf:
    name: gaussian
  regress_forces: true
  sbf:
    name: legendre_outer
  symmetric_edge_symmetrization: false
noddp: false
optim:
  batch_size: 16
  clip_grad_norm: 10
  ema_decay: 0.999
  energy_coefficient: 1
  eval_batch_size: 16
  eval_every: 5000
  factor: 0.8
  force_coefficient: 1
  load_balancing: atoms
  loss_energy: mae
  loss_force: atomwisel2
  lr_initial: 0.0005
  max_epochs: 80
  mode: min
  num_workers: 2
  optimizer: AdamW
  optimizer_params:
    amsgrad: true
  patience: 3
  scheduler: ReduceLROnPlateau
  weight_decay: 0
slurm:
  additional_parameters:
    constraint: volta32gb
  cpus_per_task: 3
  folder: /checkpoint/abhshkdz/ocp_oct1_logs/57632342
  gpus_per_node: 8
  job_id: '57632342'
  job_name: gnoc_oc22_oc20_all_s2ef
  mem: 480GB
  nodes: 8
  ntasks_per_node: 8
  partition: ocp,learnaccel
  time: 4320
task:
  dataset: oc22_lmdb
  description: Regressing to energies and forces for DFT trajectories from OCP
  eval_on_free_atoms: true
  grad_input: atomic forces
  labels:
  - potential energy
  metric: mae
  primary_metric: forces_mae
  train_on_free_atoms: true
  type: regression
trainer: forces

Now we’ll basically repeat the process from Adsorption energy calculations using a toy calculator using our pre-trained calculator.

import os

import ase.io
import numpy as np
from ase import Atoms
from ase.build import add_adsorbate, fcc100, molecule
from ase.calculators.emt import EMT
from ase.constraints import FixAtoms
from ase.io import extxyz
from ase.io.trajectory import Trajectory
from ase.optimize import LBFGS

# This cell sets up and runs a structural relaxation
# of a Cu(100) surface. It uses ASE scripts to make the surface.
# The actual surfaces in OC20 were prepared slightly differently.

# Make the bare slab using an ASE helper script
adslab = fcc100("Cu", size=(3, 3, 3))

# Now, let's add the adsorbate to the bare slab
adsorbate = molecule("CH3O")
add_adsorbate(adslab, adsorbate, 3, offset=(1, 1))  # adslab = adsorbate + slab

# Tag all slab atoms below surface as 0, surface as 1, adsorbate as 2
tags = np.zeros(len(adslab))
tags[18:27] = 1
tags[27:] = 2
adslab.set_tags(tags)

# Fixed atoms are prevented from moving during a structure relaxation.
# We fix all slab atoms beneath the surface.
cons = FixAtoms(indices=[atom.index for atom in adslab if (atom.tag == 0)])
adslab.set_constraint(cons)
adslab.center(vacuum=13.0, axis=2)
adslab.set_pbc(True)

# Set the toy calculator (EMT) so ASE knows how to get energies/forces
# at each step
adslab.set_calculator(ocp_calculator)

os.makedirs("data", exist_ok=True)

# Define structure optimizer - LBFGS. Run for 100 steps,
# or if the max force on all atoms (fmax) is below 0 ev/A.
# fmax is typically set to 0.01-0.05 eV/A,
# for this demo however we run for the full 100 steps.

dyn = LBFGS(adslab, trajectory="data/gemnetOC_oc22_Cu100+CH3O.traj")
dyn.run(fmax=0.03, steps=100)

adslab_traj = ase.io.read("data/gemnetOC_oc22_Cu100+CH3O.traj", ":")
/home/runner/work/ml_catalysis_tutorials/ml_catalysis_tutorials/ocp/ocpmodels/preprocessing/atoms_to_graphs.py:139: UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor. (Triggered internally at  /opt/conda/conda-bld/pytorch_1639180507909/work/torch/csrc/utils/tensor_new.cpp:201.)
  cell = torch.Tensor(atoms.get_cell()).view(1, 3, 3)
/home/runner/micromamba-root/envs/buildenv/lib/python3.9/site-packages/torch/autocast_mode.py:141: UserWarning: User provided device_type of 'cuda', but CUDA is not available. Disabling
  warnings.warn('User provided device_type of \'cuda\', but CUDA is not available. Disabling')
/home/runner/micromamba-root/envs/buildenv/lib/python3.9/site-packages/torch/functional.py:1069: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at  /opt/conda/conda-bld/pytorch_1639180507909/work/aten/src/ATen/native/TensorShape.cpp:2157.)
  return _VF.cartesian_prod(tensors)  # type: ignore[attr-defined]
       Step     Time          Energy         fmax
LBFGS:    0 18:03:13     -102.925842        2.5531
LBFGS:    1 18:03:13     -103.141159        0.9889
LBFGS:    2 18:03:14     -103.250916        0.5346
LBFGS:    3 18:03:14     -103.297417        0.3750
LBFGS:    4 18:03:15     -103.335434        0.3782
LBFGS:    5 18:03:16     -103.381653        0.4258
LBFGS:    6 18:03:16     -103.432320        0.4314
LBFGS:    7 18:03:17     -103.505104        0.4952
LBFGS:    8 18:03:17     -103.558937        0.6570
LBFGS:    9 18:03:18     -103.624023        1.1068
LBFGS:   10 18:03:19     -103.776688        1.6545
LBFGS:   11 18:03:19     -103.837852        2.2376
LBFGS:   12 18:03:20     -103.885040        1.3860
LBFGS:   13 18:03:20     -103.955612        1.1837
LBFGS:   14 18:03:21     -104.188095        1.0432
LBFGS:   15 18:03:22     -104.208069        0.7563
LBFGS:   16 18:03:22     -104.247948        0.6825
LBFGS:   17 18:03:23     -104.316689        0.7840
LBFGS:   18 18:03:24     -104.294258        0.5222
LBFGS:   19 18:03:24     -104.289474        0.3653
LBFGS:   20 18:03:25     -104.295074        0.3320
LBFGS:   21 18:03:25     -104.306656        0.4035
LBFGS:   22 18:03:26     -104.316551        0.4414
LBFGS:   23 18:03:27     -104.327782        0.3486
LBFGS:   24 18:03:27     -104.324432        0.3621
LBFGS:   25 18:03:28     -104.326538        0.4152
LBFGS:   26 18:03:28     -104.326027        0.4467
LBFGS:   27 18:03:29     -104.327187        0.6533
LBFGS:   28 18:03:30     -104.341805        0.6168
LBFGS:   29 18:03:30     -104.365189        0.3099
LBFGS:   30 18:03:31     -104.373894        0.3115
LBFGS:   31 18:03:32     -104.372292        0.2812
LBFGS:   32 18:03:32     -104.371628        0.2143
LBFGS:   33 18:03:33     -104.372612        0.1837
LBFGS:   34 18:03:33     -104.367661        0.1609
LBFGS:   35 18:03:34     -104.363922        0.2064
LBFGS:   36 18:03:35     -104.358955        0.2744
LBFGS:   37 18:03:35     -104.376450        0.3068
LBFGS:   38 18:03:36     -104.410309        0.3493
LBFGS:   39 18:03:36     -104.437386        0.2716
LBFGS:   40 18:03:37     -104.441780        0.2895
LBFGS:   41 18:03:38     -104.453445        0.3642
LBFGS:   42 18:03:38     -104.438026        0.6118
LBFGS:   43 18:03:39     -104.420090        1.1748
LBFGS:   44 18:03:40     -104.476585        0.9399
LBFGS:   45 18:03:40     -104.563576        0.9454
LBFGS:   46 18:03:41     -104.504128        2.2475
LBFGS:   47 18:03:41     -104.601463        0.7750
LBFGS:   48 18:03:42     -104.632332        0.6778
LBFGS:   49 18:03:43     -104.661018        0.8294
LBFGS:   50 18:03:43     -104.786324        0.8348
LBFGS:   51 18:03:44     -104.835426        0.6420
LBFGS:   52 18:03:44     -104.878456        0.3826
LBFGS:   53 18:03:45     -104.911453        0.3141
LBFGS:   54 18:03:46     -104.953659        0.4630
LBFGS:   55 18:03:46     -104.977753        0.5032
LBFGS:   56 18:03:47     -104.987679        0.5333
LBFGS:   57 18:03:47     -104.992859        0.4323
LBFGS:   58 18:03:48     -105.007919        0.3024
LBFGS:   59 18:03:49     -105.016884        0.2702
LBFGS:   60 18:03:49     -105.015167        0.2741
LBFGS:   61 18:03:50     -105.018517        0.2390
LBFGS:   62 18:03:51     -105.020927        0.1419
LBFGS:   63 18:03:51     -105.025475        0.1193
LBFGS:   64 18:03:52     -105.035484        0.1341
LBFGS:   65 18:03:52     -105.047981        0.1222
LBFGS:   66 18:03:53     -105.080688        0.1629
LBFGS:   67 18:03:54     -105.105751        0.1815
LBFGS:   68 18:03:54     -105.122963        0.1548
LBFGS:   69 18:03:55     -105.142929        0.1156
LBFGS:   70 18:03:55     -105.140564        0.1282
LBFGS:   71 18:03:56     -105.138969        0.1167
LBFGS:   72 18:03:57     -105.132248        0.0959
LBFGS:   73 18:03:57     -105.138504        0.0735
LBFGS:   74 18:03:58     -105.134331        0.0728
LBFGS:   75 18:03:59     -105.130386        0.0864
LBFGS:   76 18:03:59     -105.112701        0.1147
LBFGS:   77 18:04:00     -105.091721        0.1018
LBFGS:   78 18:04:00     -105.084084        0.0934
LBFGS:   79 18:04:01     -105.087753        0.0577
LBFGS:   80 18:04:02     -105.092865        0.0529
LBFGS:   81 18:04:02     -105.091103        0.0560
LBFGS:   82 18:04:03     -105.087654        0.0800
LBFGS:   83 18:04:04     -105.083412        0.0895
LBFGS:   84 18:04:04     -105.089378        0.0472
LBFGS:   85 18:04:05     -105.080627        0.0365
LBFGS:   86 18:04:05     -105.087029        0.0341
LBFGS:   87 18:04:06     -105.091805        0.0368
LBFGS:   88 18:04:07     -105.100128        0.0337
LBFGS:   89 18:04:07     -105.104546        0.0247

Notice the energies are much more negative than the results in Adsorption energy calculations using a pre-trained OCP calculator with OC20 referencing since these are total DFT energies, not adsorption energies.

Let’s check to see how the relaxation went to make sure it looks reasonable.

import matplotlib.pyplot as plt
from ase.visualize.plot import animate
from matplotlib import rc

# the `index` argument corresponds to what frame of the trajectory to read in, specifiying ":" reads in the full trajectory.
traj = ase.io.read("data/gemnetOC_oc22_Cu100+CH3O.traj", index=":")
anim = animate(traj, radii=0.8, rotation=("-75x, 45y, 10z"))

rc("animation", html="jshtml")
anim
../../_images/ase_calculator_demo_oc22_8_1.png

Now let’s predict the rest of the energies we need to compute the referenced adsorption energy. Note that the code is basically identical to the EMT example in Adsorption energy calculations using a toy calculator; we just replace the EMT calculator with our new calculator.

We’ll also use the real DFT-calculated per-element reference energies from the OC22 paper.

# Adslab energy
relaxed_adslab = ase.io.read("data/gemnetOC_oc22_Cu100+CH3O.traj")
adslab_energy = relaxed_adslab.get_potential_energy()
print(f"Adsorbate+slab energy = {adslab_energy:.2f} eV")

# Corresponding raw slab used in original adslab (adsorbate+slab) system.
slab = fcc100("Cu", size=(3, 3, 3))
tags = np.zeros(len(slab))
tags[18:27] = 1
slab.set_tags(tags)
cons = FixAtoms(indices=[atom.index for atom in slab if (atom.tag == 0)])
slab.set_constraint(cons)
slab.center(vacuum=13.0, axis=2)
slab.set_calculator(ocp_calculator)
dyn = LBFGS(slab)
dyn.run(fmax=0.03, steps=100)
slab_energy = slab.get_potential_energy()
print(f"Bare slab energy = {slab_energy:.2f} eV")

# These should be fixed!
gas_reference_energies = {
    "H": 1,
    "N": 1,
    "O": 1,
    "C": 1,
}

adsorbate = Atoms("CH3O").get_chemical_symbols()

adsorbate_reference_energy = 0
for ads in adsorbate:
    adsorbate_reference_energy += gas_reference_energies[ads]

print(f"Adsorbate reference energy = {adsorbate_reference_energy:.2f} eV\n")

adsorption_energy = adslab_energy - slab_energy - adsorbate_reference_energy
print(f"Adsorption energy: {adsorption_energy:.2f} eV")
Adsorbate+slab energy = -105.10 eV
/home/runner/micromamba-root/envs/buildenv/lib/python3.9/site-packages/torch/autocast_mode.py:141: UserWarning: User provided device_type of 'cuda', but CUDA is not available. Disabling
  warnings.warn('User provided device_type of \'cuda\', but CUDA is not available. Disabling')
       Step     Time          Energy         fmax
LBFGS:    0 18:04:38      -78.860931        0.0821
LBFGS:    1 18:04:39      -78.934349        0.0822
LBFGS:    2 18:04:39      -78.732971        0.2116
LBFGS:    3 18:04:40      -78.878304        0.0002
Bare slab energy = -78.88 eV
Adsorbate reference energy = 5.00 eV

Adsorption energy: -31.23 eV

Compare the results here with what the OC20-style calculator predicted in Adsorption energy calculations using a pre-trained OCP calculator with OC20 referencing. Note that the adsorbate rotates so that the O is pointing down on the surface and the adsorbate does not associate, similar to the OC20 example.

import pandas as pd
import plotly.express as px

energies = [
    image.get_potential_energy() - slab_energy - adsorbate_reference_energy
    for image in traj
]

df = pd.DataFrame(
    {
        "Relaxation Step": range(len(energies)),
        "GemNet-OC (OC20+OC22) Adsorption Energy [eV]": energies,
    }
)

fig = px.line(df, x="Relaxation Step", y="GemNet-OC (OC20+OC22) Adsorption Energy [eV]")
fig.show()

Warning

Note that the energy here is not necessarily monotonically decreasing despite this being a local relaxation. This is likely because the potential here uses direct predictions for the forces, rather than calculating them through the gradient, which results in faster model training and allows for larger and more expressive models. Following the gradients downhill might cause the energy to increase.