Adsorption energy calculations using a toy calculator#

A structural relaxation or structure optimization is the process of iteratively updating atom positions to find the atom positions that minimize the energy of the structure. Standard optimization methods are used in structural relaxations — below we use the Limited-Memory Broyden–Fletcher–Goldfarb–Shanno (LBFGS) algorithm. The step number, time, energy, and force max are printed at each optimization step. Each step is considered one example because it provides all the information we need to train models for the S2EF task and the entire set of steps is referred to as a trajectory. Visualizing intermediate structures or viewing the entire trajectory can be illuminating to understand what is physically happening and to look for problems in the simulation, especially when we run ML-driven relaxations.

Tip

Common problems one may look out for:

  • atoms excessively overlapping/colliding with each other

  • atoms flying off into random directions

  • the adsorbate molecule dissociating

  • the adsorbate desorbing from the surface

  • large changes to the surface indicating the surface is relaxed

Setup and relaxation of a bare slab#

First, let’s set up a bare Cu(100) surface. We’ll use ASE to make the initial structure, and LBFGS to do a quick relaxation to find the lowest energy configuration.

We’ll fix the bottom couple layers of the copper surface to approximate a very thick copper slab and prevent the surface from moving. This is a very common trick in the catalysis community.

For this demonstration, we’ll use the Effective Medium Theory (EMT) calculator from ASE. It works pretty well for simple metal structures like Cu(100) and is very fast. However, it won’t do a good job with the adsorbate later on in the example!

import os

import ase.io
import numpy as np
from ase import Atoms
from ase.build import fcc100
from ase.calculators.emt import EMT
from ase.constraints import FixAtoms
from ase.io import extxyz
from ase.io.trajectory import Trajectory
from ase.optimize import LBFGS

# This cell sets up and runs a structural relaxation
# of a Cu(100) surface. It uses ASE scripts to make the surface.
# The actual surfaces in OC20 were prepared slightly differently.

# Make the bare slab using an ASE helper script
slab = fcc100("Cu", size=(3, 3, 3))

# Tag all slab atoms below surface as 0, surface as 1, adsorbate as 2
tags = np.zeros(len(slab))
tags[18:27] = 1
slab.set_tags(tags)

# Fixed atoms are prevented from moving during a structure relaxation.
# We fix all slab atoms beneath the surface.
cons = FixAtoms(indices=[atom.index for atom in slab if (atom.tag == 0)])
slab.set_constraint(cons)
slab.center(vacuum=13.0, axis=2)
slab.set_pbc(True)

# Set the toy calculator (EMT) so ASE knows how to get energies/forces
# at each step
slab.set_calculator(EMT())

os.makedirs("data", exist_ok=True)

# Define structure optimizer - LBFGS. Run for 100 steps,
# or if the max force on all atoms (fmax) is below 0 ev/A.
# fmax is typically set to 0.01-0.05 eV/A,
# for this demo however we run for the full 100 steps.

dyn = LBFGS(slab, trajectory="data/Cu100.traj")
dyn.run(fmax=0.03, steps=100)

traj = ase.io.read("data/Cu100.traj", ":")

# convert traj format to extxyz format (used by OC20 dataset)
columns = ["symbols", "positions", "move_mask", "tags"]
with open("data/Cu100.extxyz", "w") as f:
    extxyz.write_xyz(f, traj, columns=columns)
       Step     Time          Energy         fmax
*Force-consistent energies used in optimization.
LBFGS:    0 18:04:43        8.127167*       0.1136
LBFGS:    1 18:04:43        8.125558*       0.1068
LBFGS:    2 18:04:43        8.113625*       0.0050
/home/runner/micromamba-root/envs/buildenv/lib/python3.9/site-packages/ase/io/extxyz.py:301: UserWarning: Skipping unhashable information adsorbate_info
  warnings.warn('Skipping unhashable information '

Reading a trajectory#

identifier = "Cu100.extxyz"

# the `index` argument corresponds to what frame of the trajectory to read in, specifiying ":" reads in the full trajectory.
traj = ase.io.read(f"data/{identifier}", index=":")

Viewing a trajectory#

Below we visualize the initial, middle, and final steps in the structural relaxation trajectory from above. Copper atoms in the surface are colored orange. EMT does a good job here and gets a relaxed structure very quickly. It only takes a few steps, and if you look closely you can see just the top layer of Cu atoms move a bit.

Tip

Visualizations can be used as a quick sanity check to ensure the initial system is set up correctly and there are no major issues with the simulation!

import matplotlib.pyplot as plt
from ase.visualize.plot import animate
from matplotlib import rc

anim = animate(traj, radii=0.8, rotation=("-75x, 45y, 10z"))

rc("animation", html="jshtml")
anim
../../_images/relaxations_6_1.png

Relaxation of a slab and adsorbate (“adslab”)#

Now that we know how to run a simple relaxation of a bare slab with ASE and a toy calculator, let’s do the same thing with a methoxy (CH3O*) intermediate on the surface.

import os

import ase.io
import numpy as np
from ase import Atoms
from ase.build import add_adsorbate, fcc100, molecule
from ase.calculators.emt import EMT
from ase.constraints import FixAtoms
from ase.io import extxyz
from ase.io.trajectory import Trajectory
from ase.optimize import LBFGS

# This cell sets up and runs a structural relaxation
# of a Cu(100) surface. It uses ASE scripts to make the surface.
# The actual surfaces in OC20 were prepared slightly differently.

# Make the bare slab using an ASE helper script
adslab = fcc100("Cu", size=(3, 3, 3))

# Now, let's add the adsorbate to the bare slab
adsorbate = molecule("CH3O")
add_adsorbate(adslab, adsorbate, 3, offset=(1, 1))  # adslab = adsorbate + slab

# Tag all slab atoms below surface as 0, surface as 1, adsorbate as 2
tags = np.zeros(len(adslab))
tags[18:27] = 1
tags[27:] = 2
adslab.set_tags(tags)

# Fixed atoms are prevented from moving during a structure relaxation.
# We fix all slab atoms beneath the surface.
cons = FixAtoms(indices=[atom.index for atom in adslab if (atom.tag == 0)])
adslab.set_constraint(cons)
adslab.center(vacuum=13.0, axis=2)
adslab.set_pbc(True)

# Set the toy calculator (EMT) so ASE knows how to get energies/forces
# at each step
adslab.set_calculator(EMT())

os.makedirs("data", exist_ok=True)

# Define structure optimizer - LBFGS. Run for 100 steps,
# or if the max force on all atoms (fmax) is below 0 ev/A.
# fmax is typically set to 0.01-0.05 eV/A,
# for this demo however we run for the full 100 steps.

dyn = LBFGS(adslab, trajectory="data/Cu100+CH3O.traj")
dyn.run(fmax=0.03, steps=100)

adslab_traj = ase.io.read("data/Cu100+CH3O.traj", ":")

# convert traj format to extxyz format (used by OC20 dataset)
columns = ["symbols", "positions", "move_mask", "tags"]
with open("data/Cu100+CH3O.extxyz", "w") as f:
    extxyz.write_xyz(f, adslab_traj, columns=columns)
       Step     Time          Energy         fmax
*Force-consistent energies used in optimization.
LBFGS:    0 18:04:44       10.500636*       6.8252
LBFGS:    1 18:04:44        9.524367*       4.4480
LBFGS:    2 18:04:45        9.182733*       3.3101
LBFGS:    3 18:04:45        9.060680*       0.8095
LBFGS:    4 18:04:45        9.046500*       0.3372
LBFGS:    5 18:04:45        9.040306*       0.2303
LBFGS:    6 18:04:45        9.036356*       0.2314
LBFGS:    7 18:04:45        9.033425*       0.1719
LBFGS:    8 18:04:45        9.031535*       0.1735
LBFGS:    9 18:04:45        9.029567*       0.1543
LBFGS:   10 18:04:45        9.028137*       0.1421
LBFGS:   11 18:04:45        9.027198*       0.1049
LBFGS:   12 18:04:45        9.026569*       0.0827
LBFGS:   13 18:04:45        9.026023*       0.0762
LBFGS:   14 18:04:45        9.025342*       0.1225
LBFGS:   15 18:04:45        9.024138*       0.1710
LBFGS:   16 18:04:45        9.021548*       0.2179
LBFGS:   17 18:04:45        9.015284*       0.3223
LBFGS:   18 18:04:45        8.997724*       0.6160
LBFGS:   19 18:04:45        8.978441*       0.8366
LBFGS:   20 18:04:45        8.961434*       0.9788
LBFGS:   21 18:04:45        8.956747*       1.0044
LBFGS:   22 18:04:45        8.934899*       0.8237
LBFGS:   23 18:04:45        8.907601*       0.6945
LBFGS:   24 18:04:45        8.869711*       0.5644
LBFGS:   25 18:04:45        8.836299*       0.9032
LBFGS:   26 18:04:45        8.802417*       0.9061
LBFGS:   27 18:04:45        8.735919*       0.7183
LBFGS:   28 18:04:45        8.688490*       0.4542
LBFGS:   29 18:04:45        8.659963*       0.6010
LBFGS:   30 18:04:45        8.625970*       0.7673
LBFGS:   31 18:04:45        8.557615*       0.7790
LBFGS:   32 18:04:45        8.498335*       0.3802
LBFGS:   33 18:04:45        8.480958*       0.3947
LBFGS:   34 18:04:45        8.452850*       0.5126
LBFGS:   35 18:04:46        8.444221*       0.2422
LBFGS:   36 18:04:46        8.433414*       0.2667
LBFGS:   37 18:04:46        8.423099*       0.3420
LBFGS:   38 18:04:46        8.416835*       0.2604
LBFGS:   39 18:04:46        8.409765*       0.2216
LBFGS:   40 18:04:46        8.406329*       0.2797
LBFGS:   41 18:04:46        8.401119*       0.3413
LBFGS:   42 18:04:46        8.394984*       0.3213
LBFGS:   43 18:04:46        8.387740*       0.2478
LBFGS:   44 18:04:46        8.381928*       0.2773
LBFGS:   45 18:04:46        8.374093*       0.3373
LBFGS:   46 18:04:46        8.363214*       0.2941
LBFGS:   47 18:04:46        8.349731*       0.3515
LBFGS:   48 18:04:46        8.334672*       0.3548
LBFGS:   49 18:04:46        8.328746*       0.7158
LBFGS:   50 18:04:46        8.300337*       0.4128
LBFGS:   51 18:04:46        8.283111*       0.3885
LBFGS:   52 18:04:46        8.267296*       0.3079
LBFGS:   53 18:04:46        8.254124*       0.2151
LBFGS:   54 18:04:46        8.244564*       0.1852
LBFGS:   55 18:04:46        8.238417*       0.2082
LBFGS:   56 18:04:46        8.233480*       0.2006
LBFGS:   57 18:04:46        8.228717*       0.1600
LBFGS:   58 18:04:46        8.227131*       0.2323
LBFGS:   59 18:04:46        8.222459*       0.1828
LBFGS:   60 18:04:46        8.218809*       0.1926
LBFGS:   61 18:04:46        8.213185*       0.2030
LBFGS:   62 18:04:46        8.206281*       0.1960
LBFGS:   63 18:04:46        8.201200*       0.1653
LBFGS:   64 18:04:46        8.195828*       0.1690
LBFGS:   65 18:04:46        8.189843*       0.1993
LBFGS:   66 18:04:47        8.185875*       0.2181
LBFGS:   67 18:04:47        8.179964*       0.2356
LBFGS:   68 18:04:47        8.168761*       0.2539
LBFGS:   69 18:04:47        8.160249*       0.2119
LBFGS:   70 18:04:47        8.155374*       0.2264
LBFGS:   71 18:04:47        8.145853*       0.2117
LBFGS:   72 18:04:47        8.136208*       0.2954
LBFGS:   73 18:04:47        8.128145*       0.3342
LBFGS:   74 18:04:47        8.114981*       0.4818
LBFGS:   75 18:04:47        8.104857*       0.2629
LBFGS:   76 18:04:47        8.093613*       0.1763
LBFGS:   77 18:04:47        8.089098*       0.1796
LBFGS:   78 18:04:47        8.083050*       0.1809
LBFGS:   79 18:04:47        8.077105*       0.1936
LBFGS:   80 18:04:47        8.072144*       0.2256
LBFGS:   81 18:04:47        8.063516*       0.2448
LBFGS:   82 18:04:47        8.053900*       0.2466
LBFGS:   83 18:04:47        8.042706*       0.2510
LBFGS:   84 18:04:47        8.030584*       0.2420
LBFGS:   85 18:04:47        8.017054*       0.2301
LBFGS:   86 18:04:47        8.007735*       0.1819
LBFGS:   87 18:04:47        7.999556*       0.2605
LBFGS:   88 18:04:47        7.990349*       0.2145
LBFGS:   89 18:04:47        7.981350*       0.1262
LBFGS:   90 18:04:47        7.975561*       0.1240
LBFGS:   91 18:04:47        7.973736*       0.1234
LBFGS:   92 18:04:47        7.971765*       0.1052
LBFGS:   93 18:04:47        7.968929*       0.0984
LBFGS:   94 18:04:47        7.965473*       0.1078
LBFGS:   95 18:04:48        7.963304*       0.0925
LBFGS:   96 18:04:48        7.961798*       0.1090
LBFGS:   97 18:04:48        7.960690*       0.0969
LBFGS:   98 18:04:48        7.959848*       0.0684
LBFGS:   99 18:04:48        7.959151*       0.0609
LBFGS:  100 18:04:48        7.958265*       0.0587
/home/runner/micromamba-root/envs/buildenv/lib/python3.9/site-packages/ase/io/extxyz.py:301: UserWarning: Skipping unhashable information adsorbate_info
  warnings.warn('Skipping unhashable information '

Note that this relaxation isn’t quite finished - we stopped the relaxation at 100 steps but the force is still a bit higher than we wanted. Probably it would have converged if we had waited just a little longer.

Let’s visualize the relaxation with the output and see what happens!

import matplotlib.pyplot as plt
from ase.visualize.plot import animate
from matplotlib import rc

anim = animate(adslab_traj, radii=0.8, rotation=("-75x, 45y, 10z"))

rc("animation", html="jshtml")
anim
../../_images/relaxations_10_1.png

Viewing a trajectory#

Below we visualize the initial, middle, and final steps in the structural relaxation trajectory from above. Copper atoms in the surface are colored orange, the propane adsorbate on the surface has grey colored carbon atoms and white colored hydrogen atoms. The adsorbate’s structure changes during the simulation and you can see how it relaxes on the surface. In this case, the relaxation looks normal; however, there can be instances where the adsorbate flies away (desorbs) from the surface or the adsorbate can break apart (dissociation), which are hard to detect without visualization.

Tip

Visualizations can be used as a quick sanity check to ensure the initial system is set up correctly and there are no major issues with the simulation!

import matplotlib.pyplot as plt
from ase.visualize.plot import animate
from matplotlib import rc

anim = animate(adslab_traj, radii=0.8, rotation=("-75x, 45y, 10z"))

rc("animation", html="jshtml")
anim
../../_images/relaxations_12_1.png

Warning

Notice that this relaxation looks quite strange

  • The oxygen is the most reactive atom here; the carbon atom is happy with four bonds (3 to H atoms, 1 to an O), but the O is clearly missing a bond. Without actually running a calculation, we could probably guess that a better configuration would be O pointing towards the copper surface. This shows that how we place the adsorbate here could be improved!

  • The methoxy intermediate settles down onto the surface and falls apart (dissociates). The final structure is not the energy of a *OCH3 intermediate on a copper surface, but instead the energy of a collection of fragments.

The poor relaxation is because we’re using the EMT calculator, which works ok for some simple metal surfaces but is basically just a toy model for any organic atoms (C, H, O, etc). It’s not a surprise that it fails for this calculation.

ASE trajectory and atoms object contents#

Here we take a closer look at what information is contained within these trajectories.

i_structure = adslab_traj[0]
i_structure
Atoms(symbols='Cu27COH3', pbc=True, cell=[7.65796644025031, 7.65796644025031, 33.513279], initial_magmoms=..., tags=..., constraint=FixAtoms(indices=[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17]), calculator=SinglePointCalculator(...))

Atomic numbers#

numbers = i_structure.get_atomic_numbers()
print(numbers)
[29 29 29 29 29 29 29 29 29 29 29 29 29 29 29 29 29 29 29 29 29 29 29 29
 29 29 29  6  8  1  1  1]

Atomic symbols#

symbols = np.array(i_structure.get_chemical_symbols())
print(symbols)
['Cu' 'Cu' 'Cu' 'Cu' 'Cu' 'Cu' 'Cu' 'Cu' 'Cu' 'Cu' 'Cu' 'Cu' 'Cu' 'Cu'
 'Cu' 'Cu' 'Cu' 'Cu' 'Cu' 'Cu' 'Cu' 'Cu' 'Cu' 'Cu' 'Cu' 'Cu' 'Cu' 'C' 'O'
 'H' 'H' 'H']

Unit cell#

The unit cell is the volume containing our system of interest. Express as a 3x3 array representing the directional vectors that make up the volume. Illustrated as the dashed box in the above visuals.

cell = np.array(i_structure.cell)
print(cell)
[[ 7.65796644  0.          0.        ]
 [ 0.          7.65796644  0.        ]
 [ 0.          0.         33.513279  ]]

Periodic boundary conditions (PBC)#

x,y,z boolean representing whether a unit cell repeats in the corresponding directions. The OC20 dataset sets this to [True, True, True], with a large enough vacuum layer above the surface such that a unit cell does not see itself in the z direction. This is necessary since the underlying DFT calculation also requires periodic boundary conditions. However, not all DFT codes do!

Although the original structure shown above is what gets passed into our models, the presence of PBC allows it to effectively repeat infinitely in the x and y directions. Below we visualize the same structure with a periodicity of 2 in all directions, what the model may effectively see.

pbc = i_structure.pbc
print(pbc)
[ True  True  True]
fig, ax = plt.subplots(1, 3)
labels = ["initial", "middle", "final"]
for i in range(3):
    ax[i].axis("off")
    ax[i].set_title(labels[i])

ase.visualize.plot.plot_atoms(
    adslab_traj[0].repeat((2, 2, 1)), ax[0], radii=0.8, rotation=("-75x, 45y, 10z")
)
ase.visualize.plot.plot_atoms(
    adslab_traj[50].repeat((2, 2, 1)), ax[1], radii=0.8, rotation=("-75x, 45y, 10z")
)
ase.visualize.plot.plot_atoms(
    adslab_traj[-1].repeat((2, 2, 1)), ax[2], radii=0.8, rotation=("-75x, 45y, 10z")
)
<AxesSubplot:title={'center':'final'}>
../../_images/relaxations_24_1.png

Tags#

The OC20 dataset consists of systems with several different types of atoms. To help with identifying the index of certain atoms, we tag each atom according to where it is found in the system. There are three categories of atoms:

  • sub-surface slab atoms: these are atoms in the bottom layers of the catalyst, furthest away from the adsorbate

  • surface slab atoms: these are atoms in the top layers of the catalyst, close to where the adsorbate will be placed

  • adsorbate atoms: atoms that make up the adsorbate molecule on top of the catalyst.

Tag definitions in OC20:

  1. Sub-surface slab atoms

  2. Surface slab atoms

  3. Adsorbate atoms

tags = i_structure.get_tags()
print(tags)
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 2 2 2 2 2]

Fixed atoms constraint#

In reality, surfaces contain many, many more atoms beneath what we’ve illustrated as the surface. At an infinite depth, these subsurface atoms would look just like the bulk structure. We approximate a true surface by fixing the subsurface atoms into their “bulk” locations. This ensures that they cannot move at the “bottom” of the surface. If they could, this would throw off our calculations. Consistent with the above, we fix all atoms with tags=0, and denote them as “fixed”. All other atoms are considered “free”.

cons = i_structure.constraints[0]
print(cons, "\n")

# indices of fixed atoms
indices = cons.index
print(indices, "\n")

# fixed atoms correspond to tags = 0
print(tags[indices])
FixAtoms(indices=[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17]) 

[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17] 

[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]

Force#

Forces are another important property of the OC20 dataset. Unlike datasets like QM9 which contain only ground state properties, the OC20 dataset contains per-atom forces necessary to carry out atomistic simulations. Physically, forces are the negative gradient of energy w.r.t atomic positions: \(F = -\frac{dE}{dx}\). Although not mandatory (depending on the application), maintaining this energy-force consistency is important for models that seek to make predictions on both properties.

The “apply_constraint” argument controls whether to apply system constraints to the forces. In the OC20 dataset, this controls whether to return forces for fixed atoms (apply_constraint=False) or return 0s (apply_constraint=True).

# Returning forces for all atoms - regardless of whether "fixed" or "free"
i_structure.get_forces(apply_constraint=False)
array([[-3.82972219e-07, -2.70392614e-07,  1.13582705e-01],
       [ 1.38586758e-09, -1.28050401e-06,  1.13578748e-01],
       [ 3.81586355e-07, -2.66773016e-07,  1.13582713e-01],
       [-1.98774837e-06, -2.60727150e-07,  1.13578640e-01],
       [ 3.84925179e-09, -8.52896492e-07,  1.13546140e-01],
       [ 1.98353941e-06, -2.61523658e-07,  1.13578702e-01],
       [-9.74017372e-07,  5.31119775e-07,  1.13581150e-01],
       [ 5.89358042e-10,  2.13307283e-06,  1.13568041e-01],
       [ 9.73428020e-07,  5.28296676e-07,  1.13581154e-01],
       [-7.59734729e-04, -7.97513312e-04, -2.23034555e-03],
       [ 6.64243288e-04, -5.32448243e-04, -1.02767179e-03],
       [-1.20826110e-05, -4.99951952e-05, -8.86553447e-05],
       [-4.88020248e-04,  5.55808522e-04, -6.48455968e-04],
       [ 4.50942502e-04,  3.56875315e-04, -1.83855449e-04],
       [-1.57894933e-09,  2.77071580e-05, -1.01963172e-04],
       [-1.70281718e-04,  1.34898110e-04, -2.70071321e-04],
       [ 1.67476821e-04,  1.45819854e-04, -2.57564576e-04],
       [-3.67220411e-08,  1.90806516e-05, -3.64555646e-05],
       [-4.20901044e-03, -4.74041750e-03, -1.17576330e-01],
       [ 1.09078706e-02, -4.84327687e-02, -1.58006424e-01],
       [ 1.96660083e-04, -5.20426539e-04, -1.13522809e-01],
       [-4.95789679e-02,  1.38576208e-02, -1.60493847e-01],
       [ 2.54958318e-01,  2.73590135e-01, -1.15039100e+00],
       [ 1.86374052e-03,  3.93815654e-03, -1.15299653e-01],
       [ 4.87777077e-03, -2.04208482e-03, -1.07273510e-01],
       [ 9.92367028e-04, -3.54386086e-02, -1.37820754e-02],
       [-5.33937331e-03, -2.74271920e-03, -1.06811669e-01],
       [ 4.00452535e-01,  6.64677385e+00,  1.49792377e+00],
       [-5.72738271e-03, -3.05457609e+00, -1.57569959e-01],
       [ 2.55908543e+00, -7.51705528e-01,  3.34547392e-02],
       [-1.04840882e+00, -1.00310180e+00,  2.07106095e+00],
       [-2.11992365e+00, -2.03471955e+00, -2.41904514e+00]])
# Applying the fixed atoms constraint to the forces
i_structure.get_forces(apply_constraint=True)
array([[ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00],
       [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00],
       [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00],
       [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00],
       [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00],
       [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00],
       [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00],
       [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00],
       [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00],
       [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00],
       [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00],
       [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00],
       [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00],
       [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00],
       [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00],
       [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00],
       [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00],
       [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00],
       [-4.20901044e-03, -4.74041750e-03, -1.17576330e-01],
       [ 1.09078706e-02, -4.84327687e-02, -1.58006424e-01],
       [ 1.96660083e-04, -5.20426539e-04, -1.13522809e-01],
       [-4.95789679e-02,  1.38576208e-02, -1.60493847e-01],
       [ 2.54958318e-01,  2.73590135e-01, -1.15039100e+00],
       [ 1.86374052e-03,  3.93815654e-03, -1.15299653e-01],
       [ 4.87777077e-03, -2.04208482e-03, -1.07273510e-01],
       [ 9.92367028e-04, -3.54386086e-02, -1.37820754e-02],
       [-5.33937331e-03, -2.74271920e-03, -1.06811669e-01],
       [ 4.00452535e-01,  6.64677385e+00,  1.49792377e+00],
       [-5.72738271e-03, -3.05457609e+00, -1.57569959e-01],
       [ 2.55908543e+00, -7.51705528e-01,  3.34547392e-02],
       [-1.04840882e+00, -1.00310180e+00,  2.07106095e+00],
       [-2.11992365e+00, -2.03471955e+00, -2.41904514e+00]])

Adsorption (reaction) energy referenced to gas phase species#

The energy of the system is one of the properties of interest in the OC20 dataset. It’s important to note that absolute energies provide little value to researchers and must be referenced properly to be useful. Put another way, all physically-meaningful energies in DFT are reaction energies from one state to another.

A common approach in catalysis to maintain consistency in energies is to reference all energies to a linear combination of specific gas phase species. For OC20, these species were N2, H2, H2O, and CO. These were chosen as the DFT method we’re using does a reasonable job of calculating the energies of these species. In effect, we are calculating the energy of a specific reaction:

\[\begin{align*} l\text{CO}+m\text{H2O}+n\text{H2}+p\text{N2} + * -> *OCH3 \end{align*}\]

where \(*\) represents a catalyst surface site, and \(*OCH3\) represents our intermediate bound to the catalyst surface. All chemical reactions must balance, so with a little bit of linear algebra we can figure out what numbers \(l,m,n,p\) are:

\[\begin{align*} \text{CO}+\frac{3}{2}\text{H2} + * -> *OCH3 \end{align*}\]

This is the energy that would be reported or predicted for an *OCH3 adsorbate in OC20. Another way that we could accomplish the same thing is to calculate a per-element energy for C,H,O,N from the reference energies:

\[\begin{align*} E_H = E_{\text{H2}}/2 && E_N = E_{\text{N2}}/2 && E_O = E_{\text{H2O}}-E_{H2} && E_C = E_{\text{CO}}-E_{\text{H2O}}+E_{H2} \end{align*}\]

It is straightforward to convert from one reference energy scheme to another (e.g. a different set of gas-phase species), so the OC20 results are useful even if you are using a different approach!

Danger

It is extremely important to use consistent settings for all steps in your adsorption energy calculation. If you use different settings or codes for different species, you will get wrong numbers!

As a demonstration we’ll do all of this using the EMT potential, but be aware that the results are going to be quite strange since EMT doesn’t work for the gas phase molecules on their own!

# Adslab energy
relaxed_adslab = ase.io.read("data/Cu100+CH3O.traj")
adslab_energy = relaxed_adslab.get_potential_energy()
print(f"Adsorbate+slab energy = {adslab_energy:.2f} eV")

# Corresponding raw slab used in original adslab (adsorbate+slab) system.
slab = fcc100("Cu", size=(3, 3, 3))
tags = np.zeros(len(slab))
tags[18:27] = 1
slab.set_tags(tags)
cons = FixAtoms(indices=[atom.index for atom in slab if (atom.tag == 0)])
slab.set_constraint(cons)
slab.center(vacuum=13.0, axis=2)
slab.set_calculator(EMT())
dyn = LBFGS(slab)
dyn.run(fmax=0.03, steps=100)
slab_energy = slab.get_potential_energy()
print(f"Bare slab energy = {slab_energy:.2f} eV")

# Energy for H2O
H2O_atoms = molecule("H2O")
H2O_atoms.set_pbc(True)
H2O_atoms.set_cell([20, 20, 20])
H2O_atoms.set_calculator(EMT())
dyn = LBFGS(H2O_atoms)
dyn.run(fmax=0.03, steps=100)
H2O_energy = H2O_atoms.get_potential_energy()
print(f"H2O energy = {H2O_energy:.2f} eV")

# Energy for N2
N2_atoms = molecule("N2")
N2_atoms.set_pbc(True)
N2_atoms.set_cell([20, 20, 20])
N2_atoms.set_calculator(EMT())
dyn = LBFGS(N2_atoms)
dyn.run(fmax=0.03, steps=100)
N2_energy = N2_atoms.get_potential_energy()
print(f"N2 energy = {N2_energy:.2f} eV")

# Energy for CO
CO_atoms = molecule("CO")
CO_atoms.set_pbc(True)
CO_atoms.set_cell([20, 20, 20])
CO_atoms.set_calculator(EMT())
dyn = LBFGS(CO_atoms)
dyn.run(fmax=0.03, steps=100)
CO_energy = CO_atoms.get_potential_energy()
print(f"CO energy = {CO_energy:.2f} eV")

# Energy for H2
H2_atoms = molecule("H2")
H2_atoms.set_pbc(True)
H2_atoms.set_cell([20, 20, 20])
H2_atoms.set_calculator(EMT())
dyn = LBFGS(H2_atoms)
dyn.run(fmax=0.03, steps=100)
H2_energy = H2_atoms.get_potential_energy()
print(f"H2 energy = {H2_energy:.2f} eV")

gas_reference_energies = {
    "H": H2_energy / 2,
    "N": N2_energy / 2,
    "O": H2O_energy - H2_energy,
    "C": CO_energy - (H2O_energy - H2_energy),
}

adsorbate = Atoms("CH3O").get_chemical_symbols()

adsorbate_reference_energy = 0
for ads in adsorbate:
    adsorbate_reference_energy += gas_reference_energies[ads]

print(f"Adsorbate reference energy = {adsorbate_reference_energy:.2f} eV\n")

adsorption_energy = adslab_energy - slab_energy - adsorbate_reference_energy
print(f"Adsorption energy: {adsorption_energy:.2f} eV")
Adsorbate+slab energy = 7.96 eV
       Step     Time          Energy         fmax
*Force-consistent energies used in optimization.
LBFGS:    0 18:05:55        8.127167*       0.1136
LBFGS:    1 18:05:55        8.125558*       0.1068
LBFGS:    2 18:05:55        8.113625*       0.0050
Bare slab energy = 8.11 eV
       Step     Time          Energy         fmax
*Force-consistent energies used in optimization.
LBFGS:    0 18:05:55        2.619811*       7.7384
LBFGS:    1 18:05:55        1.912906*       1.3454
LBFGS:    2 18:05:55        1.882033*       0.4035
LBFGS:    3 18:05:55        1.879275*       0.0317
LBFGS:    4 18:05:55        1.879253*       0.0096
H2O energy = 1.88 eV
       Step     Time          Energy         fmax
*Force-consistent energies used in optimization.
LBFGS:    0 18:05:55        0.548765*       3.9648
LBFGS:    1 18:05:55        0.269415*       0.7024
LBFGS:    2 18:05:55        0.263420*       0.2258
LBFGS:    3 18:05:55        0.262778*       0.0084
N2 energy = 0.26 eV
       Step     Time          Energy         fmax
*Force-consistent energies used in optimization.
LBFGS:    0 18:05:55        0.786209*       2.3049
LBFGS:    1 18:05:55        0.681828*       0.7945
LBFGS:    2 18:05:55        0.670839*       0.1828
LBFGS:    3 18:05:55        0.670284*       0.0105
CO energy = 0.67 eV
       Step     Time          Energy         fmax
*Force-consistent energies used in optimization.
LBFGS:    0 18:05:55        1.158863*       4.4619
LBFGS:    1 18:05:55        1.331933*       5.4158
LBFGS:    2 18:05:55        1.081151*       1.3223
LBFGS:    3 18:05:55        1.072702*       0.6347
LBFGS:    4 18:05:55        1.070550*       0.0402
LBFGS:    5 18:05:55        1.070541*       0.0011
H2 energy = 1.07 eV
Adsorbate reference energy = 2.28 eV

Adsorption energy: -2.43 eV

Danger

These gas-phase energies should not be trusted! Don’t use them for anything more than this demo. Use a real calculation with consistent settings to your adslab to get the C/H/O/N energies!

Plot energy profile of toy trajectory#

Plotting the energy profile of our trajectory is a good way to ensure nothing strange has occured. We expect to see a decreasing monotonic function. If the energy is consistently increasing or there’s multiple large spikes this could be a sign of some issues in the optimization. This is particularly useful for when analyzing ML-driven relaxations and whether they make general physical sense.

import pandas as pd
import plotly.express as px

energies = [
    image.get_potential_energy() - slab_energy - adsorbate_reference_energy
    for image in adslab_traj
]

df = pd.DataFrame(
    {"Relaxation Step": range(len(energies)), "Adsorption Energy [eV]": energies}
)

fig = px.line(df, x="Relaxation Step", y="Adsorption Energy [eV]")
fig.show()