MACE in LAMMPS

Warning

Please be cautious about benchmarking your LAMMPS model against, for example, the equivalent ASE calculator.

Both CPU and GPU evaluation are possible, but GPU evaluation will give MUCH better performance (at least at present).

Preparing your model

Train the model using the develop branch. Afterwards, use the create_lammps_model.py script to prepare a torchscript-compiled LAMMPS_MACE model:

python <mace_repo_dir>/mace/cli/create_lammps_model.py my_mace.model

Instructions for GPU

Installation

These instructions are for Cambridge-relevant machines and should be adapted as needed.

CSD3 Ampere Nodes

First steps:

git clone --branch=mace --depth=1 https://github.com/ACEsuit/lammps
wget https://download.pytorch.org/libtorch/cu121/libtorch-shared-with-deps-2.2.0%2Bcu121.zip
unzip libtorch-shared-with-deps-2.2.0+cu121.zip
rm libtorch-shared-with-deps-2.2.0+cu121.zip
mv libtorch libtorch-gpu

Request an interactive job to obtain a GPU node for the installation:

sintr -A YOUR-ACCOUNT-GPU -p ampere -N 1 --gres=gpu:1 -t 1:00:00

After logging on to the GPU node interactively, prepare the environment:

module purge
module load intel-mkl-2017.4-gcc-5.4.0-2tzpyn7
module load rhel8/slurm rhel8/global gcc/9 openmpi/gcc/9.3/4.0.4 cuda/12.1 cudnn

Compile LAMMPS:

cd lammps
mkdir build-ampere
cd build-ampere
cmake \
    -D CMAKE_BUILD_TYPE=Release \
    -D CMAKE_INSTALL_PREFIX=$(pwd) \
    -D CMAKE_CXX_STANDARD=17 \
    -D CMAKE_CXX_STANDARD_REQUIRED=ON \
    -D BUILD_MPI=ON \
    -D BUILD_SHARED_LIBS=ON \
    -D PKG_KOKKOS=ON \
    -D Kokkos_ENABLE_CUDA=ON \
    -D CMAKE_CXX_COMPILER=$(pwd)/../lib/kokkos/bin/nvcc_wrapper \
    -D Kokkos_ARCH_AMDAVX=ON \
    -D Kokkos_ARCH_AMPERE100=ON \
    -D CMAKE_PREFIX_PATH=$(pwd)/../../libtorch-gpu \
    -D PKG_ML-MACE=ON \
    ../cmake
make -j 20
make install

Using the model in LAMMPS

Warning

At present, only single-GPU evaluation is recommended.

Begin your LAMMPS input with the following commands::

units         metal
atom_style    atomic
atom_modify   map yes
newton        on

Your pair commands should look something like this::

pair_style mace no_domain_decomposition
pair_coeff * * my_mace.model-lammps.pt C H N O

With no_domain_decomposition, LAMMPS builds a periodic graph rather than treating ghost atoms as independent nodes.

Finally, you should initiate Kokkos by calling LAMMPS with something like the following:

lmp -k on g 1 -sf kk -in in.lammps

Instructions for CPU

Installation

These instructions are for Cambridge-relevant machines and should be adapted as needed.

CSD3 Cascade Lake Nodes

This environment is specific to the Cascade Lake compute nodes. If you want to use Ice Lake, you’ll need something slightly different - see the CSD3 documentation. Moreover, you may see MPI errors if you try running on a CSD3 head node; just use the compute nodes.:

module purge
module load rhel7/default-ccl
module load gcc/9

Download libtorch:

wget https://download.pytorch.org/libtorch/cpu/libtorch-shared-with-deps-1.13.0%2Bcpu.zip
unzip libtorch-shared-with-deps-1.13.0+cpu.zip
rm libtorch-shared-with-deps-1.13.0+cpu.zip

Install Lammps:

git clone --branch mace --depth=1 https://github.com/ACEsuit/lammps
cd lammps; mkdir build; cd build
cmake -DCMAKE_INSTALL_PREFIX=$(pwd) \
      -D CMAKE_CXX_STANDARD=17 \
      -D CMAKE_CXX_STANDARD_REQUIRED=ON \
      -D BUILD_MPI=ON \
      -D BUILD_OMP=ON \
      -D PKG_OPENMP=ON \
      -D PKG_ML-MACE=ON \
      -D CMAKE_PREFIX_PATH=$(pwd)/../../libtorch \
      ../cmake
make -j 4
make install

Using the model in LAMMPS

Begin your LAMMPS input with the following commands::

units         metal
atom_style    atomic
atom_modify   map yes
newton        on

Your pair commands should look something like this::

pair_style mace
pair_coeff * * my_mace.model-lammps.pt C H N O

If you are using a single MPI process with threading (recommended for small systems), use the no_domain_decomposition option for speedups::

# add this atom_modify command after your atom_style command
atom_modify map yes

# add the no_domain decomposition option to the pair_style declaration
pair_stye mace no_domain_decomposition

With no_domain_decomposition, LAMMPS builds a periodic graph rather than treating ghost atoms as independent nodes.

Here is an example slurm script (for Cascade Lake). For now, it is best to rely on threading for smaller systems. For larger systems, you’ll need to experiment - multiple-node jobs will work, but it is likely best to use a small number of MPI processes per node and threading for the rest. You may want the –exclusive option to get access to the full-node memory.:

#!/bin/bash

#SBATCH -J lammps-mace
#SBATCH -A MY-ACCOUNT-CPU
#SBATCH -p cclake
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --exclusive
#SBATCH --time=08:00:00
#SBATCH --mail-type=FAIL

. /etc/profile.d/modules.sh
module purge
module load rhel7/default-ccl

export OMP_NUM_THREADS=56
export MKL_NUM_THREADS=56
mpirun -np 1 ../../lammps/build/lmp -in in.lammps