LAMMPS GPU compilation

Compile LAMMPS with GPU package on WSL2 Ubuntu

Compiled version: 2Aug2023 (stable)
(10/12/2023)
Contact: lebedmi2@cvut.cz
 

Official documentation about the GPU package and how to run calculations with it: https://docs.lammps.org/Speed_gpu.html
Official documentation for building the GPU package: https://docs.lammps.org/Build_extras.html#gpu

Steps:

1) Compile CUDA Toolkit. For native Ubuntu, CUDA within the Nvidia HPC SDK can be used. For Ubuntu in WSL2, CUDA version for WSL2 is needed.
2) Compile VASP with GPU package.

Tested PC:
  • Intel(R) Core(TM) i5-14600K 3.50 GHz
  • RTX 2060 SUPER
  • Kingston FURY 32GB KIT DDR5 6000MHz CL32 Renegade
  • WSL2 Ubuntu version: 22.04
 
Prerequisites:

Not all are necessary but might be useful: 

sudo apt-get update
sudo apt-get upgrade
sudo apt-get install build-essential cmake cmake-curses-gui libopenmpi-dev openmpi-bin libfftw3-dev libblas-dev liblapack-dev pkg-config ffmpeg python3-dev
sudo apt install python3-pip python3.10-venv python3-venv

 

OpenMPI

Compile OpenMPI with standard GNU compiler. Download it (use wget command or download from here https://www.open-mpi.org/software/ompi/v5.0/)

wget https://download.open-mpi.org/release/open-mpi/v5.0/openmpi-5.0.0.tar.gz
tar xvzf openmpi-5.0.0.tar.gz
cd openmpi-5.0.0
mkdir build

Compile OpenMPI into the ‚build‘ folder:

./configure --prefix=$HOME/SOFTWARE/OpenMPI/openmpi-5.0.0/build
make all -j
make install -j

To use this version of OpenMPI, export the following (either for local use by copying into command line, or by setting it as default by writing into ~/.bashrc on the last two lines):

export PATH=/home/lebedmi2/SOFTWARE/OpenMPI/openmpi-5.0.0/build/bin:$PATH
export LD_LIBRARY_PATH=/home/lebedmi2/SOFTWARE/OpenMPI/openmpi-5.0.0/build/lib:$LD_LIBRARY_PATH
 
CUDA Toolkit

Install CUDA Toolkit for Linux, x86_64, WSL-Ubuntu, 2.0, deb (local) according to the manual here:
https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&Distribution=WSL-Ubuntu&target_version=2.0&target_type=deb_local,
or use the copied steps below:

wget https://developer.download.nvidia.com/compute/cuda/repos/wsl-ubuntu/x86_64/cuda-wsl-ubuntu.pin
sudo mv cuda-wsl-ubuntu.pin /etc/apt/preferences.d/cuda-repository-pin-600

wget https://developer.download.nvidia.com/compute/cuda/12.3.1/local_installers/cuda-repo-wsl-ubuntu-12-3-local_12.3.1-1_amd64.deb
sudo dpkg -i cuda-repo-wsl-ubuntu-12-3-local_12.3.1-1_amd64.deb
sudo cp /var/cuda-repo-wsl-ubuntu-12-3-local/cuda-*-keyring.gpg /usr/share/keyrings/
sudo apt-get update
sudo apt-get -y install cuda-toolkit-12-3

The installation will take some time to finish. On native Ubuntu, you can also use CUDA from the NVIDIA HPC-SDK toolkit, just modify paths in the following text accordingly. However, when I was trying to compile LAMMPS GPU package on WSL2 using it from HPC SDK, I was receiving an error. Only CUDA compiled specifically for the WSL Ubuntu version worked for me.

Add CUDA Toolkit to the PATH system variables:

nano ~/.bashrc

Scroll down to the end of the file and add there:

export PATH=/usr/local/cuda-12.3/bin:$PATH
source ~/.bashrc

Check if CUDA is properly installed:

nvcc --version
which nvcc

It should give the version of the compiler and the path to it.

 
Compile LAMMPS with GPU package
git clone -b stable https://github.com/lammps/lammps.git lammps_gpu
cd lammps_gpu
mkdir build

If you don’t have the rights as a user to this folder, give them to yourself:

sudo chown -R lebedmi2 .

You can then use ‚make‘ without ‚sudo‘. Check your GPU architecture, e.g., GTX 1060 is built on Pascal arch, so GPU_ARCH=sm_61 (see docs.lammps.org/Build_extras.html#gpu for GPU_ARCH options). For RTX 2060, use sm_75. For RTX 4090, sm_90. Also choose the precision (GPU_PREC, double or mixed (default) or single). Also turn on certain basic packages (ORIENT, PYTHON, OPENMP, MEAM, MANYBODY)

Enter the ‚build‘ folder:

cd build

When using CUDA compiled for WSL2 (set correct path for bin2c and cuda):

cmake -D PKG_GPU=ON -D GPU_API=CUDA -D GPU_PREC=mixed -D GPU_ARCH=sm_75 -D PKG_ORIENT=ON -D PKG_OPENMP=ON -D PKG_MEAM=ON -D PKG_MANYBODY=ON -D BIN2C=/usr/local/cuda-12.3/bin/bin2c -D CUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda-12.3 ../cmake

When using CUDA within NVIDIA HPC SDK (set correct path to bin2c):

cmake -D PKG_GPU=ON -D GPU_API=CUDA -D GPU_PREC=mixed -D GPU_ARCH=sm_75 -D PKG_ORIENT=ON -D PKG_OPENMP=ON -D PKG_MEAM=ON -D PKG_MANYBODY=ON -D BIN2C=/opt/nvidia/hpc_sdk/Linux_x86_64/23.9/cuda/12.2/bin/bin2c ../cmake

To select additional packages via cmake GUI:

ccmake . 

Check if path for BIN2 (first line) is correct (should be something like /usr/local/cuda-12.3/bin/bin2c). If some error appears regarding the bin2c, include its path in the previous step with: -D BIN2C= /usr/local/cuda-12.3/bin/bin2c

Compile and then install:

make -j
sudo make install -j

Note: if you want to build the LAMMPS with python package, you have to turn Python package and the shared libraries on:

cmake -D PKG_GPU=ON -D GPU_API=CUDA -D GPU_PREC=mixed -D GPU_ARCH=sm_75 -D PKG_ORIENT=ON -D PKG_PYTHON=ON -D PKG_OPENMP=ON -D PKG_MEAM=ON -D PKG_MANYBODY=ON -D BUILD_LIB=ON -D BUILD_SHARED_LIBS=ON -D BIN2C=/usr/local/cuda-12.3/bin/bin2c -D CUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda-12.3 ../cmake
sudo make install-python

 

Note: MEAM potential is not yet GPU accelerated. It is not implemented.

Add path to the lmp binary in bashrc:

export PATH=$PATH:/home/lebedmi2/lammps/build

Possibly rename lmp binary if you have more lammps versions:

mv lmp lmp_gpu

Note: When compiling older versions of LAMMPS on GPU, -DCMAKE_LIBRARY_PATH=/usr/local/cuda-12.3/targets/x86_64-linux/lib/stubs had to be added to solve the error: CMake Error: The following variables are used in this project, but they are set to NOTFOUND
Please set them or make sure they are set and tested correctly in the CMake files:
CUDA_CUDA_LIBRARY (ADVANCED)

 
Test

Prepare your input files and try if the compilation was successfull:

Run on CPU:

mpirun -np 8 lmp_gpu -in test.in

Run on GPU:

lmp_gpu -sf gpu -in test.in
lmp_gpu -sf gpu -pk gpu 1 -in test.in

mpirun -np 8 lmp_gpu -sf gpu -pk gpu 1 -in test.in
 

Calculation speed comparison

List of GPU-Accelerated Interatomic Potentials and Functions

amoeba
atom
beck
born_coul_long_cs
born_coul_long
born_coul_wolf_cs
born_coul_wolf
born
buck_coul
buck_coul_long
buck
charmm
charmm_long
colloid
coul
coul_debye
coul_dsf
coul_long_cs
coul_long
device
dipole_lj
dipole_lj_sf
dipole_long_lj
dpd
eam
ellipsoid_nbor
gauss
gayberne
gayberne_lj
hippo
lj96
lj_class2_long
lj_coul
lj_coul_debye
lj_coul_long
lj_coul_msm
lj_cubic
lj
lj_dsf
lj_expand_coul_long
lj_expand
lj_gromacs
lj_smooth
lj_spica
lj_spica_long
lj_tip4p_long
mie
morse
neighbor_cpu
neighbor_gpu
pppm_d
pppm_f
re_squared
re_squared_lj
soft
sw
table
tersoff
tersoff_mod
tersoff_zbl
ufm
vashishta
yukawa_colloid
yukawa
zbl