All requests for technical support from the VASP group must be addressed to: firstname.lastname@example.org
- 1 Requirements
- 2 Build system
- 3 How to make VASP
- 4 Adapting makefile.include
- 4.1 Precompiler variables
- 4.2 Compiler variables
- 4.3 Linking against libraries
- 4.4 The list of objects
- 4.5 Fast-Fourier-Transforms
- 4.6 Special rules
- 4.7 For the VASP library (lib)
- 4.8 For the GPU port
- 4.9 Examples
- 5 Patches
- 6 Validation
- 7 Related Sections
For the compilation of the parallel version of VASP the following software is mandatory:
- Fortran and C compilers.
- An implementation of MPI (Message Passing Interface).
- Numerical libraries like BLAS, LAPACK, ScaLAPACK, and FFTW.
The build system of VASP (as of versions >= 5.4.1) has the following structure:
vasp.X.X.X (root directory) | --------------------------------------- | | | | arch bin build src | ---------- | | lib CUDA
- Holds the high-level makefile, and several subdirectories.
- Holds the source files of VASP, and a low-level makefile.
- Holds the source of the VASP library (used to be vasp.X.lib), and a low-level makefile.
- Holds the source of the cuda-code that will be executed on the GPU by the GPU port of VASP.
- Holds a collection of makefile.include.arch files.
- The different versions of VASP, i.e., the standard, gamma-only, non-collinear version will be build in separate subdirectories of this directory.
- Here make will store the binaries.
How to make VASP
Copy one of the
makefile.include.arch files in
Take one that most closely reflects your system (hopefully).
For instance, on a linux box with the Intel Composer suite:
cp arch/makefile.include.linux_intel ./makefile.include
In many cases these
makefile.include files will have to be adapted to the particulars of your system (see below).
When you've finished setting up
makefile.include, build VASP:
This will build the standard, gamma-only, and non-collinear version of VASP one after the other. Alternatively on may build these versions individually:
make std make gam make ncl
To compile the GPU port of VASP:
cp arch/makefile.include.linux_intel_cuda ./makefile.include
and adapt it to the particulars of your system (see below), followed by:
make gpu make gpu_ncl
to built the GPU ports of the standard and non-collinear versions, respectively.
N.B.: Unfortunately at this time we do not offer a GPU port of the gamma-only version yet.
- Specify the precompiler flags:
CPP_OPTIONS=[-Dflag1 [-Dflag2] ... ]
- Take a lead from the
makefile.include.archfiles in /arch.
- N.B.I: -DNGZhalf, -DwNGZhalf, -DNGXhalf, -DwNGXhalf are deprecated options. Building the standard, gamma-only, or non-collinear version of the code is specified through an additional argument to the make command (see the make section).
- N.B.II: CPP_OPTIONS is only used in this file, where it should be added to CPP (see next item).
- The command to invoke the precompiler you want to use, for instance:
- Using Intel's Fortran precompiler:
CPP=fpp -f_com=no -free -w0 $*$(FUFFIX) $*$(SUFFIX) $(CPP_OPTIONS)
- Using cpp:
CPP=/usr/bin/cpp -P -C -traditional $*$(FUFFIX) >$*$(SUFFIX) $(CPP_OPTIONS)
- N.B.: This variable has to include $(CPP_OPTIONS)! If not, CPP_OPTIONS will be ignored.
The Fortran compiler will be invoked as:
$(FC) $(FREE) $(FFLAGS) $(OFLAG) $(INCS)
- Specify the options that your Fortran compiler needs for it to accept free-form source layout, without line-length limitation. For instance:
- Using Intel's Fortran compiler:
FREE=-free -names lowercase
- Using gfortran:
- The command to invoke your Fortran compiler (e.g. gfortran, ifort, mpif90, mpiifort, ... ).
- The command that invokes the linker. In most cases:
FCL=$(FC) [+ some options]
- Using the Intel composer suite (Fortan compiler + MKL libraries), typically:
- The general level of optimization (default: OFLAG=-O2).
- Additional compiler flags.
- (default: -O2) In the vast majority of makefiles this variable is set:
- The optimization level with which the main program (main.F) will be compiled, usually:
- Use this variable to specify objects to be included in the sense of:
Linking against libraries
The linker will be invoked as:
$(FCL) -o vasp ..all-objects.. $(LLIBS) $(LINK)
- Specify libraries and/or objects to be linked against, in the usual ways:
LLIBS=[-Ldirectory -llibrary] [path/library.a] [path/object.o]
Usually one has to specify several numerical libraries (BLAS, LAPACK or scaLAPACK, etc). For instance using the Intel composer suite (and compiling with CPP_OPTIONS= .. -DscaLAPACK ..):
MKL_PATH = $(MKLROOT)/lib/intel64 BLACS = -lmkl_blacs_openmpi_lp64 SCALAPACK = $(MKL_PATH)/libmkl_scalapack_lp64.a $(BLACS) LLIBS = $(SCALAPACK) $(LAPACK)
For other configurations please take a lead from the
makefile.include.arch files under /arch, or look at the examples below.
The list of objects
The standard list of objects needed to compile VASP is given by the variable SOURCE in the
root/src/.objects file that is part of the
Objects to be added to this list can be specified in
makefile.include by means of:
OBJECTS= .. your list of objects ..
N.B.: Several objects will *have* to be added in this manner (see the following section on "Fast-Fourier-Transforms).
- Add the objects to be compiled (or linked againts) that provide the FFTs (may include static libraries of objects .a).
- In case one compiles using the fftw-library, i.e.,
OBJECTS= .. fftw3d.o fftmpiw.o ..
- then INCS can be set to the directory that holds
- (needed because
- N.B.: If in the aformentioned case INCS is not set, then
fftw3.fhas to be present in /src.
Common choices are:
- To use Intel's MKL wrapper of fftw (and compiling with CPP_OPTIONS= .. -DMPI ..):
OBJECTS= fftmpiw.o fftmpi_map.o fftw3d.o fft3dlib.o \ $(MKLROOT)/interfaces/fftw3xf/libfftw3xf_intel.a INCS=-I$(MKLROOT)/include/fftw
- Or to use Juergen Furtmueller's FFT implementation (and -DMPI):
OBJECTS= fftmpi.o fftmpi_map.o fft3dfurth.o fft3dlib.o INCS=
For other configurations please take lead from the
makefile.include.arch files under /arch or look at the examples below.
The makefiles of our old build systems contained a set of special rules for the optimization level allowed in the compilation of the FFT related objects. In the current build system these special rules can be duplicated by adding the following:
OBJECTS_O1 += fft3dfurth.o fftw3d.o fftmpi.o fftmpiw.o OBJECTS_O2 += fft3dlib.o
Special rules in general
src/makefile contains a set of recipes to allow for the compilation of objects at different levels of optimization (other than the general level specified by OFLAG). These recipes replace the special rules section of the makefiles in our old build system.
In these recipes the compiler will be invoked as:
$(FC) $(FREE) $(FFLAGS_x) $(OFLAG_x) $(INCS_x)
where x stands for: 1, 2, 3, or IN.
- Default: FFLAGS_x=$(FFLAGS), for x=1, 2, 3, and IN.
- Default: OFLAG_x=-Ox (for x=1, 2, 3), and OFLAG_IN=-O2
- Default: INCS_x=$(INCS), for x=1, 2, 3, and IN.
The objects to be compiled in accordance with these recipes have to be specified by means of the variables:
- OBJECTS_O1, OBJECTS_O2, OBJECTS_O3, OBJECTS_IN
Several objects are compiled at -O1 and -O2 by default. These lists of objects are specified in the .objects file through the variables:
- SOURCE_O1, SOURCE_O2, SOURCE_IN
and reflect the special rules as they were present in most of the makefiles of the old build system.
To completely overrule a default setting (for instance for the -O1 special rules) use the following construct:
SOURCE_O1= OBJECTS_O1= .. your list of objects ..
For the VASP library (lib)
- The command to invoke the precompiler. In most cases it will suffise to set:
- The command to invoke your Fortran compiler. In most cases:
- N.B.: the library can be compiled without MPI support, i.e., when
FC=mpif90, FC_LIB may specify a Fortran compiler without MPI supprt, e.g.
- Fortran compiler flags, including a specification of the level of optimization. In most cases:
- Specify the options that your Fortran compiler needs for it to accept free-form source layout, without line-length limitation. In most cases it will suffise to set:
- The command to invoke your C compiler (e.g. gcc, icc, ..).
- N.B.: the library can be compiled without MPI support.
- C compiler flags, including a specification of the level of optimization. In most cases:
- List of "non-standard" objects to be added to the library. In most cases:
- When compiling VASP with
-Duse_shmem, one has to add
getshmem.oas well, i.e.,
OBJECTS_LIB= .. getshmem.o ..
For the GPU port
- Location of CUDA toolkit install. For example:
CUDA_ROOT := /opt/cuda
- CUDA toolkit libraries to link to. Typically:
CUDA_LIB := -L$(CUDA_ROOT)/lib64 -lnvToolsExt -lcudart -lcuda -lcufft -lcublas
- Location of CUDA compiler and flags. Typically:
NVCC := $(CUDA_ROOT)/bin/nvcc -g
- Add the objects to be compiled (or linked againts) that provide the FFTs (may include static libraries of objects .a). For FFTW:
OBJECTS_GPU = fftmpiw.o fftmpi_map.o fft3dlib.o fftw3d_gpu.o fftmpiw_gpu.o
- CUDA compiler options to generate code for your particular GPU architecture.
- For Kepler:
GENCODE_ARCH := -gencode=arch=compute_35,code=\"sm_35,compute_35\"
- For Maxwell:
GENCODE_ARCH := -gencode=arch=compute_53,code=\"sm_53,compute_53\"
- Multiple `-gencode` statements can be compiled to create cross-platform executables.
- For details see the NVIDIA nvcc documentation.
- Path to MPI include files so the CUDA compiler can find them. For example:
MPI_INC := /opt/openmpi/include
- These can often be found with
- Preprocessor options for GPU compilation.
- Always include:
-DCUDA_GPUto build cross-platform sources for GPU,
-DUSE_PINNED_MEMORYto use pinned memory for transfer buffers, and
-DRPROMU_CPROJ_OVERLAPto overlap communication and computation in RPROJ_MU.
-DCUFFT_MIN=Nto intercept any FFT calls of size greater than N3 and evaluate on GPU.
-DUSE_MAGMAto use MAGMA for LAPACK-like calls on the GPU.
- So typically:
CPP_GPU = -DCUDA_GPU -DRPROMU_CPROJ_OVERLAP -DUSE_PINNED_MEMORY -DCUFFT_MIN=28
- If using the experimental MAGMA support, path to MAGMA 1.6. Typically:
MAGMA_ROOT := /opt/magma/lib
Unfortunately several bugs were reported for vasp.126.96.36.199Jun15. To fix them download the patch(es) below:
To apply these patch(es) gunzip the patch file(s) and
patch -p1 < patch.5.4.1.ddmmyyyy
within your vasp.X.X.X root-directory.
For vasp.5.4.1.05Feb16 (with GPU support)
The following patch improves the mapping between MPI-ranks and GPUs on multi-node/multi-GPU systems (the issue is performance only, not a bugfix):
The following patch unfortunately does address several bugs:
- For noncollinear calculations LOPTICS=.TRUE. didn't work correctly with Blöchl-smearing (ISMEAR≤-4).
- The Zeroth-Order-Regular-Approximation (ZORA) that accounts for the relativistic mass correction in the Spin-Orbit-Coupling operator was not implemented correctly.
- N.B.: Unfortunately this bugfix affects the total energy. Effects are expected to be negligible except for heavy elements.
To apply these patches gunzip the patch files and
patch -p0 < patch.188.8.131.5232016 patch -p0 < patch.5.4.1.03082016
within your vasp.X.X.X root-directory.
We are currently constructing a suite of tests to be used to validate your VASP executables.