Next: 8.3 Elastic band method
Up: 8 Special switches for
Previous: 8.1 NPAR switchand
First of all, the memory requirements of the serial version
can be estimated using the makeparam utility (see
Sec. 6.15). At present, there is however no
way to estimate the memory requirements of
the parallel version.
In fact, it might be difficult to run huge jobs on "thin" T3E or
SP2 nodes. Most tables (pseudopotentials etc.) and the executable
must be hold on all nodes (10-20 Mbytes).
In addition one complex array of the size
is allocated on each node;
during dynamic simulation even up to three such arrays are allocated.
Upon reading and writing the charge density, a complex
array that can hold all data points of the charge density
is allocated (8*NGXF*NGYF*NGZF). Finally, three such arrays
are allocated (and deallocated) during the charge density symmetrisation
(the charge density symmetrisation takes usually the hugest amount
of memory.)
All other data are distributed among all nodes.
The following things can be tried to reduce the memory
requirements on each node.
- Possibly the executable becomes smaller if the options -G1 (T3E) and -g
are removed from the lines OFLAG and DEBUG in the makefile.
-
Switch of symmetrisation (ISYM=0). Symmetrisation is done locally
on each node requiring three huge array.
vasp.4.4.2 has switch to run a more memory conserving symmetrization
this can be selected by specifying ISYM=2. Results might however
differ somewhat from ISYM=1 (usually only 1/100th of an meV).
Also avoid writing or reading the file CHGCAR (LCHARG=F).
-
Use NPAR=1.
It should be mentioned that VASP relies heavily on dynamic memory
allocation (ALLOCATE and DEALLOCATE). As far as we know there
is no memory leakage (ALLOCATE without DEALLOCATE), however unfortunately
it is impossible to be entirely sure that no leakage exists. It should be mentioned
that some users have observed that the code is growing during
dynamic simulations on the T3E.
This is however most likely due to a ``problematic''
dynamic memory management of the f90 runtime system and not due to
programming error in VASP. Unfortunately the
dynamic memory subsystems of most f90 compilers are still
rather inefficient. As a result it might happen, that
the memory becomes more and more fragmented during the run, so that large pieces
of memory can not be allocated. We can only hope for
improvements in the dynamic memory management (for instance
the introduction of garbage collectors).
Next: 8.3 Elastic band method
Up: 8 Special switches for
Previous: 8.1 NPAR switchand
MASTER USER VASP
Mon Mar 29 10:38:29 MEST 1999