Registered Member #223
Joined Tue Jun 14 2005, 05:18PM
posts 25

Hello all,

I tried to calculate the case of a dilute impurity system using VASP and obtained the error message indicated above. For a transition-metal alloy having a concentration of the order of 1/100 (for example, a cubic cell with 107 solvent atoms in a periodic FCC arrangement and an impurity atom placed at the center of the cell) the required amount of memory using a Monkhorst-Pack grid of size 15x15x15 is of the order of 20 GB. That's not the main problem, our computing system is able to allocate this amount of memory but after three hours of execution I get the following error message:
LAPACK: Routine ZPOTRF failed! 118
Does anybody know what's happening? I traced back this message to the choleski module but could not get a clue about the reason the ZPOTRF routine was returning an INFO value different from zero. It obviously mean that the Choleski decomposition failed for this system but I don't know if something is necessarily wrong with the calculation. Is the Choleski routine from LAPACK robust or does it depends on the size of the input array? Here are the parameters for the calculation:

Dimension of arrays:
k-Points NKPTS = 120 number of bands NBANDS= 712
number of dos NEDOS = 1100 number of ions NIONS = 108
non local maximal LDIM = 6 non local SUM 2l+1 LMDIM = 18
total plane-waves NPLWV = 216000
max r-space proj IRMAX = 2367 max aug-charges IRDMAX= 6116
dimension x,y,z NGX = 60 NGY = 60 NGZ = 60
dimension x,y,z NGXF= 120 NGYF= 120 NGZF= 120
support grid NGXF= 120 NGYF= 120 NGZF= 120
ions per type = 1 107
NGX,Y,Z is equivalent to a cutoff of 9.18, 9.18, 9.18 a.u.
NGXF,Y,Z is equivalent to a cutoff of 18.37, 18.37, 18.37 a.u.

I would recommend the setting:
dimension x,y,z NGX = 59 NGY = 59 NGZ = 59
SYSTEM = NiCu bulk cell 108 atoms PAW-GGA
POSCAR = NiCu bulk cell 108 atoms

Startparameter for this run:
NWRITE = 2 write-flag & timer
PREC = accura medium, high low
ISTART = 0 job : 0-new 1-cont 2-samecut
ICHARG = 2 charge: 1-file 2-atom 10-const
ISPIN = 1 spin polarized calculation?
LNONCOLLINEAR = F non collinear calculations
LSORBIT = F spin-orbit coupling
INIWAV = 1 electr: 0-lowe 1-rand 2-diag
LASPH = F aspherical Exc in radial PAW
METAGGA= F non-selfconsistent MetaGGA calc.

Electronic Relaxation 1
ENCUT = 273.2 eV 20.08 Ry 4.48 a.u. 14.64 14.64 14.64*2*pi/ulx,y,z
ENINI = 273.2 initial cutoff
ENAUG = 544.6 eV augmentation charge cutoff
NELM = 250; NELMIN= 2; NELMDL= 6 # of ELM steps
EDIFF = 0.1E-03 stopping-criterion for ELM
LREAL = T real-space projection
LCOMPAT= F compatible to vasp.4.4
LREAL_COMPAT= F compatible to vasp.4.5.1-3
GGA_COMPAT = T GGA compatible to vasp.4.4-vasp.4.6
LMAXPAW = -100 max onsite density
LMAXMIX = 2 max onsite mixed and CHGCAR
VOSKOWN= 0 Vosko Wilk Nusair interpolation
ROPT = 0.00000 0.00000
Ionic relaxation
EDIFFG = -.1E-01 stopping-criterion for IOM
NSW = 0 number of steps for IOM
NBLOCK = 1; KBLOCK = 1 inner block; outer block
IBRION = -1 ionic relax: 0-MD 1-quasi-New 2-CG
NFREE = 1 steps in history (QN), initial steepest desc. (CG)
ISIF = 3 stress and relaxation
IWAVPR = 0 prediction: 0-non 1-charg 2-wave 3-comb
ISYM = 2 0-nonsym 1-usesym 2-fastsym
LCORR = T Harris-Foulkes like correction to forces

POTIM = 0.50 time-step for ionic-motion
TEIN = 0.0 initial temperature
TEBEG = 0.0; TEEND = 0.0 temperature during run
SMASS = -3.00 Nose mass-parameter (am)
estimated Nose-frequenzy (Omega) = 0.10E-29 period in steps =****** mass= -0.270E-26a.u.
NPACO = 256; APACO = 16.0 distance and # of slots for P.C.
PSTRESS= 0.0 pullay stress

Mass of Ions in am
POMASS = 58.69 63.55
Ionic Valenz
ZVAL = 10.00 11.00
Atomic Wigner-Seitz radii
RWIGS = 1.28 1.28
NELECT = 1187.0000 total number of electrons
NUPDOWN= -1.0000 fix difference up-down

DOS related values:
EMIN = -6.00; EMAX = 5.00 energy-range for DOS
ISMEAR = 1; SIGMA = 0.20 broadening in eV -4-tet -1-fermi 0-gaus

ZPOTRF is usually stable with respect to the matrix size.
the 2 most common reasons for this error are
1) errors in the LAPACK installation (check if the error persists if you use an alternative LAPACK)
2) a unreasonable density,
--) either form your setup in POSCAR, or as a result of previous relaxation steps (I see you do not relax your system in THIS run)
--) or from mixing (please look at which electronic step this error occurs)

Registered Member #3558
Joined: Tue Feb 16 2010, 05:21PM
posts 1

With respect to the comment made by admin:
"""
--) either form your setup in POSCAR, or as a result of previous relaxation steps (I see you do not relax your system in THIS run)
"""

I am trying to run optimizations of random starting configurations. My jobs keep crashing stating this same "LAPACK: Routine zpotrf failed! 1" error.

My jobs progress normally, but somewhere along the optimization steps the error is encountered. Is there something I could be doing to avoid these crashes during my optimizations?

Registered Member #3785
Joined: Thu Apr 29 2010, 12:29AM
Location: Corvallis, OR
posts 21

One time, when I got "ZPOTRF failed" immediately (right after it said "reading WAVECAR" (BTW, there was no WAVECAR)), I was able to solve *that* problem by either setting PREC=Accurate, or by increasing NGX,NGXF, and the like to values that were intermediate between the PREC=Normal values and the PREC=Accurate values.

I was using ALGO=Normal and not specifying NSIM or NPAR and had lowered AMIX: I have found these are the best way to avoid other types of errors on parallel computations with many atoms. Unfortunately in this case one of the other common problems, "WARNING: Sub-Space-Matrix is not hermitian in DAV" did occur after the first line of interation output energies was reported.

Perhaps something could be improved in my LAPACK setup. I am using intel mkl9.1.023. I guess I'll try the vasp lapack_double.o instead.

There was no problem with the positions of the atoms, by the way. The problem started when I raised ENCUT to 410, to check my system (max(ENMAX=273)) at a modestly high (1.5 * 273) ENCUT. The cell had 128 atoms. Despite what the VASP manual says, I find accurate (<1meV per atom, or ~10 meV bandgap) results are often achieved in the range 1.4 to 1.7 times the maximum ENMAX).