memory corruption in DFPT calc with ISYM = -1, NPAR !=1

Problems running VASP: crashes, internal errors, "wrong" results.


Moderators: Global Moderator, Moderator

Post Reply
Message
Author
weixie4
Newbie
Newbie
Posts: 2
Joined: Sat May 06, 2017 2:24 am

memory corruption in DFPT calc with ISYM = -1, NPAR !=1

#1 Post by weixie4 » Mon Oct 29, 2018 6:59 pm

Dear VASP developers and users,

I was run DFPT calcs to get macroscopic dielectric constant and born effective charges etc. for a semiconductor (gap ~ 1.5 eV). I understand VASP changes k-point mesh internally with symmetry on when doing DFPT (LEPSILON = T) or berry phase (LCALCEPS = T) calcs, which would crash the job if I use band parallelization. I thus had symmetry turned off (ISYM = -1) and expected that I could then use band parallelization. However, it turned out that k-mesh indeed was not a problem, but I instead always got memory corruption error the moment when DFPT loop started. The error messages were like:
vasp_std': malloc(): memory corruption: 0x00000000108c3d70 ***
Clearly, that was a memory issue and as a sanity check I increased memory to 6GB/core and the problem remained. I tried in another supercomputer, still to no vail. Of course, If I instead turned off band parallelization, no such error was encountered. This is very annoying because I have a big system and only one k-point so band parallelization is essential for me. Can I get any help here?

Here's my INCAR:

Code: Select all

ALGO = F

NSW = 1
ISIF = 3
IBRION = 2
EDIFFG = -0.001

ISMEAR = 0
SIGMA = 0.01

LWAVE = .FALSE.
LCHARG = .FALSE.

ISPIN = 1

ENCUT = 520
PREC = Accurate
LREAL = .FALSE.
EDIFF = 1e-08

NELMIN = 1
NELM = 100

#KPAR = 2
NPAR = 2

ADDGRID = .TRUE.
LEPSILON = .TRUE.
LPEAD = .TRUE.
LASPH = .TRUE.

NWRITE = 3
ISYM = -1
I tried to upload the whole set of input/output files but failed, receiving this message:"Sorry, the board attachment quota has been reached.". Please download it from:
"link removed from moderator"

Thank you

merzuk.kaltak
Administrator
Administrator
Posts: 282
Joined: Mon Sep 24, 2018 9:39 am

Re: memory corruption in DFPT calc with ISYM = -1, NPAR !=1

#2 Post by merzuk.kaltak » Mon Nov 05, 2018 9:36 am

Dear weixie4,

we will take a look at this issue.
In the mean time, the attachment quota has been changed to 8MiB. Your bug report is in the attachment.
You do not have the required permissions to view the files attached to this post.

merzuk.kaltak
Administrator
Administrator
Posts: 282
Joined: Mon Sep 24, 2018 9:39 am

Re: memory corruption in DFPT calc with ISYM = -1, NPAR !=1

#3 Post by merzuk.kaltak » Wed Nov 07, 2018 10:10 am

We have been able to reproduce your issue with vasp 5.4.4. and provide the following two-step solution for the problem (needs recompiling the code unfortunately):

First, note that setting NPAR = 2 in the INCAR file tells VASP to distribute 2 bands among all MPI ranks. In your case the job was executed with 144 MPI ranks, which is a very inefficient setting, since the default is NPAR = "number of ranks" as written on our wiki page. Since you are using LPEAD to calculate Born effective charges, we suggest not to set NPAR at all.

Second, even if you uncomment the NPAR line in the INCAR, VASP will abort with the error message:

Code: Select all

 internal error in SETUP_DEG_CLUSTERS: NB_TOT exceeds NMAX_DEG
    increase NMAX_DEG to          64
This means that VASP internally allows for a maximum of 48 degenerate states in src/subrot_cluster.F which is too small for your system.
Thus, you will have to change following line in src/subrot_cluster.F:

Code: Select all

INTEGER, PARAMETER :: NMAX_DEG=48
to

Code: Select all

INTEGER, PARAMETER :: NMAX_DEG=64
and recompile the code. We have been able to execute your job successfully with 80 MPI ranks using the gamma-only binary.

weixie4
Newbie
Newbie
Posts: 2
Joined: Sat May 06, 2017 2:24 am

Re: memory corruption in DFPT calc with ISYM = -1, NPAR !=1

#4 Post by weixie4 » Mon Nov 12, 2018 8:31 pm

Dear Dr. Merzuk Kaltak,

Thank you for the testing and suggestion.

I did encounter the "internal error in SETUP_DEG_CLUSTERS: NB_TOT exceeds NMAX_DEG" error but with suggestions given in this forum before I had increased NMAX_DEG to 64 before reporting the issue.

My main question was whether it is possible to use band parallelization (i.e., NPAR != total # of MPI ranks) when doing DFPT calculations (LEPSILON = .TRUE. and/or IBRION = 7/8) with symmetry switched off (ISYM = -1)? If this is possible, I will try to work around the memory issue; if however this feature is not implemented in VASP in the first place, I will no longer put more time on this.

Thanks and regards
Wei

Post Reply