forrtl: severe (41): insufficient virtual memory
Posted: Fri Dec 14, 2012 8:31 am
Dear all:
Can you guys help me out? I am currently calculating the phonon dispersion relation of CoSb3 using supercell method. The unit cell includes 32 atoms in total, resulting in a large system including 256 atoms in a 2*2*2 supercell. The biggest problem bugging me is the memory requirement. Our server can supply 48G memory at most, but the calculation can easily cause overflow. Finally I reduced the memory requirement to 27G by decreasing kpoints to 2*2*2. Although the server can afford it, but it turns out to be killed after the last step of the convergence calculation. I think it may involve with the saving process, but not very sure. The log is posted as below:
********************************************************
running on 10 total cores
distrk: each k-point on 10 cores, 1 groups
distr: one band on 1 cores, 10 groups
using from now: INCAR
vasp.5.3.2 13Sep12 (build Nov 16 2012 12:11:05) complex
POSCAR found : 2 types and 256 ions
-----------------------------------------------------------------------------
| |
| W W AA RRRRR N N II N N GGGG !!! |
| W W A A R R NN N II NN N G G !!! |
| W W A A R R N N N II N N N G !!! |
| W WW W AAAAAA RRRRR N N N II N N N G GGG ! |
| WW WW A A R R N NN II N NN G G |
| W W A A R R N N II N N GGGG !!! |
| |
| For optimal performance we recommend that you set |
| NPAR = 4 - approx SQRT( number of cores) |
| (number of cores/NPAR must be integer) |
| This setting will greatly improve the performance of VASP for DFT. |
| The default NPAR=number of cores might be grossly inefficient |
| on modern multi-core architectures or massively parallel machines. |
| Do your own testing. |
| Unfortunately you need to use the default for hybrid, GW and RPA |
| calculations. |
| |
-----------------------------------------------------------------------------
LDA part: xc-table for Pade appr. of Perdew
POSCAR, INCAR and KPOINTS ok, starting setup
WARNING: small aliasing (wrap around) errors must be expected
FFT: planning ...
WAVECAR not read
--------------------------------------------------------------------------
mpirun noticed that process rank 2 with PID 28166 on node mez503 exited on signal 1 (Hangup).
--------------------------------------------------------------------------
running on 10 total cores
distrk: each k-point on 10 cores, 1 groups
distr: one band on 1 cores, 10 groups
using from now: INCAR
vasp.5.3.2 13Sep12 (build Nov 16 2012 12:11:05) complex
POSCAR found : 2 types and 256 ions
-----------------------------------------------------------------------------
| |
| W W AA RRRRR N N II N N GGGG !!! |
| W W A A R R NN N II NN N G G !!! |
| W W A A R R N N N II N N N G !!! |
| W WW W AAAAAA RRRRR N N N II N N N G GGG ! |
| WW WW A A R R N NN II N NN G G |
| W W A A R R N N II N N GGGG !!! |
| |
| For optimal performance we recommend that you set |
| NPAR = 4 - approx SQRT( number of cores) |
| (number of cores/NPAR must be integer) |
| This setting will greatly improve the performance of VASP for DFT. |
| The default NPAR=number of cores might be grossly inefficient |
| on modern multi-core architectures or massively parallel machines. |
| Do your own testing. |
| Unfortunately you need to use the default for hybrid, GW and RPA |
| calculations. |
| |
-----------------------------------------------------------------------------
LDA part: xc-table for Pade appr. of Perdew
POSCAR, INCAR and KPOINTS ok, starting setup
WARNING: small aliasing (wrap around) errors must be expected
FFT: planning ...
WAVECAR not read
WARNING: random wavefunctions but no delay for mixing, default for NELMDL
entering main loop
N E dE d eps ncg rms rms(c)
DAV: 1 0.964684502134E+04 0.96468E+04 -0.46569E+05 26130 0.117E+03
DAV: 2 -0.130558796811E+03 -0.97774E+04 -0.94529E+04 27030 0.305E+02
DAV: 3 -0.127435783296E+04 -0.11438E+04 -0.10730E+04 30730 0.118E+02
DAV: 4 -0.134728200243E+04 -0.72924E+02 -0.71739E+02 34160 0.286E+01
DAV: 5 -0.134978919538E+04 -0.25072E+01 -0.24962E+01 37230 0.518E+00 0.803E+01
DAV: 6 -0.157473892594E+04 -0.22495E+03 -0.18645E+03 37900 0.110E+02 0.143E+02
DAV: 7 -0.128652372902E+04 0.28822E+03 -0.10597E+03 35760 0.999E+01 0.490E+01
DAV: 8 -0.130485299317E+04 -0.18329E+02 -0.20252E+02 34820 0.136E+01 0.386E+01
DAV: 9 -0.129201157920E+04 0.12841E+02 -0.67339E+01 36970 0.200E+01 0.115E+01
DAV: 10 -0.129220219877E+04 -0.19062E+00 -0.35622E+00 28510 0.253E+00 0.942E+00
DAV: 11 -0.129221790995E+04 -0.15711E-01 -0.73262E-01 39710 0.324E+00 0.903E+00
DAV: 12 -0.129197917364E+04 0.23874E+00 -0.11862E+00 45690 0.484E+00 0.696E+00
DAV: 13 -0.129200130008E+04 -0.22126E-01 -0.73376E-01 37500 0.245E+00 0.460E+00
DAV: 14 -0.129196465820E+04 0.36642E-01 -0.19233E-01 37930 0.181E+00 0.407E+00
DAV: 15 -0.129194981111E+04 0.14847E-01 -0.32349E-02 35120 0.427E-01 0.340E+00
DAV: 16 -0.129194167092E+04 0.81402E-02 -0.14702E-02 37880 0.528E-01 0.304E+00
DAV: 17 -0.129192167918E+04 0.19992E-01 -0.21565E-02 34700 0.347E-01 0.205E+00
DAV: 18 -0.129191622979E+04 0.54494E-02 -0.11808E-02 37400 0.294E-01 0.165E+00
DAV: 19 -0.129191151554E+04 0.47143E-02 -0.50656E-03 35660 0.247E-01 0.131E+00
DAV: 20 -0.129190867389E+04 0.28416E-02 -0.29148E-03 36140 0.101E-01 0.104E+00
DAV: 21 -0.129190740557E+04 0.12683E-02 -0.98540E-04 38980 0.104E-01 0.895E-01
DAV: 22 -0.129190403719E+04 0.33684E-02 -0.58654E-03 34730 0.125E-01 0.358E-01
DAV: 23 -0.129190369198E+04 0.34520E-03 -0.17033E-03 37670 0.106E-01 0.210E-01
DAV: 24 -0.129190349339E+04 0.19860E-03 -0.55831E-04 37130 0.823E-02 0.117E-01
DAV: 25 -0.129190346892E+04 0.24471E-04 -0.18947E-04 37970 0.288E-02 0.805E-02
DAV: 26 -0.129190346078E+04 0.81302E-05 -0.46326E-05 34590 0.244E-02 0.621E-02
DAV: 27 -0.129190345701E+04 0.37782E-05 -0.14741E-05 32640 0.946E-03 0.384E-02
DAV: 28 -0.129190345728E+04 -0.26899E-06 -0.80157E-06 31270 0.112E-02
forrtl: severe (41): insufficient virtual memory
Image PC Routine Line Source
vasp5 0000000001100A6E Unknown Unknown Unknown
vasp5 00000000010FF506 Unknown Unknown Unknown
vasp5 00000000010B4AB2 Unknown Unknown Unknown
vasp5 00000000010672BB Unknown Unknown Unknown
vasp5 00000000010988E3 Unknown Unknown Unknown
vasp5 00000000005A66D6 Unknown Unknown Unknown
vasp5 000000000045857A Unknown Unknown Unknown
vasp5 000000000043333C Unknown Unknown Unknown
libc.so.6 00002AD68C5BDD8E Unknown Unknown Unknown
vasp5 0000000000433239 Unknown Unknown Unknown
forrtl: severe (41): insufficient virtual memory
Image PC Routine Line Source
vasp5 0000000001100A6E Unknown Unknown Unknown
vasp5 00000000010FF506 Unknown Unknown Unknown
vasp5 00000000010B4AB2 Unknown Unknown Unknown
vasp5 00000000010672BB Unknown Unknown Unknown
vasp5 00000000010988E3 Unknown Unknown Unknown
vasp5 00000000005A66D6 Unknown Unknown Unknown
vasp5 000000000045857A Unknown Unknown Unknown
vasp5 000000000043333C Unknown Unknown Unknown
libc.so.6 00002AF4560DED8E Unknown Unknown Unknown
vasp5 0000000000433239 Unknown Unknown Unknown
forrtl: severe (41): insufficient virtual memory
Image PC Routine Line Source
vasp5 0000000001100A6E Unknown Unknown Unknown
vasp5 00000000010FF506 Unknown Unknown Unknown
vasp5 00000000010B4AB2 Unknown Unknown Unknown
vasp5 00000000010672BB Unknown Unknown Unknown
vasp5 00000000010988E3 Unknown Unknown Unknown
vasp5 00000000005A66D6 Unknown Unknown Unknown
vasp5 000000000045857A Unknown Unknown Unknown
vasp5 000000000043333C Unknown Unknown Unknown
libc.so.6 00002B95BDF8DD8E Unknown Unknown Unknown
vasp5 0000000000433239 Unknown Unknown Unknown
forrtl: severe (41): insufficient virtual memory
Image PC Routine Line Source
vasp5 0000000001100A6E Unknown Unknown Unknown
vasp5 00000000010FF506 Unknown Unknown Unknown
vasp5 00000000010B4AB2 Unknown Unknown Unknown
vasp5 00000000010672BB Unknown Unknown Unknown
vasp5 00000000010988E3 Unknown Unknown Unknown
vasp5 00000000005A66D6 Unknown Unknown Unknown
vasp5 000000000045857A Unknown Unknown Unknown
vasp5 000000000043333C Unknown Unknown Unknown
libc.so.6 00002AC528119D8E Unknown Unknown Unknown
vasp5 0000000000433239 Unknown Unknown Unknown
forrtl: severe (41): insufficient virtual memory
Image PC Routine Line Source
vasp5 0000000001100A6E Unknown Unknown Unknown
vasp5 00000000010FF506 Unknown Unknown Unknown
vasp5 00000000010B4AB2 Unknown Unknown Unknown
vasp5 00000000010672BB Unknown Unknown Unknown
vasp5 00000000010988E3 Unknown Unknown Unknown
vasp5 00000000005A66D6 Unknown Unknown Unknown
vasp5 000000000045857A Unknown Unknown Unknown
vasp5 000000000043333C Unknown Unknown Unknown
libc.so.6 00002B71D0FF5D8E Unknown Unknown Unknown
vasp5 0000000000433239 Unknown Unknown Unknown
forrtl: severe (41): insufficient virtual memory
Image PC Routine Line Source
vasp5 0000000001100A6E Unknown Unknown Unknown
vasp5 00000000010FF506 Unknown Unknown Unknown
vasp5 00000000010B4AB2 Unknown Unknown Unknown
vasp5 00000000010672BB Unknown Unknown Unknown
vasp5 00000000010988E3 Unknown Unknown Unknown
vasp5 00000000005A66D6 Unknown Unknown Unknown
vasp5 000000000045857A Unknown Unknown Unknown
vasp5 000000000043333C Unknown Unknown Unknown
libc.so.6 00002B0BCA415D8E Unknown Unknown Unknown
vasp5 0000000000433239 Unknown Unknown Unknown
--------------------------------------------------------------------------
mpirun has exited due to process rank 1 with PID 28296 on
node mez503 exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--------------------------------------------------------------------------
Can you guys help me out? I am currently calculating the phonon dispersion relation of CoSb3 using supercell method. The unit cell includes 32 atoms in total, resulting in a large system including 256 atoms in a 2*2*2 supercell. The biggest problem bugging me is the memory requirement. Our server can supply 48G memory at most, but the calculation can easily cause overflow. Finally I reduced the memory requirement to 27G by decreasing kpoints to 2*2*2. Although the server can afford it, but it turns out to be killed after the last step of the convergence calculation. I think it may involve with the saving process, but not very sure. The log is posted as below:
********************************************************
running on 10 total cores
distrk: each k-point on 10 cores, 1 groups
distr: one band on 1 cores, 10 groups
using from now: INCAR
vasp.5.3.2 13Sep12 (build Nov 16 2012 12:11:05) complex
POSCAR found : 2 types and 256 ions
-----------------------------------------------------------------------------
| |
| W W AA RRRRR N N II N N GGGG !!! |
| W W A A R R NN N II NN N G G !!! |
| W W A A R R N N N II N N N G !!! |
| W WW W AAAAAA RRRRR N N N II N N N G GGG ! |
| WW WW A A R R N NN II N NN G G |
| W W A A R R N N II N N GGGG !!! |
| |
| For optimal performance we recommend that you set |
| NPAR = 4 - approx SQRT( number of cores) |
| (number of cores/NPAR must be integer) |
| This setting will greatly improve the performance of VASP for DFT. |
| The default NPAR=number of cores might be grossly inefficient |
| on modern multi-core architectures or massively parallel machines. |
| Do your own testing. |
| Unfortunately you need to use the default for hybrid, GW and RPA |
| calculations. |
| |
-----------------------------------------------------------------------------
LDA part: xc-table for Pade appr. of Perdew
POSCAR, INCAR and KPOINTS ok, starting setup
WARNING: small aliasing (wrap around) errors must be expected
FFT: planning ...
WAVECAR not read
--------------------------------------------------------------------------
mpirun noticed that process rank 2 with PID 28166 on node mez503 exited on signal 1 (Hangup).
--------------------------------------------------------------------------
running on 10 total cores
distrk: each k-point on 10 cores, 1 groups
distr: one band on 1 cores, 10 groups
using from now: INCAR
vasp.5.3.2 13Sep12 (build Nov 16 2012 12:11:05) complex
POSCAR found : 2 types and 256 ions
-----------------------------------------------------------------------------
| |
| W W AA RRRRR N N II N N GGGG !!! |
| W W A A R R NN N II NN N G G !!! |
| W W A A R R N N N II N N N G !!! |
| W WW W AAAAAA RRRRR N N N II N N N G GGG ! |
| WW WW A A R R N NN II N NN G G |
| W W A A R R N N II N N GGGG !!! |
| |
| For optimal performance we recommend that you set |
| NPAR = 4 - approx SQRT( number of cores) |
| (number of cores/NPAR must be integer) |
| This setting will greatly improve the performance of VASP for DFT. |
| The default NPAR=number of cores might be grossly inefficient |
| on modern multi-core architectures or massively parallel machines. |
| Do your own testing. |
| Unfortunately you need to use the default for hybrid, GW and RPA |
| calculations. |
| |
-----------------------------------------------------------------------------
LDA part: xc-table for Pade appr. of Perdew
POSCAR, INCAR and KPOINTS ok, starting setup
WARNING: small aliasing (wrap around) errors must be expected
FFT: planning ...
WAVECAR not read
WARNING: random wavefunctions but no delay for mixing, default for NELMDL
entering main loop
N E dE d eps ncg rms rms(c)
DAV: 1 0.964684502134E+04 0.96468E+04 -0.46569E+05 26130 0.117E+03
DAV: 2 -0.130558796811E+03 -0.97774E+04 -0.94529E+04 27030 0.305E+02
DAV: 3 -0.127435783296E+04 -0.11438E+04 -0.10730E+04 30730 0.118E+02
DAV: 4 -0.134728200243E+04 -0.72924E+02 -0.71739E+02 34160 0.286E+01
DAV: 5 -0.134978919538E+04 -0.25072E+01 -0.24962E+01 37230 0.518E+00 0.803E+01
DAV: 6 -0.157473892594E+04 -0.22495E+03 -0.18645E+03 37900 0.110E+02 0.143E+02
DAV: 7 -0.128652372902E+04 0.28822E+03 -0.10597E+03 35760 0.999E+01 0.490E+01
DAV: 8 -0.130485299317E+04 -0.18329E+02 -0.20252E+02 34820 0.136E+01 0.386E+01
DAV: 9 -0.129201157920E+04 0.12841E+02 -0.67339E+01 36970 0.200E+01 0.115E+01
DAV: 10 -0.129220219877E+04 -0.19062E+00 -0.35622E+00 28510 0.253E+00 0.942E+00
DAV: 11 -0.129221790995E+04 -0.15711E-01 -0.73262E-01 39710 0.324E+00 0.903E+00
DAV: 12 -0.129197917364E+04 0.23874E+00 -0.11862E+00 45690 0.484E+00 0.696E+00
DAV: 13 -0.129200130008E+04 -0.22126E-01 -0.73376E-01 37500 0.245E+00 0.460E+00
DAV: 14 -0.129196465820E+04 0.36642E-01 -0.19233E-01 37930 0.181E+00 0.407E+00
DAV: 15 -0.129194981111E+04 0.14847E-01 -0.32349E-02 35120 0.427E-01 0.340E+00
DAV: 16 -0.129194167092E+04 0.81402E-02 -0.14702E-02 37880 0.528E-01 0.304E+00
DAV: 17 -0.129192167918E+04 0.19992E-01 -0.21565E-02 34700 0.347E-01 0.205E+00
DAV: 18 -0.129191622979E+04 0.54494E-02 -0.11808E-02 37400 0.294E-01 0.165E+00
DAV: 19 -0.129191151554E+04 0.47143E-02 -0.50656E-03 35660 0.247E-01 0.131E+00
DAV: 20 -0.129190867389E+04 0.28416E-02 -0.29148E-03 36140 0.101E-01 0.104E+00
DAV: 21 -0.129190740557E+04 0.12683E-02 -0.98540E-04 38980 0.104E-01 0.895E-01
DAV: 22 -0.129190403719E+04 0.33684E-02 -0.58654E-03 34730 0.125E-01 0.358E-01
DAV: 23 -0.129190369198E+04 0.34520E-03 -0.17033E-03 37670 0.106E-01 0.210E-01
DAV: 24 -0.129190349339E+04 0.19860E-03 -0.55831E-04 37130 0.823E-02 0.117E-01
DAV: 25 -0.129190346892E+04 0.24471E-04 -0.18947E-04 37970 0.288E-02 0.805E-02
DAV: 26 -0.129190346078E+04 0.81302E-05 -0.46326E-05 34590 0.244E-02 0.621E-02
DAV: 27 -0.129190345701E+04 0.37782E-05 -0.14741E-05 32640 0.946E-03 0.384E-02
DAV: 28 -0.129190345728E+04 -0.26899E-06 -0.80157E-06 31270 0.112E-02
forrtl: severe (41): insufficient virtual memory
Image PC Routine Line Source
vasp5 0000000001100A6E Unknown Unknown Unknown
vasp5 00000000010FF506 Unknown Unknown Unknown
vasp5 00000000010B4AB2 Unknown Unknown Unknown
vasp5 00000000010672BB Unknown Unknown Unknown
vasp5 00000000010988E3 Unknown Unknown Unknown
vasp5 00000000005A66D6 Unknown Unknown Unknown
vasp5 000000000045857A Unknown Unknown Unknown
vasp5 000000000043333C Unknown Unknown Unknown
libc.so.6 00002AD68C5BDD8E Unknown Unknown Unknown
vasp5 0000000000433239 Unknown Unknown Unknown
forrtl: severe (41): insufficient virtual memory
Image PC Routine Line Source
vasp5 0000000001100A6E Unknown Unknown Unknown
vasp5 00000000010FF506 Unknown Unknown Unknown
vasp5 00000000010B4AB2 Unknown Unknown Unknown
vasp5 00000000010672BB Unknown Unknown Unknown
vasp5 00000000010988E3 Unknown Unknown Unknown
vasp5 00000000005A66D6 Unknown Unknown Unknown
vasp5 000000000045857A Unknown Unknown Unknown
vasp5 000000000043333C Unknown Unknown Unknown
libc.so.6 00002AF4560DED8E Unknown Unknown Unknown
vasp5 0000000000433239 Unknown Unknown Unknown
forrtl: severe (41): insufficient virtual memory
Image PC Routine Line Source
vasp5 0000000001100A6E Unknown Unknown Unknown
vasp5 00000000010FF506 Unknown Unknown Unknown
vasp5 00000000010B4AB2 Unknown Unknown Unknown
vasp5 00000000010672BB Unknown Unknown Unknown
vasp5 00000000010988E3 Unknown Unknown Unknown
vasp5 00000000005A66D6 Unknown Unknown Unknown
vasp5 000000000045857A Unknown Unknown Unknown
vasp5 000000000043333C Unknown Unknown Unknown
libc.so.6 00002B95BDF8DD8E Unknown Unknown Unknown
vasp5 0000000000433239 Unknown Unknown Unknown
forrtl: severe (41): insufficient virtual memory
Image PC Routine Line Source
vasp5 0000000001100A6E Unknown Unknown Unknown
vasp5 00000000010FF506 Unknown Unknown Unknown
vasp5 00000000010B4AB2 Unknown Unknown Unknown
vasp5 00000000010672BB Unknown Unknown Unknown
vasp5 00000000010988E3 Unknown Unknown Unknown
vasp5 00000000005A66D6 Unknown Unknown Unknown
vasp5 000000000045857A Unknown Unknown Unknown
vasp5 000000000043333C Unknown Unknown Unknown
libc.so.6 00002AC528119D8E Unknown Unknown Unknown
vasp5 0000000000433239 Unknown Unknown Unknown
forrtl: severe (41): insufficient virtual memory
Image PC Routine Line Source
vasp5 0000000001100A6E Unknown Unknown Unknown
vasp5 00000000010FF506 Unknown Unknown Unknown
vasp5 00000000010B4AB2 Unknown Unknown Unknown
vasp5 00000000010672BB Unknown Unknown Unknown
vasp5 00000000010988E3 Unknown Unknown Unknown
vasp5 00000000005A66D6 Unknown Unknown Unknown
vasp5 000000000045857A Unknown Unknown Unknown
vasp5 000000000043333C Unknown Unknown Unknown
libc.so.6 00002B71D0FF5D8E Unknown Unknown Unknown
vasp5 0000000000433239 Unknown Unknown Unknown
forrtl: severe (41): insufficient virtual memory
Image PC Routine Line Source
vasp5 0000000001100A6E Unknown Unknown Unknown
vasp5 00000000010FF506 Unknown Unknown Unknown
vasp5 00000000010B4AB2 Unknown Unknown Unknown
vasp5 00000000010672BB Unknown Unknown Unknown
vasp5 00000000010988E3 Unknown Unknown Unknown
vasp5 00000000005A66D6 Unknown Unknown Unknown
vasp5 000000000045857A Unknown Unknown Unknown
vasp5 000000000043333C Unknown Unknown Unknown
libc.so.6 00002B0BCA415D8E Unknown Unknown Unknown
vasp5 0000000000433239 Unknown Unknown Unknown
--------------------------------------------------------------------------
mpirun has exited due to process rank 1 with PID 28296 on
node mez503 exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--------------------------------------------------------------------------