Page 1 of 1

Problems while running VASP 5.2.11 and 5.2.12 on multiple nodes

Posted: Tue Jun 12, 2012 11:23 am
by nkwem
Hi

We are experiencing problems when running VASP 5.2.11 and 5.2.12 using multiple nodes. Jobs are able to run on a single node with 12 cores but they fail when using 2 or more nodes. We have also noticed that some input file are able to run on multiple nodes while other can't. The executables were both compiler using intel compilers and MKL libraries. We are using IntelMPI to run the jobs.What could be the problem?

Below is the INCAR file that fails to run on multiple nodes:
ISPIN=2
ISMEAR=0; SIGMA=0.05
NSW = 90
IBRION = 2
ISIF = 4
LREAL= Auto

The POSCAR is as follows:
C :fcc
3.63666666666667
2.00000000000000 0.00000000000000 0.00000000000000
0.00000000000000 2.00000000000000 0.00000000000000
0.00000000000000 0.00000000000000 2.00000000000000
1 63
Selective dynamics
Direct
0.0 0.0 0.0 T T T
0.25000000000000 0.25000000000000 0.00000000000000 T T T
0.50000000000000 0.50000000000000 0.00000000000000 T T T
0.75000000000000 0.75000000000000 0.00000000000000 T T T
0.00000000000000 0.25000000000000 0.25000000000000 T T T
0.25000000000000 0.50000000000000 0.25000000000000 T T T
0.50000000000000 0.75000000000000 0.25000000000000 T T T
0.75000000000000 0.00000000000000 0.25000000000000 T T T
0.00000000000000 0.50000000000000 0.50000000000000 T T T
0.25000000000000 0.75000000000000 0.50000000000000 T T T
0.50000000000000 0.00000000000000 0.50000000000000 T T T
0.75000000000000 0.25000000000000 0.50000000000000 T T T
0.00000000000000 0.75000000000000 0.75000000000000 T T T
0.25000000000000 0.00000000000000 0.75000000000000 T T T
0.50000000000000 0.25000000000000 0.75000000000000 T T T
0.75000000000000 0.50000000000000 0.75000000000000 T T T
0.25000000000000 0.00000000000000 0.25000000000000 T T T
0.50000000000000 0.25000000000000 0.25000000000000 T T T
0.75000000000000 0.50000000000000 0.25000000000000 T T T
0.00000000000000 0.75000000000000 0.25000000000000 T T T
0.25000000000000 0.25000000000000 0.50000000000000 T T T
0.50000000000000 0.50000000000000 0.50000000000000 T T T
0.75000000000000 0.75000000000000 0.50000000000000 T T T
0.00000000000000 0.00000000000000 0.50000000000000 T T T
0.25000000000000 0.50000000000000 0.75000000000000 T T T
0.50000000000000 0.75000000000000 0.75000000000000 T T T
0.75000000000000 0.00000000000000 0.75000000000000 T T T
0.00000000000000 0.25000000000000 0.75000000000000 T T T
0.25000000000000 0.75000000000000 0.00000000000000 T T T
0.50000000000000 0.00000000000000 0.00000000000000 T T T
0.75000000000000 0.25000000000000 0.00000000000000 T T T
0.00000000000000 0.50000000000000 0.00000000000000 T T T
0.12500000000000 0.12500000000000 0.12500000000000 T T T
0.37500000000000 0.37500000000000 0.12500000000000 T T T
0.62500000000000 0.62500000000000 0.12500000000000 T T T
0.87500000000000 0.87500000000000 0.12500000000000 T T T
0.12500000000000 0.37500000000000 0.37500000000000 T T T
0.37500000000000 0.62500000000000 0.37500000000000 T T T
0.62500000000000 0.87500000000000 0.37500000000000 T T T
0.87500000000000 0.12500000000000 0.37500000000000 T T T
0.12500000000000 0.62500000000000 0.62500000000000 T T T
0.37500000000000 0.87500000000000 0.62500000000000 T T T
0.62500000000000 0.12500000000000 0.62500000000000 T T T
0.87500000000000 0.37500000000000 0.62500000000000 T T T
0.12500000000000 0.87500000000000 0.87500000000000 T T T
0.37500000000000 0.12500000000000 0.87500000000000 T T T
0.62500000000000 0.37500000000000 0.87500000000000 T T T
0.87500000000000 0.62500000000000 0.87500000000000 T T T
0.37500000000000 0.12500000000000 0.37500000000000 T T T
0.62500000000000 0.37500000000000 0.37500000000000 T T T
0.87500000000000 0.62500000000000 0.37500000000000 T T T
0.12500000000000 0.87500000000000 0.37500000000000 T T T
0.37500000000000 0.37500000000000 0.62500000000000 T T T
0.62500000000000 0.62500000000000 0.62500000000000 T T T
0.87500000000000 0.87500000000000 0.62500000000000 T T T
0.12500000000000 0.12500000000000 0.62500000000000 T T T
0.37500000000000 0.62500000000000 0.87500000000000 T T T
0.62500000000000 0.87500000000000 0.87500000000000 T T T
0.87500000000000 0.12500000000000 0.87500000000000 T T T
0.12500000000000 0.37500000000000 0.87500000000000 T T T
0.37500000000000 0.87500000000000 0.12500000000000 T T T
0.62500000000000 0.12500000000000 0.12500000000000 T T T
0.87500000000000 0.37500000000000 0.12500000000000 T T T
0.12500000000000 0.62500000000000 0.12500000000000 T T T


Regards,
Nkwe

Problems while running VASP 5.2.11 and 5.2.12 on multiple nodes

Posted: Tue Jun 12, 2012 1:51 pm
by alex
Hi Nkwe,

if this situation occurs, your (VASP-) inputs are already fine. You have to check many things:

a) log into node01, then try ssh to e.g. node02 (or your chosen remote login shell). Are you allowed without password? Was the login successful?

Try this one first, otherwise the list to check is very long ...

Cheers,

alex

Problems while running VASP 5.2.11 and 5.2.12 on multiple nodes

Posted: Wed Jun 13, 2012 4:09 pm
by nkwem
Hi Alex

Thank you for responding.

Yes, I can ssh to different nodes without a password and I can also ssh from a compute node to other compute nodes.

Regards,
Nkwe
<span class='smallblacktext'>[ Edited Wed Jun 13 2012, 04:12PM ]</span>

Problems while running VASP 5.2.11 and 5.2.12 on multiple nodes

Posted: Wed Jun 13, 2012 9:10 pm
by jlbettis
Try setting NPAR.

Problems while running VASP 5.2.11 and 5.2.12 on multiple nodes

Posted: Tue Jul 31, 2012 10:44 am
by nkwem
Hi jlbettis,

Thank you. Your suggestion works perfectly.

Regards,
Nkwe