Problem of running VASP parrelly on different nodes

Questions regarding the compilation of VASP on various platforms: hardware, compilers and libraries, etc.


Moderators: Global Moderator, Moderator

Locked
Message
Author
nujjj
Newbie
Newbie
Posts: 3
Joined: Thu Nov 11, 2004 1:03 am

Problem of running VASP parrelly on different nodes

#1 Post by nujjj » Wed Nov 17, 2004 4:02 pm

Hello everybody,

I use pgf90 and lam 7.0.6 to compile vasp.4.6 parallel version.

The program runs pretty well on each of my AMD opteron nodes which have 2 CPU on it.

Then I tryied to run the pvasp on different nodes. First, I magnage to make logging on from one node to another node not require entering password. Then I ran some simple program , like printing "Hello world" from different CPU. It woked fine. However, when I began running pvasp it shows error message.

Input file:

Only change NPAR=4 in INCAR, everything else is the same.

command type:

$ lamboot hostfile
$ mpirun -np 4 pvasp

error message:

FORTRAN STOP
Error reading item 'IMAGES' from file INCAR.
FORTRAN STOP
Error reading item 'IMAGES' from file INCAR.
MPI_Recv: process in local group is dead (rank 0, comm 3)
Rank (0, MPI_COMM_WORLD): Call stack within LAM:
Rank (0, MPI_COMM_WORLD): - MPI_Recv()
Rank (0, MPI_COMM_WORLD): - MPI_Barrier()
Rank (0, MPI_COMM_WORLD): - MPI_Barrier()
Rank (0, MPI_COMM_WORLD): - main()
MPI_Recv: process in local group is dead (rank 1, SSI:coll:smp:local comm for CID 0)
Rank (1, MPI_COMM_WORLD): Call stack within LAM:
Rank (1, MPI_COMM_WORLD): - MPI_Recv()
Rank (1, MPI_COMM_WORLD): - MPI_Bcast()
Rank (1, MPI_COMM_WORLD): - MPI_Barrier()
Rank (1, MPI_COMM_WORLD): - main()

Aside from being confused abou this MPI problem, I am also confused that it looks like the pvasp on the distant node trying to fine the INCAR but failed. But shouldn't the input file INCAR be transfered from local host to the distant node during the process?

I am very grateful if somebody could give a hint of how could I solve the problem Thanks a lot !
Last edited by nujjj on Wed Nov 17, 2004 4:02 pm, edited 1 time in total.

doris
Newbie
Newbie
Posts: 6
Joined: Tue Aug 31, 2004 7:30 am
License Nr.: staff

Problem of running VASP parrelly on different nodes

#2 Post by doris » Tue Nov 23, 2004 12:57 pm

p4vasp is a tool to visualize VASP results, written in vasprun.xml.
please have a look at the p4vasp site how to install and run it.
Last edited by doris on Tue Nov 23, 2004 12:57 pm, edited 1 time in total.

mbhabani
Newbie
Newbie
Posts: 2
Joined: Fri Feb 25, 2005 8:02 am
License Nr.: 289

Problem of running VASP parrelly on different nodes

#3 Post by mbhabani » Sun Feb 27, 2005 5:52 am

I have same type of problem.I have compiled vasp on a Linux- P4-cluster running on RedHat9.0, IFC-7.0 with mpiifc for parallel version. but when I run it as

mpirun -np 6 vasp

Error reading item 'IMAGES' from file INCAR.
Error reading item 'IMAGES' from file INCAR.
Error reading item 'IMAGES' from file INCAR.
Error reading item 'IMAGES' from file INCAR.
Error reading item 'IMAGES' from file INCAR.
MPI_Recv: process in local group is dead (rank 0, MPI_COMM_WORLD)
Rank (0, MPI_COMM_WORLD): Call stack within LAM:
Rank (0, MPI_COMM_WORLD): - MPI_Recv()
Rank (0, MPI_COMM_WORLD): - MPI_Barrier()
Rank (0, MPI_COMM_WORLD): - main()

But when I run it as a single node job then it is running fine.
mpirun -np 1 vasp

It will be great if someone can suppy me with the Makefile for this platform.
please help me. Help will be greatly appreciated.

Thanks for help.
Bhabani
Last edited by mbhabani on Sun Feb 27, 2005 5:52 am, edited 1 time in total.

masato
Newbie
Newbie
Posts: 3
Joined: Tue Mar 08, 2005 2:02 am
License Nr.: 85

Problem of running VASP parrelly on different nodes

#4 Post by masato » Tue Mar 08, 2005 2:08 am

I encountered the same problem when I forgot to mount working directory over nfs and the problem was solved when input files are shared by different nodes by nfs.
Last edited by masato on Tue Mar 08, 2005 2:08 am, edited 1 time in total.

admin
Administrator
Administrator
Posts: 2921
Joined: Tue Aug 03, 2004 8:18 am
License Nr.: 458

Problem of running VASP parrelly on different nodes

#5 Post by admin » Mon Apr 11, 2005 1:25 pm

the reason for that error most probably is an mpi error,
please have a look whether all nodes were accessed correctly
it seems that all but the master nodes do not find and read the INCAR file correctly.

makefile.linux-ifc-P4 generates a code which runs well on P4-clusters
if there had not been compilation errors.
Last edited by admin on Mon Apr 11, 2005 1:25 pm, edited 1 time in total.

vasp
Newbie
Newbie
Posts: 29
Joined: Mon Jun 19, 2006 7:03 am
License Nr.: 853
Location: FAMU

Problem of running VASP parrelly on different nodes

#6 Post by vasp » Tue Sep 01, 2009 2:46 am

This is not a reply; instead asking for help. I have read the previous replies, but the replies are not clear to me.

I am getting a similar error message on Intel 64 Linux Cluster running RedHat Enterprise Linux 4 (Linux 2.6.9):

Error reading item 'IMAGES' from file INCAR.

I installed VASP in my directory for my own use using makefiles supplied by the computing center. Here is the mpi command I was told to use:

mpirun_rsh -np ${NP} -hostfile ${PBS_NODEFILE} ~/VASP/src/vasp.4.6/vasp

In my INCAR file, I use NPAR = number of nodes.
Last edited by vasp on Tue Sep 01, 2009 2:46 am, edited 1 time in total.

pafell
Newbie
Newbie
Posts: 24
Joined: Wed Feb 18, 2009 11:40 pm
License Nr.: 196
Location: Poznań, Poland

Problem of running VASP parrelly on different nodes

#7 Post by pafell » Thu Sep 03, 2009 10:42 am

[quote="nujjj"]
Aside from being confused abou this MPI problem, I am also confused that it looks like the pvasp on the distant node trying to fine the INCAR but failed. But shouldn't the input file INCAR be transfered from local host to the distant node during the process?
[/quote]

If I can understand what you mean, you don't share case files over NFS (or in other manner). Vasp's working directory (with case files) must be seen by all nodes.
Last edited by pafell on Thu Sep 03, 2009 10:42 am, edited 1 time in total.

Locked