Restarting VASP_ML with modified ML_AB file

Queries about input and output files, running specific calculations, etc.


Moderators: Global Moderator, Moderator

Post Reply
Message
Author
askhetan
Jr. Member
Jr. Member
Posts: 81
Joined: Wed Sep 28, 2011 4:15 pm
License Nr.: 5-1441
Location: Germany

Restarting VASP_ML with modified ML_AB file

#1 Post by askhetan » Wed Dec 20, 2023 5:54 pm

Dear vasp admins,
I want to train VASP_ML forcefields with precalculated ab initio data. I know that the way to re-start VASP_ML calculations in version 6.4.2 is to copy the ML_ABN file to ML_AB and then run the calculation in train mode.

For training a new VASP_ML forcefield with existing ab initio data I wanted to do an experiment. I took the ML_AB file and removed the last two. I also reduced the number of total configurations right at the top of the ML_AB file.

When I rerun this job I get the segmentation error as:
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source
vasp_std 0000000002DF4B6A for__signal_handl Unknown Unknown
libpthread-2.31.s 000014CC1B907420 Unknown Unknown Unknown
...........
libc-2.31.so 000014CC1B5D6083 __libc_start_main Unknown Unknown
vasp_std 000000000041CB2E Unknown Unknown Unknown
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source
vasp_std 0000000002DF4B6A for__signal_handl Unknown Unknown
libpthread-2.31.s 0000154BF5211420 Unknown Unknown Unknown
vasp_std 00000000005D0DBF Unknown Unknown Unknown
............
libc-2.31.so 0000154BF4EE0083 __libc_start_main Unknown Unknown
vasp_std 000000000041CB2E Unknown Unknown Unknown
This error doesn't occur when I do not remove the last two configurations from the ML_AB file. Here is my INCAR:
ISMEAR = 0
SIGMA = 0.1
ISYM = 0
ENCUT = 520
LASPH = .True.
EDIFF = 1E-5
ISPIN = 2
LREAL = Auto
LCHARG = .FALSE.
LWAVE = .FALSE.
ALGO = VeryFast
IVDW = 12
NPAR = 2
KPAR = 2
NELM = 150
IBRION = 0
POTIM = 2
MDALGO = 2
SMASS = 0
ISIF = 2
NSW = 50000
SMASS = 1.0
TEBEG = 50
#PSTRESS = 0.001 # kiloBar
ML_LMLFF = .TRUE.
ML_MODE = train
ML_MCONF = 1000 # default = 1500
#ML_MCONF_NEW = 5 # default = 5
Could you please tell me if my strategy is correct or if I am missing something. Alternately, could you tell me how one can train a VASP_ML forcefield with existing data?
My ultimate aim is to use mode=train then mode=select. The select "step" can help me refine the dataset such that if I want to train again then I can keep in the new ML_AB file only those configurations that were deemed by mode=select as worthy of training the force field on. Thia way I want to mitigate some of the enourmous memory requirement problems.
Thanks in advance for your assistance.

pedro_melo
Global Moderator
Global Moderator
Posts: 127
Joined: Thu Nov 03, 2022 1:03 pm

Re: Restarting VASP_ML with modified ML_AB file

#2 Post by pedro_melo » Thu Dec 21, 2023 2:28 pm

Dear askhetan,

Could you sent us the original and the changed ML_AB file? It may be possible that you removed the last two configurations in the wrong way? At the start of the ML_AB file there is a list of local reference configurations, so you must also update that when you remove the last two configurations. If you do not, and the removed configurations are on that list the code will give a segmentation fault error.

I would also advise you to change

Code: Select all

ML_MODE=train
to

Code: Select all

ML_MODE=select
since you have to update your local reference configurations when changing crystal structure. With the "select" mode on you can also train a force field on existing ab-initio data. Also, in this mode the list of local reference configurations at the top of the ML_AB file will not matter.

Lastly, more information on ML_MODE can be found here.

Best,
Pedro Melo

askhetan
Jr. Member
Jr. Member
Posts: 81
Joined: Wed Sep 28, 2011 4:15 pm
License Nr.: 5-1441
Location: Germany

Re: Restarting VASP_ML with modified ML_AB file

#3 Post by askhetan » Thu Dec 21, 2023 3:35 pm

Dear pedro_melo
Thanks for the tips and hints. Based on these I went a bit deeper into the description of the ML_AB file and found that deleting certain configurations and changing the total number of configurations at the top isn't enough. One must also remove the lines corresponding to the deleted configurations from basis set of each atom type and also change the number of basis set of that type correspondingly. When I modified all four of these, that is:
**************************************************
The number of configurations
--------------------------------------------------
**************************************************
The numbers of basis sets per atom type
--------------------------------------------------
**************************************************
Basis set for X
--------------------------------------------------
**************************************************
Configuration num. NNN
==================================================
.... then I can run with mode=train without problems. However, I also found that all four of these changes are also necessary for mode=select.

Thanks and Best regards
askhetan

ferenc_karsai
Global Moderator
Global Moderator
Posts: 460
Joined: Mon Nov 04, 2019 12:44 pm

Re: Restarting VASP_ML with modified ML_AB file

#4 Post by ferenc_karsai » Tue Jan 02, 2024 10:31 am

So I also tested just for the ML_MODE=select and unfortunately you can't have structures in the list of local reference configurations that don't exist, although the data read in that part is never used for anything. We will change that in a future release.

But what works is to write " 1 1" for each local reference configuration. So for a binary system it would look like this:

Code: Select all

**************************************************
     The numbers of basis sets per atom type
--------------------------------------------------
        1    1
**************************************************
     Basis set for A
--------------------------------------------------
          1      1
**************************************************
     Basis set for B
--------------------------------------------------
          1      1
**************************************************
     Configuration num.      1
...
This way you save the work of deleteing the matching local reference configurations.

Post Reply