BUG in vasp.6.3.0: internal error in: radial.F
Moderators: Global Moderator, Moderator
-
- Newbie
- Posts: 14
- Joined: Tue Nov 19, 2019 9:45 am
BUG in vasp.6.3.0: internal error in: radial.F
Dear developer,
I run with the on-the-fly ML in VASP.6.3.0 and found the calculation terminated at 3568 ion steps and reported a bug. I encountered this problem in two different calculations.
Here is one of the task files: Here is the bug:
I run with the on-the-fly ML in VASP.6.3.0 and found the calculation terminated at 3568 ion steps and reported a bug. I encountered this problem in two different calculations.
Here is one of the task files: Here is the bug:
You do not have the required permissions to view the files attached to this post.
-
- Global Moderator
- Posts: 460
- Joined: Mon Nov 04, 2019 12:44 pm
Re: BUG in vasp.6.3.0: internal error in: radial.F
At first glance I see it is in the 3569th step, which is an ab-initio step.
Hence I don't see an obvious bug in machine learning.
Your force fits are very inaccurate.
I looked at your input settings and saw that you set 8000K as target temperature. Do you really need that high temperature?
Of course you encounter more configurations at high temperature, but it also introduces more noise.
I need to look further into the ab initio parts.
Could you please also upload the OUTCAR and CONTCAR file. The CONTCAR I would like to have to quickly look at how the structure looks at the 3568th step.
I think it will be very hard for me to reproduce the problem on the current size, because it would be computationally very demanding.
If I don't find anything suspicious in the OUTCAR files then we would need to reduce the problem size.
Have you tried the calculation also on smaller cells?
Could you post your other calculation too where this happens.
Hence I don't see an obvious bug in machine learning.
Your force fits are very inaccurate.
I looked at your input settings and saw that you set 8000K as target temperature. Do you really need that high temperature?
Of course you encounter more configurations at high temperature, but it also introduces more noise.
I need to look further into the ab initio parts.
Could you please also upload the OUTCAR and CONTCAR file. The CONTCAR I would like to have to quickly look at how the structure looks at the 3568th step.
I think it will be very hard for me to reproduce the problem on the current size, because it would be computationally very demanding.
If I don't find anything suspicious in the OUTCAR files then we would need to reduce the problem size.
Have you tried the calculation also on smaller cells?
Could you post your other calculation too where this happens.
-
- Newbie
- Posts: 14
- Joined: Tue Nov 19, 2019 9:45 am
Re: BUG in vasp.6.3.0: internal error in: radial.F
Dear Moderator,ferenc_karsai wrote: ↑Mon Mar 14, 2022 9:38 am I need to look further into the ab initio parts.
Could you please also upload the OUTCAR and CONTCAR file. The CONTCAR I would like to have to quickly look at how the structure looks at the 3568th step.
I think it will be very hard for me to reproduce the problem on the current size, because it would be computationally very demanding.
If I don't find anything suspicious in the OUTCAR files then we would need to reduce the problem size.
Have you tried the calculation also on smaller cells?
Could you post your other calculation too where this happens.
Thank you for your help.
I've noticed this bug appearing on many different tasks over the last few days (same system but different volumes). The OUCAR file of the previous task is too big to upload. Here is another task with the same error, the attachments contain more complete input and output files.
As you said, all errors occur at the ab initio step. 8000K should not be the cause, as I successfully ran these input files on the same cluster using vasp.5.4 a long time ago. The only difference is that I am now using vasp.6.3.0 and have added machine learning parameters to the INCAR file.
Here are the input and output files:
You do not have the required permissions to view the files attached to this post.
-
- Newbie
- Posts: 14
- Joined: Tue Nov 19, 2019 9:45 am
Re: BUG in vasp.6.3.0: internal error in: radial.F
Here are files from another task with the same error.: Almost all tasks of this system report this bug. But the calculations of another research system do not have this problem.ferenc_karsai wrote: ↑Mon Mar 14, 2022 9:38 am I need to look further into the ab initio parts.
Could you please also upload the OUTCAR and CONTCAR file. The CONTCAR I would like to have to quickly look at how the structure looks at the 3568th step.
I think it will be very hard for me to reproduce the problem on the current size, because it would be computationally very demanding.
If I don't find anything suspicious in the OUTCAR files then we would need to reduce the problem size.
Have you tried the calculation also on smaller cells?
Could you post your other calculation too where this happens.
You do not have the required permissions to view the files attached to this post.
-
- Global Moderator
- Posts: 460
- Joined: Mon Nov 04, 2019 12:44 pm
Re: BUG in vasp.6.3.0: internal error in: radial.F
I reran the last structure (CONTCAR and also the next structure by selecting NSW=0 and NSW=1) but I could not get the error. So the problem is most likely not the structure itself. Together with Georg Kresse we looked at your input and saw that you set MAXMIX=30. For machine learning calculations it is very important that one does not set MAXMIX>0. We've now even put this information onto the VASP wiki:
wiki/index.php/Machine_learning_force_f ... ns:_Basics
and
wiki/index.php/MAXMIX
The information on the wiki contains:
"Do not set MAXMIX>0 when using MLFF. During machine learning, the first principles calculations are often bypassed for hundreds or even thousands of ionic steps, and the ions might move considerably between first principles calculations. In this cases using MAXMIX will very often lead to electronic divergence or strange errors during the self-consistency cycle."
So in your calculation you set MAXMIX=30, but very often you have more than 30 steps. Also after a few hundred to thousand steps the number of ab-initio calculations drastically decreases. The problem with MAXMIX is that it continous countig even between ionic steps and restarts the mixer after 30 electronic steps. Most likely in your case it unluckily cuts in the 4th electronic step of the 3569th ionic iteration. You can see also that after that step the electronic calculation starts to strongly diverge.
So maybe you could try running without MAXMIX.
Another big problem in your calculation is that the quality of your force field is terrible. In particular the real error of the forces is too large (usually it should be around 0.1 eV/Ang) and growing. (the 4th column in "grep ERR ML_LOGFILE"). I'm not sure if MAXMIX will cure this. We generally see that magnetic structures are very hard to learn, also the high temperature you have will induce larger energy differences which will lead to larger standard deviations. This will increase the error of the force field. I would definitely set a lower temperature. Maybe max 500K above the expected melting point.
wiki/index.php/Machine_learning_force_f ... ns:_Basics
and
wiki/index.php/MAXMIX
The information on the wiki contains:
"Do not set MAXMIX>0 when using MLFF. During machine learning, the first principles calculations are often bypassed for hundreds or even thousands of ionic steps, and the ions might move considerably between first principles calculations. In this cases using MAXMIX will very often lead to electronic divergence or strange errors during the self-consistency cycle."
So in your calculation you set MAXMIX=30, but very often you have more than 30 steps. Also after a few hundred to thousand steps the number of ab-initio calculations drastically decreases. The problem with MAXMIX is that it continous countig even between ionic steps and restarts the mixer after 30 electronic steps. Most likely in your case it unluckily cuts in the 4th electronic step of the 3569th ionic iteration. You can see also that after that step the electronic calculation starts to strongly diverge.
So maybe you could try running without MAXMIX.
Another big problem in your calculation is that the quality of your force field is terrible. In particular the real error of the forces is too large (usually it should be around 0.1 eV/Ang) and growing. (the 4th column in "grep ERR ML_LOGFILE"). I'm not sure if MAXMIX will cure this. We generally see that magnetic structures are very hard to learn, also the high temperature you have will induce larger energy differences which will lead to larger standard deviations. This will increase the error of the force field. I would definitely set a lower temperature. Maybe max 500K above the expected melting point.
-
- Newbie
- Posts: 14
- Joined: Tue Nov 19, 2019 9:45 am
Re: BUG in vasp.6.3.0: internal error in: radial.F
Thank you very much for your detailed answer!ferenc_karsai wrote: ↑Tue Mar 22, 2022 8:39 am I reran the last structure (CONTCAR and also the next structure by selecting NSW=0 and NSW=1) but I could not get the error. So the problem is most likely not the structure itself. Together with Georg Kresse we looked at your input and saw that you set MAXMIX=30. For machine learning calculations it is very important that one does not set MAXMIX>0. We've now even put this information onto the VASP wiki:
wiki/index.php/Machine_learning_force_f ... ns:_Basics
and
wiki/index.php/MAXMIX
I will try a new test based on your suggestion.