Optimizing VASP Compilation for Multi-GPU Models Compatibility Using OpenACC.

Message

hszhao.cn@gmail.com · #1 Post by **hszhao.cn@gmail.com** » Sat May 04, 2024 3:23 am

Hello VASP Community,

I am currently working on compiling VASP using OpenACC with the intention of leveraging GPU acceleration, and I have some questions regarding the compatibility of the compiled binaries across different GPU models.

1. Specific GPU Targeting: When compiling VASP, if a specific GPU model is targeted (e.g., nvc -acc -ta=tesla:cc70 for NVIDIA's Volta), will the resulting binary only deliver optimal performance on that specified model? Additionally, are there risks of compatibility issues if the same binary is used on a different GPU architecture (e.g., from Volta to Turing or Kepler)?

2. Generalization Approach: Is there a recommended strategy to compile a more generally applicable VASP binary that could perform reasonably well across different NVIDIA GPU architectures? Would targeting a lower compute capability during compilation (e.g., nvc -acc -ta=tesla:cc35 for Kepler or nvc -acc -ta=tesla:cc50 for Maxwell ) ensure broader compatibility, or would this significantly compromise the performance on newer architectures?

3. Multiple Binaries: Would it be advisable to compile multiple versions of VASP, each optimized for a specific GPU architecture, and then select the appropriate version based on the hardware available for execution? What are the best practices for managing such an environment efficiently?

I am looking to balance performance gains with broad usability, and any recommendations or insights based on your experiences would be greatly appreciated. Thank you in advance for your assistance in navigating these complexities.

Best regards,
Zhao

#2 Post by **michael_wolloch** » Mon May 06, 2024 7:43 am

Dear Zhao,

in our provided makefile.include.nvhpc_*acc we specify several compute capabilities (cc). E.g. for the Fortran compiler:

Code: Select all

FC          = mpif90 -acc -gpu=cc60,cc70,cc80,cuda11.8 -mp

This means that the compiler will generate code that is optimized for all cards with compute capability between 6.0 and 8.0. You can check compute capabilities in the respective NVIDIA table. E.g. Tespa P100 : 6.0, Tesla V100: 7.0 , or Tesla A100: 8.0. If you already have access to a H100, you should add

Code: Select all

cc90

to the list.

Thus, you should compile one version of VASP, which encompasses all compute capabilities of available GPUs. At runtime, the optimal instructions for the specific GPU will be selected. Multiple binaries are not needed.

Note that you might get a compiler error if you specify some compute capabilities with minor versions other than zero. Look at this thread about a not recognized M60 for example. Just drop down to e.g. 5.0 from 5.2 for example.

Let me know if this fully answers your question, so I can lock the topic.
Cheers, Michael

hszhao.cn@gmail.com · #3 Post by **hszhao.cn@gmail.com** » Tue May 07, 2024 7:07 am

Dear Michael,

Got it. Thank you very much for your valuable comments and explanations.

Regards,
Zhao

My Community

Optimizing VASP Compilation for Multi-GPU Models Compatibility Using OpenACC.

Optimizing VASP Compilation for Multi-GPU Models Compatibility Using OpenACC.

Re: Optimizing VASP Compilation for Multi-GPU Models Compatibility Using OpenACC.

Re: Optimizing VASP Compilation for Multi-GPU Models Compatibility Using OpenACC.