Speaker: Dr. Michael Klemm, AMD
Slides: https://hpc.fau.de/files/2024/05/NHR-Perf-Lab-OpenMP-Offloading-final-split.pdf
Abstract:
Today’s supercomputers are heterogeneous systems and GPUs have been established as accelerators to boost the execution of data parallel algorithms. However, programming GPUs remains complex and requires specialized expertise to optimize code for the specific architecture. This can result in code that is not portable between GPU vendors. This in turn may mean that programmers have to maintain multiple versions, targeting different architectures, which can be a time-consuming and error-prone process to maintain. Many software packages reflect this circumstance; they typically provide tailored implementations for HIP, CUDA, and OpenCL.
This talk has two parts. Firstly, it will introduce target offloading using the OpenMP API, an easy and portable way to exploit GPUs’ massive parallelism without being tied to a specific vendor. Secondly, the talk will explain the architecture of the MI300A accelerator and the process of offloading to it, including the software changes required to take advantage of new features like Unified Shared Memory.
Short Bio
Dr. Michael Klemm is a Principal Member of Technical Staff in the Compilers, Languages, Runtimes & Tools team of the Machine Learning & Software Engineering group at AMD. He is part of the OpenMP compiler team, focusing on Fortran and kernel performance for AMD Instinct accelerators for High Performance and Throughput Computing. Michael is the Chief Executive Officer of the OpenMP Architecture Review Board.
For a list of past and upcoming NHR PerfLab seminar events, see: https://hpc.fau.de/research/nhr-perflab-seminar-series/