Maël MARTIN is currently a PhD student at CEA, supervised by Hugo TABOADA. His thesis director is Patrick Carribault, HDR Engineer Researcher at CEA.
Maël’s thesis topic is the following: “Driving HPC Parallel Optimizations with DSL”.
Performance portability of parallel applications is a major issue in a context where the architectures of supercomputers evolve very quickly in relation to the lifespan of the applications.
Using a Domain-Specific Language (DSL) allows you to adapt to a new machine without having to rewrite the applications. That is why we want to define a methodology to improve parallel code generation from a DSL. It is first necessary to define which properties are guaranteed by the DSL to guide optimized code generation. Then, we propose to adapt the intermediate representation of this language by integrating concepts related to parallelism and then generate the code based on an existing runtime. In this way, we will be able to separate the scientific part of the code from the underlying programming models to not rewrite the applications while benefiting from the parallel optimizations on future machines.
IWOMP 23 - International Workshop on OpenMP, 2023
abstract
Abstract
Many-core and heterogeneous architectures now require programmers to compose multiple asynchronous programming model to fully exploit hardware capabilities. As a shared-memory parallel programming model, OpenMP has the responsibility of orchestrating the suspension and progression of asynchronous operations occurring on a compute node, such as MPI communications or CUDA/HIP streams. Yet, specifications only come with the task detach(event) API to suspend tasks until an asynchronous operation is completed, which presents a few drawbacks. In this paper, we introduce the design and implementation of an extension on the taskwait construct to suspend a task until an asynchronous event completion. It aims to reduce runtime costs induced by the current solution, and to provide a standard API to automate portable task suspension solutions. The results show twice less overheads compared to the existing task detach clause.