Post-Image

Maël MARTIN

Maël MARTIN was a PhD student at CEA, supervised by Hugo TABOADA. His thesis director was Patrick Carribault, HDR Engineer Researcher at CEA.

Maël’s thesis topic was the following: “Driving HPC Parallel Optimizations with DSL”.

Performance portability of parallel applications is a major issue in a context where the architectures of supercomputers evolve very quickly in relation to the lifespan of the applications.

Using a Domain-Specific Language (DSL) allows you to adapt to a new machine without having to rewrite the applications. That is why we want to define a methodology to improve parallel code generation from a DSL. It is first necessary to define which properties are guaranteed by the DSL to guide optimized code generation. Then, we propose to adapt the intermediate representation of this language by integrating concepts related to parallelism and then generate the code based on an existing runtime. In this way, we will be able to separate the scientific part of the code from the underlying programming models to not rewrite the applications while benefiting from the parallel optimizations on future machines.

IO-SEA: Storage I/O and Data Management for Exascale Architectures
Daniel Medeiros   Eric B. Gregory   Philippe Couvee   James Hawkes   Sebastien Gougeaud   Maike Gilliot   Olivier Bressand   Yoann Valeri   Julien Jaeger   Damien Chapon   Frederic Bournaud   Loı̈c Strafella   Daniel Caviedes-Voullième   Ghazal Tashakor   Jolanta Zjupa   Max Holicki   Tom Ridley   Yanik Müller   Filipe Souza Mendes Guimarães   Wolfgang Frings   Jan-Oliver Mirus   Ilya Zhukov   Eric Rodrigues Borba   Nafiseh Moti   Reza Salkhordeh   Nadia Derbey   Salim Mimouni   Simon Derr   Buket Benek Gursoy   James Grogan   Radek Furmánek   Martin Golasowski   Kateřina Slaninová   Jan Martinovič   Jan Faltýnek   Jenny Wong   Metin Cakircali   Tiago Quintino   Simon Smart   Olivier Iffrig   Sai Narasimhamurthy   Sonja Happ   Michael Rauh   Stephan Krempel   Mark Wiggins   Jiřı́ Nováček   André Brinkmann   Stefano Markidis   Philippe Deniel  
Proceedings of the 21st ACM International Conference on Computing Frontiers: Workshops and Special Sessions, Association for Computing Machinery, p. 94-100, 2024

abstract

Abstract

The new emerging scientific workloads to be executed in the upcoming exascale supercomputers face major challenges in terms of storage, given their extreme volume of data. In particular, intelligent data placement, instrumentation, and workflow handling are central to application performance. The IO-SEA project developed multiple solutions to aid the scientific community in adressing these challenges: a Workflow Manager, a hierarchical storage management system, and a semantic API for storage. All of these major products incorporate additional minor products that support their mission. In this paper, we discuss both the roles of all these products and how they can assist the scientific community in achieving exascale performance.

Suspending OpenMP Tasks on Asynchronous Events: Extending the Taskwait Construct
Romain Pereira   Maël Martin   Adrien Roussel   Thierry Gautier   Patrick Carribault  
IWOMP 23 - International Workshop on OpenMP, 2023

abstract

Abstract

Many-core and heterogeneous architectures now require programmers to compose multiple asynchronous programming model to fully exploit hardware capabilities. As a shared-memory parallel programming model, OpenMP has the responsibility of orchestrating the suspension and progression of asynchronous operations occurring on a compute node, such as MPI communications or CUDA/HIP streams. Yet, specifications only come with the task detach(event) API to suspend tasks until an asynchronous operation is completed, which presents a few drawbacks. In this paper, we introduce the design and implementation of an extension on the taskwait construct to suspend a task until an asynchronous event completion. It aims to reduce runtime costs induced by the current solution, and to provide a standard API to automate portable task suspension solutions. The results show twice less overheads compared to the existing task detach clause.