LiHPC | Laboratoire en Informatique Haute Performance pour le Calcul et la Simulation

Philippe Deniel est Ingénieur-Chercher et expert Fellow CEA. Il a été diplômé de l’Ecole Centrale Paris en 1996 et il est titulaire d’un doctorat en Informatique. Il a été responsable des équipes en charges des systèmes de stockage de 2015 à 2023. Ses centres d’intérêts recoupent le stockage massif pour le HPC, l’intégration système du Quantum Computing dans le HPC et l’hybridation HPC/QC.

IO-SEA: Storage I/O and Data Management for Exascale Architectures
Daniel Medeiros Eric B. Gregory Philippe Couvee James Hawkes Sebastien Gougeaud Maike Gilliot Olivier Bressand Yoann Valeri Julien Jaeger Damien Chapon Frederic Bournaud Loı̈c Strafella Daniel Caviedes-Voullième Ghazal Tashakor Jolanta Zjupa Max Holicki Tom Ridley Yanik Müller Filipe Souza Mendes Guimarães Wolfgang Frings Jan-Oliver Mirus Ilya Zhukov Eric Rodrigues Borba Nafiseh Moti Reza Salkhordeh Nadia Derbey Salim Mimouni Simon Derr Buket Benek Gursoy James Grogan Radek Furmánek Martin Golasowski Kateřina Slaninová Jan Martinovič Jan Faltýnek Jenny Wong Metin Cakircali Tiago Quintino Simon Smart Olivier Iffrig Sai Narasimhamurthy Sonja Happ Michael Rauh Stephan Krempel Mark Wiggins Jiřı́ Nováček André Brinkmann Stefano Markidis Philippe Deniel
Proceedings of the 21st ACM International Conference on Computing Frontiers: Workshops and Special Sessions, Association for Computing Machinery, p. 94-100, 2024

abstract

Abstract

The new emerging scientific workloads to be executed in the upcoming exascale supercomputers face major challenges in terms of storage, given their extreme volume of data. In particular, intelligent data placement, instrumentation, and workflow handling are central to application performance. The IO-SEA project developed multiple solutions to aid the scientific community in adressing these challenges: a Workflow Manager, a hierarchical storage management system, and a semantic API for storage. All of these major products incorporate additional minor products that support their mission. In this paper, we discuss both the roles of all these products and how they can assist the scientific community in achieving exascale performance.

Strategic Research Agenda for High-Performance Computing in Europe European HPC Research Priorities for 2025 - 2029
Nico Mittenzwey Fabrizio Magugliani Marc Duranton Craig Prunty Pascale Rossé-Laurent Manolis Marazakis Paul Carpenter Gabriel Antoniu Sarah Neuwirth Philippe Deniel Dirk Pleiter Utz-Uwe Haus Erwin Laure Andreas Wierse Tobias Becker Robert Haas Michael Malms Hans-Christian Hoppe Valeria Bartsch Sagar Dolas Ondřej Vysocký Maria Perez Andy Forrester Kristel Michielsen Estela Suarez Sai Narasimhamurthy Marcin Ostacz Gabriella Povero Pascale Bernier-Bruna Jean-Pierre Panziera
Zenodo, 2024

NFS-Ganesha : évolutions d'un serveur NFS pour le HPC du Terascale à l'Exascale
Philippe Deniel
Thèse de Doctorat de l'Université Paris-Saclay, 2023

abstract

Abstract

Cette thèse présente NFS-Ganesha, un serveur NFS en espace utilisateur pour le HPC, et ses évolutions depuis sa création à l'aube des années 2000 jusqu'à la période Exascale actuelle. Créé à l'origine pour des besoins opérationnels liés à l'exploitation des grands systèmes de stockage, NFS-Ganesha a été pensé pour être générique et parallélisé. L'apparition conjointe des systèmes de fichiers parallèles, donnant naissance aux architectures «data-centriques» de centre de calcul, et celle du protocole NFSv4 vont faire évoluer de NFS-Ganesha qui va devenir un serveur NFS générique capable de s'interfacer avec de nombreux backends. L'évolution de NFSv4, sous la forme de NFSv4.1 et du protocole pNFS, fera de NFS-Ganesha un standard adopté par une forte communauté open-source impliquant chercheurs et industriels. NFS-Ganesha sera utilisé pour réaliser la fonctionnalité IO-Proxy, et la création de nouveaux protocoles parallèles afférents. Impliqués dans des projets de R&D européens, NFS-Ganesha servira à implémenter la fonctionnalité de serveur éphémère afin de répondre aux exigences de l'Exascale.

ADAPTING THE ARC CACHE MANAGEMENT POLICY TO FILE GRANULARITY
Hocine Mahni Stéphane Rubini Jalil Boukhobza Sebastien Gougeaud Philippe Deniel
7th Workshop on Performance and Scalability of Storage Systems (Per3S), 2023

The I/O Trace Initiative: Building a Collaborative I/O Archive to Advance HPC
Nafiseh Moti André Brinkmann Marc-André Vef Philippe Deniel Jesus Carretero Philip Carns Jean-Thomas Acquaviva Reza Salkhordeh
SC-W '23: Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis, 2023

abstract

Abstract

HPC application developers and administrators need to understand the complex interplay between compute clusters and storage systems to make effective optimization decisions. Ad hoc investigations of this interplay based on isolated case studies can lead to conclusions that are incorrect or difficult to generalize. The I/O Trace Initiative aims to improve the scientific community’s understanding of I/O operations by building a searchable collaborative archive of I/O traces from a wide range of applications and machines, with a focus on high-performance computing and scalable AI/ML. This initiative advances the accessibility of I/O trace data by enabling users to locate and compare traces based on user-specified criteria. It also provides a visual analytics platform for in-depth analysis, paving the way for the development of advanced performance optimization techniques. By acting as a hub for trace data, the initiative fosters collaborative research by encouraging data sharing and collective learning.

ETP4HPC's SRA 5-Strategic Research Agenda for High-Performance Computing in Europe-2022
Michael Malms Laurent Cargemel Estela Suarez Nico Mittenzwey Marc Duranton Sakir Sezer Craig Prunty Pascale Rosse-Laurent Maria Perez-Harnandez Manolis Marazakis Cristiano Malossi Francois Bodin Jean-Francois Lavignon Jean-Philippe Nominé Mark Asch Ovidiu Vermesan Peter Bauer Stephane Requena Alberto Scionti Alexandru Costan Andrea Ferretti Angelos Bilas Ani Anciaux-Sedrakian Anna Queralt Antonio Peña Benjamin Depardon Carmine D'Amico Christophe Calvin Christos Kozanitis Colin Morey Daniel Molka Dario Garcia-Gasulla Dirk Hartmann Edouard Audit Emeric Brun Fabien Chaix France Boillod-Cerneux Gilad Shainer Gilles Wiber Guillaume Colin de Verdière Jacques-Charles Lafoucrière Jean-Marc Denis Jean-Thomas Acquaviva Jordi Guitart Julien Bigot Julita Corbolan Gomez Bautista Arturo Leonardo Lillit Axner Luke Mason Manolis Ploumidis Marc Casas Marc Perache Matthieu Hautreux Miguel Vazquez Nejc Bat Nicolas Bergeret Nicolas Tonello Nils Wedi Olivier Marsden Olivier Terzo Osman Unsal Patrick Carribault Petar Radojkovic Philippe Bricard Philippe Deniel Polyvios Pratikakis Ramon Nou Ricard Borrell Richard Graham Robin Pinning Rossen Apostolov Sabri Pllana Sinead Ryan Somnath Mazumdar Stefano Markidis Sven-Arne Reinemo Thierry Goubier Tiago Quintino Utz-Uwe Haus Valentin Plugaru Valeria Bartsch Vassil Alexandrov Vassilis Papaefstathiou Vicenc Beltran Xavier Martorell Xing Cai Yannis Papaefstathiou Yolanda Becerra
Zenodo, 2022

abstract

Abstract

This document feeds research and development priorities devel-oped by the European HPC ecosystem into EuroHPC’s Research and Innovation Advisory Group with an aim to define the HPC Technology research Work Programme and the calls for proposals included in it and to be launched from 2023 to 2026. This SRA also describes the major trends in the deployment of HPC and HPDA methods and systems, driven by economic and societal needs in Europe, taking into account the changes ex-pected in the technologies and architectures of the expanding underlying IT infrastructure. The goal is to draw a complete pic-ture of the state of the art and the challenges for the next three to four years rather than to focus on specific technologies, implementations or solutions.

Predicting file lifetimes for data placement in multi-tiered storage systems for HPC
Luis Thomas Sebastien Gougeaud Stéphane Rubini Philippe Deniel Jalil Boukhobza
CHEOPS '21: Proceedings of the Workshop on Challenges and Opportunities of Efficient and Performant Storage Systems, p. 1-9, 2021

abstract

Abstract

The emergence of Exascale machines in HPC will have the foreseen consequence of putting more pressure on the storage systems in place, not only in terms of capacity but also bandwidth and latency. With limited budget we cannot imagine using only storage class memory, which leads to the use of a heterogeneous tiered storage hierarchy. In order to make the most efficient use of the high performance tier in this storage hierarchy, we need to be able to place user data on the right tier and at the right time. In this paper, we assume a 2-tier storage hierarchy with a high performance tier and a high capacity archival tier. Files are placed on the high performance tier at creation time and moved to capacity tier once their lifetime expires (that is once they are no more accessed). The main contribution of this paper lies in the design of a file lifetime prediction model solely based on its path based on the use of Convolutional Neural Network. Results show that our solution strikes a good trade-off between accuracy and under-estimation. Compared to previous work, our model made it possible to reach an accuracy close to previous work (around 98.60% compared to 98.84%) while reducing the underestimations by almost 10x to reach 2.21% (compared to 21.86%). The reduction in underestimations is crucial as it avoids misplacing files in the capacity tier while they are still in use.

Workload Evaluation Tool for Metadata Distribution Method
Éloïse Billa Philippe Deniel Soraya Zertal
Simulation Tools and Techniques, Springer International Publishing, p. 796-810, 2021

abstract

Abstract

In the High Performance Computing field (HPC), metadata server cluster is a critical aspect of a storage system performance and with object storage growth, systems must now be able to distribute metadata across servers thanks to distributed metadata servers. Storage systems reach better performances if the workload remains balanced over time. Indeed, an unbalanced distribution can lead to frequent requests to a subset of servers while other servers are completely idle. To avoid this issue, different metadata distribution methods exist and each one has its best use cases. Moreover, each system has different usages and different workloads, which means that one distribution method could fit to a specific kind of storage system and not to another one. To this end, we propose a tool to evaluate metadata distribution methods with different workloads. In this paper, we describe this tool and we use it to compare state-of-the-art methods and one method we developed. We also show how outputs generated by our tool enable us to deduce distribution weakness and chose the most adapted method.

Toward a Versatile and Scalable Metadata Distribution Framework for Object Storage (Research Poster)
Eloise Billa Soraya Zertal Thomas Leibovici Philippe Deniel
2018 International Conference on High Performance Computing and Simulation (HPCS), p. 1059-1060, 2018

abstract

Abstract

The use of object storage in the HPC world becomes a common case as it enables to overcome some POSIX limitations in scalability and performance. Indeed, object stores use a flat namespace, avoiding hierarchy in access requests and the cost of maintaining dependencies between multiple entries. Object stores also differentiate data flow from metadata flow, providing better concurrency and throughput. They can store trillions of objects and each object has its own customized metadata attributes so these metadata can be richer than POSIX attributes.

Designing a parallel OGSSim through library specificities
Sebastien Gougeaud Soraya Zertal Jacques-Charles Lafoucriere Philippe Deniel
Proceedings of the 4th ACM International Conference of Computing for Engineering and Sciences, Association for Computing Machinery, 2018

abstract

Abstract

Simulation is the most appropriate technique to evaluate the performance of current data storage systems and predict it for the future ones as part of data centers or cloud infrastructures. It assesses the potential of a system to meet the users requirements in terms of storage capacity, devices heterogeneity, delivered performance and robustness. We developed a simulation tool called OGSSim to address efficiently these criteria within a reduced execution time. But the number of threads on the test machine put an upper bound to the size of the simulated systems. To push this limitation and improve the simulation time, we define in this paper a parallel version of OGSSim. We explain how the parallelization process generate both design and implementation challenges due to the multi-node environment and the related communications and how MPI and ZeroMQ libraries respectively help us to address those challenges.

Contemporary High Performance Computing
Mickaël Amiet Patrick Carribault Elisabeth Charon Guillaume Colin Verdière Philippe Deniel Gilles Grospellier Guénolé Harel François Jollet Jacques-Charles Lafoucrière Jacques-Bernard Lekien Stéphane Mathieu Marc Pérache Jean-Christophe Weill Gilles Wiber
Chapman; Hall/CRC, p. 45-74, 2017

Optimizing Data Robustness in Large-Scale Storage Systems
Sebastien Gougeaud Soraya Zertal Jacques-Charles Lafoucriere Philippe Deniel
2017 International Conference on High Performance Computing and Simulation (HPCS), p. 236-243, 2017

abstract

Abstract

Storage systems capacity provided by data centers do not cease to increase, currently reaching the exabyte scale using thousands of disks. In this way, the question of the resiliency of such systems becomes critical, to avoid data loss and reduce the impact of the reconstruction process on the data access time. We propose SD2S, a method to create a placement scheme for declustered RAID organizations, based on a shifting placement. It consists in the calculation of degree matrices, which represent the distance between the source sets of each couple of physical disks, thus the number of data blocks which will be reconstructed in case of a double failure. The scheme creation is made by the computation of a score function for all possible shifting offsets and the selection of the one ensuring the reconstruction of the highest percentage of data. Results show the data reconstruction distribution against the number of double failure events. Also, the overhead generated by the calculation of the shifting offsets is compared to greedy SD2S and CRUSH without replicas for systems reaching the hundred of disks. These results confirm that the selection of the best offset can lead to a complete data reconstruction giving a small overhead, especially for large systems.

Using ZeroMQ as communication/synchronization mechanisms for IO requests simulation
Sebastien Gougeaud Soraya Zertal Jacques-Charles Lafoucriere Philippe Deniel
2017 International Symposium on Performance Evaluation of Computer and Telecommunication Systems (SPECTS), p. 1-8, 2017

abstract

Abstract

Using simulation to study the behavior of large-scale data storage systems becomes capital to predict the performance and the reliability of any of them at a lower cost. This helps to take the right decisions before the system development and deployment. OGSSim is a simulation tool for large and heterogeneous storage systems that uses parallelism to provide informations about the behavior of such systems in a reduced time. It uses ZeroMQ communication library to implement not only the data communication but also the synchronization functions between the generated threads. These synchronization points occur during the requests parallel execution and need to be treated efficiently to ensure data coherency for the fast and accurate computation of performance metrics. In this work, different issues due to the parallel execution of our simulation tool OGSSim are presented and the adopted solutions using ZeroMQ are discussed. The impact of these solutions in term of simulation time overhead are measured considering various system configurations. The obtained results s how t hat ZeroMQ has almost no impact on the simulation time, even for complex and large configurations.

Block shifting layout for efficient and robust large declustered storage systems
Sebastien Gougeaud Soraya Zertal Jacques-Charles Lafoucriere Philippe Deniel
2016 International Conference on High Performance Computing and Simulation (HPCS), p. 342-349, 2016

abstract

Abstract

Modern disks are very large (SSDs, HDDs) and their capacities will certainly increase in the future. Storage systems use an important number of such devices to compose storage pools and fulfil the storage capacity demands. The result is a higher probability of a failure and a longer reconstruction duration. Consequently, the whole system is penalized as the response time is higher and a second failure will generate a data loss. In this paper, we propose a new method based on block shifting layout which increases the efficiency of a RAID declustered storage system and improves its robustness in both normal and failure modes. We define four mapping rules to reach these objectives. Conducted tests reveal that exploiting the coprime property between the number of devices and the block shifting factor leads to an optimal layout. It reduces significantly the redirection time proportionally to the number of disks, reaching 50% for 1000 disks and a negligeable memory cost as we avoid the use of a redirection table. It also allows the recovery of additional data in case of a second failure during the degraded mode which gives to our proposed method a huge interest for large storage systems comparing to other existing methods.

A generic and open simulation tool for large multi-tiered hierarchical storage systems
Sebastien Gougeaud Soraya Zertal Jacques-Charles Lafoucriere Philippe Deniel
2016 International Symposium on Performance Evaluation of Computer and Telecommunication Systems (SPECTS), p. 1-8, 2016

abstract

Abstract

Actual storage systems are very large, with complex and distributed architectural configurations, composed of various technologies devices. However, simulation, analysis and evaluation tools in the literature do not handle this complex design and these heterogeneous components. This paper presents OGSSim (Open and Generic Storage systems Simulation tool): a new simulation tool for such systems. Being generic to all devices technologies and open to diverse management strategies and architecture layouts, it fulfills all the storage systems needs in term of representativeness. Also, it has been validated against real systems, thus its accuracy makes it a useful tool for the conception of future storage systems, the choice of hardware components and the analysis of the adequacy between the applications needs and the management strategies combined with the configuration layout. This validation confirmed only a maximum of 15% of difference between real and simulated execution time. Also, OGSSim execution in a competitive time, just 3.5 sec for common workloads on a large system of 500 disks, makes it a challenging simulation and evaluation tool. Thus, it is the appropriate and accurate tool for modern storage systems conception, evaluation and maintenance.

OGSSim: Open Generic data Storage systems Simulation tool
Sebastien Gougeaud Soraya Zertal Jacques-Charles Lafoucriere Philippe Deniel
EAI Endorsed Transactions on Scalable Information Systems, ACM, 2015

abstract

Abstract

In this paper, an open and generic storage simulator is proposed. It simulates with accuracy multi-tiered storage systems based on heterogeneous devices including HDDs, SSDs and the connecting buses. The target simulated sys- tem is constructed from the hardware configuration input, then sent to the simulator modules along with the trace file and the appropriate simulator functions are selected and executed. Each module of the simulator is executed by a thread, and communicates with the others via ZeroMQ, a message transmission API using sockets for the information transfer. The result is an accurate behavior of the simulated system submitted to a specific workload and represented by performance and reliability metrics. No restriction is put on the input hardware configuration which can handle differ- ent levels of details and makes this simulator generic. The diversity of the supported devices, regardless to their na- ture: disks, buses, ..etc and organisation: JBOD, RAID, ..etc makes the simulator open to many technologies. The modularity of its design and the independence of its exe- cution functions, makes it open to handle any additional mapping, access, maintenance or reconstruction strategies. The conducted tests using OLTP and scientific workloads show accurate results, obtained in a competitive runtime.

Formal modelling and analysis of distributed storage systems
Jordan La Houssaye Franck Pommereau Philippe Deniel
IBISC, university of Evry / Paris-Saclay, 2014

abstract

Abstract

Distributed storage systems are nowadays ubiquitous, often under the form of multiple caches forming a hierarchy. A large amount of work has been dedicated to design, implement and optimise such systems. However, there exists to the best of our knowledge no attempt to use formal modelling and analysis in this field. This paper proposes a formal modelling framework to design distributed storage systems while separating the various concerns they involve like data-model, operations, placement, consistency, topology, etc. A system modelled in such a way can be analysed through model-checking to prove correctness properties, or through simulation to measure timed performance. In this paper, we define the modelling framework and then focus on timing analysis. We illustrate these two aspects on a simple example showing that our proposal has the potential to be used to make design decisions before the real system is implemented.

Contemporary High Performance Computing: From Petascale toward Exascale
Jeffer Vetter Jack Dongarra Piot Muszcek Wu-Chun Feng Kirk Cameron Thomas Scoogland Mickaël Amiet Patrick Carribault Elisabeth Charon Philippe Deniel Gilles Grospellier Guenole Harel François Jollet Jacques-Charles Lafoucriere Stephane Mathieu Marc Pérache Jean-Christophe Weill Gilles Wiber Guillaume Colin de Verdiere
Chapman; Hall/CRC, 2013

abstract

Abstract

Contemporary High Performance Computing: From Petascale toward Exascale focuses on the ecosystems surrounding the world’s leading centers for high performance computing (HPC). It covers many of the important factors involved in each ecosystem: computer architectures, software, applications, facilities, and sponsors. The first part of the book examines significant trends in HPC systems, including computer architectures, applications, performance, and software. It discusses the growth from terascale to petascale computing and the influence of the TOP500 and Green500 lists. The second part of the book provides a comprehensive overview of 18 HPC ecosystems from around the world. Each chapter in this section describes programmatic motivation for HPC and their important applications; a flagship HPC system overview covering computer architecture, system software, programming systems, storage, visualization, and analytics support; and an overview of their data center/facility. The last part of the book addresses the role of clouds and grids in HPC, including chapters on the Magellan, FutureGrid, and LLGrid projects. With contributions from top researchers directly involved in designing, deploying, and using these supercomputing systems, this book captures a global picture of the state of the art in HPC.

NFSv4 Proxy in User Space on a Massive Cluster Architecture: Issues and Perspectives
Philippe Deniel
USENIX Association, 2009

GANESHA, a multi-usage with large cache NFSv4 server
Ph. Deniel Th. Leibovici J-Ch. Lafoucrière
WiP session at FAST'97, 2007

GANESHA, a multi-usage with large cache NFSv4 server
Ph. Deniel Th. Leibovici J-Ch. Lafoucrière
Proceedings of the Linux Symposium, p. 113-124, 2007

Authentification dans les ONC/RPC
Ph. Deniel
M.I.S.C, 2005

Introduction à la GSSAPI
Ph. Deniel
Linux Magazine, Diamond Editions, 2003

Philippe DENIEL

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract