Post-Image

Philippe DENIEL

Post-Image

Philippe Deniel est Ingénieur-Chercher et expert Fellow CEA. Il a été diplômé de l’Ecole Centrale Paris en 1996 et il est titulaire d’un doctorat en Informatique. Il a été responsable des équipes en charges des systèmes de stockage de 2015 à 2023. Ses centres d’intérêts recoupent le stockage massif pour le HPC, l’intégration système du Quantum Computing dans le HPC et l’hybridation HPC/QC.

IO-SEA: Storage I/O and Data Management for Exascale Architectures
Daniel Medeiros   Eric B. Gregory   Philippe Couvee   James Hawkes   Sebastien Gougeaud   Maike Gilliot   Olivier Bressand   Yoann Valeri   Julien Jaeger   Damien Chapon   Frederic Bournaud   Loı̈c Strafella   Daniel Caviedes-Voullième   Ghazal Tashakor   Jolanta Zjupa   Max Holicki   Tom Ridley   Yanik Müller   Filipe Souza Mendes Guimarães   Wolfgang Frings   Jan-Oliver Mirus   Ilya Zhukov   Eric Rodrigues Borba   Nafiseh Moti   Reza Salkhordeh   Nadia Derbey   Salim Mimouni   Simon Derr   Buket Benek Gursoy   James Grogan   Radek Furmánek   Martin Golasowski   Kateřina Slaninová   Jan Martinovič   Jan Faltýnek   Jenny Wong   Metin Cakircali   Tiago Quintino   Simon Smart   Olivier Iffrig   Sai Narasimhamurthy   Sonja Happ   Michael Rauh   Stephan Krempel   Mark Wiggins   Jiřı́ Nováček   André Brinkmann   Stefano Markidis   Philippe Deniel  
Proceedings of the 21st ACM International Conference on Computing Frontiers: Workshops and Special Sessions, Association for Computing Machinery, p. 94-100, 2024

abstract

Abstract

The new emerging scientific workloads to be executed in the upcoming exascale supercomputers face major challenges in terms of storage, given their extreme volume of data. In particular, intelligent data placement, instrumentation, and workflow handling are central to application performance. The IO-SEA project developed multiple solutions to aid the scientific community in adressing these challenges: a Workflow Manager, a hierarchical storage management system, and a semantic API for storage. All of these major products incorporate additional minor products that support their mission. In this paper, we discuss both the roles of all these products and how they can assist the scientific community in achieving exascale performance.

Strategic Research Agenda for High-Performance Computing in Europe European HPC Research Priorities for 2025 - 2029
Nico Mittenzwey   Fabrizio Magugliani   Marc Duranton   Craig Prunty   Pascale Rossé-Laurent   Manolis Marazakis   Paul Carpenter   Gabriel Antoniu   Sarah Neuwirth   Philippe Deniel   Dirk Pleiter   Utz-Uwe Haus   Erwin Laure   Andreas Wierse   Tobias Becker   Robert Haas   Michael Malms   Hans-Christian Hoppe   Valeria Bartsch   Sagar Dolas   Ondřej Vysocký   Maria Perez   Andy Forrester   Kristel Michielsen   Estela Suarez   Sai Narasimhamurthy   Marcin Ostacz   Gabriella Povero   Pascale Bernier-Bruna   Jean-Pierre Panziera  
Zenodo, 2024

NFS-Ganesha : évolutions d'un serveur NFS pour le HPC du Terascale à l'Exascale
Philippe Deniel  
Thèse de Doctorat de l'Université Paris-Saclay, 2023

abstract

Abstract

Cette thèse présente NFS-Ganesha, un serveur NFS en espace utilisateur pour le HPC, et ses évolutions depuis sa création à l'aube des années 2000 jusqu'à la période Exascale actuelle. Créé à l'origine pour des besoins opérationnels liés à l'exploitation des grands systèmes de stockage, NFS-Ganesha a été pensé pour être générique et parallélisé. L'apparition conjointe des systèmes de fichiers parallèles, donnant naissance aux architectures «data-centriques» de centre de calcul, et celle du protocole NFSv4 vont faire évoluer de NFS-Ganesha qui va devenir un serveur NFS générique capable de s'interfacer avec de nombreux backends. L'évolution de NFSv4, sous la forme de NFSv4.1 et du protocole pNFS, fera de NFS-Ganesha un standard adopté par une forte communauté open-source impliquant chercheurs et industriels. NFS-Ganesha sera utilisé pour réaliser la fonctionnalité IO-Proxy, et la création de nouveaux protocoles parallèles afférents. Impliqués dans des projets de R&D européens, NFS-Ganesha servira à implémenter la fonctionnalité de serveur éphémère afin de répondre aux exigences de l'Exascale.

ADAPTING THE ARC CACHE MANAGEMENT POLICY TO FILE GRANULARITY
Hocine Mahni   Stéphane Rubini   Jalil Boukhobza   Sebastien Gougeaud   Philippe Deniel  
7th Workshop on Performance and Scalability of Storage Systems (Per3S), 2023

The I/O Trace Initiative: Building a Collaborative I/O Archive to Advance HPC
Nafiseh Moti   André Brinkmann   Marc-André Vef   Philippe Deniel   Jesus Carretero   Philip Carns   Jean-Thomas Acquaviva   Reza Salkhordeh  
SC-W '23: Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis, 2023

abstract

Abstract

HPC application developers and administrators need to understand the complex interplay between compute clusters and storage systems to make effective optimization decisions. Ad hoc investigations of this interplay based on isolated case studies can lead to conclusions that are incorrect or difficult to generalize. The I/O Trace Initiative aims to improve the scientific community’s understanding of I/O operations by building a searchable collaborative archive of I/O traces from a wide range of applications and machines, with a focus on high-performance computing and scalable AI/ML. This initiative advances the accessibility of I/O trace data by enabling users to locate and compare traces based on user-specified criteria. It also provides a visual analytics platform for in-depth analysis, paving the way for the development of advanced performance optimization techniques. By acting as a hub for trace data, the initiative fosters collaborative research by encouraging data sharing and collective learning.

ETP4HPC's SRA 5-Strategic Research Agenda for High-Performance Computing in Europe-2022
Michael Malms   Laurent Cargemel   Estela Suarez   Nico Mittenzwey   Marc Duranton   Sakir Sezer   Craig Prunty   Pascale Rosse-Laurent   Maria Perez-Harnandez   Manolis Marazakis   Cristiano Malossi   Francois Bodin   Jean-Francois Lavignon   Jean-Philippe Nominé   Mark Asch   Ovidiu Vermesan   Peter Bauer   Stephane Requena   Alberto Scionti   Alexandru Costan   Andrea Ferretti   Angelos Bilas   Ani Anciaux-Sedrakian   Anna Queralt   Antonio Peña   Benjamin Depardon   Carmine D'Amico   Christophe Calvin   Christos Kozanitis   Colin Morey   Daniel Molka   Dario Garcia-Gasulla   Dirk Hartmann   Edouard Audit   Emeric Brun   Fabien Chaix   France Boillod-Cerneux   Gilad Shainer   Gilles Wiber   Guillaume Colin de Verdière   Jacques-Charles Lafoucrière   Jean-Marc Denis   Jean-Thomas Acquaviva   Jordi Guitart   Julien Bigot   Julita Corbolan   Gomez Bautista   Arturo Leonardo   Lillit Axner   Luke Mason   Manolis Ploumidis   Marc Casas   Marc Perache   Matthieu Hautreux   Miguel Vazquez   Nejc Bat   Nicolas Bergeret   Nicolas Tonello   Nils Wedi   Olivier Marsden   Olivier Terzo   Osman Unsal   Patrick Carribault   Petar Radojkovic   Philippe Bricard   Philippe Deniel   Polyvios Pratikakis   Ramon Nou   Ricard Borrell   Richard Graham   Robin Pinning   Rossen Apostolov   Sabri Pllana   Sinead Ryan   Somnath Mazumdar   Stefano Markidis   Sven-Arne Reinemo   Thierry Goubier   Tiago Quintino   Utz-Uwe Haus   Valentin Plugaru   Valeria Bartsch   Vassil Alexandrov   Vassilis Papaefstathiou   Vicenc Beltran   Xavier Martorell   Xing Cai   Yannis Papaefstathiou   Yolanda Becerra  
Zenodo, 2022

abstract

Abstract

This document feeds research and development priorities devel-oped by the European HPC ecosystem into EuroHPC’s Research and Innovation Advisory Group with an aim to define the HPC Technology research Work Programme and the calls for proposals included in it and to be launched from 2023 to 2026. This SRA also describes the major trends in the deployment of HPC and HPDA methods and systems, driven by economic and societal needs in Europe, taking into account the changes ex-pected in the technologies and architectures of the expanding underlying IT infrastructure. The goal is to draw a complete pic-ture of the state of the art and the challenges for the next three to four years rather than to focus on specific technologies, implementations or solutions.

Predicting file lifetimes for data placement in multi-tiered storage systems for HPC
Luis Thomas   Sebastien Gougeaud   Stéphane Rubini   Philippe Deniel   Jalil Boukhobza  
CHEOPS '21: Proceedings of the Workshop on Challenges and Opportunities of Efficient and Performant Storage Systems, p. 1-9, 2021

abstract

Abstract

The emergence of Exascale machines in HPC will have the foreseen consequence of putting more pressure on the storage systems in place, not only in terms of capacity but also bandwidth and latency. With limited budget we cannot imagine using only storage class memory, which leads to the use of a heterogeneous tiered storage hierarchy. In order to make the most efficient use of the high performance tier in this storage hierarchy, we need to be able to place user data on the right tier and at the right time. In this paper, we assume a 2-tier storage hierarchy with a high performance tier and a high capacity archival tier. Files are placed on the high performance tier at creation time and moved to capacity tier once their lifetime expires (that is once they are no more accessed). The main contribution of this paper lies in the design of a file lifetime prediction model solely based on its path based on the use of Convolutional Neural Network. Results show that our solution strikes a good trade-off between accuracy and under-estimation. Compared to previous work, our model made it possible to reach an accuracy close to previous work (around 98.60% compared to 98.84%) while reducing the underestimations by almost 10x to reach 2.21% (compared to 21.86%). The reduction in underestimations is crucial as it avoids misplacing files in the capacity tier while they are still in use.

Workload Evaluation Tool for Metadata Distribution Method
Éloïse Billa   Philippe Deniel   Soraya Zertal  
Simulation Tools and Techniques, Springer International Publishing, p. 796-810, 2021

abstract

Abstract

In the High Performance Computing field (HPC), metadata server cluster is a critical aspect of a storage system performance and with object storage growth, systems must now be able to distribute metadata across servers thanks to distributed metadata servers. Storage systems reach better performances if the workload remains balanced over time. Indeed, an unbalanced distribution can lead to frequent requests to a subset of servers while other servers are completely idle. To avoid this issue, different metadata distribution methods exist and each one has its best use cases. Moreover, each system has different usages and different workloads, which means that one distribution method could fit to a specific kind of storage system and not to another one. To this end, we propose a tool to evaluate metadata distribution methods with different workloads. In this paper, we describe this tool and we use it to compare state-of-the-art methods and one method we developed. We also show how outputs generated by our tool enable us to deduce distribution weakness and chose the most adapted method.

Toward a Versatile and Scalable Metadata Distribution Framework for Object Storage (Research Poster)
Eloise Billa   Soraya Zertal   Thomas Leibovici   Philippe Deniel  
2018 International Conference on High Performance Computing and Simulation (HPCS), p. 1059-1060, 2018

abstract

Abstract

The use of object storage in the HPC world becomes a common case as it enables to overcome some POSIX limitations in scalability and performance. Indeed, object stores use a flat namespace, avoiding hierarchy in access requests and the cost of maintaining dependencies between multiple entries. Object stores also differentiate data flow from metadata flow, providing better concurrency and throughput. They can store trillions of objects and each object has its own customized metadata attributes so these metadata can be richer than POSIX attributes.

Designing a parallel OGSSim through library specificities
Sebastien Gougeaud   Soraya Zertal   Jacques-Charles Lafoucriere   Philippe Deniel  
Proceedings of the 4th ACM International Conference of Computing for Engineering and Sciences, Association for Computing Machinery, 2018

abstract

Abstract

Simulation is the most appropriate technique to evaluate the performance of current data storage systems and predict it for the future ones as part of data centers or cloud infrastructures. It assesses the potential of a system to meet the users requirements in terms of storage capacity, devices heterogeneity, delivered performance and robustness. We developed a simulation tool called OGSSim to address efficiently these criteria within a reduced execution time. But the number of threads on the test machine put an upper bound to the size of the simulated systems. To push this limitation and improve the simulation time, we define in this paper a parallel version of OGSSim. We explain how the parallelization process generate both design and implementation challenges due to the multi-node environment and the related communications and how MPI and ZeroMQ libraries respectively help us to address those challenges.

Contemporary High Performance Computing
Mickaël Amiet   Patrick Carribault   Elisabeth Charon   Guillaume Colin Verdière   Philippe Deniel   Gilles Grospellier   Guénolé Harel   François Jollet   Jacques-Charles Lafoucrière   Jacques-Bernard Lekien   Stéphane Mathieu   Marc Pérache   Jean-Christophe Weill   Gilles Wiber  
Chapman; Hall/CRC, p. 45-74, 2017

Optimizing Data Robustness in Large-Scale Storage Systems
Sebastien Gougeaud   Soraya Zertal   Jacques-Charles Lafoucriere   Philippe Deniel  
2017 International Conference on High Performance Computing and Simulation (HPCS), p. 236-243, 2017

abstract

Abstract

Storage systems capacity provided by data centers do not cease to increase, currently reaching the exabyte scale using thousands of disks. In this way, the question of the resiliency of such systems becomes critical, to avoid data loss and reduce the impact of the reconstruction process on the data access time. We propose SD2S, a method to create a placement scheme for declustered RAID organizations, based on a shifting placement. It consists in the calculation of degree matrices, which represent the distance between the source sets of each couple of physical disks, thus the number of data blocks which will be reconstructed in case of a double failure. The scheme creation is made by the computation of a score function for all possible shifting offsets and the selection of the one ensuring the reconstruction of the highest percentage of data. Results show the data reconstruction distribution against the number of double failure events. Also, the overhead generated by the calculation of the shifting offsets is compared to greedy SD2S and CRUSH without replicas for systems reaching the hundred of disks. These results confirm that the selection of the best offset can lead to a complete data reconstruction giving a small overhead, especially for large systems.

Using ZeroMQ as communication/synchronization mechanisms for IO requests simulation
Sebastien Gougeaud   Soraya Zertal   Jacques-Charles Lafoucriere   Philippe Deniel  
2017 International Symposium on Performance Evaluation of Computer and Telecommunication Systems (SPECTS), p. 1-8, 2017

abstract

Abstract

Using simulation to study the behavior of large-scale data storage systems becomes capital to predict the performance and the reliability of any of them at a lower cost. This helps to take the right decisions before the system development and deployment. OGSSim is a simulation tool for large and heterogeneous storage systems that uses parallelism to provide informations about the behavior of such systems in a reduced time. It uses ZeroMQ communication library to implement not only the data communication but also the synchronization functions between the generated threads. These synchronization points occur during the requests parallel execution and need to be treated efficiently to ensure data coherency for the fast and accurate computation of performance metrics. In this work, different issues due to the parallel execution of our simulation tool OGSSim are presented and the adopted solutions using ZeroMQ are discussed. The impact of these solutions in term of simulation time overhead are measured considering various system configurations. The obtained results s how t hat ZeroMQ has almost no impact on the simulation time, even for complex and large configurations.

Block shifting layout for efficient and robust large declustered storage systems
Sebastien Gougeaud   Soraya Zertal   Jacques-Charles Lafoucriere   Philippe Deniel  
2016 International Conference on High Performance Computing and Simulation (HPCS), p. 342-349, 2016

abstract

Abstract

Modern disks are very large (SSDs, HDDs) and their capacities will certainly increase in the future. Storage systems use an important number of such devices to compose storage pools and fulfil the storage capacity demands. The result is a higher probability of a failure and a longer reconstruction duration. Consequently, the whole system is penalized as the response time is higher and a second failure will generate a data loss. In this paper, we propose a new method based on block shifting layout which increases the efficiency of a RAID declustered storage system and improves its robustness in both normal and failure modes. We define four mapping rules to reach these objectives. Conducted tests reveal that exploiting the coprime property between the number of devices and the block shifting factor leads to an optimal layout. It reduces significantly the redirection time proportionally to the number of disks, reaching 50% for 1000 disks and a negligeable memory cost as we avoid the use of a redirection table. It also allows the recovery of additional data in case of a second failure during the degraded mode which gives to our proposed method a huge interest for large storage systems comparing to other existing methods.

A generic and open simulation tool for large multi-tiered hierarchical storage systems
Sebastien Gougeaud   Soraya Zertal   Jacques-Charles Lafoucriere   Philippe Deniel  
2016 International Symposium on Performance Evaluation of Computer and Telecommunication Systems (SPECTS), p. 1-8, 2016

abstract

Abstract

Actual storage systems are very large, with complex and distributed architectural configurations, composed of various technologies devices. However, simulation, analysis and evaluation tools in the literature do not handle this complex design and these heterogeneous components. This paper presents OGSSim (Open and Generic Storage systems Simulation tool): a new simulation tool for such systems. Being generic to all devices technologies and open to diverse management strategies and architecture layouts, it fulfills all the storage systems needs in term of representativeness. Also, it has been validated against real systems, thus its accuracy makes it a useful tool for the conception of future storage systems, the choice of hardware components and the analysis of the adequacy between the applications needs and the management strategies combined with the configuration layout. This validation confirmed only a maximum of 15% of difference between real and simulated execution time. Also, OGSSim execution in a competitive time, just 3.5 sec for common workloads on a large system of 500 disks, makes it a challenging simulation and evaluation tool. Thus, it is the appropriate and accurate tool for modern storage systems conception, evaluation and maintenance.

OGSSim: Open Generic data Storage systems Simulation tool
Sebastien Gougeaud   Soraya Zertal   Jacques-Charles Lafoucriere   Philippe Deniel  
EAI Endorsed Transactions on Scalable Information Systems, ACM, 2015

abstract

Abstract

In this paper, an open and generic storage simulator is proposed. It simulates with accuracy multi-tiered storage systems based on heterogeneous devices including HDDs, SSDs and the connecting buses. The target simulated sys- tem is constructed from the hardware configuration input, then sent to the simulator modules along with the trace file and the appropriate simulator functions are selected and executed. Each module of the simulator is executed by a thread, and communicates with the others via ZeroMQ, a message transmission API using sockets for the information transfer. The result is an accurate behavior of the simulated system submitted to a specific workload and represented by performance and reliability metrics. No restriction is put on the input hardware configuration which can handle differ- ent levels of details and makes this simulator generic. The diversity of the supported devices, regardless to their na- ture: disks, buses, ..etc and organisation: JBOD, RAID, ..etc makes the simulator open to many technologies. The modularity of its design and the independence of its exe- cution functions, makes it open to handle any additional mapping, access, maintenance or reconstruction strategies. The conducted tests using OLTP and scientific workloads show accurate results, obtained in a competitive runtime.

Formal modelling and analysis of distributed storage systems
Jordan La Houssaye   Franck Pommereau   Philippe Deniel  
IBISC, university of Evry / Paris-Saclay, 2014

abstract

Abstract

Distributed storage systems are nowadays ubiquitous, often under the form of multiple caches forming a hierarchy. A large amount of work has been dedicated to design, implement and optimise such systems. However, there exists to the best of our knowledge no attempt to use formal modelling and analysis in this field. This paper proposes a formal modelling framework to design distributed storage systems while separating the various concerns they involve like data-model, operations, placement, consistency, topology, etc. A system modelled in such a way can be analysed through model-checking to prove correctness properties, or through simulation to measure timed performance. In this paper, we define the modelling framework and then focus on timing analysis. We illustrate these two aspects on a simple example showing that our proposal has the potential to be used to make design decisions before the real system is implemented.

Contemporary High Performance Computing: From Petascale toward Exascale
Jeffer Vetter   Jack Dongarra   Piot Muszcek   Wu-Chun Feng   Kirk Cameron   Thomas Scoogland   Mickaël Amiet   Patrick Carribault   Elisabeth Charon   Philippe Deniel   Gilles Grospellier   Guenole Harel   François Jollet   Jacques-Charles Lafoucriere   Stephane Mathieu   Marc Pérache   Jean-Christophe Weill   Gilles Wiber   Guillaume Colin de Verdiere  
Chapman; Hall/CRC, 2013

abstract

Abstract

Contemporary High Performance Computing: From Petascale toward Exascale focuses on the ecosystems surrounding the world’s leading centers for high performance computing (HPC). It covers many of the important factors involved in each ecosystem: computer architectures, software, applications, facilities, and sponsors. The first part of the book examines significant trends in HPC systems, including computer architectures, applications, performance, and software. It discusses the growth from terascale to petascale computing and the influence of the TOP500 and Green500 lists. The second part of the book provides a comprehensive overview of 18 HPC ecosystems from around the world. Each chapter in this section describes programmatic motivation for HPC and their important applications; a flagship HPC system overview covering computer architecture, system software, programming systems, storage, visualization, and analytics support; and an overview of their data center/facility. The last part of the book addresses the role of clouds and grids in HPC, including chapters on the Magellan, FutureGrid, and LLGrid projects. With contributions from top researchers directly involved in designing, deploying, and using these supercomputing systems, this book captures a global picture of the state of the art in HPC.

NFSv4 Proxy in User Space on a Massive Cluster Architecture: Issues and Perspectives
Philippe Deniel  
USENIX Association, 2009

GANESHA, a multi-usage with large cache NFSv4 server
Ph. Deniel   Th. Leibovici   J-Ch. Lafoucrière  
WiP session at FAST'97, 2007

GANESHA, a multi-usage with large cache NFSv4 server
Ph. Deniel   Th. Leibovici   J-Ch. Lafoucrière  
Proceedings of the Linux Symposium, p. 113-124, 2007

Authentification dans les ONC/RPC
Ph. Deniel  
M.I.S.C, 2005

Introduction à la GSSAPI
Ph. Deniel  
Linux Magazine, Diamond Editions, 2003