Hercule is a platform designed for managing scientific data produced by high-performance computing (HPC) codes. In production for over 20 years on HPC clusters, this solution is integrated into several multi-code, multi-physics simulation chains covering 1D, 2D, and 3D dimensions. The primary goal of Hercule is to provide an efficient API for recording and retrieving data, while enabling various services and processes to meet the needs of sharing numerical simulation data. It supports three main categories of data:
- Checkpoints and restarts (Prot/Rep): managing data for backups and restarts.
- Inter-code communication (InterCodes): facilitating data exchange between different simulation codes or tools.
- Post-processing (PostTraitement): exporting data for analysis or visualization tools.
To ensure effective communication, Hercule is built on several key concepts:
- Generic API: a standardized interface for all use cases.
- Type dictionary: defining common data models (structured and unstructured meshes, AMR, scalar and vector fields, material descriptions, point clouds, probes, etc.) to facilitate interoperability between software.
- Parallel I/O: distributing I/O operations across multiple files for improved performance.
- Parallel database: splitting databases into self-contained sub-domains, enabling independent processing of each segment.
- Metadata: incorporating metadata (bounding boxes, min/max values) to quickly locate specific data and position sub-domains relative to each other.
- Data manipulation services: tools for reconfiguring or filtering data during read operations, supporting inter-code communication and data analysis.
An open-source version is available to foster collaborations and testing, with a public release coming soon on GitHub. For further inquiries, please contact Olivier Bressand .
Publications
2024
Optimizing I/O performance for AMR Code: A case study with RAMSES in Astrophysics
Author: Loïc Strafella
https://numpex.org/wp-content/uploads/2024/04/Loic_Straffela_Exa_DI_Workshop.pdf
2022
LightAMR format standard and lossless compression algorithms for adaptive mesh refinement grids: RAMSES use case
Authors: Loïc Strafella, Damien Chapon
Abstract: The evolution of parallel I/O library as well as new concepts such as ‘in transit’ and ‘in situ’ visualization and analysis have been identified as key technologies to circumvent I/O bottleneck in pre-exascale applications. Nevertheless, data structure and data format can also be improved for both reducing I/O volume and improving data interoperability between data producer and data consumer. In this paper, we propose a very lightweight and purpose-specific post-processing data model for AMR meshes, called lightAMR. Based on this data model, we introduce a tree pruning algorithm that removes data redundancy from a fully threaded AMR octree. In addition, we present two lossless compression algorithms, one for the AMR grid structure description and one for AMR double/single precision physical quantity scalar fields. Then we present performance benchmarks on RAMSES simulation datasets of this new lightAMR data model and the pruning and compression algorithms. We show that our pruning algorithm can reduce the total number of cells from RAMSES AMR datasets by 10-40% without loss of information. Finally, we show that the RAMSES AMR grid structure can be compacted by ~ 3 orders of magnitude and the float scalar fields can be compressed by a factor ~ 1.2 for double precision and ~ 1.3 - 1.5 in single precision with a compression speed of ~ 1 GB/s.
https://arxiv.org/abs/2208.11958v1
https://doi.org/10.1016/j.jcp.2022.111577
https://www.sciencedirect.com/science/article/pii/S0021999122006398?via%3Dihub
2019
ASTRONUM 2019: Boosting I/O and visualization for exascale era using Hercule: test case on RAMSES
Authors: Loïc Strafella, Damien Chapon
Abstract: It has been clearly identified that I/O is one of the bottleneck to extend application for the exascale era. New concepts such as ‘in transit’ and ‘in situ’ visualization and analysis have been identified as key technologies to circumvent this particular issue. A new parallel I/O and data management library called Hercule, developed at CEA-DAM, has been integrated to Ramses, an AMR simulation code for self-gravitating fluids. Splitting the original Ramses output format in Hercule database formats dedicated to either checkpoints/restarts (HProt format) or post-processing (HDep format) not only improved I/O performance and scalability of the Ramses code but also introduced much more flexibility in the simulation outputs to help astrophysicists prepare their DMP (Data Management Plan). Furthermore, the very lightweight and purpose-specific post-processing format (HDep) will significantly improve the overall performance of analysis and visualization tools such as PyMSES 5. An introduction to the Hercule parallel I/O library as well as I/O benchmark results will be discussed.
https://arxiv.org/pdf/2006.02759
https://hal.science/hal-02886874v1
2018
Ramses User Meeting 2018: HERCULE data management library : boosting RAMSES I/O for exascale era
Author: Loïc Strafella