SEISM-IO: A High Level Parallel I/O Library for High-Performance Seismic Applications

Yifeng Cui, Dawei Mu, & Daniel Roten

Submitted August 15, 2017, SCEC Contribution #7809, 2017 SCEC Annual Meeting Poster #278

SEISM-IO is a high-level parallel I/O library developed with a goal to simplify the programming of parallel I/O for seismic applications at scale. This library has added the sub-library feature and has been optimized for performance and programming efficiency, and a GitHub open source version will soon be ready to share within the SCEC computational community.

By adding a unified software layer between application and file system, the SEISM-IO library uses widely used high-performance I/O libraries, including PHDF5, PnetCDF and ADIOS, as sub-libraries to handle low-level I/O operations. We develop an easy-to-use application programming interface (API) for both C and Fortran language, which integrates different initialization, open, read, write and finalize processes in sub-libraries. By calling this light-weight interface, we are able to choose different sub-libraries and corresponding file formats without changing the source code. Supported by the NSF’s Petascale Application Improvement Discovery (PAID) program, we also improved the HDF5 I/O performance in SEISM-IO by introducing optimal striping/blocksize and added neverfill/miscroversion features.

SEISM-IO supports partitioning functions for the structured mesh. By defining the global/local parameters such as model maximum dimension, model size, ghost cell layers and the number of CPU cores along each dimensional axis, the library calculates the exact local model chunk for each process. The spatial relation between local data chunk and the global dataset is taken care of so that the user can use the processed local dataset directly for their solvers. In addition, SEISM-IO natively supports data buffering, and builds a data buffer on each process to reduce the expensive storage access.

To verify the library in a realistic case, we carried out a 0-4 Hz ShakeOut simulation of an Mw 7.7 earthquake on the southern San Andreas fault using SEISM-IO against the MPI-IO embedded in the CPU version of AWP-ODC. The simulation was run successfully using 48,000 CPU cores, with a total data volume of 52 TB written to the disk in the process. The result demonstrated the improved performance, robustness, correctness and stability.

Key Words
SEISM-IO, IO, HPC, structured mesh, ShakeOut

Citation
Cui, Y., Mu, D., & Roten, D. (2017, 08). SEISM-IO: A High Level Parallel I/O Library for High-Performance Seismic Applications. Poster Presentation at 2017 SCEC Annual Meeting.


Related Projects & Working Groups
Computational Science (CS)