Software Containers and Version Control for High Performance Computing Applications

Liberty Locsin, & Scott Callaghan

Submitted August 5, 2020, SCEC Contribution #10267, 2020 SCEC Annual Meeting Poster #190

Software is where the computational models that aid scientific analysis exist. Due to the complexity of some scientific software applications, software may require specialized parallel high performance computing (HPC) environments such as supercomputers. However, installing scientific software on supercomputers can require a lot of setup and be time intensive. One approach to reducing the burden of installation, and improving the ease of distribution, is to encapsulate software applications in containers. Popular container technologies such as Docker pose security concerns for system administrators at supercomputing centers. Therefore, we investigated specialized container software designed for HPC clusters. We compared different container technologies to decide which tool best suits the needs of scientists. As a proof of concept, we demonstrate moving the CyberShake project, a physics-based probabilistic seismic hazard analysis application that uses the MPI library, into a Singularity container on the Frontera supercomputer at the Texas Advanced Computing Center.
HPC containers improve ease of use for scientists by removing the need to install dependencies and libraries and allow for portability of code. Another factor that can improve availability of software is that of transferring code into widely used and more accessible repositories, such as git. Using widely used public repositories on ubiquitous version control systems also improves the portability of code and increases transparency of the work. Currently, the CyberShake codebase is hosted on a Subversion repository. However, Subversion is no longer commonly used, and the repository hosts a number of software elements which have been removed from the CyberShake workflows. Therefore, we are migrating CyberShake to a Git repository. We believe this would improve portability, not only allowing for more reproducible science, but also establishing a path towards putting CyberShake under a Continuous Integration workflow in conjunction with a system testing suit.

Key Words
containers, high performance computing

Citation
Locsin, L., & Callaghan, S. (2020, 08). Software Containers and Version Control for High Performance Computing Applications. Poster Presentation at 2020 SCEC Annual Meeting.


Related Projects & Working Groups
Computational Science (CS)