SeedMe: Data sharing building blocks

Amit Chourasia, David R. Nadeau, John Moreland, Dmitry Mishin, & Michael L. Norman

Submitted August 12, 2016, SCEC Contribution #6678, 2016 SCEC Annual Meeting Poster #348 (PDF)

Computational simulations have become an indispensible tool in a wide variety of science and engineering investigations. Nearly all scientific computation and analyses create important transient data and preliminary results. These transient data include information dumped while a job is running, such as coarse output and run statistics. Preliminary results include data output by a running or finished job that needs to be quickly processed to get a view of the job’s success or failure. These job output data provide vital guidance that helps scientists review a current job and adjust parameters for the next job to run. Quick and effective assessments of these data are necessary for efficient use of the computation resources, but this is complicated when a large collaborating team is geographically dispersed and/or some team members do not have direct access to the computation resource and output data. Current methods for sharing and assessing transient data and preliminary results are cumbersome, labor intensive, and largely unsupported by useful tools and procedures. Each research team is forced to create their own scripts and ad hoc procedures to push data from system to system, and user to user, and to make quick plots, images, and videos to guide the next step in their research. These custom efforts often rely on email, ftp, and scp, despite the ubiquity of much more flexible dynamic web-based technologies and the impressive display and interaction abilities of today’s mobile devices. Better tools, building blocks, and cyberinfrastructure are needed to better support transient data and preliminary results sharing for collaborating computational science teams.

SeedMe project is developing web-based building blocks and cyberinfrastructure to enable easy sharing and streaming of transient data and preliminary results from computing resources to a variety of platforms, from mobile devices to workstations, and make it possible to quickly and conveniently view and assess results and provide an essential missing components in High Performance Computing (HPC) and cloud computing infrastructure. This work is an evolution of the SeedMe project [1, 2, 3] that will ultimately offer modular and flexible data sharing building blocks to the computation community. The building blocks will include authentication/authorization, granular access controls, data sharing and indexing, micro format ingestion and presentation for dashboard like functionality.

SeedMe building blocks is broadly applicable to a diverse set of scientific and engineering communities, and SeedMe will be released as a suite of open source modular building blocks that may be extended by others. With this poster we'd like to showcase the current progress on the project and engage with the HPC community to get feedback.

Initial design and documentation for the project is available at the project website
http://dibbs.seedme.org/documentation

1. SeedMe. 2016. SeedMe (Stream Encode, Explore and Disseminate My Experiments) Retrieved Aug 12, 2016 from https://www.seedme.org
2. Amit Chourasia, Mona Wong-Barnum, Dmitry Mishin, David R. Nadeau and Michael L. Norman. SeedMe: A scientific data sharing and collaboration platform. Presented at the XSEDE 2016 conference. Miami, FL Jul 17-21, 2016. DOI=10.1145/2949550.2949590

Key Words
Data sharing, collaboration, visualization

Citation
Chourasia, A., Nadeau, D. R., Moreland, J., Mishin, D., & Norman, M. L. (2016, 08). SeedMe: Data sharing building blocks. Poster Presentation at 2016 SCEC Annual Meeting.


Related Projects & Working Groups
Computational Science (CS)