Cloud Computing and Big Data – Using the Southern California Earthquake Data Center (SCEDC) and the Southern California Seismic Network (SCSN) Products and Services for Earthquake Research

Ellen Yu, Prabha Acharya, Aparna Bhaskaran, Shang-Lin Chen, Jennifer R. Andrews, Valerie Thomas, Zachary E. Ross, Egill Hauksson, & Robert W. Clayton

Submitted August 13, 2019, SCEC Contribution #9487, 2019 SCEC Annual Meeting Poster #301 (PDF)

Poster Image: 
Southern California Earthquake Data Now in the Amazon Cloud

• At the date of this abstract, we have uploaded one year (2016,180 to 2017,180).  Efforts to load remaining years (1999-present) into the archive are are ongoing. AWS bucket name is s3://scedc-pds and it is hosted in the us-west-2 (Oregon) region.
• To minimize the learning curve for users, we chose a data format familiar to most users of the archive and a file/directory naming convention that would allow users to perform time based and channel based searches. Each file is in miniSEED format and represents one channel for one day. 
• Development of additional tools for more fine grained searches is an ongoing effort as well as the creation of Docker and Amazon Machine Images to facilitate access to the data set. The poster will also present a cost analysis for a variety of analysis activities to give users an idea of costs incurred working with a cloud archive.

New Data Availability Web Services and Station Map to Help Plan Your Research

• Users can use SCEDC data availability web services to determine the time ranges for which triggered and continuous waveform time series are available for download. This service is compliant with the IRIS data availability service. A new dynamic station map allows users to see the information geographically.

New Researcher Provided Datasets

• The Quake Template Matching (QTM) seismicity catalog (Ross et al 2019) is available for download.
• The SCEDC data holdings now include a double difference catalog (Hauksson et. al 2011) available via STP, and a focal mechanism catalog (Yang et al. 2011) both spanning 1981 through 2018.
• The SCEDC website now hosts training and validation datasets that are for deep learning research. There are sets for p wave picking, first motion polarity, and phase identification (Ross et al. 2018), and signal noise discrimination (Meier et al 2019)

Key Words
Cloud Computing, Seismology, Data Archive

Citation
Yu, E., Acharya, P., Bhaskaran, A., Chen, S., Andrews, J. R., Thomas, V., Ross, Z. E., Hauksson, E., & Clayton, R. W. (2019, 08). Cloud Computing and Big Data – Using the Southern California Earthquake Data Center (SCEDC) and the Southern California Seismic Network (SCSN) Products and Services for Earthquake Research. Poster Presentation at 2019 SCEC Annual Meeting.


Related Projects & Working Groups
Computational Science (CS)