SCEC Award Number 22114 View PDF
Proposal Category Individual Proposal (Integration and Theory)
Proposal Title Flexible and Scalable Earthquake Forecasting
Investigator(s)
Name Organization
Emily Brodsky University of California, Santa Cruz
Other Participants Kelian Dascher-Cousineau (Graduate Student)
SCEC Priorities 5c, 5d, 5e SCEC Groups Seismology, EFP, CS
Report Due Date 03/15/2023 Date Report Submitted 12/05/2023
Project Abstract
Seismology is witnessing explosive growth in the diversity and scale of earthquake catalogs. A key motivation for this community effort is that more data should translate into better earthquake forecasts. Such improvements are yet to be seen. Here, we introduce the Recurrent Earthquake foreCAST (RECAST), a deep-learning model based on recent developments in neural temporal point processes. The model is designed for a greater volume and diversity of earthquake observations, overcoming theoretical and computational limitations of traditional approaches. We benchmark against a temporal Epidemic Type Aftershock Sequence (ETAS) model. Tests on synthetic data suggest that with a modest-sized dataset, RECAST accurately models earthquake-like point processes directly from cataloged data. Tests on earthquake catalogs in Southern California indicate improved fit and forecast accuracy compared to our benchmark when the training set is sufficiently long (>104 events). The basic components in RECAST add flexibility and scalability for earthquake forecasting without sacrificing performance.
Intellectual Merit The diversity and scale of earthquake catalogs has exploded in the past years due to dense seismic networks and increasingly automated data processing techniques (Mousavi et al., 2020; Mousavi & Beroza, 2023; Obara, 2003; Z. E. Ross et al., 2019; Shelly, 2017; Tan et al., 2021; White et al., 2019). A motivation for this community effort is that more detailed observations should translate into better earthquake forecasts. However, clear improvement in forecasting skill has yet to materialize (Beroza et al., 2021). One factor here may be the nature of the models used for forecasting. Current operational earthquake forecasts build on seminal work designed for sparse earthquake records based on the canonical statistical laws of seismology (Llenos & Michael, 2017; van der Elst et al., 2022). While the past decades have seen advances in the regionalization of these models (Field et al., 2017; Mai et al., 2016; Ogata, 2017), catalog bias correction (Mizrahi et al., 2021; G. J. Ross, 2021) and spatial forecasts (Ogata, 1998), these advancements have failed to fully capitalize on the wealth of available geophysical data. However, the advances do not leverage the wealth of available geophysical data (Mancini et al., 2022; Mousavi & Beroza, 2023). This is largely due to the limitations inherent to current parametric models, which frequently constrain the analysis to only a small portion of the available catalogs.
In this study, we turn to recent advances in machine learning using neural temporal point processes to complement existing forecasting capabilities. Like most current work, we confine our attention to statistical, rather than to deterministic forecasts. In principle, such approaches provide the promise of general-purpose flexible and scalable forecasting (Shchur et al., 2021). Here we use the term flexible in the sense that models do not presume functional form and thus can both incorporate additional earthquake information and learn complex dependencies in the data (Gareth et al., 2021). We use the term scalable both in the sense that models can efficiently train on large datasets (Grover & Leskovec, 2016) and continue to improve with additional data (Kaplan et al., 2020). Despite these desirable characteristics, it is unclear whether deep learning is well suited for earthquake data (Mignan & Broccardo, 2020). The earthquake record is highly stochastic and is strongly influenced by extreme events. Our goal here is to define and implement the basic requirements for a neural temporal point process model applied to earthquake forecasting and assess whether it is well suited for the task.
Broader Impacts This funding supported the PhD of graduate student Kelian Dascher-Cousineau. It has also advanced operational earthquake forecasting.
Exemplary Figure Figure 2 | Performance on modern earthquake catalogs.
(A) Seismicity around the San Jacinto Fault zone in Southern California(White et al., 2019) (B) Model relative goodness of fit on the out-of-sample test period as measured by the time-averaged joint log-likelihood for RECAST and ETAS model. Error bars indicate the 95% confidence interval of 1000 bootstrap samples for five random initializations of RECAST. The models were trained with incrementally longer training and validation sets. The inset shows the time series for the seismicity considered. Each white bar shows the corresponding periods used to train and validate ETAS and RECAST. Given a training and validation catalog in excess of ~10,000 earthquakes (fourth white bar from the top in the inset), the test period is best modeled by RECAST.