Accelerating AWP-ODC-OS Using Intel Xeon Phi Processors

Josh Tobin, Alexander N. Breuer, Charles Yount, Alexander Heinecke, & Yifeng Cui

Submitted August 15, 2016, SCEC Contribution #6795, 2016 SCEC Annual Meeting Poster #340

AWP-ODC is software that simulates dynamic rupture and wave propagation, using a staggered grid finite difference scheme, and is widely used in the SCEC community. Recently a unified open source version AWP-ODC-OS has been released. We present an extension of AWP-ODC-OS to the Intel Xeon Phi processor, codenamed Knight's Landing (KNL). Additionally, this extension includes a new lightweight system that streamlines the execution and verification of AWP-ODC-OS simulations.

The AWP-ODC code has in recent years run on GPU-accelerated supercomputers. The hybrid version we have developed allows AWP-ODC-OS to run additionally on Xeon Phi processors, and is designed to take full advantage of the potential of the KNL architecture. In particular, the KNL processors provide 16GB of high-speed MCDRAM. Since the time-to-solution of AWP-ODC is limited primarily by memory-bandwidth, this high-bandwidth memory has the potential to greatly reduce the time-to-solution of AWP-ODC. Moreover, since even top-of-the-line GPUs are equipped with less than 16GB of memory, KNL presents the opportunity to run simulations on larger domains than were previously possible. In addition to the 16GB of MCDRAM, KNL nodes are outfitted with up to 384GB of DDR memory, which presents further opportunities to increase simulation size. Our extension incorporates all of the features available in AWP-ODC-OS, which include purely elastic kernels, frequency--dependent viscoelasticity, free surface boundary conditions, absorbing boundary conditions and point sources. We present our new verification tool, which automatically generates a report with synthetic seismograms and global velocity statistics after a simulation, intended to allow the user to quickly verify the results of a run.

Our performance results demonstrate that our extension of AWP-ODC running on an Intel Xeon Phi 7210 processor performs more than 7.5 times faster than an Intel Xeon E5-2680v3 and more than 1.4 times faster than the recent Nvidia M40 GPU.

Tobin, J., Breuer, A. N., Yount, C., Heinecke, A., & Cui, Y. (2016, 08). Accelerating AWP-ODC-OS Using Intel Xeon Phi Processors. Poster Presentation at 2016 SCEC Annual Meeting.

Related Projects & Working Groups
Computational Science (CS)