Seminars & Colloquia
Dong Li
UC Merced
"Is Big Memory Useful for HPC Applications? A Case Study with Molecular Dynamics Simulation"
Friday October 23, 2020 11:00 AM
Location: N/A, EB2 NCSU Centennial Campus
Zoom Meeting Info (Visitor parking instructions)
This talk is part of the System Research Seminar series
Optane DC persistent memory-based system providing up to 9TB memory per
machine and Amazon EC2 high memory instance providing up to 24 TB
memory per machine. However, the impact of those big memory platforms
on high performance computing (HPC) applications is largely unknown. Is
the big memory platform useful for HPC applications? On the one hand,
the big memory platform enables scientific simulations with larger
problem scales, because of large memory capability; On the other hand,
we observe that in production supercomputers, 90% of jobs utilize less
than 15% of the node memory capacity, and for 90% of the time, memory
utilization is less than 35%. Many computation-intensive HPC
applications cannot benefit from the big memory system. In this talk,
we discuss challenges and opportunities that the big memory platform
brings to HPC applications. We use molecular dynamics (MD) simulation,
a computation-intensive application, for study. We introduce a
memoization framework (named MD-PM) that trades large memory capacity
for high computation capability. Evaluating with nine realistic MD
simulation problems on Optane DC PM, we show that MD-PM consistently
outperforms a state-of-the-art MD simulation package LAMMPS with an
average speedup of 22.96x. The big memory system has great potential to
accelerate HPC applications.
California, Merced. Previously, he was a research scientist at the Oak
Ridge National Laboratory (ORNL), studying computer architecture and
programming model for next generation supercomputer systems. Dong
earned his PhD in computer science from Virginia Tech. His research
focuses on high performance computing (HPC), and maintains a strong
relevance to computer systems. The core theme of his research is to
study how to enable scalable and efficient execution of scientific
applications on increasingly complex large-scale parallel systems. Dong
received a CAREER Award from U.S. National Science Foundation in 2016,
and an ORNL/CSMD Distinguished Contributor Award in 2013. His paper in
SC'14 was nominated as the best student paper. He is also the lead PI
for NVIDIA CUDA Research Center at UC Merced. He is a review board
member of IEEE Transaction on Parallel and Distributed Systems (TPDS).
Host: Frank Mueller, CSC