Lori Diachin will lead the Exascale Computing project as it approaches final milestones

The ultimate goal is in sight for the multi-institutional Exascale Computing Project (ECP), which was launched in 2016 with a mandate from the Department of Energy (DOE) and the National Nuclear Security Administration (NNSA) to achieve readiness for the ‘exascale in the United States by the 2022 time horizon. The project could not fully prove its mettle until at least one exascale machine was implemented and operational. HPC thread recently learned that is now definitively the case with Frontier’s formal acceptance at Oak Ridge National Laboratory (ORNL). Frontier is now open for early scientific workloads, including many of the 24 ECP target applications. These are true scientific codes in their own right, but they also serve to evaluate the success of ECP, which must meet certain key performance parameters (KPPs) before it can be completed successfully.

Leading ECP to that milestone now is Lori Diachin, who has served as ECP’s Deputy Director since 2018, as well as being the Principal Deputy Associate Director for Lawrence Livermore National Labs Computing Directorate. Diachin is the third person to lead the ECP in its nearly 8 years. She takes over from Doug Kothe, who has ably led the project for the past six years after taking over from Paul Messina, who was instrumental in launching the program. [Kothe is leaving his post at ORNL (where he was associate lab director for the Computing and Computational Systems Directorate, in addition to being ECP chief) to join Sandia National Laboratories on June 5 as chief research officer and associate labs director of Sandias Advanced Science and Technology Division.] Diachin retains his position and employment at LLNL, but is spending a significant amount of time at the ORNL campus, in Oak Ridge, Tennessee. He will report to ORNL Acting Director Jeff Smith in his new capacity as ECP Director.

Today, the day before her official start date, I spoke to Diachin, who I first met a few years ago when she was involved in the HPC4 program, and received an update hot off the press. Diachin provided a status report on both the project itself where he’s at and what the next milestones are and on a little more personal note, he shared how stepping into this role is a natural fit and career extension. rewarding in scientific computing, the majority spent within the DOE lab.

Heres a transcript of that interview.

Tiffany Trader: Let’s start with why exascale is necessary and important, for science and for the United States, can you give some clarifying examples?

Lori Diachin: Exascale will clearly give us a significant edge in many different areas related to science and security. As for Lawrence Livermore (a NNSA lab), we have stocks that we certify through high-performance computing each year, and El Capitan will be a big part of that story when it comes online. We have a partnership with, for example, the National Cancer Institute and the National Institutes of Health in the area of ‚Äč‚Äčmechanistic analysis of RAS proteins for tumors, and RAS protein tumors account for about 30% of all tumors. And so being able to understand a little better what these mechanisms are, can help us understand the mechanism of the disease. And similarly, the same project is looking into precision medicine, where they try to predict the particular ways drugs interact with a particular cancer or disease within a specific patient. And so exascale computing can really start to give us those kinds of benefits and insights through modeling and simulation. And it runs the gamut: modeling wind farms on realistic topographies, and wherever there are multiple turbines together interacting with each other, and how does that affect the overall wind farm output? So there are just a great many examples that we could talk about.

Trader: So some of that cancer research is under the CANDLE project, which is an ECP application is it coming to a conclusion?

Diachin: Yes, so they can continue the partnership beyond the ECP, it will be up to the program sponsors at the DOE to determine. But in terms of ECP, all ECP projects will come to an end. We’re completing the technical work at the end of December 2023. And at that point, we’re wrapping up the ECP portion, but many projects will move into related work at the various stakeholder offices that they have.

Trader: Did you mention that December 31st is the end date for the ECP?

Diachin: That will be the end of the technical work and then all teams will need to wrap up their work on their key performance metrics and software applications and technologies, again, as part of the ECP. A lot of those efforts, as I mentioned, are transitioning into new programs, new efforts, new projects, all over the Department of Energy. But the ECP project itself is a formal DOE project, which has an end date, and that is December 31st for the technical work. We will do a final review in April. And then we formally shut down the project, you know, there’s a lot of mechanisms that need to happen to shut down the project. And this must happen by next September 2024.

Trader: Will there be a public report?

Diachin: We are working on several communications for the ECP. One of the things we’re really excited about as a leadership team is working on a book proposal, where we want to talk about the lessons we’ve learned, both from a technical standpoint, but also from a standpoint of how do you collaborate on a such a big project in computational science? It’s really one of the few mega-projects we’ve seen in computational science. And so what are the lessons that we’ve learned that we hope will be helpful, not just for projects that are this large which, truth be told, there aren’t many that large, right but for small to medium sized projects. And then also the lessons on project management? You know, we’ve done a lot of work in applying formal project mechanisms to a research development and implementation project. And there’s been a lot of interesting lessons that we’ve learned about the value of some of those practices in the environment for computational science. So we’re working on a book, we’re working on a series of high-level communications aimed primarily at a non-technical audience. There is also a podcast series in the works.

Trader: Regarding the Key Performance Metrics (KPP), you will only need to run KPP on Frontier (ORNL) to successfully conclude the ECP or is there a plan to include Aurora (ANL) and El Captain (LLNL)?

Diachin: We will accept any KPP submissions from our technical teams between now and the end of the technical work. So, as we know, Aurora will come; we’ll be getting early access here this summer, with more comprehensive access planned for the fall. So where there’s an overlap with our teams, we’re definitely giving the highest priority to teams to go into Aurora. Like Douglas [Kothe] That said, we don’t count on Aurora for the success of ECP, but some of our software technology teams and application teams will be able and are strongly encouraged to operate in that environment, to demonstrate performance portability across multiple architectures, and to demonstrate the their challenge problems and their science on the Aurora system as well.

Trader: Are you already running some of the ECPs on Sunspot (a mini, 2-rack, 128-node version of Aurora) and Aurora itself?

Diachin: All ECPs have access to Sunspot. So I would say that a very large percentage, if not all, of our teams have gone into Sunspot and are using it to solve every problem they can. This is a much smaller scale system, but it is the [same] hardware and therefore are working through the software. Any differences that emerge between Aurora and Frontier with respect to software stack, or GPU performance, etc. So all of our teams are working on it.

Trader: What impresses you the most as you have been part of ECP in a leadership role since 2018, nearly five full years? What did you find particularly meaningful or satisfying?

Diachin: I think one of the things that we as a leadership team have found very satisfying is how much progress we can make as a community, when we have a large-scale funded effort where we collaborate across software technologies and applications teams. And we’re really able to bring all of these together in a sustained way over a period of many years. So that really gives us time, to take those advances that are happening in software technologies, like advanced math libraries and visualization techniques and data science, and really see them start to bear fruit in applications, and it gives that time for the ‘application team to provide feedback and that iteration and co-design process between software technologies and applications to really work. And so that’s one of the things that I’ve personally found most satisfying is we’re seeing that at scale with ECP. And there have been programs that have addressed this in the past, like SciDAC, which I was a part of before I joined ECP, that have very similar motivations, but just the scale and ability to back it up has been remarkable.

Trader: So the successes, achievements and progress you’ve seen at ECP, it seems like leading the Exascale Computing Project is a natural extension of your career trajectory.

Diachin: Ah, definitely. So I’ve been a part of the DOE family for 30 years and have worked primarily with ASCR, for that time. I was at Argonne in the mathematics and computer science division. You know Rick Stevens. I was in that division when he was the division head. And then, all that time, I’ve been very interested in these collaborative projects. I was one of the first PIs in the SciDAC program in 2000. And I worked as a PI in a leadership role on projects in SciDAC until I became the deputy here, and I was also involved in the HPC4 program, HPC for Energy Innovation .

Trader: That’s where we first met. I believe it was an HPC4Manufacturing meeting in San Diego a few years ago.

Diachin: Yes. So seeing those connections between the education, I’m a mathematician, and then seeing the connections for what we can do with the numerical methods and the software technologies that we develop and the impact that can have. It’s something I’ve been interested in and have been working towards, you know, my entire career. And ECP has been particularly satisfactory in this regard.

In a statement released by LLNL, Diachin said much was due to the two ECP directors who preceded her. [Kothes] Leadership in ECP’s application development portfolio, and subsequently leadership of the project as a whole, have positioned this one-of-a-kind project to be a huge success, said Diachin. The DOE community owes him and the original ECP director who steered this project from concept to reality, Paul Messina, a great debt of gratitude for their leadership and service.

Diachin further noted that in the project’s history, ECP has engaged more than 1,000 researchers in developing and documenting next-generation computational tools and applications, which will pay dividends for DOE and the nation for many years to come.

ORNL’s Ashley Barker will serve as Diachins’ deputy.

For more details, see the official announcement from LLNL.

#Lori #Diachin #lead #Exascale #Computing #project #approaches #final #milestones

Leave a Comment