1
|
Hicks CB, Martinez TJ. Massively scalable workflows for quantum chemistry: BigChem and ChemCloud. J Chem Phys 2024; 160:142501. [PMID: 38591672 DOI: 10.1063/5.0190834] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2023] [Accepted: 03/14/2024] [Indexed: 04/10/2024] Open
Abstract
Electronic structure theory, i.e., quantum chemistry, is the fundamental building block for many problems in computational chemistry. We present a new distributed computing framework (BigChem), which allows for an efficient solution of many quantum chemistry problems in parallel. BigChem is designed to be easily composable and leverages industry-standard middleware (e.g., Celery, RabbitMQ, and Redis) for distributed approaches to large scale problems. BigChem can harness any collection of worker nodes, including ones on cloud providers (such as AWS or Azure), local clusters, or supercomputer centers (and any mixture of these). BigChem builds upon MolSSI packages, such as QCEngine to standardize the operation of numerous computational chemistry programs, demonstrated here with Psi4, xtb, geomeTRIC, and TeraChem. BigChem delivers full utilization of compute resources at scale, offers a programable canvas for designing sophisticated quantum chemistry workflows, and is fault tolerant to node failures and network disruptions. We demonstrate linear scalability of BigChem running computational chemistry workloads on up to 125 GPUs. Finally, we present ChemCloud, a web API to BigChem and successor to TeraChem Cloud. ChemCloud delivers scalable and secure access to BigChem over the Internet.
Collapse
Affiliation(s)
- Colton B Hicks
- Department of Chemistry and The PULSE Institute, Stanford University, Stanford, California 94305, USA and SLAC National Accelerator Laboratory, Menlo Park, California 94025, USA
| | - Todd J Martinez
- Department of Chemistry and The PULSE Institute, Stanford University, Stanford, California 94305, USA and SLAC National Accelerator Laboratory, Menlo Park, California 94025, USA
| |
Collapse
|
2
|
Fortenberry RC. Quantum Chemistry and Astrochemistry: A Match Made in the Heavens. J Phys Chem A 2024; 128:1555-1565. [PMID: 38381079 DOI: 10.1021/acs.jpca.3c07601] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/22/2024]
Abstract
Quantum chemistry can uniquely answer astrochemical questions that no other technique can provide. Computations can be parallelized, automated, and left to run continuously providing exceptional molecular throughput that cannot be done through experimentation. Additionally, the granularity of the individual computations that are required of potential energy surfaces, reaction mechanism pathways, or other quantum chemically derived observables produces a unique mosaic that make up the larger whole. These pieces can be dissected for their individual contributions or evaluated in an ad hoc fashion for each of their roles in generating the larger whole. No other scientific approach is capable of reporting such fine-grained insights. Quantum chemistry also works from a bottom-up approach in providing properties directly from the desired molecule instead of a top-down perspective as required of experiment where molecules have to be linked to observed phenomena. Furthermore, modern quantum chemistry is well within the range of "chemical accuracy" and is approaching "spectroscopic accuracy." As such, the seemingly difficult questions asked by astrochemistry that would not be asked initially for any other application require quantum chemical reference data. While the results of quantum chemical computations are needed to interpret astrochemical observation, modeling, or laboratory experimentation, such hard questions, regardless of the original need to answer them, produce unique solutions. While questions in astrochemistry often require novel developments in and implementations of quantum chemistry as outlined herein, the applications of these solutions will stretch beyond astrochemistry and may yet impact fields much closer to Earth.
Collapse
Affiliation(s)
- Ryan C Fortenberry
- Department of Chemistry & Biochemistry, University of Mississippi, Oxford, Mississippi 38677-1848, United States
| |
Collapse
|
3
|
Borges R, Colby SM, Das S, Edison AS, Fiehn O, Kind T, Lee J, Merrill AT, Merz KM, Metz TO, Nunez JR, Tantillo DJ, Wang LP, Wang S, Renslow RS. Quantum Chemistry Calculations for Metabolomics. Chem Rev 2021; 121:5633-5670. [PMID: 33979149 PMCID: PMC8161423 DOI: 10.1021/acs.chemrev.0c00901] [Citation(s) in RCA: 40] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2020] [Indexed: 02/07/2023]
Abstract
A primary goal of metabolomics studies is to fully characterize the small-molecule composition of complex biological and environmental samples. However, despite advances in analytical technologies over the past two decades, the majority of small molecules in complex samples are not readily identifiable due to the immense structural and chemical diversity present within the metabolome. Current gold-standard identification methods rely on reference libraries built using authentic chemical materials ("standards"), which are not available for most molecules. Computational quantum chemistry methods, which can be used to calculate chemical properties that are then measured by analytical platforms, offer an alternative route for building reference libraries, i.e., in silico libraries for "standards-free" identification. In this review, we cover the major roadblocks currently facing metabolomics and discuss applications where quantum chemistry calculations offer a solution. Several successful examples for nuclear magnetic resonance spectroscopy, ion mobility spectrometry, infrared spectroscopy, and mass spectrometry methods are reviewed. Finally, we consider current best practices, sources of error, and provide an outlook for quantum chemistry calculations in metabolomics studies. We expect this review will inspire researchers in the field of small-molecule identification to accelerate adoption of in silico methods for generation of reference libraries and to add quantum chemistry calculations as another tool at their disposal to characterize complex samples.
Collapse
Affiliation(s)
- Ricardo
M. Borges
- Walter
Mors Institute of Research on Natural Products, Federal University of Rio de Janeiro, Rio de Janeiro 21941-901, Brazil
| | - Sean M. Colby
- Biological
Science Division, Pacific Northwest National
Laboratory, Richland, Washington 99352, United States
| | - Susanta Das
- Department
of Chemistry, Michigan State University, East Lansing, Michigan 48824, United States
| | - Arthur S. Edison
- Departments
of Genetics and Biochemistry and Molecular Biology, Complex Carbohydrate
Research Center and Institute of Bioinformatics, University of Georgia, Athens, Georgia 30602, United States
| | - Oliver Fiehn
- West
Coast Metabolomics Center for Compound Identification, UC Davis Genome
Center, University of California, Davis, California 95616, United States
| | - Tobias Kind
- West
Coast Metabolomics Center for Compound Identification, UC Davis Genome
Center, University of California, Davis, California 95616, United States
| | - Jesi Lee
- West
Coast Metabolomics Center for Compound Identification, UC Davis Genome
Center, University of California, Davis, California 95616, United States
- Department
of Chemistry, University of California, Davis, California 95616, United States
| | - Amy T. Merrill
- Department
of Chemistry, University of California, Davis, California 95616, United States
| | - Kenneth M. Merz
- Department
of Chemistry, Michigan State University, East Lansing, Michigan 48824, United States
| | - Thomas O. Metz
- Biological
Science Division, Pacific Northwest National
Laboratory, Richland, Washington 99352, United States
| | - Jamie R. Nunez
- Biological
Science Division, Pacific Northwest National
Laboratory, Richland, Washington 99352, United States
| | - Dean J. Tantillo
- Department
of Chemistry, University of California, Davis, California 95616, United States
| | - Lee-Ping Wang
- Department
of Chemistry, University of California, Davis, California 95616, United States
| | - Shunyang Wang
- West
Coast Metabolomics Center for Compound Identification, UC Davis Genome
Center, University of California, Davis, California 95616, United States
- Department
of Chemistry, University of California, Davis, California 95616, United States
| | - Ryan S. Renslow
- Biological
Science Division, Pacific Northwest National
Laboratory, Richland, Washington 99352, United States
| |
Collapse
|
4
|
Raucci U, Valentini A, Pieri E, Weir H, Seritan S, Martínez TJ. Voice-controlled quantum chemistry. NATURE COMPUTATIONAL SCIENCE 2021; 1:42-45. [PMID: 38217155 DOI: 10.1038/s43588-020-00012-9] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/07/2020] [Accepted: 12/03/2020] [Indexed: 01/15/2024]
Abstract
Over the past decade, artificial intelligence has been propelled forward by advances in machine learning algorithms and computational hardware, opening up myriads of new avenues for scientific research. Nevertheless, virtual assistants and voice control have yet to be widely used in the natural sciences. Here, we present ChemVox, an interactive Amazon Alexa skill that uses speech recognition to perform quantum chemistry calculations. This new application interfaces Alexa with cloud computing and returns the results through a capable device. ChemVox paves the way to making computational chemistry routinely accessible to the wider community.
Collapse
Affiliation(s)
- Umberto Raucci
- Department of Chemistry and The PULSE Institute, Stanford University, Stanford, CA, USA
- SLAC National Accelerator Laboratory, Menlo Park, CA, USA
| | - Alessio Valentini
- Department of Chemistry and The PULSE Institute, Stanford University, Stanford, CA, USA
- SLAC National Accelerator Laboratory, Menlo Park, CA, USA
| | - Elisa Pieri
- Department of Chemistry and The PULSE Institute, Stanford University, Stanford, CA, USA
- SLAC National Accelerator Laboratory, Menlo Park, CA, USA
| | - Hayley Weir
- Department of Chemistry and The PULSE Institute, Stanford University, Stanford, CA, USA
- SLAC National Accelerator Laboratory, Menlo Park, CA, USA
| | - Stefan Seritan
- Department of Chemistry and The PULSE Institute, Stanford University, Stanford, CA, USA
- SLAC National Accelerator Laboratory, Menlo Park, CA, USA
| | - Todd J Martínez
- Department of Chemistry and The PULSE Institute, Stanford University, Stanford, CA, USA.
- SLAC National Accelerator Laboratory, Menlo Park, CA, USA.
| |
Collapse
|
5
|
Seritan S, Thompson K, Martínez TJ. TeraChem Cloud: A High-Performance Computing Service for Scalable Distributed GPU-Accelerated Electronic Structure Calculations. J Chem Inf Model 2020; 60:2126-2137. [PMID: 32267693 DOI: 10.1021/acs.jcim.9b01152] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
The encapsulation and commoditization of electronic structure arise naturally as interoperability, and the use of nontraditional compute resources (e.g., new hardware accelerators, cloud computing) remains important for the computational chemistry community. We present TeraChem Cloud, a high-performance computing service (HPCS) that offers on-demand electronic structure calculations on both traditional HPC clusters and cloud-based hardware. The framework is designed using off-the-shelf web technologies and containerization to be extremely scalable and portable. Within the HPCS model, users can quickly develop new methods and algorithms in an interactive environment on their laptop while allowing TeraChem Cloud to distribute ab initio calculations across all available resources. This approach greatly increases the accessibility of hardware accelerators such as graphics processing units (GPUs) and flexibility for the development of new methods as additional electronic structure packages are integrated into the framework as alternative backends. Cost-performance analysis indicates that traditional nodes are the most cost-effective long-term solution, but commercial cloud providers offer cutting-edge hardware with competitive rates for short-term large-scale calculations. We demonstrate the power of the TeraChem Cloud framework by carrying out several showcase calculations, including the generation of 300,000 density functional theory energy and gradient evaluations on medium-sized organic molecules and reproducing 300 fs of nonadiabatic dynamics on the B800-B850 antenna complex in LH2, with the latter demonstration using over 50 Tesla V100 GPUs in a commercial cloud environment in 8 h for approximately $1250.
Collapse
Affiliation(s)
- Stefan Seritan
- Department of Chemistry and the PULSE Institute, Stanford University, Stanford, California 94305, United States.,SLAC National Accelerator Laboratory, Menlo Park, California 94305, United States
| | - Keiran Thompson
- Department of Chemistry and the PULSE Institute, Stanford University, Stanford, California 94305, United States.,SLAC National Accelerator Laboratory, Menlo Park, California 94305, United States
| | - Todd J Martínez
- Department of Chemistry and the PULSE Institute, Stanford University, Stanford, California 94305, United States.,SLAC National Accelerator Laboratory, Menlo Park, California 94305, United States
| |
Collapse
|
6
|
Fortenberry RC, Thackston R, Francisco JS, Lee TJ. Toward the laboratory identification of the not-so-simple NS2neutral and anion isomers. J Chem Phys 2017; 147:074303. [DOI: 10.1063/1.4985901] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Affiliation(s)
- Ryan C. Fortenberry
- Department of Chemistry and Biochemistry, Georgia Southern University, Statesboro, Georgia 30460-8064, USA
| | - Russell Thackston
- Department of Information Technology, Georgia Southern University, Statesboro, Georgia 30460-8150, USA
| | - Joseph S. Francisco
- Department of Chemistry, University of Nebraska-Lincoln, Lincoln, Nebraska 68588, USA
| | - Timothy J. Lee
- MS 245-1, NASA Ames Research Center, Moffett Field, California 94035-1000, USA
| |
Collapse
|
7
|
Abstract
Simulations in neuroscience are performed on local servers or High Performance Computing (HPC) facilities. Recently, cloud computing has emerged as a potential computational platform for neuroscience simulation. In this paper we compare and contrast HPC and cloud resources for scientific computation, then report how we deployed NEURON, a widely used simulator of neuronal activity, in three clouds: Chameleon Cloud, a hybrid private academic cloud for cloud technology research based on the OpenStack software; Rackspace, a public commercial cloud, also based on OpenStack; and Amazon Elastic Cloud Computing, based on Amazon's proprietary software. We describe the manual procedures and how to automate cloud operations. We describe extending our simulation automation software called NeuroManager (Stockton and Santamaria, Frontiers in Neuroinformatics, 2015), so that the user is capable of recruiting private cloud, public cloud, HPC, and local servers simultaneously with a simple common interface. We conclude by performing several studies in which we examine speedup, efficiency, total session time, and cost for sets of simulations of a published NEURON model.
Collapse
Affiliation(s)
- David B Stockton
- Department of Biomedical Engineering, The University of Texas at San Antonio, San Antonio, TX, 78249, USA.
| | - Fidel Santamaria
- Department of Biology, The University of Texas at San Antonio, San Antonio, TX, 78249, USA
| |
Collapse
|
8
|
Affiliation(s)
- Ryan C. Fortenberry
- Georgia Southern University, Department of Chemistry, Statesboro, Georgia 30460, United States
| |
Collapse
|
9
|
Fortenberry RC, McDonald AR, Shepherd TD, Kennedy M, Sherrill CD. PSI4Education: Computational Chemistry Labs Using Free Software. THE PROMISE OF CHEMICAL EDUCATION: ADDRESSING OUR STUDENTS’ NEEDS 2015. [DOI: 10.1021/bk-2015-1193.ch007] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Affiliation(s)
- Ryan C. Fortenberry
- Department of Chemistry, Georgia Southern University, Statesboro, Georgia 30460-8064, United States
- Department of Chemistry and Biochemistry, California Polytechnic State University, San Luis Obispo, California 93407, United States
- Department of Chemistry, St. Edward’s University, Austin, Texas 78704, United States
- Center for Computational Molecular Science and Technology, School of Chemistry and Biochemistry, and School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, Georgia, 30332-0400, United States
| | - Ashley Ringer McDonald
- Department of Chemistry, Georgia Southern University, Statesboro, Georgia 30460-8064, United States
- Department of Chemistry and Biochemistry, California Polytechnic State University, San Luis Obispo, California 93407, United States
- Department of Chemistry, St. Edward’s University, Austin, Texas 78704, United States
- Center for Computational Molecular Science and Technology, School of Chemistry and Biochemistry, and School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, Georgia, 30332-0400, United States
| | - Tricia D. Shepherd
- Department of Chemistry, Georgia Southern University, Statesboro, Georgia 30460-8064, United States
- Department of Chemistry and Biochemistry, California Polytechnic State University, San Luis Obispo, California 93407, United States
- Department of Chemistry, St. Edward’s University, Austin, Texas 78704, United States
- Center for Computational Molecular Science and Technology, School of Chemistry and Biochemistry, and School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, Georgia, 30332-0400, United States
| | - Matthew Kennedy
- Department of Chemistry, Georgia Southern University, Statesboro, Georgia 30460-8064, United States
- Department of Chemistry and Biochemistry, California Polytechnic State University, San Luis Obispo, California 93407, United States
- Department of Chemistry, St. Edward’s University, Austin, Texas 78704, United States
- Center for Computational Molecular Science and Technology, School of Chemistry and Biochemistry, and School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, Georgia, 30332-0400, United States
| | - C. David Sherrill
- Department of Chemistry, Georgia Southern University, Statesboro, Georgia 30460-8064, United States
- Department of Chemistry and Biochemistry, California Polytechnic State University, San Luis Obispo, California 93407, United States
- Department of Chemistry, St. Edward’s University, Austin, Texas 78704, United States
- Center for Computational Molecular Science and Technology, School of Chemistry and Biochemistry, and School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, Georgia, 30332-0400, United States
| |
Collapse
|