1
|
Pokhilko P, Yeh CN, Morales MA, Zgid D. Tensor hypercontraction for fully self-consistent imaginary-time GF2 and GWSOX methods: Theory, implementation, and role of the Green's function second-order exchange for intermolecular interactions. J Chem Phys 2024; 161:084108. [PMID: 39185845 DOI: 10.1063/5.0215954] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2024] [Accepted: 08/05/2024] [Indexed: 08/27/2024] Open
Abstract
We present an efficient MPI-parallel algorithm and its implementation for evaluating the self-consistent correlated second-order exchange term (SOX), which is employed as a correction to the fully self-consistent GW scheme called scGWSOX (GW plus the SOX term iterated to achieve full Green's function self-consistency). Due to the application of the tensor hypercontraction (THC) in our computational procedure, the scaling of the evaluation of scGWSOX is reduced from O(nτnAO5) to O(nτN2nAO2). This fully MPI-parallel and THC-adapted approach enabled us to conduct the largest fully self-consistent scGWSOX calculations with over 1100 atomic orbitals with only negligible errors attributed to THC fitting. Utilizing our THC implementation for scGW, scGF2, and scGWSOX, we evaluated energies of intermolecular interactions. This approach allowed us to circumvent issues related to reference dependence and ambiguity in energy evaluation, which are common challenges in non-self-consistent calculations. We demonstrate that scGW exhibits a slight overbinding tendency for large systems, contrary to the underbinding observed with non-self-consistent RPA. Conversely, scGWSOX exhibits a slight underbinding tendency for such systems. This behavior is both physical and systematic and is caused by exclusion-principle violating diagrams or corresponding corrections. Our analysis elucidates the role played by these different diagrams, which is crucial for the construction of rigorous, accurate, and systematic methods. Finally, we explicitly show that all perturbative fully self-consistent Green's function methods are size-extensive and size-consistent.
Collapse
Affiliation(s)
- Pavel Pokhilko
- Department of Chemistry, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - Chia-Nan Yeh
- Center for Computational Quantum Physics, Flatiron Institute, New York, New York 10010, USA
| | - Miguel A Morales
- Center for Computational Quantum Physics, Flatiron Institute, New York, New York 10010, USA
| | - Dominika Zgid
- Department of Chemistry, University of Michigan, Ann Arbor, Michigan 48109, USA
- Department of Physics, University of Michigan, Ann Arbor, Michigan 48109, USA
| |
Collapse
|
2
|
Nagy PR. State-of-the-art local correlation methods enable affordable gold standard quantum chemistry for up to hundreds of atoms. Chem Sci 2024:d4sc04755a. [PMID: 39246365 PMCID: PMC11376132 DOI: 10.1039/d4sc04755a] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2024] [Accepted: 07/30/2024] [Indexed: 09/10/2024] Open
Abstract
In this feature, we review the current capabilities of local electron correlation methods up to the coupled cluster model with single, double, and perturbative triple excitations [CCSD(T)], which is a gold standard in quantum chemistry. The main computational aspects of the local method types are assessed from the perspective of applications, but the focus is kept on how to achieve chemical accuracy (i.e., <1 kcal mol-1 uncertainty), as well as on the broad scope of chemical problems made accessible. The performance of state-of-the-art methods is also compared, including the most employed DLPNO and, in particular, our local natural orbital (LNO) CCSD(T) approach. The high accuracy and efficiency of the LNO method makes chemically accurate CCSD(T) computations accessible for molecules of hundreds of atoms with resources affordable to a broad computational community (days on a single CPU and 10-100 GB of memory). Recent developments in LNO-CCSD(T) enable systematic convergence and robust error estimates even for systems of complicated electronic structure or larger size (up to 1000 atoms). The predictive power of current local CCSD(T) methods, usually at about 1-2 order of magnitude higher cost than hybrid density functional theory (DFT), has become outstanding on the palette of computational chemistry applicable for molecules of practical interest. We also review more than 50 LNO-based and other advanced local-CCSD(T) applications for realistic, large systems across molecular interactions as well as main group, transition metal, bio-, and surface chemistry. The examples show that properly executed local-CCSD(T) can contribute to binding, reaction equilibrium, rate constants, etc. which are able to match measurements within the error estimates. These applications demonstrate that modern, open-access, and broadly affordable local methods, such as LNO-CCSD(T), already enable predictive computations and atomistic insight for complicated, real-life molecular processes in realistic environments.
Collapse
Affiliation(s)
- Péter R Nagy
- Department of Physical Chemistry and Materials Science, Faculty of Chemical Technology and Biotechnology, Budapest University of Technology and Economics Műegyetem rkp. 3. H-1111 Budapest Hungary
- HUN-REN-BME Quantum Chemistry Research Group Műegyetem rkp. 3. H-1111 Budapest Hungary
- MTA-BME Lendület Quantum Chemistry Research Group Műegyetem rkp. 3. H-1111 Budapest Hungary
| |
Collapse
|
3
|
Tehrani A, Richer M, Heidar-Zadeh F. CuGBasis: High-performance CUDA/Python library for efficient computation of quantum chemistry density-based descriptors for larger systems. J Chem Phys 2024; 161:072501. [PMID: 39158048 DOI: 10.1063/5.0216781] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2024] [Accepted: 06/17/2024] [Indexed: 08/20/2024] Open
Abstract
CuGBasis is a free and open-source CUDA®/Python library for efficient computation of scalar, vector, and matrix quantities crucial for the post-processing of electronic structure calculations. CuGBasis integrates high-performance Graphical Processing Unit (GPU) computing with the ease and flexibility of Python programming, making it compatible with a vast ecosystem of libraries. We showcase its utility as a Python library and demonstrate its seamless interoperability with existing Python software to gain chemical insight from quantum chemistry calculations. Leveraging GPU-accelerated code, cuGBasis exhibits remarkable performance, making it highly applicable to larger systems or large databases. Our benchmarks reveal a 100-fold performance gain compared to alternative software packages, including serial/multi-threaded Central Processing Unit and GPU implementations. This paper outlines various features and computational strategies that lead to cuGBasis's enhanced performance, guiding developers of GPU-accelerated code.
Collapse
Affiliation(s)
- Alireza Tehrani
- Department of Chemistry, Queen's University, Kingston, Ontario K7L-3N6, Canada
| | - Michelle Richer
- Department of Chemistry, Queen's University, Kingston, Ontario K7L-3N6, Canada
| | - Farnaz Heidar-Zadeh
- Department of Chemistry, Queen's University, Kingston, Ontario K7L-3N6, Canada
| |
Collapse
|
4
|
Di Felice R, Mayes ML, Richard RM, Williams-Young DB, Chan GKL, de Jong WA, Govind N, Head-Gordon M, Hermes MR, Kowalski K, Li X, Lischka H, Mueller KT, Mutlu E, Niklasson AMN, Pederson MR, Peng B, Shepard R, Valeev EF, van Schilfgaarde M, Vlaisavljevich B, Windus TL, Xantheas SS, Zhang X, Zimmerman PM. A Perspective on Sustainable Computational Chemistry Software Development and Integration. J Chem Theory Comput 2023; 19:7056-7076. [PMID: 37769271 PMCID: PMC10601486 DOI: 10.1021/acs.jctc.3c00419] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2023] [Indexed: 09/30/2023]
Abstract
The power of quantum chemistry to predict the ground and excited state properties of complex chemical systems has driven the development of computational quantum chemistry software, integrating advances in theory, applied mathematics, and computer science. The emergence of new computational paradigms associated with exascale technologies also poses significant challenges that require a flexible forward strategy to take full advantage of existing and forthcoming computational resources. In this context, the sustainability and interoperability of computational chemistry software development are among the most pressing issues. In this perspective, we discuss software infrastructure needs and investments with an eye to fully utilize exascale resources and provide unique computational tools for next-generation science problems and scientific discoveries.
Collapse
Affiliation(s)
- Rosa Di Felice
- Departments
of Physics and Astronomy and Quantitative and Computational Biology, University of Southern California, Los Angeles, California 90089, United States
- CNR-NANO
Modena, Modena 41125, Italy
| | - Maricris L. Mayes
- Department
of Chemistry and Biochemistry, University
of Massachusetts Dartmouth, North Dartmouth, Massachusetts 02747, United States
| | | | | | - Garnet Kin-Lic Chan
- Division
of Chemistry and Chemical Engineering, California
Institute of Technology, Pasadena, California 91125, United States
| | - Wibe A. de Jong
- Lawrence
Berkeley National Laboratory, Berkeley, California 94720, United States
| | - Niranjan Govind
- Physical
Sciences Division, Pacific Northwest National
Laboratory, Richland, Washington 99354, United States
| | - Martin Head-Gordon
- Pitzer Center
for Theoretical Chemistry, Department of Chemistry, University of California, Berkeley, California 94720, United States
| | - Matthew R. Hermes
- Department
of Chemistry, Chicago Center for Theoretical Chemistry, University of Chicago, Chicago, Illinois 60637, United States
| | - Karol Kowalski
- Physical
Sciences Division, Pacific Northwest National
Laboratory, Richland, Washington 99354, United States
| | - Xiaosong Li
- Department
of Chemistry, University of Washington, Seattle, Washington 98195, United States
| | - Hans Lischka
- Department
of Chemistry and Biochemistry, Texas Tech
University, Lubbock, Texas 79409, United States
| | - Karl T. Mueller
- Physical
and Computational Sciences Directorate, Pacific Northwest National Laboratory, Richland, Washington 99354, United States
| | - Erdal Mutlu
- Advanced
Computing, Mathematics, and Data Division, Pacific Northwest National Laboratory, Richland, Washington 99354, United States
| | - Anders M. N. Niklasson
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Mark R. Pederson
- Department
of Physics, The University of Texas at El
Paso, El Paso, Texas 79968, United States
| | - Bo Peng
- Physical
Sciences Division, Pacific Northwest National
Laboratory, Richland, Washington 99354, United States
| | - Ron Shepard
- Chemical
Sciences and Engineering Division, Argonne
National Laboratory, Lemont, Illinois 60439, United States
| | - Edward F. Valeev
- Department
of Chemistry, Virginia Tech, Blacksburg, Virginia 24061, United States
| | | | - Bess Vlaisavljevich
- Department
of Chemistry, University of South Dakota, Vermillion, South Dakota 57069, United States
| | - Theresa L. Windus
- Department
of Chemistry, Iowa State University and
Ames Laboratory, Ames, Iowa 50011, United States
| | - Sotiris S. Xantheas
- Department
of Chemistry, University of Washington, Seattle, Washington 98195, United States
- Advanced
Computing, Mathematics and Data Division, Pacific Northwest National Laboratory, Richland, Washington 99354, United States
| | - Xing Zhang
- Division
of Chemistry and Chemical Engineering, California
Institute of Technology, Pasadena, California 91125, United States
| | - Paul M. Zimmerman
- Department
of Chemistry, University of Michigan, Ann Arbor, Michigan 48109, United States
| |
Collapse
|
5
|
Manathunga M, Aktulga HM, Götz AW, Merz KM. Quantum Mechanics/Molecular Mechanics Simulations on NVIDIA and AMD Graphics Processing Units. J Chem Inf Model 2023; 63:711-717. [PMID: 36720086 DOI: 10.1021/acs.jcim.2c01505] [Citation(s) in RCA: 17] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
We have ported and optimized the graphics processing unit (GPU)-accelerated QUICK and AMBER-based ab initio quantum mechanics/molecular mechanics (QM/MM) implementation on AMD GPUs. This encompasses the entire Fock matrix build and force calculation in QUICK including one-electron integrals, two-electron repulsion integrals, exchange-correlation quadrature, and linear algebra operations. General performance improvements to the QUICK GPU code are also presented. Benchmarks carried out on NVIDIA V100 and AMD MI100 cards display similar performance on both hardware for standalone HF/DFT calculations with QUICK and QM/MM molecular dynamics simulations with QUICK/AMBER. Furthermore, with respect to the QUICK/AMBER release version 21, significant speedups are observed for QM/MM molecular dynamics simulations. This significantly increases the range of scientific problems that can be addressed with open-source QM/MM software on state-of-the-art computer hardware.
Collapse
Affiliation(s)
- Madushanka Manathunga
- Department of Chemistry and Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan48824-1322, United States
| | - Hasan Metin Aktulga
- Department of Computer Science and Engineering, Michigan State University, East Lansing, Michigan48824-1322, United States
| | - Andreas W Götz
- San Diego Supercomputer Center, University of California San Diego, La Jolla, California92093-0505, United States
| | - Kenneth M Merz
- Department of Chemistry and Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan48824-1322, United States
| |
Collapse
|
6
|
Guo M, Wang Z, Lu Y, Wang F. Energy correction and analytic energy gradients due to triples in CCSD(T) with spin–orbit coupling on graphic processing units using single-precision data. Mol Phys 2021. [DOI: 10.1080/00268976.2021.1974591] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Affiliation(s)
- Minggang Guo
- Institute of Atomic and Molecular Physics, Key Laboratory of High Energy Density Physics and Technology, Ministry of Education, Sichuan University, Chengdu, People’s Republic of China
| | - Zhifan Wang
- College of Chemistry and Life Science, Chengdu Normal University, Chengdu, People’s Republic of China
- School of Electronic Engineering, Chengdu Technological University, Chengdu, People’s Republic of China
| | - Yanzhao Lu
- Institute of Atomic and Molecular Physics, Key Laboratory of High Energy Density Physics and Technology, Ministry of Education, Sichuan University, Chengdu, People’s Republic of China
| | - Fan Wang
- Institute of Atomic and Molecular Physics, Key Laboratory of High Energy Density Physics and Technology, Ministry of Education, Sichuan University, Chengdu, People’s Republic of China
| |
Collapse
|
7
|
Hohenstein EG, Martínez TJ. GPU acceleration of rank-reduced coupled-cluster singles and doubles. J Chem Phys 2021; 155:184110. [PMID: 34773962 DOI: 10.1063/5.0063467] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
We have developed a graphical processing unit (GPU) accelerated implementation of our recently introduced rank-reduced coupled-cluster singles and doubles (RR-CCSD) method. RR-CCSD introduces a low-rank approximation of the doubles amplitudes. This is combined with a low-rank approximation of the electron repulsion integrals via Cholesky decomposition. The result of these two low-rank approximations is the replacement of the usual fourth-order CCSD tensors with products of second- and third-order tensors. In our implementation, only a single fourth-order tensor must be constructed as an intermediate during the solution of the amplitude equations. Owing in large part to the compression of the doubles amplitudes, the GPU-accelerated implementation shows excellent parallel efficiency (95% on eight GPUs). Our implementation can solve the RR-CCSD equations for up to 400 electrons and 1550 basis functions-roughly 50% larger than the largest canonical CCSD computations that have been performed on any hardware. In addition to increased scalability, the RR-CCSD computations are faster than the corresponding CCSD computations for all but the smallest molecules. We test the accuracy of RR-CCSD for a variety of chemical systems including up to 1000 basis functions and determine that accuracy to better than 0.1% error in the correlation energy can be achieved with roughly 95% compression of the ov space for the largest systems considered. We also demonstrate that conformational energies can be predicted to be within 0.1 kcal mol-1 with efficient compression applied to the wavefunction. Finally, we find that low-rank approximations of the CCSD doubles amplitudes used in the similarity transformation of the Hamiltonian prior to a conventional equation-of-motion CCSD computation will not introduce significant errors (on the order of a few hundredths of an electronvolt) into the resulting excitation energies.
Collapse
Affiliation(s)
- Edward G Hohenstein
- Department of Chemistry and The PULSE Institute, Stanford University, Stanford, California 94305, USA
| | - Todd J Martínez
- Department of Chemistry and The PULSE Institute, Stanford University, Stanford, California 94305, USA
| |
Collapse
|
8
|
Manathunga M, Jin C, Cruzeiro VWD, Miao Y, Mu D, Arumugam K, Keipert K, Aktulga HM, Merz KM, Götz AW. Harnessing the Power of Multi-GPU Acceleration into the Quantum Interaction Computational Kernel Program. J Chem Theory Comput 2021; 17:3955-3966. [PMID: 34062061 DOI: 10.1021/acs.jctc.1c00145] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
We report a new multi-GPU capable ab initio Hartree-Fock/density functional theory implementation integrated into the open source QUantum Interaction Computational Kernel (QUICK) program. Details on the load balancing algorithms for electron repulsion integrals and exchange correlation quadrature across multiple GPUs are described. Benchmarking studies carried out on up to four GPU nodes, each containing four NVIDIA V100-SXM2 type GPUs demonstrate that our implementation is capable of achieving excellent load balancing and high parallel efficiency. For representative medium to large size protein/organic molecular systems, the observed parallel efficiencies remained above 82% for the Kohn-Sham matrix formation and above 90% for nuclear gradient calculations. The accelerations on NVIDIA A100, P100, and K80 platforms also have realized parallel efficiencies higher than 68% in all tested cases, paving the way for large-scale ab initio electronic structure calculations with QUICK.
Collapse
Affiliation(s)
- Madushanka Manathunga
- Department of Chemistry and Department of Biochemistry and Molecular Biology, Michigan State University, 578 S. Shaw Lane, East Lansing, Michigan 48824-1322, United States
| | - Chi Jin
- Department of Chemistry and Department of Biochemistry and Molecular Biology, Michigan State University, 578 S. Shaw Lane, East Lansing, Michigan 48824-1322, United States
| | - Vinícius Wilian D Cruzeiro
- San Diego Supercomputer Center, University of California San Diego, 9500 Gilman Drive, La Jolla, California 92093-0505, United States.,Department of Chemistry and Biochemistry, University of California San Diego, La Jolla, California 92093, United States
| | - Yipu Miao
- Facebook, 1 Hacker Way, Menlo Park, California 94025, United States
| | - Dawei Mu
- National Center for Supercomputing Applications, University of Illinois at Urbana-Champaign, 1205 W Clark Street, Urbana, Illinois 61801, United States
| | - Kamesh Arumugam
- NVIDIA Corporation, Santa Clara, California 95051, United States
| | | | - Hasan Metin Aktulga
- Department of Computer Science and Engineering, Michigan State University, 428 S. Shaw Lane, East Lansing, Michigan 48824-1322, United States
| | - Kenneth M Merz
- Department of Chemistry and Department of Biochemistry and Molecular Biology, Michigan State University, 578 S. Shaw Lane, East Lansing, Michigan 48824-1322, United States
| | - Andreas W Götz
- San Diego Supercomputer Center, University of California San Diego, 9500 Gilman Drive, La Jolla, California 92093-0505, United States
| |
Collapse
|
9
|
Laqua H, Kussmann J, Ochsenfeld C. Accelerating seminumerical Fock-exchange calculations using mixed single- and double-precision arithmethic. J Chem Phys 2021; 154:214116. [PMID: 34240990 DOI: 10.1063/5.0045084] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
We investigate the applicability of single-precision (fp32) floating point operations within our linear-scaling, seminumerical exchange method sn-LinK [Laqua et al., J. Chem. Theory Comput. 16, 1456 (2020)] and find that the vast majority of the three-center-one-electron (3c1e) integrals can be computed with reduced numerical precision with virtually no loss in overall accuracy. This leads to a near doubling in performance on central processing units (CPUs) compared to pure fp64 evaluation. Since the cost of evaluating the 3c1e integrals is less significant on graphic processing units (GPUs) compared to CPU, the performance gains from accelerating 3c1e integrals alone is less impressive on GPUs. Therefore, we also investigate the possibility of employing only fp32 operations to evaluate the exchange matrix within the self-consistent-field (SCF) followed by an accurate one-shot evaluation of the exchange energy using mixed fp32/fp64 precision. This still provides very accurate (1.8 µEh maximal error) results while providing a sevenfold speedup on a typical "gaming" GPU (GTX 1080Ti). We also propose the use of incremental exchange-builds to further reduce these errors. The proposed SCF scheme (i-sn-LinK) requires only one mixed-precision exchange matrix calculation, while all other exchange-matrix builds are performed with only fp32 operations. Compared to pure fp64 evaluation, this leads to 4-7× speedups for the whole SCF procedure without any significant deterioration of the results or the convergence behavior.
Collapse
Affiliation(s)
- Henryk Laqua
- Department of Chemistry, Chair of Theoretical Chemistry, University of Munich (LMU), D-81377 München, Germany
| | - Jörg Kussmann
- Department of Chemistry, Chair of Theoretical Chemistry, University of Munich (LMU), D-81377 München, Germany
| | - Christian Ochsenfeld
- Department of Chemistry, Chair of Theoretical Chemistry, University of Munich (LMU), D-81377 München, Germany
| |
Collapse
|
10
|
Gyevi-Nagy L, Kállay M, Nagy PR. Accurate Reduced-Cost CCSD(T) Energies: Parallel Implementation, Benchmarks, and Large-Scale Applications. J Chem Theory Comput 2021; 17:860-878. [PMID: 33400527 PMCID: PMC7884001 DOI: 10.1021/acs.jctc.0c01077] [Citation(s) in RCA: 31] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2020] [Indexed: 11/28/2022]
Abstract
The accurate and systematically improvable frozen natural orbital (FNO) and natural auxiliary function (NAF) cost-reducing approaches are combined with our recent coupled-cluster singles, doubles, and perturbative triples [CCSD(T)] implementations. Both of the closed- and open-shell FNO-CCSD(T) codes benefit from OpenMP parallelism, completely or partially integral-direct density-fitting algorithms, checkpointing, and hand-optimized, memory- and operation count effective implementations exploiting all permutational symmetries. The closed-shell CCSD(T) code requires negligible disk I/O and network bandwidth, is MPI/OpenMP parallel, and exhibits outstanding peak performance utilization of 50-70% up to hundreds of cores. Conservative FNO and NAF truncation thresholds benchmarked for challenging reaction, atomization, and ionization energies of both closed- and open-shell species are shown to maintain 1 kJ/mol accuracy against canonical CCSD(T) for systems of 31-43 atoms even with large basis sets. The cost reduction of up to an order of magnitude achieved extends the reach of FNO-CCSD(T) to systems of 50-75 atoms (up to 2124 atomic orbitals) with triple- and quadruple-ζ basis sets, which is unprecedented without local approximations. Consequently, a considerably larger portion of the chemical compound space can now be covered by the practically "gold standard" quality FNO-CCSD(T) method using affordable resources and about a week of wall time. Large-scale applications are presented for organocatalytic and transition-metal reactions as well as noncovalent interactions. Possible applications for benchmarking local CCSD(T) methods, as well as for the accuracy assessment or parametrization of less complete models, for example, density functional approximations or machine learning potentials, are also outlined.
Collapse
Affiliation(s)
- László Gyevi-Nagy
- Department of Physical Chemistry and
Materials Science, Budapest University of
Technology and Economics, P.O. Box 91, H-1521 Budapest, Hungary
| | - Mihály Kállay
- Department of Physical Chemistry and
Materials Science, Budapest University of
Technology and Economics, P.O. Box 91, H-1521 Budapest, Hungary
| | - Péter R. Nagy
- Department of Physical Chemistry and
Materials Science, Budapest University of
Technology and Economics, P.O. Box 91, H-1521 Budapest, Hungary
| |
Collapse
|
11
|
Calvin JA, Peng C, Rishi V, Kumar A, Valeev EF. Many-Body Quantum Chemistry on Massively Parallel Computers. Chem Rev 2020; 121:1203-1231. [DOI: 10.1021/acs.chemrev.0c00006] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Affiliation(s)
- Justus A. Calvin
- Department of Chemistry, Virginia Tech, Blacksburg, Virginia 24061, United States
| | - Chong Peng
- Department of Chemistry, Virginia Tech, Blacksburg, Virginia 24061, United States
| | - Varun Rishi
- Department of Chemistry, Virginia Tech, Blacksburg, Virginia 24061, United States
| | - Ashutosh Kumar
- Department of Chemistry, Virginia Tech, Blacksburg, Virginia 24061, United States
| | - Edward F. Valeev
- Department of Chemistry, Virginia Tech, Blacksburg, Virginia 24061, United States
| |
Collapse
|
12
|
Fales BS, Curtis ER, Johnson KG, Lahana D, Seritan S, Wang Y, Weir H, Martínez TJ, Hohenstein EG. Performance of Coupled-Cluster Singles and Doubles on Modern Stream Processing Architectures. J Chem Theory Comput 2020; 16:4021-4028. [PMID: 32567305 DOI: 10.1021/acs.jctc.0c00336] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
We develop a new implementation of coupled-cluster singles and doubles (CCSD) optimized for the most recent graphical processing unit (GPU) hardware. We find that a single node with 8 NVIDIA V100 GPUs is capable of performing CCSD computations on roughly 100 atoms and 1300 basis functions in less than 1 day. Comparisons against massively parallel implementations of CCSD suggest that more than 64 CPU-based nodes (each with 16 cores) are required to match this performance.
Collapse
Affiliation(s)
- B Scott Fales
- Department of Chemistry and The PULSE Institute, Stanford University, Stanford, California 94305, United States.,SLAC National Accelerator Laboratory, 2575 Sand Hill Road, Menlo Park, California 94025, United States
| | - Ethan R Curtis
- Department of Chemistry and The PULSE Institute, Stanford University, Stanford, California 94305, United States.,SLAC National Accelerator Laboratory, 2575 Sand Hill Road, Menlo Park, California 94025, United States
| | - K Grace Johnson
- Department of Chemistry and The PULSE Institute, Stanford University, Stanford, California 94305, United States.,SLAC National Accelerator Laboratory, 2575 Sand Hill Road, Menlo Park, California 94025, United States
| | - Dean Lahana
- Department of Chemistry and The PULSE Institute, Stanford University, Stanford, California 94305, United States.,SLAC National Accelerator Laboratory, 2575 Sand Hill Road, Menlo Park, California 94025, United States
| | - Stefan Seritan
- Department of Chemistry and The PULSE Institute, Stanford University, Stanford, California 94305, United States.,SLAC National Accelerator Laboratory, 2575 Sand Hill Road, Menlo Park, California 94025, United States
| | - Yuanheng Wang
- Department of Chemistry and The PULSE Institute, Stanford University, Stanford, California 94305, United States.,SLAC National Accelerator Laboratory, 2575 Sand Hill Road, Menlo Park, California 94025, United States
| | - Hayley Weir
- Department of Chemistry and The PULSE Institute, Stanford University, Stanford, California 94305, United States.,SLAC National Accelerator Laboratory, 2575 Sand Hill Road, Menlo Park, California 94025, United States
| | - Todd J Martínez
- Department of Chemistry and The PULSE Institute, Stanford University, Stanford, California 94305, United States.,SLAC National Accelerator Laboratory, 2575 Sand Hill Road, Menlo Park, California 94025, United States
| | - Edward G Hohenstein
- Department of Chemistry and The PULSE Institute, Stanford University, Stanford, California 94305, United States.,SLAC National Accelerator Laboratory, 2575 Sand Hill Road, Menlo Park, California 94025, United States
| |
Collapse
|
13
|
Wang Z, Guo M, Wang F. Single-precision open-shell CCSD and CCSD(T) calculations on graphics processing units. Phys Chem Chem Phys 2020; 22:25103-25111. [DOI: 10.1039/d0cp03800h] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
It has been shown that coupled-cluster calculations with single-precision data are able to provide correlation energy with insignificant loss of accuracy.
Collapse
Affiliation(s)
- Zhifan Wang
- College of Chemistry and Life Science/Sichuan Provincial Key Laboratory for Structural Optimization and Application of Functional Molecules
- Chengdu Normal University
- Chengdu
- P. R. China
| | - Minggang Guo
- Institute of Atomic and Molecular Physics
- Sichuan University
- Chengdu
- P. R. China
| | - Fan Wang
- Institute of Atomic and Molecular Physics
- Sichuan University
- Chengdu
- P. R. China
| |
Collapse
|
14
|
Yoshikawa T, Komoto N, Nishimura Y, Nakai H. GPU-Accelerated Large-Scale Excited-State Simulation Based on Divide-and-Conquer Time-Dependent Density-Functional Tight-Binding. J Comput Chem 2019; 40:2778-2786. [PMID: 31441083 DOI: 10.1002/jcc.26053] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2019] [Revised: 08/04/2019] [Accepted: 08/07/2019] [Indexed: 01/09/2023]
Abstract
The present study implemented the divide-and-conquer time-dependent density-functional tight-binding (DC-TDDFTB) code on a graphical processing unit (GPU). The DC method, which is a linear-scaling scheme, divides a total system into several fragments. By separately solving local equations in individual fragments, the DC method could reduce slow central processing unit (CPU)-GPU memory access, as well as computational cost, and avoid shortfalls of GPU memory. Numerical applications confirmed that the present code on GPU significantly accelerated the TDDFTB calculations, while maintaining accuracy. Furthermore, the DC-TDDFTB simulation of 2-acetylindan-1,3-dione displays excited-state intramolecular proton transfer and provides reasonable absorption and fluorescence energies with the corresponding experimental values. © 2019 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Takeshi Yoshikawa
- Waseda Research Institute for Science and Engineering, Waseda University, 3-4-1 Okubo, Shinjuku-ku, Tokyo, 169-8555, Japan
| | - Nana Komoto
- Department of Chemistry and Biochemistry, School of Advanced Science and Engineering, Waseda University, 3-4-1 Okubo, Shinjuku-ku, Tokyo, 169-8555, Japan
| | - Yoshifumi Nishimura
- Waseda Research Institute for Science and Engineering, Waseda University, 3-4-1 Okubo, Shinjuku-ku, Tokyo, 169-8555, Japan
| | - Hiromi Nakai
- Waseda Research Institute for Science and Engineering, Waseda University, 3-4-1 Okubo, Shinjuku-ku, Tokyo, 169-8555, Japan.,Department of Chemistry and Biochemistry, School of Advanced Science and Engineering, Waseda University, 3-4-1 Okubo, Shinjuku-ku, Tokyo, 169-8555, Japan.,Elements Strategy Initiative for Catalysts and Batteries (ESICB), Kyoto University, Katsura, Kyoto, 615-8520, Japan
| |
Collapse
|
15
|
Gyevi-Nagy L, Kállay M, Nagy PR. Integral-Direct and Parallel Implementation of the CCSD(T) Method: Algorithmic Developments and Large-Scale Applications. J Chem Theory Comput 2019; 16:366-384. [DOI: 10.1021/acs.jctc.9b00957] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Affiliation(s)
- László Gyevi-Nagy
- Department of Physical Chemistry and Materials Science, Budapest University of Technology and Economics, P.O. Box 91, H-1521 Budapest, Hungary
| | | | | |
Collapse
|
16
|
Nagy PR, Kállay M. Approaching the Basis Set Limit of CCSD(T) Energies for Large Molecules with Local Natural Orbital Coupled-Cluster Methods. J Chem Theory Comput 2019; 15:5275-5298. [DOI: 10.1021/acs.jctc.9b00511] [Citation(s) in RCA: 65] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Affiliation(s)
- Péter R. Nagy
- Department of Physical Chemistry and Materials Science, Budapest University of Technology and Economics, P.O. Box 91, H-1521 Budapest, Hungary
| | - Mihály Kállay
- Department of Physical Chemistry and Materials Science, Budapest University of Technology and Economics, P.O. Box 91, H-1521 Budapest, Hungary
| |
Collapse
|
17
|
Pokhilko P, Epifanovsky E, Krylov AI. Double Precision Is Not Needed for Many-Body Calculations: Emergent Conventional Wisdom. J Chem Theory Comput 2018; 14:4088-4096. [DOI: 10.1021/acs.jctc.8b00321] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Pavel Pokhilko
- Department of Chemistry, University of Southern California, Los Angeles, California 90089-0482, United States
| | - Evgeny Epifanovsky
- Q-Chem
Inc., 6601 Owens Drive, Suite 105, Pleasanton, California 94588, United States
| | - Anna I. Krylov
- Department of Chemistry, University of Southern California, Los Angeles, California 90089-0482, United States
| |
Collapse
|
18
|
Nagy PR, Samu G, Kállay M. Optimization of the Linear-Scaling Local Natural Orbital CCSD(T) Method: Improved Algorithm and Benchmark Applications. J Chem Theory Comput 2018; 14:4193-4215. [DOI: 10.1021/acs.jctc.8b00442] [Citation(s) in RCA: 66] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Affiliation(s)
- Péter R. Nagy
- MTA-BME Lendület Quantum Chemistry Research Group, Department of Physical Chemistry and Materials Science, Budapest University of Technology and Economics, P.O. Box 91, H-1521 Budapest, Hungary
| | - Gyula Samu
- MTA-BME Lendület Quantum Chemistry Research Group, Department of Physical Chemistry and Materials Science, Budapest University of Technology and Economics, P.O. Box 91, H-1521 Budapest, Hungary
| | - Mihály Kállay
- MTA-BME Lendület Quantum Chemistry Research Group, Department of Physical Chemistry and Materials Science, Budapest University of Technology and Economics, P.O. Box 91, H-1521 Budapest, Hungary
| |
Collapse
|
19
|
Nagy PR, Kállay M. Optimization of the linear-scaling local natural orbital CCSD(T) method: Redundancy-free triples correction using Laplace transform. J Chem Phys 2017; 146:214106. [PMID: 28576082 PMCID: PMC5453808 DOI: 10.1063/1.4984322] [Citation(s) in RCA: 70] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2017] [Accepted: 05/05/2017] [Indexed: 01/30/2023] Open
Abstract
An improved algorithm is presented for the evaluation of the (T) correction as a part of our local natural orbital (LNO) coupled-cluster singles and doubles with perturbative triples [LNO-CCSD(T)] scheme [Z. Rolik et al., J. Chem. Phys. 139, 094105 (2013)]. The new algorithm is an order of magnitude faster than our previous one and removes the bottleneck related to the calculation of the (T) contribution. First, a numerical Laplace transformed expression for the (T) fragment energy is introduced, which requires on average 3 to 4 times fewer floating point operations with negligible compromise in accuracy eliminating the redundancy among the evaluated triples amplitudes. Second, an additional speedup factor of 3 is achieved by the optimization of our canonical (T) algorithm, which is also executed in the local case. These developments can also be integrated into canonical as well as alternative fragmentation-based local CCSD(T) approaches with minor modifications. As it is demonstrated by our benchmark calculations, the evaluation of the new Laplace transformed (T) correction can always be performed if the preceding CCSD iterations are feasible, and the new scheme enables the computation of LNO-CCSD(T) correlation energies with at least triple-zeta quality basis sets for realistic three-dimensional molecules with more than 600 atoms and 12 000 basis functions in a matter of days on a single processor.
Collapse
Affiliation(s)
- Péter R Nagy
- MTA-BME Lendület Quantum Chemistry Research Group, Department of Physical Chemistry and Materials Science, Budapest University of Technology and Economics, P.O. Box 91, H-1521 Budapest, Hungary
| | - Mihály Kállay
- MTA-BME Lendület Quantum Chemistry Research Group, Department of Physical Chemistry and Materials Science, Budapest University of Technology and Economics, P.O. Box 91, H-1521 Budapest, Hungary
| |
Collapse
|