1
|
Moberg DR, Jasper AW, Davis MJ. Parsimonious Potential Energy Surface Expansions Using Dictionary Learning with Multipass Greedy Selection. J Phys Chem Lett 2021; 12:9169-9174. [PMID: 34525799 DOI: 10.1021/acs.jpclett.1c02721] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Potential energy surfaces fit with basis set expansions have been shown to provide accurate representations of electronic energies and have enabled a variety of high-accuracy dynamics, kinetics, and spectroscopy applications. The number of terms in these expansions scales poorly with system size, a drawback that challenges their use for systems with more than ∼10 atoms. A solution is presented here using dictionary learning. Subsets of the full set of conventional basis functions are optimized using a newly developed multipass greedy regression method inspired by forward and backward selection methods from the statistics, signal processing, and machine learning literatures. The optimized representations have accuracies comparable to the full set but are 1 or more orders of magnitude smaller, and notably, the number of terms in the optimized multipass greedy expansions scales approximately linearly with the number of atoms.
Collapse
Affiliation(s)
- Daniel R Moberg
- Chemical Sciences and Engineering Division, Argonne National Laboratory, Lemont, Illinois 60439, United States
| | - Ahren W Jasper
- Chemical Sciences and Engineering Division, Argonne National Laboratory, Lemont, Illinois 60439, United States
| | - Michael J Davis
- Chemical Sciences and Engineering Division, Argonne National Laboratory, Lemont, Illinois 60439, United States
| |
Collapse
|
2
|
Moberg DR, Jasper AW. Permutationally Invariant Polynomial Expansions with Unrestricted Complexity. J Chem Theory Comput 2021; 17:5440-5455. [PMID: 34469127 DOI: 10.1021/acs.jctc.1c00352] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
A general strategy is presented for constructing and validating permutationally invariant polynomial (PIP) expansions for chemical systems of any stoichiometry. Demonstrations are made for three categories of gas-phase dynamics and kinetics: collisional energy-transfer trajectories for predicting pressure-dependent kinetics, three-body collisions for describing transient van der Waals adducts relevant to atmospheric chemistry, and nonthermal reactivity via quasiclassical trajectories. In total, 30 systems are considered with up to 15 atoms and 39 degrees of freedom. Permutational invariance is enforced in PIP expansions with as many as 13 million terms and 13 permutationally distinct atom types by taking advantage of petascale computational resources. The quality of the PIP expansions is demonstrated through the systematic convergence of in-sample and out-of-sample errors with respect to both the number of training data and the order of the expansion, and these errors are shown to predict errors in the dynamics for both reactive and nonreactive applications. The parallelized code distributed as part of this work enables the automation of PIP generation for complex systems with multiple channels and flexible user-defined symmetry constraints and for automatically removing unphysical unconnected terms from the basis set expansions, all of which are required for simulating complex reactive systems.
Collapse
Affiliation(s)
- Daniel R Moberg
- Chemical Sciences and Engineering Division, Argonne National Laboratory, Lemont, Illinois 60439, United States
| | - Ahren W Jasper
- Chemical Sciences and Engineering Division, Argonne National Laboratory, Lemont, Illinois 60439, United States
| |
Collapse
|
3
|
Guo H, Yang X, Zwier T. Virtual Issue on Combustion Chemistry. J Phys Chem A 2020; 124:5995-5996. [PMID: 32698590 DOI: 10.1021/acs.jpca.0c05674] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
4
|
Liu D, Li L, Rostami-Hodjegan A, Bois FY, Jamei M. Considerations and Caveats when Applying Global Sensitivity Analysis Methods to Physiologically Based Pharmacokinetic Models. AAPS JOURNAL 2020; 22:93. [PMID: 32681207 PMCID: PMC7367914 DOI: 10.1208/s12248-020-00480-x] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/21/2020] [Accepted: 07/07/2020] [Indexed: 02/06/2023]
Abstract
Three global sensitivity analysis (GSA) methods (Morris, Sobol and extended Sobol) are applied to a minimal physiologically based PK (mPBPK) model using three model drugs given orally, namely quinidine, alprazolam, and midazolam. We investigated how correlations among input parameters affect the determination of the key parameters influencing pharmacokinetic (PK) properties of general interest, i.e., the maximal plasma concentration (Cmax) time at which Cmax is reached (Tmax), and area under plasma concentration (AUC). The influential parameters determined by the Morris and Sobol methods (suitable for independent model parameters) were compared to those determined by the extended Sobol method (which considers model parameter correlations). For the three drugs investigated, the Morris method was as informative as the Sobol method. The extended Sobol method identified different sets of influential parameters to Morris and Sobol. These methods overestimated the influence of volume of distribution at steady state (Vss) on AUC24h for quinidine and alprazolam. They also underestimated the effect of volume of liver (Vliver) for all three drugs, the impact of enzyme intrinsic clearance of CYP2C9 and CYP2E1 for quinidine, and that of UGT1A4 abundance for midazolam. Our investigation showed that the interpretation of GSA results is not straightforward. Dismissing existing model parameter correlations, GSA methods such as Morris and Sobol can lead to biased determination of the key parameters for the selected outputs of interest. Decisions regarding parameters’ influence (or otherwise) should be made in light of available knowledge including the model assumptions, GSA method limitations, and inter-correlations between model parameters, particularly in complex models. Graphical abstract ![]()
Collapse
Affiliation(s)
- Dan Liu
- Simcyp Division, Certara UK Limited, Level 2-Acero, 1 Concourse Way, Sheffield, S1 2BJ, UK.
| | - Linzhong Li
- Simcyp Division, Certara UK Limited, Level 2-Acero, 1 Concourse Way, Sheffield, S1 2BJ, UK
| | - Amin Rostami-Hodjegan
- Simcyp Division, Certara UK Limited, Level 2-Acero, 1 Concourse Way, Sheffield, S1 2BJ, UK
| | - Frederic Y Bois
- Simcyp Division, Certara UK Limited, Level 2-Acero, 1 Concourse Way, Sheffield, S1 2BJ, UK
| | - Masoud Jamei
- Simcyp Division, Certara UK Limited, Level 2-Acero, 1 Concourse Way, Sheffield, S1 2BJ, UK
| |
Collapse
|
5
|
Huang Y, Li J, Ma Y. Determining optimum sampling numbers for survey of soil heavy metals in decision-making units: taking cadmium as an example. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2020; 27:24466-24479. [PMID: 32304065 DOI: 10.1007/s11356-020-08793-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/22/2020] [Accepted: 04/06/2020] [Indexed: 06/11/2023]
Abstract
Optimum sampling number (OSN) is one critical issue to achieve credible results when surveying heavy metals in soil and undertaking risk assessment for sustainable land use or remediation decisions. Although traditional methods, such as classical statistics, geostatistics, and simulated annealing algorithm, have been used to determine OSN for surveying soil heavy metals, their usefulness is limited because the distribution of soil heavy metal concentration approximately follows a log-normal distribution. Furthermore, existing correction equations for the log-normal distribution may overestimate or underestimate the OSN, and they have not been applied to estimate the OSN of soil heavy metals. The objective of the present study was to find a simple model under the log-normal distribution that determined the OSN for surveying of soil heavy metals in decision-making units. To test the effectiveness and accuracy of this model, soil heavy metals in 17 contaminated areas generating 200 multiscale units were analyzed. Determining equations for OSN, including classical statistics and approximate correction equations, were compared. Results showed that the equation for determining OSN by ordinary least squares (OSN_OLS) was computationally simple and straightforward because of an adjustment of the classic log-normal equation without relying on consulting the adjusted Student t-tables for a noncentralized data distribution. Compared with other OSN determining equations, sampling numbers by OSN_OLS were closer to optimum numbers and effectively avoided the risk of overestimation or underestimation. Descriptive statistics indicated that the estimated pollution results by OSN_OLS in representative units were very similar to original sampling with more sampling information. Furthermore, compared with other OSN-determining equations, the mapping based on OSN_OLS not only described the trends of spatial variation but also improved mapping accuracy. We conclude that OSN_OLS is an effective, straightforward, and exact model to estimate the OSN for surveying of soil heavy metals in decision-making units.
Collapse
Affiliation(s)
- Yajie Huang
- Institute of Agricultural Resources and Regional Planning, Chinese Academy of Agricultural Sciences, Beijing, 100081, China
| | - Jumei Li
- Institute of Agricultural Resources and Regional Planning, Chinese Academy of Agricultural Sciences, Beijing, 100081, China
| | - Yibing Ma
- Macau Environmental Research Institute, Macau University of Science and Technology, Macau, 999078, China.
| |
Collapse
|
6
|
Jasper AW, Davis MJ. Parameterization Strategies for Intermolecular Potentials for Predicting Trajectory-Based Collision Parameters. J Phys Chem A 2019; 123:3464-3480. [DOI: 10.1021/acs.jpca.9b01918] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Ahren W. Jasper
- Chemical Sciences and Engineering Division, Argonne National Laboratory, Lemont, Illinois 60439, United States
| | - Michael J. Davis
- Chemical Sciences and Engineering Division, Argonne National Laboratory, Lemont, Illinois 60439, United States
| |
Collapse
|
7
|
Magnotti GM, Wang Z, Liu W, Sivaramakrishnan R, Som S, Davis MJ. Sparsity Facilitates Chemical-Reaction Selection for Engine Simulations. J Phys Chem A 2018; 122:7227-7237. [PMID: 30102539 DOI: 10.1021/acs.jpca.8b05436] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Analysis of large-scale, realistic models incorporating detailed chemistry can be challenging because each simulation is computationally expensive, and a complete analysis may require many simulations. This paper addresses one such problem of this type, chemical-reaction selection in engine simulations. In this computationally challenging case, it is demonstrated how the important concept of sparsity can facilitate chemical-reaction selection, which is the process of finding the most important chemical reactions for modeling a chemical process. It is difficult to perform accurate reaction selection for engine simulations using realistic models of the chemistry, as each simulation takes processor weeks to complete. We developed a procedure to efficiently accomplish this selection process with a relatively small number of simulations using a form of global sensitivity analysis based on sparse regression. The chemical-reaction selection leads to an analysis of the ignition chemistry as it evolves within the compression-ignition engine simulations and allows for the spatial development of the selected chemical reactions to be studied in detail.
Collapse
|
8
|
Global Sensitivity Analysis of Large Reaction Mechanisms Using Fourier Amplitude Sensitivity Test. J CHEM-NY 2018. [DOI: 10.1155/2018/5127393] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Global sensitivity analysis (GSA) of large chemical reaction mechanisms remains a challenge since the model with uncertainties in the large number of input parameters provides large dimension of input parameter space and tends to be difficult to evaluate the effect of input parameters on model outputs. In this paper, a criterion for frequency selection to input parameter is proposed so that Fourier amplitude sensitivity test (FAST) method can evaluate the complex model with a low sample size. This developed FAST method can establish the relationship between the number of input parameters and sample size needed to measure sensitivity indices with high accuracy. The performance of this FAST method which can allow both the qualitative and quantitative analysis of complex systems is validated by a H2/air combustion model and a CH4/air combustion model. This FAST method is also compared with other GSA methods to illustrate the features of this FAST method. The results show that FAST method can evaluate the reaction systems with low sample size, and the sensitivity indices obtained from the FAST method can provide more important information which the variance-based GSA methods cannot obtain. FAST method can be a remarkably effective tool for the modelling and diagnosis of large chemical reaction.
Collapse
|
9
|
Wei P, Liu F, Lu Z, Wang Z. A probabilistic procedure for quantifying the relative importance of model inputs characterized by second-order probability models. Int J Approx Reason 2018. [DOI: 10.1016/j.ijar.2018.04.007] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
|