1
|
Vásquez-Pérez JM, Zárate-Hernández LÁ, Gómez-Castro CZ, Nolasco-Hernández UA. A Practical Algorithm to Solve the Near-Congruence Problem for Rigid Molecules and Clusters. J Chem Inf Model 2023; 63:1157-1165. [PMID: 36749172 DOI: 10.1021/acs.jcim.2c01187] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
We present an improved algorithm to solve the near-congruence problem for rigid molecules and clusters based on the iterative application of assignment and alignment steps with biased Euclidean costs. The algorithm is formulated as a quasi-local optimization procedure with each optimization step involving a linear assignment (LAP) and a singular value decomposition (SVD). The efficiency of the algorithm is increased by up to 5 orders of magnitude with respect to the original unbiased noniterative method and can be applied to systems with hundreds or thousands of atoms, outperforming all state-of-the-art methods published so far in the literature. The Fortran implementation of the algorithm is available as an open source library (https://github.com/qcuaeh/molalignlib) and is suitable to be used in global optimization methods for the identification of local minima or basins.
Collapse
|
2
|
Lu Q. Identifying molecular structural features by pattern recognition methods. RSC Adv 2022; 12:17559-17569. [PMID: 35765452 PMCID: PMC9192268 DOI: 10.1039/d2ra00764a] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2022] [Accepted: 06/06/2022] [Indexed: 11/21/2022] Open
Abstract
Identification of molecular structural features is a central part of computational chemistry. It would be beneficial if pattern recognition techniques could be incorporated to facilitate the identification. Currently, the quantification of the structural dissimilarity is mainly carried out by root-mean-square-deviation (RMSD) calculations such as in molecular dynamics simulations. However, the RMSD calculation underperforms for large molecules, showing the so-called "curse of dimensionality" problem. Also, it requires consistent ordering of atoms in two comparing structures, which needs nontrivial effort to fulfill. In this work, we propose to take advantage of the point cloud recognition using convex hulls as the basis to recognize molecular structural features. Two advantages of the method can be highlighted. First, the dimension of the input data structure is largely reduced from the number of atoms of molecules to the number of atoms of convex hulls. Therefore, the dimensionality curse problem is avoided, and the atom ordering process is saved. Second, the construction of convex hulls can be used to define new molecular descriptors, such as the contact area of molecular interactions. These new molecular descriptors have different properties from existing ones, therefore they are expected to exhibit different behaviors for certain machine learning studies. Several illustrative applications have been carried out, which provide promising results for structure-activity studies.
Collapse
Affiliation(s)
- Qing Lu
- Beijing National Laboratory for Molecular Sciences, Institute of Chemistry, Chinese Academy of Sciences Beijing 100190 China
| |
Collapse
|
3
|
Schmitz G, Yönder Ö, Schnieder B, Schmid R, Hättig C. An automatized workflow from molecular dynamic simulation to quantum chemical methods to identify elementary reactions and compute reaction constants. J Comput Chem 2021; 42:2264-2282. [PMID: 34636424 DOI: 10.1002/jcc.26757] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
We present an automatized workflow which, starting from molecular dynamics simulations, identifies reaction events, filters them, and prepares them for accurate quantum chemical calculations using, for example, Density Functional Theory (DFT) or Coupled Cluster methods. The capabilities of the automatized workflow are demonstrated by the example of simulations for the combustion of some polycyclic aromatic hydrocarbons (PAHs). It is shown how key elementary reaction candidates are filtered out of a much larger set of redundant reactions and refined further. The molecular species in question are optimized using DFT and reaction energies, barrier heights, and reaction rates are calculated. The setup is general enough to include at this stage configurational sampling, which can be exploited in the future. Using the introduced machinery, we investigate how the observed reaction types depend on the gas atmosphere used in the molecular dynamics simulation. For the re-optimization on the DFT level, we show how the additional information needed to switch from reactive force-field to electronic structure calculations can be filled in and study how well ReaxFF and DFT agree with each other and shine light on the perspective of using more accurate semi-empirical methods in the MD simulation.
Collapse
Affiliation(s)
- Gunnar Schmitz
- Computational Materials Chemistry Group, Ruhr-Universität Bochum, Bochum, Germany
| | - Özlem Yönder
- Lehrstuhl für Theoretische Chemie, Ruhr-Universität Bochum, Bochum, Germany
| | - Bastian Schnieder
- Computational Materials Chemistry Group, Ruhr-Universität Bochum, Bochum, Germany
| | - Rochus Schmid
- Computational Materials Chemistry Group, Ruhr-Universität Bochum, Bochum, Germany
| | - Christof Hättig
- Lehrstuhl für Theoretische Chemie, Ruhr-Universität Bochum, Bochum, Germany
| |
Collapse
|
4
|
Gunde M, Salles N, Hémeryck A, Martin-Samos L. IRA: A Shape Matching Approach for Recognition and Comparison of Generic Atomic Patterns. J Chem Inf Model 2021; 61:5446-5457. [PMID: 34704748 DOI: 10.1021/acs.jcim.1c00567] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
We propose a versatile, parameter-less approach for solving the shape matching problem, specifically in the context of atomic structures when atomic assignments are not known a priori. The algorithm Iteratively suggests Rotated atom-centered reference frames and Assignments (iterative rotations and assignments (IRA)). The frame for which a permutationally invariant set-set distance, namely, the Hausdorff distance, returns a minimal value is chosen as the solution of the matching problem. IRA is able to find rigid rotations, reflections, translations, and permutations between structures with different numbers of atoms, for any atomic arrangement and pattern, periodic or not. When distortions are present between the structures, optimal rotation and translation are found by further applying a standard singular value decomposition-based method. To compute the atomic assignments under the one-to-one assignment constraint, we develop our own algorithm, constrained shortest distance assignments (CShDA). The overall approach is extensively tested on several structures, including distorted structural fragments. The efficiency of the proposed algorithm is shown as a benchmark comparison against two other shape matching algorithms. We discuss the use of our approach for the identification and comparison of structures and structural fragments through two examples: a replica-exchange trajectory of a cyanine molecule, in which we show how our approach could aid the exploration of relevant collective coordinates for clustering the data, and a SiO2 amorphous model, in which we compute distortion scores, and compare them with a classical strain-based potential. The source code and benchmark data are available at https://github.com/mammasmias/IterativeRotationsAssignments.
Collapse
Affiliation(s)
- Miha Gunde
- LAAS-CNRS, Université de Toulouse, CNRS, 7 avenue du Colonel Roche, 31031 Toulouse, France.,CNR-IOM, Democritos National Simulation Center, Istituto Officina dei Materiali, c/o SISSA, via Bonomea 265, IT-34136 Trieste, Italy
| | - Nicolas Salles
- CNR-IOM, Democritos National Simulation Center, Istituto Officina dei Materiali, c/o SISSA, via Bonomea 265, IT-34136 Trieste, Italy
| | - Anne Hémeryck
- LAAS-CNRS, Université de Toulouse, CNRS, 7 avenue du Colonel Roche, 31031 Toulouse, France
| | - Layla Martin-Samos
- CNR-IOM, Democritos National Simulation Center, Istituto Officina dei Materiali, c/o SISSA, via Bonomea 265, IT-34136 Trieste, Italy
| |
Collapse
|
5
|
Lu Q. Molecular structure recognition by blob detection. RSC Adv 2021; 11:35879-35886. [PMID: 35492772 PMCID: PMC9043223 DOI: 10.1039/d1ra05752a] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2021] [Accepted: 10/31/2021] [Indexed: 11/23/2022] Open
Abstract
Molecular structure recognition is fundamental in computational chemistry. The most common approach is to calculate the root mean square deviation (RMSD) between two sets of molecular coordinates. However, this method does not perform well for large molecules. In this work, a new method is proposed for structure comparison. Blob detection is used for recognizing structural features. Fragmentation of molecules is proposed as the pre-treatment. Mapping between blobs and atoms is developed as the post-treatment. A set of key parameters important for blob detections are determined. The dissimilarity is quantified by calculating the Euclidean metric of the blob vectors. The overall algorithm is found to be accurate to distinguish structural dissimilarity. The method has potential to be combined with other pattern recognition techniques for new chemistry discoveries. Molecular structure recognition is fundamental in computational chemistry.![]()
Collapse
Affiliation(s)
- Qing Lu
- Beijing National Laboratory for Molecular Sciences, Institute of Chemistry, Chinese Academy of Sciences Beijing 100190 China
| |
Collapse
|
6
|
Zhai H, Alexandrova AN. Local Fluxionality of Surface-Deposited Cluster Catalysts: The Case of Pt 7 on Al 2O 3. J Phys Chem Lett 2018; 9:1696-1702. [PMID: 29551071 DOI: 10.1021/acs.jpclett.8b00379] [Citation(s) in RCA: 48] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/18/2023]
Abstract
Subnano surface-supported catalytic clusters can be generally characterized by many low-energy isomers accessible at elevated temperatures of catalysis. The most stable isomer may not be the most catalytically active. Additionally, isomers may interconvert across barriers, i.e., exhibit fluxionality, during catalysis. To study the big picture of the cluster fluxional behavior, we model such a process as isomerization graph using bipartite matching algorithm, harmonic transition state theory, and paralleled nudged elastic band method. All the minimal energy paths form a minimum spanning tree (MST) of the original graph. Detailed inspection shows that, at temperatures typical for catalysis, the cluster geometry changes frequently within several regions in the MST, while transition across regions is less likely. As a further confirmation, the structural similarity analysis was additionally performed based on molecular dynamics trajectories. This local fluxionality picture provides a new perspective on understanding finite-temperate catalytic processes.
Collapse
Affiliation(s)
- Huanchen Zhai
- Department of Chemistry and Biochemistry , University of California, Los Angeles , Los Angeles , California 90095 , United States
| | - Anastassia N Alexandrova
- Department of Chemistry and Biochemistry , University of California, Los Angeles , Los Angeles , California 90095 , United States
- California NanoSystems Institute , Los Angeles , California 90095 , United States
| |
Collapse
|
7
|
Woodley S, Lazauskas T, Illingworth M, Carter AC, Sokol AA. What is the best or most relevant global minimum for nanoclusters? Predicting, comparing and recycling cluster structures with WASP@N. Faraday Discuss 2018; 211:593-611. [DOI: 10.1039/c8fd00060c] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023]
Abstract
Our WASP@N project is an open-access database of cluster structures with a web-assisted interface and toolkit for structure prediction.
Collapse
Affiliation(s)
- Scott M. Woodley
- University College London
- Department of Chemistry
- London WC1H 0AJ
- UK
| | - Tomas Lazauskas
- University College London
- Department of Chemistry
- London WC1H 0AJ
- UK
| | | | | | - Alexey A. Sokol
- University College London
- Department of Chemistry
- London WC1H 0AJ
- UK
| |
Collapse
|
8
|
Temelso B, Mabey JM, Kubota T, Appiah-Padi N, Shields GC. ArbAlign: A Tool for Optimal Alignment of Arbitrarily Ordered Isomers Using the Kuhn-Munkres Algorithm. J Chem Inf Model 2017; 57:1045-1054. [PMID: 28398732 DOI: 10.1021/acs.jcim.6b00546] [Citation(s) in RCA: 37] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
When assessing the similarity between two isomers whose atoms are ordered identically, one typically translates and rotates their Cartesian coordinates for best alignment and computes the pairwise root-mean-square distance (RMSD). However, if the atoms are ordered differently or the molecular axes are switched, it is necessary to find the best ordering of the atoms and check for optimal axes before calculating a meaningful pairwise RMSD. The factorial scaling of finding the best ordering by looking at all permutations is too expensive for any system with more than ten atoms. We report use of the Kuhn-Munkres matching algorithm to reduce the cost of finding the best ordering from factorial to polynomial scaling. That allows the application of this scheme to any arbitrary system efficiently. Its performance is demonstrated for a range of molecular clusters as well as rigid systems. The largely standalone tool is freely available for download and distribution under the GNU General Public License v3.0 (GNU_GPL_v3) agreement. An online implementation is also provided via a web server ( http://www.arbalign.org ) for convenient use.
Collapse
Affiliation(s)
- Berhane Temelso
- Dean's Office, College of Arts and Sciences, and Department of Chemistry, Bucknell University , Lewisburg, Pennsylvania 17837, United States
| | - Joel M Mabey
- Dean's Office, College of Arts and Sciences, and Department of Chemistry, Bucknell University , Lewisburg, Pennsylvania 17837, United States
| | - Toshiro Kubota
- Department of Mathematical Sciences, Susquehanna University , Selinsgrove, Pennsylvania 17870, United States
| | - Nana Appiah-Padi
- Dean's Office, College of Arts and Sciences, and Department of Chemistry, Bucknell University , Lewisburg, Pennsylvania 17837, United States.,Lewisburg Area High School , Lewisburg, Pennsylvania 17837, United States
| | - George C Shields
- Dean's Office, College of Arts and Sciences, and Department of Chemistry, Bucknell University , Lewisburg, Pennsylvania 17837, United States
| |
Collapse
|
9
|
Lazauskas T, Sokol AA, Woodley SM. An efficient genetic algorithm for structure prediction at the nanoscale. NANOSCALE 2017; 9:3850-3864. [PMID: 28252128 DOI: 10.1039/c6nr09072a] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
We have developed and implemented a new global optimization technique based on a Lamarckian genetic algorithm with the focus on structure diversity. The key process in the efficient search on a given complex energy landscape proves to be the removal of duplicates that is achieved using a topological analysis of candidate structures. The careful geometrical prescreening of newly formed structures and the introduction of new mutation move classes improve the rate of success further. The power of the developed technique, implemented in the Knowledge Led Master Code, or KLMC, is demonstrated by its ability to locate and explore a challenging double funnel landscape of a Lennard-Jones 38 atom system (LJ38). We apply the redeveloped KLMC to investigate three chemically different systems: ionic semiconductor (ZnO)1-32, metallic Ni13 and covalently bonded C60. All four systems have been systematically explored on the energy landscape defined using interatomic potentials. The new developments allowed us to successfully locate the double funnels of LJ38, find new local and global minima for ZnO clusters, extensively explore the Ni13 and C60 (the buckminsterfullerene, or buckyball) potential energy surfaces.
Collapse
Affiliation(s)
- Tomas Lazauskas
- University College London, Kathleen Lonsdale Materials Chemistry, Department of Chemistry, 20 Gordon Street, London WC1H 0AJ, UK.
| | - Alexey A Sokol
- University College London, Kathleen Lonsdale Materials Chemistry, Department of Chemistry, 20 Gordon Street, London WC1H 0AJ, UK.
| | - Scott M Woodley
- University College London, Kathleen Lonsdale Materials Chemistry, Department of Chemistry, 20 Gordon Street, London WC1H 0AJ, UK.
| |
Collapse
|
10
|
Affiliation(s)
- Arne Wagner
- Institute of Inorganic Chemistry, Ruprecht-Karls University Heidelberg, Im Neuenheimer Feld 270, 69120 Heidelberg, Germany
| | - Hans-Jörg Himmel
- Institute of Inorganic Chemistry, Ruprecht-Karls University Heidelberg, Im Neuenheimer Feld 270, 69120 Heidelberg, Germany
| |
Collapse
|
11
|
Ramirez-Manzanares A, Peña J, Azpiroz JM, Merino G. A hierarchical algorithm for molecular similarity (H-FORMS). J Comput Chem 2015; 36:1456-66. [DOI: 10.1002/jcc.23947] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2015] [Revised: 04/24/2015] [Accepted: 04/27/2015] [Indexed: 11/10/2022]
Affiliation(s)
- Alonso Ramirez-Manzanares
- Computer Science Department; Centro de Investigación en Matemáticas, A.C. Callejón Jalisco S/N; Guanajuato Gto Mexico
| | - Joaquin Peña
- Computer Science Department; Centro de Investigación en Matemáticas, A.C. Callejón Jalisco S/N; Guanajuato Gto Mexico
| | - Jon M. Azpiroz
- Kimika Fakultatea, Euskal Herriko Unibertsitatea (UPV/EHU); Donostia International Physics Center (DIPC); P. K. 1072 20080 Donostia Euskadi Spain
- Computational Laboratory for Hybrid/Organic Photovoltaics (CLHYO); CNR-ISTM; Via Elce di Sotto 8 06123 Perugia Italy
| | - Gabriel Merino
- Departamento de Física Aplicada; Centro de Investigación y de Estudios Avanzados Unidad Mérida. Km 6 Antigua carretera a Progreso. Apdo. Postal 73; Cordemex 97310 Mérida Yuc México
| |
Collapse
|
12
|
Sadeghi A, Ghasemi SA, Schaefer B, Mohr S, Lill MA, Goedecker S. Metrics for measuring distances in configuration spaces. J Chem Phys 2014; 139:184118. [PMID: 24320265 DOI: 10.1063/1.4828704] [Citation(s) in RCA: 79] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
In order to characterize molecular structures we introduce configurational fingerprint vectors which are counterparts of quantities used experimentally to identify structures. The Euclidean distance between the configurational fingerprint vectors satisfies the properties of a metric and can therefore safely be used to measure dissimilarities between configurations in the high dimensional configuration space. In particular we show that these metrics are a perfect and computationally cheap replacement for the root-mean-square distance (RMSD) when one has to decide whether two noise contaminated configurations are identical or not. We introduce a Monte Carlo approach to obtain the global minimum of the RMSD between configurations, which is obtained from a global minimization over all translations, rotations, and permutations of atomic indices.
Collapse
Affiliation(s)
- Ali Sadeghi
- Department of Physics, Universität Basel, Klingelbergstr. 82, 4056 Basel, Switzerland
| | | | | | | | | | | |
Collapse
|
13
|
Allen WJ, Rizzo RC. Implementation of the Hungarian algorithm to account for ligand symmetry and similarity in structure-based design. J Chem Inf Model 2014; 54:518-29. [PMID: 24410429 PMCID: PMC3958141 DOI: 10.1021/ci400534h] [Citation(s) in RCA: 63] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Abstract
![]()
False
negative docking outcomes for highly symmetric molecules
are a barrier to the accurate evaluation of docking programs, scoring
functions, and protocols. This work describes an implementation of
a symmetry-corrected root-mean-square deviation (RMSD) method into
the program DOCK based on the Hungarian algorithm for solving the
minimum assignment problem, which dynamically assigns atom correspondence
in molecules with symmetry. The algorithm adds only a trivial amount
of computation time to the RMSD calculations and is shown to increase
the reported overall docking success rate by approximately 5% when
tested over 1043 receptor–ligand systems. For some families
of protein systems the results are even more dramatic, with success
rate increases up to 16.7%. Several additional applications of the
method are also presented including as a pairwise similarity metric
to compare molecules during de novo design, as a scoring function
to rank-order virtual screening results, and for the analysis of trajectories
from molecular dynamics simulation. The new method, including source
code, is available to registered users of DOCK6 (http://dock.compbio.ucsf.edu).
Collapse
Affiliation(s)
- William J Allen
- Department of Applied Mathematics & Statistics, Stony Brook University , Stony Brook, New York 11794, United States
| | | |
Collapse
|
14
|
Furche F, Ahlrichs R, Hättig C, Klopper W, Sierka M, Weigend F. Turbomole. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE 2013. [DOI: 10.1002/wcms.1162] [Citation(s) in RCA: 666] [Impact Index Per Article: 60.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Affiliation(s)
- Filipp Furche
- University of California, IrvineDepartment of ChemistryIrvineCAUSA
| | - Reinhart Ahlrichs
- Institute of Physical ChemistryKarlsruhe Institute of Technology (KIT)KarlsruheGermany
| | - Christof Hättig
- Lehrstuhl für Theoretische ChemieRuhr‐Universität BochumBochumGermany
| | - Wim Klopper
- Institute of Physical ChemistryKarlsruhe Institute of Technology (KIT)KarlsruheGermany
| | - Marek Sierka
- Otto‐Schott‐Institut für MaterialforschungFriedrich‐Schiller‐Universität JenaJenaGermany
| | - Florian Weigend
- Institute of NanotechnologyKarlsruhe Institute of Technology (KIT)KarlsruheGermany
| |
Collapse
|