1
|
Shoura MJ, Giovan SM, Vetcher AA, Ziraldo R, Hanke A, Levene SD. Loop-closure kinetics reveal a stable, right-handed DNA intermediate in Cre recombination. Nucleic Acids Res 2020; 48:4371-4381. [PMID: 32182357 PMCID: PMC7192630 DOI: 10.1093/nar/gkaa153] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2019] [Revised: 02/24/2020] [Accepted: 02/29/2020] [Indexed: 11/12/2022] Open
Abstract
In Cre site-specific recombination, the synaptic intermediate is a recombinase homotetramer containing a pair of loxP DNA target sites. The enzyme system's strand-exchange mechanism proceeds via a Holliday-junction (HJ) intermediate; however, the geometry of DNA segments in the synapse has remained highly controversial. In particular, all crystallographic structures are consistent with an achiral, planar Holliday-junction (HJ) structure, whereas topological assays based on Cre-mediated knotting of plasmid DNAs are consistent with a right-handed chiral junction. We use the kinetics of loop closure involving closely spaced (131-151 bp) loxP sites to investigate the in-aqueo ensemble of conformations for the longest-lived looped DNA intermediate. Fitting the experimental site-spacing dependence of the loop-closure probability, J, to a statistical-mechanical theory of DNA looping provides evidence for substantial out-of-plane HJ distortion, which unequivocally stands in contrast to the square-planar intermediate geometry from Cre-loxP crystal structures and those of other int-superfamily recombinases. J measurements for an HJ-isomerization-deficient Cre mutant suggest that the apparent geometry of the wild-type complex is consistent with temporal averaging of right-handed and achiral structures. Our approach connects the static pictures provided by crystal structures and the natural dynamics of macromolecules in solution, thus advancing a more comprehensive dynamic analysis of large nucleoprotein structures and their mechanisms.
Collapse
Affiliation(s)
- Massa J Shoura
- Department of Bioengineering, University of Texas at Dallas, Richardson, TX 75080, USA
- Department of Biological Sciences, University of Texas at Dallas, Richardson, TX 75080, USA
| | - Stefan M Giovan
- Department of Biological Sciences, University of Texas at Dallas, Richardson, TX 75080, USA
| | - Alexandre A Vetcher
- Department of Biological Sciences, University of Texas at Dallas, Richardson, TX 75080, USA
| | - Riccardo Ziraldo
- Department of Bioengineering, University of Texas at Dallas, Richardson, TX 75080, USA
| | - Andreas Hanke
- Department of Physics, University of Texas Rio Grande Valley, Brownsville, TX 78520, USA
| | - Stephen D Levene
- Department of Bioengineering, University of Texas at Dallas, Richardson, TX 75080, USA
- Department of Biological Sciences, University of Texas at Dallas, Richardson, TX 75080, USA
- Physics, University of Texas at Dallas, Richardson, TX 75080, USA
| |
Collapse
|
2
|
Abstract
Recently, machine learning (ML) has established itself in various worldwide benchmarking competitions in computational biology, including Critical Assessment of Structure Prediction (CASP) and Drug Design Data Resource (D3R) Grand Challenges. However, the intricate structural complexity and high ML dimensionality of biomolecular datasets obstruct the efficient application of ML algorithms in the field. In addition to data and algorithm, an efficient ML machinery for biomolecular predictions must include structural representation as an indispensable component. Mathematical representations that simplify the biomolecular structural complexity and reduce ML dimensionality have emerged as a prime winner in D3R Grand Challenges. This review is devoted to the recent advances in developing low-dimensional and scalable mathematical representations of biomolecules in our laboratory. We discuss three classes of mathematical approaches, including algebraic topology, differential geometry, and graph theory. We elucidate how the physical and biological challenges have guided the evolution and development of these mathematical apparatuses for massive and diverse biomolecular data. We focus the performance analysis on protein-ligand binding predictions in this review although these methods have had tremendous success in many other applications, such as protein classification, virtual screening, and the predictions of solubility, solvation free energies, toxicity, partition coefficients, protein folding stability changes upon mutation, etc.
Collapse
Affiliation(s)
- Duc Duy Nguyen
- Department of Mathematics, Michigan State University, MI 48824, USA.
| | - Zixuan Cang
- Department of Mathematics, Michigan State University, MI 48824, USA.
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, MI 48824, USA. and Department of Biochemistry and Molecular Biology, Michigan State University, MI 48824, USA and Department of Electrical and Computer Engineering, Michigan State University, MI 48824, USA
| |
Collapse
|
3
|
Cang Z, Wei GW. Integration of element specific persistent homology and machine learning for protein-ligand binding affinity prediction. INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN BIOMEDICAL ENGINEERING 2018; 34. [PMID: 28677268 DOI: 10.1002/cnm.2914] [Citation(s) in RCA: 93] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/14/2017] [Revised: 06/27/2017] [Accepted: 06/29/2017] [Indexed: 05/17/2023]
Abstract
Protein-ligand binding is a fundamental biological process that is paramount to many other biological processes, such as signal transduction, metabolic pathways, enzyme construction, cell secretion, and gene expression. Accurate prediction of protein-ligand binding affinities is vital to rational drug design and the understanding of protein-ligand binding and binding induced function. Existing binding affinity prediction methods are inundated with geometric detail and involve excessively high dimensions, which undermines their predictive power for massive binding data. Topology provides the ultimate level of abstraction and thus incurs too much reduction in geometric information. Persistent homology embeds geometric information into topological invariants and bridges the gap between complex geometry and abstract topology. However, it oversimplifies biological information. This work introduces element specific persistent homology (ESPH) or multicomponent persistent homology to retain crucial biological information during topological simplification. The combination of ESPH and machine learning gives rise to a powerful paradigm for macromolecular analysis. Tests on 2 large data sets indicate that the proposed topology-based machine-learning paradigm outperforms other existing methods in protein-ligand binding affinity predictions. ESPH reveals protein-ligand binding mechanism that can not be attained from other conventional techniques. The present approach reveals that protein-ligand hydrophobic interactions are extended to 40Å away from the binding site, which has a significant ramification to drug and protein design.
Collapse
Affiliation(s)
- Zixuan Cang
- Department of Mathematics, Michigan State University, MI 48824, USA
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, MI 48824, USA
- Department of Biochemistry and Molecular Biology, Michigan State University, MI 48824, USA
- Department of Electrical and Computer Engineering, Michigan State University, MI 48824, USA
| |
Collapse
|
4
|
Cang Z, Mu L, Wei GW. Representability of algebraic topology for biomolecules in machine learning based scoring and virtual screening. PLoS Comput Biol 2018; 14:e1005929. [PMID: 29309403 PMCID: PMC5774846 DOI: 10.1371/journal.pcbi.1005929] [Citation(s) in RCA: 139] [Impact Index Per Article: 23.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2017] [Revised: 01/19/2018] [Accepted: 12/15/2017] [Indexed: 12/05/2022] Open
Abstract
This work introduces a number of algebraic topology approaches, including multi-component persistent homology, multi-level persistent homology, and electrostatic persistence for the representation, characterization, and description of small molecules and biomolecular complexes. In contrast to the conventional persistent homology, multi-component persistent homology retains critical chemical and biological information during the topological simplification of biomolecular geometric complexity. Multi-level persistent homology enables a tailored topological description of inter- and/or intra-molecular interactions of interest. Electrostatic persistence incorporates partial charge information into topological invariants. These topological methods are paired with Wasserstein distance to characterize similarities between molecules and are further integrated with a variety of machine learning algorithms, including k-nearest neighbors, ensemble of trees, and deep convolutional neural networks, to manifest their descriptive and predictive powers for protein-ligand binding analysis and virtual screening of small molecules. Extensive numerical experiments involving 4,414 protein-ligand complexes from the PDBBind database and 128,374 ligand-target and decoy-target pairs in the DUD database are performed to test respectively the scoring power and the discriminatory power of the proposed topological learning strategies. It is demonstrated that the present topological learning outperforms other existing methods in protein-ligand binding affinity prediction and ligand-decoy discrimination.
Collapse
Affiliation(s)
- Zixuan Cang
- Department of Mathematics, Michigan State University, East Lansing, Michigan, United States of America
| | - Lin Mu
- Computer Science and Mathematics Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee, United States of America
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, East Lansing, Michigan, United States of America
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan, United States of America
- Department of Electrical and Computer Engineering, Michigan State University, East Lansing, Michigan, United States of America
| |
Collapse
|
5
|
Cang Z, Wei GW. TopologyNet: Topology based deep convolutional and multi-task neural networks for biomolecular property predictions. PLoS Comput Biol 2017; 13:e1005690. [PMID: 28749969 PMCID: PMC5549771 DOI: 10.1371/journal.pcbi.1005690] [Citation(s) in RCA: 155] [Impact Index Per Article: 22.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2017] [Revised: 08/08/2017] [Accepted: 07/18/2017] [Indexed: 11/18/2022] Open
Abstract
Although deep learning approaches have had tremendous success in image, video and audio processing, computer vision, and speech recognition, their applications to three-dimensional (3D) biomolecular structural data sets have been hindered by the geometric and biological complexity. To address this problem we introduce the element-specific persistent homology (ESPH) method. ESPH represents 3D complex geometry by one-dimensional (1D) topological invariants and retains important biological information via a multichannel image-like representation. This representation reveals hidden structure-function relationships in biomolecules. We further integrate ESPH and deep convolutional neural networks to construct a multichannel topological neural network (TopologyNet) for the predictions of protein-ligand binding affinities and protein stability changes upon mutation. To overcome the deep learning limitations from small and noisy training sets, we propose a multi-task multichannel topological convolutional neural network (MM-TCNN). We demonstrate that TopologyNet outperforms the latest methods in the prediction of protein-ligand binding affinities, mutation induced globular protein folding free energy changes, and mutation induced membrane protein folding free energy changes. AVAILABILITY weilab.math.msu.edu/TDL/.
Collapse
Affiliation(s)
- Zixuan Cang
- Department of Mathematics, Michigan State University, East Lansing, MI 48824, USA
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, East Lansing, MI 48824, USA
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824, USA
- Department of Electrical and Computer Engineering, Michigan State University, East Lansing, MI 48824, USA
| |
Collapse
|
6
|
Seol Y, Neuman KC. The dynamic interplay between DNA topoisomerases and DNA topology. Biophys Rev 2016; 8:101-111. [PMID: 28510219 DOI: 10.1007/s12551-016-0240-8] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2016] [Accepted: 06/07/2016] [Indexed: 01/03/2023] Open
Abstract
Topological properties of DNA influence its structure and biochemical interactions. Within the cell, DNA topology is constantly in flux. Transcription and other essential processes, including DNA replication and repair, not only alter the topology of the genome but also introduce additional complications associated with DNA knotting and catenation. These topological perturbations are counteracted by the action of topoisomerases, a specialized class of highly conserved and essential enzymes that actively regulate the topological state of the genome. This dynamic interplay among DNA topology, DNA processing enzymes, and DNA topoisomerases is a pervasive factor that influences DNA metabolism in vivo. Building on the extensive structural and biochemical characterization over the past four decades that has established the fundamental mechanistic basis of topoisomerase activity, scientists have begun to explore the unique roles played by DNA topology in modulating and influencing the activity of topoisomerases. In this review we survey established and emerging DNA topology-dependent protein-DNA interactions with a focus on in vitro measurements of the dynamic interplay between DNA topology and topoisomerase activity.
Collapse
Affiliation(s)
- Yeonee Seol
- Laboratory of Single Molecule Biophysics, National Heart, Lung, and Blood Institute (NHLBI), National Institutes of Health, 50 South Dr., Room 3517, Bethesda, MD, 20892, USA
| | - Keir C Neuman
- Laboratory of Single Molecule Biophysics, National Heart, Lung, and Blood Institute (NHLBI), National Institutes of Health, 50 South Dr., Room 3517, Bethesda, MD, 20892, USA.
| |
Collapse
|
7
|
Abstract
Topological properties of DNA influence its structure and biochemical interactions. Within the cell DNA topology is constantly in flux. Transcription and other essential processes including DNA replication and repair, alter the topology of the genome, while introducing additional complications associated with DNA knotting and catenation. These topological perturbations are counteracted by the action of topoisomerases, a specialized class of highly conserved and essential enzymes that actively regulate the topological state of the genome. This dynamic interplay among DNA topology, DNA processing enzymes, and DNA topoisomerases, is a pervasive factor that influences DNA metabolism in vivo. Building on the extensive structural and biochemical characterization over the past four decades that established the fundamental mechanistic basis of topoisomerase activity, the unique roles played by DNA topology in modulating and influencing the activity of topoisomerases have begun to be explored. In this review we survey established and emerging DNA topology dependent protein-DNA interactions with a focus on in vitro measurements of the dynamic interplay between DNA topology and topoisomerase activity.
Collapse
Affiliation(s)
- Yeonee Seol
- Laboratory of Single Molecule Biophysics, NHLBI, National Institutes of Health, Bethesda, MD, 20892, U.S.A
| | - Keir C Neuman
- Laboratory of Single Molecule Biophysics, NHLBI, National Institutes of Health, Bethesda, MD, 20892, U.S.A
| |
Collapse
|
8
|
FtsK-dependent XerCD-dif recombination unlinks replication catenanes in a stepwise manner. Proc Natl Acad Sci U S A 2013; 110:20906-11. [PMID: 24218579 DOI: 10.1073/pnas.1308450110] [Citation(s) in RCA: 56] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
In Escherichia coli, complete unlinking of newly replicated sister chromosomes is required to ensure their proper segregation at cell division. Whereas replication links are removed primarily by topoisomerase IV, XerC/XerD-dif site-specific recombination can mediate sister chromosome unlinking in Topoisomerase IV-deficient cells. This reaction is activated at the division septum by the DNA translocase FtsK, which coordinates the last stages of chromosome segregation with cell division. It has been proposed that, after being activated by FtsK, XerC/XerD-dif recombination removes DNA links in a stepwise manner. Here, we provide a mathematically rigorous characterization of this topological mechanism of DNA unlinking. We show that stepwise unlinking is the only possible pathway that strictly reduces the complexity of the substrates at each step. Finally, we propose a topological mechanism for this unlinking reaction.
Collapse
|
9
|
Abstract
The Topological Aspects of DNA Function and Protein Folding international meeting provided an interdisciplinary forum for biological scientists, physicists and mathematicians to discuss recent developments in the application of topology to the study of DNA and protein structure. It had 111 invited participants, 48 talks and 21 posters. The present article discusses the importance of topology and introduces the articles from the meeting's speakers.
Collapse
|