1
|
Chandy SK, Raghavachari K. MIM-ML: A Novel Quantum Chemical Fragment-Based Random Forest Model for Accurate Prediction of NMR Chemical Shifts of Nucleic Acids. J Chem Theory Comput 2023; 19:6632-6642. [PMID: 37703522 DOI: 10.1021/acs.jctc.3c00563] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/15/2023]
Abstract
We developed a random forest machine learning (ML) model for the prediction of 1H and 13C NMR chemical shifts of nucleic acids. Our ML model is trained entirely on reproducing computed chemical shifts obtained previously on 10 nucleic acids using a Molecules-in-Molecules (MIM) fragment-based density functional theory (DFT) protocol including microsolvation effects. Our ML model includes structural descriptors as well as electronic descriptors from an inexpensive low-level semiempirical calculation (GFN2-xTB) and trained on a relatively small number of DFT chemical shifts (2080 1H chemical shifts and 1780 13C chemical shifts on the 10 nucleic acids). The ML model is then used to make chemical shift predictions on 8 new nucleic acids ranging in size from 600 to 900 atoms and compared directly to experimental data. Though no experimental data was used in the training, the performance of our model is excellent (mean absolute deviation of 0.34 ppm for 1H chemical shifts and 2.52 ppm for 13C chemical shifts for the test set), despite having some nonstandard structures. A simple analysis suggests that both structural and electronic descriptors are critical for achieving reliable predictions. This is the first attempt to combine ML from fragment-based DFT calculations to predict experimental chemical shifts accurately, making the MIM-ML model a valuable tool for NMR predictions of nucleic acids.
Collapse
Affiliation(s)
- Sruthy K Chandy
- Department of Chemistry, Indiana University, Bloomington, Indiana 47405, United States
| | - Krishnan Raghavachari
- Department of Chemistry, Indiana University, Bloomington, Indiana 47405, United States
| |
Collapse
|
2
|
Chandy SK, Raghavachari K. Accurate and Cost-Effective NMR Chemical Shift Predictions for Nucleic Acids Using a Molecules-in-Molecules Fragmentation-Based Method. J Chem Theory Comput 2023; 19:544-561. [PMID: 36630261 DOI: 10.1021/acs.jctc.2c00967] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
Abstract
We have developed, implemented, and assessed an efficient protocol for the prediction of NMR chemical shifts of large nucleic acids using our molecules-in-molecules (MIM) fragment-based quantum chemical approach. To assess the performance of our approach, MIM-NMR calculations are calibrated on a test set of three nucleic acids, where the structure is derived from solution-phase NMR studies. For DNA systems with multiple conformers, the one-layer MIM method with trimer fragments (MIM1trimer) is benchmarked to get the lowest energy structure, with an average error of only 0.80 kcal/mol with respect to unfragmented full molecule calculations. The MIMI-NMRdimer calibration with respect to unfragmented full molecule calculations shows a mean absolute deviation (MAD) of 0.06 and 0.11 ppm, respectively, for 1H and 13C nuclei, but the performance with respect to experimental NMR chemical shifts is comparable to the more expensive MIM1-NMR and MIM2-NMR methods with trimer subsystems. To compare with the experimental chemical shifts, a standard protocol is derived using DNA systems with Protein Data Bank (PDB) IDs 1SY8, 1K2K, and 1KR8. The effect of structural minimizations is employed using a hybrid mechanics/semiempirical approach and used for computations in solution with implicit and explicit-implicit solvation models in our MIM1-NMRdimer methodology. To demonstrate the applicability of our protocol, we tested it on seven nucleic acids, including structures with nonstandard residues, heteroatom substitutions (F and B atoms), and side chain mutations with a size ranging from ∼300 to 1100 atoms. The major improvement for predicted MIM1-NMRdimer calculations is obtained from structural minimizations and implicit solvation effects. A significant improvement with the explicit-implicit solvation model is observed only for two smaller nucleic acid systems (1KR8 and 7NBK), where the expensive first solvation shell is replaced by the microsolvation model, in which a single water molecule is added for each solvent-exposed amino and imino protons, along with the implicit solvation. Overall, our target accuracy of ∼0.2-0.3 ppm for 1H and ∼2-3 ppm for 13C has been achieved for large nucleic acids. The proposed MIM-NMR approach is accurate and cost-effective (linear scaling with system size), and it can aid in the structural assignments of a wide range of complex biomolecules.
Collapse
Affiliation(s)
- Sruthy K Chandy
- Department of Chemistry, Indiana University, Bloomington, Indiana 47405, United States
| | - Krishnan Raghavachari
- Department of Chemistry, Indiana University, Bloomington, Indiana 47405, United States
| |
Collapse
|
3
|
Clavé G, Reverte M, Vasseur JJ, Smietana M. Modified internucleoside linkages for nuclease-resistant oligonucleotides. RSC Chem Biol 2021; 2:94-150. [PMID: 34458777 PMCID: PMC8341215 DOI: 10.1039/d0cb00136h] [Citation(s) in RCA: 31] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2020] [Accepted: 10/16/2020] [Indexed: 12/21/2022] Open
Abstract
In the past few years, several drugs derived from nucleic acids have been approved for commercialization and many more are in clinical trials. The sensitivity of these molecules to nuclease digestion in vivo implies the need to exploit resistant non-natural nucleotides. Among all the possible modifications, the one concerning the internucleoside linkage is of particular interest. Indeed minor changes to the natural phosphodiester may result in major modifications of the physico-chemical properties of nucleic acids. As this linkage is a key element of nucleic acids' chemical structures, its alteration can strongly modulate the plasma stability, binding properties, solubility, cell penetration and ultimately biological activity of nucleic acids. Over the past few decades, many research groups have provided knowledge about non-natural internucleoside linkage properties and participated in building biologically active nucleic acid derivatives. The recent renewing interest in nucleic acids as drugs, demonstrated by the emergence of new antisense, siRNA, aptamer and cyclic dinucleotide molecules, justifies the review of all these studies in order to provide new perspectives in this field. Thus, in this review we aim at providing the reader insights into modified internucleoside linkages that have been described over the years whose impact on annealing properties and resistance to nucleases have been evaluated in order to assess their potential for biological applications. The syntheses of modified nucleotides as well as the protocols developed for their incorporation within oligonucleotides are described. Given the intended biological applications, the modifications described in the literature that have not been tested for their resistance to nucleases are not reported.
Collapse
Affiliation(s)
| | - Maeva Reverte
- IBMM, Univ. Montpellier, CNRS, ENSCM Montpellier France
| | | | | |
Collapse
|
4
|
Ferris ZE, Li Q, Germann MW. Substituting Inosine for Guanosine in DNA: Structural and Dynamic Consequences. Nat Prod Commun 2019. [DOI: 10.1177/1934578x19850032] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open
Abstract
Inosine differs from the guanosine nucleoside only by the absence of the N2 amino group. Both nucleosides also have similar electrostatic potentials. Therefore, substituting I for G has been used to probe various properties of nucleic acids and to facilitate the interpretation of binding studies. In particular, the absence of the amino group permits the assessment of its importance in the binding of ligands to the minor groove of duplex DNA. It has been known for some time that an I-C base pair is of lower stability than a regular G-C base pair, which needs to be considered when making DNA constructs containing inosine. However, it is generally assumed that both base pairs are structurally highly similar. To test this assumption in an identical sequence environment, we have determined the fine structure of two hairpin DNA substrates that differ only in the substitution of an I-C base pair for a G-C base pair. The structures have been solved using nuclear magnetic resonance (NMR) restraints in conjunction with Mardigras and molecular dynamics. The structural data are complemented with thermodynamic and dynamic data to get a comprehensive evaluation of the consequences of G-C vs I-C base pair substitutions. Our data show a strong similarity in the structures of the hairpins, but a significant difference in the melting temperatures, T m. This difference is also reflected in the drastically decreased base pair lifetime of 7.4 milliseconds compared to the G-C base pair lifetime of 155 milliseconds. The substitution of I-C for G-C is to probe for specific effect due to the amino group is satisfactory, as long as the lowered thermal stability and the drastically increased local dynamics are considered.
Collapse
Affiliation(s)
| | - Qiushi Li
- Department of Chemistry, Georgia State University, Atlanta, GA, USA
| | - Markus W. Germann
- Department of Chemistry, Georgia State University, Atlanta, GA, USA
- Department of Biology, Georgia State University, Atlanta, GA, USA
| |
Collapse
|
5
|
Spring-Connell AM, Evich M, Germann MW. NMR Structure Determination for Oligonucleotides. ACTA ACUST UNITED AC 2019; 72:7.28.1-7.28.39. [PMID: 29927124 DOI: 10.1002/cpnc.48] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
NMR spectroscopy is a versatile tool for determining the structure and dynamics of nucleic acids under solution conditions. In this unit, we provide an overview and detail of the experiments and methods used in our laboratory to determine the structure of oligonucleotides at natural abundance, thus limiting our approach to 1 H, 13 C, and 31 P NMR techniques. Isotopic labeling is heavily used in RNA NMR studies, however, labeling of DNA is still less common and, if modified nucleotides are investigated, is exceptionally expensive or not feasible. Each method described here is extensively documented and annotated with tips and observations to facilitate their application. Sections are devoted to sample preparation, NMR experiments and setup, resonance assignment, structure generation protocols, evaluation, tips that may be useful, and software sources. © 2018 by John Wiley & Sons, Inc.
Collapse
Affiliation(s)
| | - Marina Evich
- Department of Chemistry, Georgia State University, Atlanta, Georgia
| | - Markus W Germann
- Department of Chemistry, Georgia State University, Atlanta, Georgia.,Neuroscience Institute, Georgia State University, Atlanta, Georgia
| |
Collapse
|
6
|
Evich M, Spring-Connell AM, Germann MW. Impact of modified ribose sugars on nucleic acid conformation and function. HETEROCYCL COMMUN 2017. [DOI: 10.1515/hc-2017-0056] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Open
Abstract
AbstractThe modification of the ribofuranose in nucleic acids is a widespread method of manipulating the activity of nucleic acids. These alterations, however, impact the local conformation and chemical reactivity of the sugar. Changes in the conformation and dynamics of the sugar moiety alter the local and potentially global structure and plasticity of nucleic acids, which in turn contributes to recognition, binding of ligands and enzymatic activity of proteins. This review article introduces the conformational properties of the (deoxy)ribofuranose ring and then explores sugar modifications and how they impact local and global structure and dynamics in nucleic acids.
Collapse
Affiliation(s)
- Marina Evich
- Georgia State University, Department of Chemistry, 50 Decatur St. SE, Atlanta, GA 30303, USA
| | | | - Markus W. Germann
- Georgia State University, Department of Chemistry, 50 Decatur St. SE, Atlanta, GA 30303, USA
- Georgia State University, Department of Biology, P.O. 4010, Atlanta, GA 30303, USA
- Georgia State University, Neuroscience Institute, P.O. 5030, Atlanta, GA 30303, USA
| |
Collapse
|
7
|
Spring-Connell AM, Evich MG, Debelak H, Seela F, Germann MW. Using NMR and molecular dynamics to link structure and dynamics effects of the universal base 8-aza, 7-deaza, N8 linked adenosine analog. Nucleic Acids Res 2016; 44:8576-8587. [PMID: 27566150 PMCID: PMC5062995 DOI: 10.1093/nar/gkw736] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2015] [Accepted: 08/10/2016] [Indexed: 12/30/2022] Open
Abstract
A truly universal nucleobase enables a host of novel applications such as simplified templates for PCR primers, randomized sequencing and DNA based devices. A universal base must pair indiscriminately to each of the canonical bases with little or preferably no destabilization of the overall duplex. In reality, many candidates either destabilize the duplex or do not base pair indiscriminatingly. The novel base 8-aza-7-deazaadenine (pyrazolo[3,4-d]pyrimidin- 4-amine) N8-(2'deoxyribonucleoside), a deoxyadenosine analog (UB), pairs with each of the natural DNA bases with little sequence preference. We have utilized NMR complemented with molecular dynamic calculations to characterize the structure and dynamics of a UB incorporated into a DNA duplex. The UB participates in base stacking with little to no perturbation of the local structure yet forms an unusual base pair that samples multiple conformations. These local dynamics result in the complete disappearance of a single UB proton resonance under native conditions. Accommodation of the UB is additionally stabilized via heightened backbone conformational sampling. NMR combined with various computational techniques has allowed for a comprehensive characterization of both structural and dynamic effects of the UB in a DNA duplex and underlines that the UB as a strong candidate for universal base applications.
Collapse
Affiliation(s)
| | - Marina G Evich
- Department of Chemistry, Georgia State University, Atlanta, GA 30303, USA
| | - Harald Debelak
- Laboratorium für Organische und Bioorganische Chemie, Institut für Chemie neuer Materialien, Universität Osnabrück, Barbarastraße 7, 49069 Osnabrück, Germany
| | - Frank Seela
- Laboratorium für Organische und Bioorganische Chemie, Institut für Chemie neuer Materialien, Universität Osnabrück, Barbarastraße 7, 49069 Osnabrück, Germany Laboratory of Bioorganic Chemistry and Chemical Biology, Center for Nanotechnology, Heisenbergstraße 11, 48149 Münster, Germany
| | - Markus W Germann
- Department of Chemistry, Georgia State University, Atlanta, GA 30303, USA
| |
Collapse
|
8
|
Wickstrom E. DNA and RNA derivatives to optimize distribution and delivery. Adv Drug Deliv Rev 2015; 87:25-34. [PMID: 25912659 DOI: 10.1016/j.addr.2015.04.012] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2014] [Revised: 04/13/2015] [Accepted: 04/15/2015] [Indexed: 12/27/2022]
Abstract
Synthetic, complementary DNA single strands and short interfering RNA double strands have been found to inhibit the expression of animal, plant, and viral genes in cells, animals, and patients, in a dose dependent and sequence specific manner. DNAs and RNAs, however, are readily digested in biological systems. Hence, chemists are obliged to design and synthesize nuclease-resistant analogs of normal DNA (Fig. 1).
Collapse
|
9
|
Victora A, Möller HM, Exner TE. Accurate ab initio prediction of NMR chemical shifts of nucleic acids and nucleic acids/protein complexes. Nucleic Acids Res 2014; 42:e173. [PMID: 25404135 PMCID: PMC4267612 DOI: 10.1093/nar/gku1006] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023] Open
Abstract
NMR chemical shift predictions based on empirical methods are nowadays indispensable tools during resonance assignment and 3D structure calculation of proteins. However, owing to the very limited statistical data basis, such methods are still in their infancy in the field of nucleic acids, especially when non-canonical structures and nucleic acid complexes are considered. Here, we present an ab initio approach for predicting proton chemical shifts of arbitrary nucleic acid structures based on state-of-the-art fragment-based quantum chemical calculations. We tested our prediction method on a diverse set of nucleic acid structures including double-stranded DNA, hairpins, DNA/protein complexes and chemically-modified DNA. Overall, our quantum chemical calculations yield highly/very accurate predictions with mean absolute deviations of 0.3–0.6 ppm and correlation coefficients (r2) usually above 0.9. This will allow for identifying misassignments and validating 3D structures. Furthermore, our calculations reveal that chemical shifts of protons involved in hydrogen bonding are predicted significantly less accurately. This is in part caused by insufficient inclusion of solvation effects. However, it also points toward shortcomings of current force fields used for structure determination of nucleic acids. Our quantum chemical calculations could therefore provide input for force field optimization.
Collapse
Affiliation(s)
- Andrea Victora
- Department of Chemistry and Zukunftskolleg, Universität Konstanz, 78457 Konstanz, Germany
| | - Heiko M Möller
- Institute of Chemistry, University of Potsdam, Karl-Liebknecht-Strasse 24-25, 14476 Potsdam OT Golm, Germany
| | - Thomas E Exner
- Department of Chemistry and Zukunftskolleg, Universität Konstanz, 78457 Konstanz, Germany Institute of Pharmacy, Eberhard Karls Universität Tübingen, Auf der Morgenstelle 8, 72076 Tübingen, Germany
| |
Collapse
|
10
|
Thompson RA, Spring AM, Sheng J, Huang Z, Germann MW. The importance of fitting in: conformational preference of selenium 2' modifications in nucleosides and helical structures. J Biomol Struct Dyn 2014; 33:289-97. [PMID: 24558982 DOI: 10.1080/07391102.2014.880944] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
Selenomethionine incorporation has proven useful in X-ray crystallography of proteins to obtain phase information. In nucleic acids, the introduction of selenium to different positions is beneficial for solving the phase problem as well, but its addition to the 2' position also significantly enhances the crystal formation. The selenium modification in a single nucleotide shows a preference towards 2'-endo sugar puckering, which is in conflict with existing crystal structures where the duplex incorporated 2'-selenium-modified nucleotide is exclusively found in a 3'-endo conformation. Our work provides a rationale why 2'-selenium modifications facilitate crystallization despite this contradictory behavior.
Collapse
Affiliation(s)
- R Adam Thompson
- a Department of Chemistry , Georgia State University , 50 Decatur Street, Atlanta , GA 30303 , USA
| | | | | | | | | |
Collapse
|
11
|
Xu Z, Sergueeva ZA, Shaw BR. Synthesis and hydrolytic properties of thymidine boranomonophosphate. Tetrahedron Lett 2013. [DOI: 10.1016/j.tetlet.2013.03.110] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
12
|
Iwamoto N, Oka N, Wada T. Stereocontrolled synthesis of oligodeoxyribonucleoside boranophosphates by an oxazaphospholidine approach using acid-labile N-protecting groups. Tetrahedron Lett 2012. [DOI: 10.1016/j.tetlet.2012.06.015] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|
13
|
Johnson CN, Spring AM, Desai S, Cunningham RP, Germann MW. DNA sequence context conceals α-anomeric lesions. J Mol Biol 2011; 416:425-37. [PMID: 22227386 DOI: 10.1016/j.jmb.2011.12.051] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2011] [Revised: 12/14/2011] [Accepted: 12/23/2011] [Indexed: 11/16/2022]
Abstract
DNA sequence context has long been known to modulate detection and repair of DNA damage. Recent studies using experimental and computational approaches have sought to provide a basis for this observation. We have previously shown that an α-anomeric adenosine (αA) flanked by cytosines (5'CαAC-3') resulted in a kinked DNA duplex with an enlarged minor groove. Comparison of different flanking sequences revealed that a DNA duplex containing a 5'CαAG-3' motif exhibits unique substrate properties. However, this substrate was not distinguished by unusual thermodynamic properties. To understand the structural basis of the altered recognition, we have determined the solution structure of a DNA duplex with a 5'CαAG-3' core, using an extensive set of restraints including dipolar couplings and backbone torsion angles. The NMR structure exhibits an excellent agreement with the data (total R(X) <5.3%). The αA base is intrahelical, in a reverse Watson-Crick orientation, and forms a weak base pair with a thymine of the opposite strand. In comparison to the DNA duplex with a 5'CαAC-3' core, we observe a significant reduction of the local perturbation (backbone, stacking, tilt, roll, and twist), resulting in a straighter DNA with narrower minor groove. Overall, these features result in a less perturbed DNA helix and obscure the presence of the lesion compared to the 5'CαAC-3' sequence. The improved stacking of the 5'CαAG-3' core also affects the energetics of the DNA deformation that is required to form a catalytically competent complex. These traits provide a rationale for the modulation of the recognition by endonuclease IV.
Collapse
|