1
|
Rooman M, Pucci F. Estimating the Vertical Ionization Potential of Single-Stranded DNA Molecules. J Chem Inf Model 2023; 63:1766-1775. [PMID: 36877828 DOI: 10.1021/acs.jcim.2c01525] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/08/2023]
Abstract
The electronic properties of DNA molecules, defined by the sequence-dependent ionization potentials of nucleobases, enable long-range charge transport along the DNA stacks. This has been linked to a range of key physiological processes in the cells and to the triggering of nucleobase substitutions, some of which may cause diseases. To gain molecular-level understanding of the sequence dependence of these phenomena, we estimated the vertical ionization potential (vIP) of all possible nucleobase stacks in B-conformation, containing one to four Gua, Ade, Thy, Cyt, or methylated Cyt. To do this, we used quantum chemistry calculations and more precisely the second-order Møller-Plesset perturbation theory (MP2) and three double-hybrid density functional theory methods, combined with several basis sets for describing atomic orbitals. The calculated vIP of single nucleobases were compared to experimental data and those of nucleobase pairs, triplets, and quadruplets, to observed mutability frequencies in the human genome, reported to be correlated with vIP values. This comparison selected MP2 with the 6-31G* basis set as the best of the tested calculation levels. These results were exploited to set up a recursive model, called vIPer, which estimates the vIP of all possible single-stranded DNA sequences of any length based on the calculated vIPs of overlapping quadruplets. vIPer's vIP values correlate well with oxidation potentials measured by cyclic voltammetry and activities obtained through photoinduced DNA cleavage experiments, further validating our approach. vIPer is freely available on the github.com/3BioCompBio/vIPer repository.
Collapse
Affiliation(s)
- Marianne Rooman
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, 1050 Brussels, Belgium.,Interuniversity Institute of Bioinformatics in Brussels, 1050 Brussels, Belgium
| | - Fabrizio Pucci
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, 1050 Brussels, Belgium.,Interuniversity Institute of Bioinformatics in Brussels, 1050 Brussels, Belgium
| |
Collapse
|
2
|
Auboeuf D. The Physics-Biology continuum challenges darwinism: Evolution is directed by the homeostasis-dependent bidirectional relation between genome and phenotype. PROGRESS IN BIOPHYSICS AND MOLECULAR BIOLOGY 2021; 167:121-139. [PMID: 34097984 DOI: 10.1016/j.pbiomolbio.2021.05.008] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/19/2021] [Revised: 05/19/2021] [Accepted: 05/31/2021] [Indexed: 10/21/2022]
Abstract
The physics-biology continuum relies on the fact that life emerged from prebiotic molecules. Here, I argue that life emerged from the coupling between nucleic acid and protein synthesis during which proteins (or proto-phenotypes) maintained the physicochemical parameter equilibria (or proto-homeostasis) in the proximity of their encoding nucleic acids (or proto-genomes). This protected the proto-genome physicochemical integrity (i.e., atomic composition) from environmental physicochemical constraints, and therefore increased the probability of reproducing the proto-genome without variation. From there, genomes evolved depending on the biological activities they generated in response to environmental fluctuations. Thus, a genome maintaining homeostasis (i.e., internal physicochemical parameter equilibria), despite and in response to environmental fluctuations, maintains its physicochemical integrity and has therefore a higher probability to be reproduced without variation. Consequently, descendants have a higher probability to share the same phenotype than their parents. Otherwise, the genome is modified during replication as a consequence of the imbalance of the internal physicochemical parameters it generates, until new mutation-deriving biological activities maintain homeostasis in offspring. In summary, evolution depends on feedforward and feedback loops between genome and phenotype, as the internal physicochemical conditions that a genome generates ─ through its derived phenotype in response to environmental fluctuations ─ in turn either guarantee its stability or direct its variation. Evolution may not be explained by the Darwinism-derived, unidirectional principle (random mutations-phenotypes-natural selection) but rather by the bidirectional relationship between genome and phenotype, in which the phenotype in interaction with the environment directs the evolution of the genome it derives from.
Collapse
Affiliation(s)
- Didier Auboeuf
- ENS de Lyon, Univ Claude Bernard, CNRS UMR 5239, INSERM U1210, Laboratory of Biology and Modelling of the Cell, 46 Allée D'Italie, Site Jacques Monod, F-69007, Lyon, France.
| |
Collapse
|
3
|
Atilano SR, Udar N, Satalich TA, Udar V, Chwa M, Kenney MC. Low frequency mitochondrial DNA heteroplasmy SNPs in blood, retina, and [RPE+choroid] of age-related macular degeneration subjects. PLoS One 2021; 16:e0246114. [PMID: 33513185 PMCID: PMC7846006 DOI: 10.1371/journal.pone.0246114] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2020] [Accepted: 01/13/2021] [Indexed: 01/07/2023] Open
Abstract
Purpose Mitochondrial (mt) DNA damage is associated with age-related macular degeneration (AMD) and other human aging diseases. This study was designed to quantify and characterize mtDNA low-frequency heteroplasmy single nucleotide polymorphisms (SNPs) of three different tissues isolated from AMD subjects using Next Generation Sequencing (NGS) technology. Methods DNA was extracted from neural retina, [RPE+choroid] and blood from three deceased age-related macular degeneration (AMD) subjects. Entire mitochondrial genomes were analyzed for low-frequency heteroplasmy SNPs using NGS technology that independently sequenced both mtDNA strands. This deep sequencing method (average sequencing depth of 30,000; range 1,000–100,000) can accurately differentiate low-frequency heteroplasmy SNPs from DNA modification artifacts. Twenty-three ‘hot-spot’ heteroplasmy mtDNA SNPs were analyzed in 222 additional blood samples. Results Germline homoplasmy SNPs that defined mtDNA haplogroups were consistent in the three tissues of each subject. Analyses of SNPs with <40% heteroplasmy revealed the blood had significantly greater numbers of heteroplasmy SNPs than retina alone (p≤0.05) or retina+choroid combined (p = 0.008). Twenty-three ‘hot-spot’ mtDNA heteroplasmy SNPs were present, with three being non-synonymous (amino acid change). Four ‘hot-spot’ heteroplasmy SNPs (m.1120C>T, m.1284T>C, m.1556C>T, m.7256C>T) were found in additional samples (n = 222). Five heteroplasmy SNPs (m.4104A>G, m.5320C>T, m.5471G>A, m.5474A>G, m.5498A>G) declined with age. Two heteroplasmy SNPs (m.13095T>C, m.13105A>G) increased in AMD compared to Normal samples. In the heteroplasmy SNPs, very few transversion mutations (purine to pyrimidine or vice versa, associated with oxidative damage) were found and the majority were transition changes (purine to purine or pyrimidine to pyrimidine, associated with replication errors). Conclusion Within an individual, the blood, retina and [RPE+choroid] contained identical homoplasmy SNPs representing inherited germline mtDNA haplogroup. NGS methodology showed significantly more mtDNA heteroplasmy SNPs in blood compared to retina and [RPE+choroid], suggesting the latter tissues have substantial protection. Significantly higher heteroplasmy levels of m.13095T>C and m.13105A>G may represent potential AMD biomarkers. Finally, high levels of transition mutations suggest that accumulation of heteroplasmic SNPs may occur through replication errors rather than oxidative damage.
Collapse
Affiliation(s)
- Shari R. Atilano
- Gavin Herbert Eye Institute, University of California Irvine, Irvine, CA, United States of America
| | - Nitin Udar
- Gavin Herbert Eye Institute, University of California Irvine, Irvine, CA, United States of America
| | - Timothy A. Satalich
- Institute for Mathematical Behavioral Science, University of California Irvine, Irvine, CA, United States of America
| | - Viraat Udar
- Gavin Herbert Eye Institute, University of California Irvine, Irvine, CA, United States of America
| | - Marilyn Chwa
- Gavin Herbert Eye Institute, University of California Irvine, Irvine, CA, United States of America
| | - M. Cristina Kenney
- Gavin Herbert Eye Institute, University of California Irvine, Irvine, CA, United States of America
- Department of Pathology and Laboratory Medicine, University of California Irvine, Irvine, CA, United States of America
- * E-mail:
| |
Collapse
|
4
|
Niccum BA, Coplen CP, Lee H, Mohammed Ismail W, Tang H, Foster PL. New complexities of SOS-induced "untargeted" mutagenesis in Escherichia coli as revealed by mutation accumulation and whole-genome sequencing. DNA Repair (Amst) 2020; 90:102852. [PMID: 32388005 DOI: 10.1016/j.dnarep.2020.102852] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2020] [Revised: 03/19/2020] [Accepted: 04/06/2020] [Indexed: 01/23/2023]
Abstract
When its DNA is damaged, Escherichia coli induces the SOS response, which consists of about 40 genes that encode activities to repair or tolerate the damage. Certain alleles of the major SOS-control genes, recA and lexA, cause constitutive expression of the response, resulting in an increase in spontaneous mutations. These mutations, historically called "untargeted", have been the subject of many previous studies. Here we re-examine SOS-induced mutagenesis using mutation accumulation followed by whole-genome sequencing (MA/WGS), which allows a detailed picture of the types of mutations induced as well as their sequence-specificity. Our results confirm previous findings that SOS expression specifically induces transversion base-pair substitutions, with rates averaging about 60-fold above wild-type levels. Surprisingly, the rates of G:C to C:G transversions, normally an extremely rare mutation, were induced an average of 160-fold above wild-type levels. The SOS-induced transversion showed strong sequence specificity, the most extreme of which was the G:C to C:G transversions, 60% of which occurred at the middle base of 5'GGC3'+5'GCC3' sites, although these sites represent only 8% of the G:C base pairs in the genome. SOS-induced transversions were also DNA strand-biased, occurring, on average, 2- to 4- times more often when the purine was on the leading-strand template and the pyrimidine on the lagging-strand template than in the opposite orientation. However, the strand bias was also sequence specific, and even of reverse orientation at some sites. By eliminating constraints on the mutations that can be recovered, the MA/WGS protocol revealed new complexities of SOS "untargeted" mutations.
Collapse
Affiliation(s)
- Brittany A Niccum
- Department of Biology, Indiana University, Bloomington, IN, 47405, USA
| | | | - Heewook Lee
- Luddy School of Informatics, Computing and Engineering, Indiana University, Bloomington, IN, 47405, USA
| | - Wazim Mohammed Ismail
- Luddy School of Informatics, Computing and Engineering, Indiana University, Bloomington, IN, 47405, USA
| | - Haixu Tang
- Luddy School of Informatics, Computing and Engineering, Indiana University, Bloomington, IN, 47405, USA
| | - Patricia L Foster
- Department of Biology, Indiana University, Bloomington, IN, 47405, USA.
| |
Collapse
|
5
|
Auboeuf D. Physicochemical Foundations of Life that Direct Evolution: Chance and Natural Selection are not Evolutionary Driving Forces. Life (Basel) 2020; 10:life10020007. [PMID: 31973071 PMCID: PMC7175370 DOI: 10.3390/life10020007] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2019] [Revised: 01/15/2020] [Accepted: 01/16/2020] [Indexed: 12/11/2022] Open
Abstract
The current framework of evolutionary theory postulates that evolution relies on random mutations generating a diversity of phenotypes on which natural selection acts. This framework was established using a top-down approach as it originated from Darwinism, which is based on observations made of complex multicellular organisms and, then, modified to fit a DNA-centric view. In this article, it is argued that based on a bottom-up approach starting from the physicochemical properties of nucleic and amino acid polymers, we should reject the facts that (i) natural selection plays a dominant role in evolution and (ii) the probability of mutations is independent of the generated phenotype. It is shown that the adaptation of a phenotype to an environment does not correspond to organism fitness, but rather corresponds to maintaining the genome stability and integrity. In a stable environment, the phenotype maintains the stability of its originating genome and both (genome and phenotype) are reproduced identically. In an unstable environment (i.e., corresponding to variations in physicochemical parameters above a physiological range), the phenotype no longer maintains the stability of its originating genome, but instead influences its variations. Indeed, environment- and cellular-dependent physicochemical parameters define the probability of mutations in terms of frequency, nature, and location in a genome. Evolution is non-deterministic because it relies on probabilistic physicochemical rules, and evolution is driven by a bidirectional interplay between genome and phenotype in which the phenotype ensures the stability of its originating genome in a cellular and environmental physicochemical parameter-depending manner.
Collapse
Affiliation(s)
- Didier Auboeuf
- Laboratory of Biology and Modelling of the Cell, Univ Lyon, ENS de Lyon, Univ Claude Bernard, CNRS UMR 5239, INSERM U1210, 46 Allée d'Italie, Site Jacques Monod, F-69007, Lyon, France
| |
Collapse
|
6
|
Bacolla A, Ye Z, Ahmed Z, Tainer JA. Cancer mutational burden is shaped by G4 DNA, replication stress and mitochondrial dysfunction. PROGRESS IN BIOPHYSICS AND MOLECULAR BIOLOGY 2019; 147:47-61. [PMID: 30880007 PMCID: PMC6745008 DOI: 10.1016/j.pbiomolbio.2019.03.004] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/23/2018] [Revised: 03/08/2019] [Accepted: 03/12/2019] [Indexed: 02/01/2023]
Abstract
A hallmark of cancer is genomic instability, which can enable cancer cells to evade therapeutic strategies. Here we employed a computational approach to uncover mechanisms underlying cancer mutational burden by focusing upon relationships between 1) translocation breakpoints and the thousands of G4 DNA-forming sequences within retrotransposons impacting transcription and exemplifying probable non-B DNA structures and 2) transcriptome profiling and cancer mutations. We determined the location and number of G4 DNA-forming sequences in the Genome Reference Consortium Human Build 38 and found a total of 358,605 covering ∼13.4 million bases. By analyzing >97,000 unique translocation breakpoints from the Catalogue Of Somatic Mutations In Cancer (COSMIC), we found that breakpoints are overrepresented at G4 DNA-forming sequences within hominid-specific SVA retrotransposons, and generally occur in tumors with mutations in tumor suppressor genes, such as TP53. Furthermore, correlation analyses between mRNA levels and exome mutational loads from The Cancer Genome Atlas (TCGA) encompassing >450,000 gene-mutation regressions revealed strong positive and negative associations, which depended upon tissue of origin. The strongest positive correlations originated from genes not listed as cancer genes in COSMIC; yet, these show strong predictive power for survival in most tumor types by Kaplan-Meier estimation. Thus, correlation analyses of DNA structure and gene expression with mutation loads complement and extend more traditional approaches to elucidate processes shaping genomic instability in cancer. The combined results point to G4 DNA, activation of cell cycle/DNA repair pathways, and mitochondrial dysfunction as three major factors driving the accumulation of somatic mutations in cancer cells.
Collapse
Affiliation(s)
- Albino Bacolla
- Departments of Cancer Biology and of Molecular and Cellular Oncology, The University of Texas M.D. Anderson Cancer Center, 6767 Bertner Avenue, Houston, TX, 77030, USA.
| | - Zu Ye
- Departments of Cancer Biology and of Molecular and Cellular Oncology, The University of Texas M.D. Anderson Cancer Center, 6767 Bertner Avenue, Houston, TX, 77030, USA.
| | - Zamal Ahmed
- Departments of Cancer Biology and of Molecular and Cellular Oncology, The University of Texas M.D. Anderson Cancer Center, 6767 Bertner Avenue, Houston, TX, 77030, USA.
| | - John A Tainer
- Departments of Cancer Biology and of Molecular and Cellular Oncology, The University of Texas M.D. Anderson Cancer Center, 6767 Bertner Avenue, Houston, TX, 77030, USA.
| |
Collapse
|
7
|
Pucci F, Rooman M. Relation between DNA ionization potentials, single base substitutions and pathogenic variants. BMC Genomics 2019; 20:551. [PMID: 31307386 PMCID: PMC6631442 DOI: 10.1186/s12864-019-5867-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Background It is nowadays clear that single base substitutions that occur in the human genome, of which some lead to pathogenic conditions, are non-random and influenced by their flanking nucleobase sequences. However, despite recent progress, the understanding of these "non-local" effects is still far from being achieved. Results To advance this problem, we analyzed the relationship between the base mutability in specific gene regions and the electron hole transport along the DNA base stacks, as it is one of the mechanisms that have been suggested to contribute to these effects. More precisely, we studied the connection between the normalized frequency of single base substitutions and the vertical ionization potential of the base and its flanking sequence, estimated using MP2/6-31G* ab initio quantum chemistry calculations. We found a statistically significant overall anticorrelation between these two quantities: the lower the vIP value, the more probable the substitution. Moreover, the slope of the regression lines varies. It is larger for introns than for exons and untranslated regions, and for synonymous than for missense substitutions. Interestingly, the correlation appears to be more pronounced when considering the flanking sequence of the substituted base in the 3’ rather than in the 5’ direction, which corresponds to the preferred direction of charge migration. A weaker but still statistically significant correlation is found between the ionization potentials and the pathogenicity of the base substitutions. Moreover, pathogenicity is also preferentially associated with larger changes in ionization potentials upon base substitution. Conclusions With this analysis we gained new insights into the complex biophysical mechanisms that are at the basis of mutagenesis and pathogenicity, and supported the role of electron-hole transport in these matters. Electronic supplementary material The online version of this article (10.1186/s12864-019-5867-y) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Fabrizio Pucci
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, Roosevelt Ave. 50, Bruxelles, 1050, Belgium.,John von Neumann Institute for Computing, Jülich Supercomputer Centre, Forschungszentrum Jülich, Jülich, 52428, Germany
| | - Marianne Rooman
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, Roosevelt Ave. 50, Bruxelles, 1050, Belgium.
| |
Collapse
|
8
|
Singh V, Jolly B, Rajput NK, Pramanik S, Bhardwaj A. MtBrowse: An integrative genomics browser for human mitochondrial DNA. Mitochondrion 2019; 48:31-36. [PMID: 30738202 DOI: 10.1016/j.mito.2019.02.003] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2018] [Revised: 11/10/2018] [Accepted: 02/06/2019] [Indexed: 12/18/2022]
Abstract
The human mitochondrion is a unique semi-autonomous organelle with a genome of its own and also requires nuclear encoded components to carry out its functions. In addition to being the powerhouse of the cell, mitochondria plays a central role in several metabolic pathways. It is therefore challenging to delineate the cause-effect relationship in context of mitochondrial dysfunction. Several studies implicate mutations in mitochondrial DNA (mtDNA) in various complex diseases. The human mitochondrial DNA (mtDNA) encodes a set of 37 genes, 13 protein coding, 22 tRNAs and two ribosomal RNAs, which are essential structural and functional components of the electron transport chain. As mentioned above, variations in these genes have been implicated in a broad spectrum of diseases and are extensively reported in literature and various databases. A large number of databases and prediction methods have been published to elucidate the role of human mitochondrial DNA in various disease phenotypes. However, there is no centralized resource to visualize this genotype-phenotype data. Towards this, we have developed MtBrowse: an integrative genomics browser for human mtDNA. As of now, MtBrowse has four categories - Gene, Disease, Reported variation and Variation prediction. These categories have 105 tracks and house data on mitochondrial reference genes, around 600 variants reported in literature with respect to various disease phenotypes and predictions for potential pathogenic variations in protein-coding genes. MtBrowse also hosts genomic variation data from over 5000 individuals on 22 disease phenotypes. MtBrowse may be accessed at http://ab-openlab.csir.res.in/cgi-bin/gb2/gbrowse.
Collapse
Affiliation(s)
- Vipin Singh
- University Institute of Biotechnology, Chandigarh University, Mohali, India
| | - Bani Jolly
- Bioinformatics Center, CSIR-Institute of Microbial Technology, Chandigarh, India
| | - Neeraj K Rajput
- Bioinformatics Center, CSIR-Institute of Microbial Technology, Chandigarh, India
| | - Sayan Pramanik
- Department of Biotechnology, Indian Institute of Technology, Kharagpur, India
| | - Anshu Bhardwaj
- Bioinformatics Center, CSIR-Institute of Microbial Technology, Chandigarh, India.
| |
Collapse
|