1
|
Shoer S, Reicher L, Zhao C, Pollard KS, Pilpel Y, Segal E. Pangenomes of human gut microbiota uncover links between genetic diversity and stress response. Cell Host Microbe 2024; 32:1744-1757.e2. [PMID: 39353429 DOI: 10.1016/j.chom.2024.08.017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2024] [Revised: 07/11/2024] [Accepted: 08/23/2024] [Indexed: 10/04/2024]
Abstract
The genetic diversity of the gut microbiota has a central role in host health. Here, we created pangenomes for 728 human gut prokaryotic species, quadrupling the genes of strain-specific genomes. Each of these species has a core set of a thousand genes, differing even between closely related species, and an accessory set of genes unique to the different strains. Functional analysis shows high strain variability associates with sporulation, whereas low variability is linked with antibiotic resistance. We further map the antibiotic resistome across the human gut population and find 237 cases of extreme resistance even to last-resort antibiotics, with a predominance among Enterobacteriaceae. Lastly, the presence of specific genes in the microbiota relates to host age and sex. Our study underscores the genetic complexity of the human gut microbiota, emphasizing its significant implications for host health. The pangenomes and antibiotic resistance map constitute a valuable resource for further research.
Collapse
Affiliation(s)
- Saar Shoer
- Department of Computer Science and Applied Mathematics, The Weizmann Institute of Science, Rehovot, Israel; Department of Molecular Cell Biology, The Weizmann Institute of Science, Rehovot, Israel
| | - Lee Reicher
- Department of Computer Science and Applied Mathematics, The Weizmann Institute of Science, Rehovot, Israel; Department of Molecular Cell Biology, The Weizmann Institute of Science, Rehovot, Israel; Lis Maternity and Women's Hospital, Sourasky Medical Center, Tel Aviv, Israel
| | - Chunyu Zhao
- Gladstone Institute for Data Science and Biotechnology, San Francisco, CA, USA; Chan Zuckerberg Biohub San Francisco, San Francisco, CA, USA
| | - Katherine S Pollard
- Gladstone Institute for Data Science and Biotechnology, San Francisco, CA, USA; Chan Zuckerberg Biohub San Francisco, San Francisco, CA, USA; Department of Epidemiology and Biostatistics, University of California, San Francisco, CA, USA
| | - Yitzhak Pilpel
- Department of Molecular Genetics, The Weizmann Institute of Science, Rehovot, Israel
| | - Eran Segal
- Department of Computer Science and Applied Mathematics, The Weizmann Institute of Science, Rehovot, Israel; Department of Molecular Cell Biology, The Weizmann Institute of Science, Rehovot, Israel.
| |
Collapse
|
2
|
Khandia R, Garg R, Pandey MK, Khan AA, Dhanda SK, Malik A, Gurjar P. Determination of codon pattern and evolutionary forces acting on genes linked to inflammatory bowel disease. Int J Biol Macromol 2024; 278:134480. [PMID: 39116987 DOI: 10.1016/j.ijbiomac.2024.134480] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2024] [Revised: 06/25/2024] [Accepted: 07/31/2024] [Indexed: 08/10/2024]
Abstract
Inflammatory bowel disease (IBD) is an inflammatory disorder of the gastrointestinal tract. The present study attempted to understand the codon usage preferences in genes associated with IBD progression. Compositional analysis, codon usage bias (CUB), Relative synonymous codon usage (RSCU), RNA structure, and expression analysis were performed to obtain a comprehensive picture of codon usage in IBD genes. Compositional analysis of 62 IBD-associated genes revealed that G and T are the most and least abundant nucleotides, respectively. ApG, CpA, and TpG dinucleotides were overrepresented or randomly used, while ApC, CpG, GpT, and TpA dinucleotides were either underrepresented or randomly used in genes related to IBD. The codons influencing the codon usage the most in IBD genes were CGC and AGG. A comparison of codon usage between IBD, and pancreatitis (non-IBD inflammatory disease) indicated that only codon CTG codon usage was significantly different between IBD and pancreatitis. At the same time, there were codons ATA, ACA, CGT, CAA, GTA, CCT, ATT, GCT, CGG, TTG, and CAG for whom codon usage was significantly different for IBD and housekeeping gene sets. The results suggest similar codon usage in at least two inflammatory disorders, IBD and pancreatitis. The analysis helps understand the codon biology, factors affecting gene expression of IBD-associated genes, and the evolution of these genes. The study helps reveal the molecular patterns associated with IBD.
Collapse
Affiliation(s)
- Rekha Khandia
- Department of Biochemistry and Genetics, Barkatullah University, Bhopal 462026, MP, India.
| | - Rajkumar Garg
- Department of Biosciences, Barkatullah University, Bhopal 462026, MP, India
| | - Megha Katare Pandey
- Translational Medicine Center, All India Institute of Medical Sciences, Bhopal 462020, MP, India.
| | - Azmat Ali Khan
- Pharmaceutical Biotechnology Laboratory, Department of Pharmaceutical Chemistry, College of Pharmacy, King Saud University, Riyadh 11451, Saudi Arabia.
| | - Sandeep Kumar Dhanda
- Department of Oncology, St Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Abdul Malik
- Department of Pharmaceutics, College of Pharmacy, King Saud University, Riyadh 11451, Saudi Arabia.
| | - Pankaj Gurjar
- Centre for Global Health Research, Saveetha Medical College and Hospital, Saveetha Institute of Medical and Technical Sciences, Saveetha University, Chennai, Tamil Nadu, India; Department of Science and Engineering, Novel Global Community Educational Foundation, Hebersham, Australia.
| |
Collapse
|
3
|
Krishnakant Kushwaha S, Wu Y, Leonardo Avila H, Anand A, Sicheritz-Pontén T, Millard A, Amol Marathe S, Nobrega FL. Comprehensive blueprint of Salmonella genomic plasticity identifies hotspots for pathogenicity genes. PLoS Biol 2024; 22:e3002746. [PMID: 39110680 PMCID: PMC11305592 DOI: 10.1371/journal.pbio.3002746] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2024] [Accepted: 07/10/2024] [Indexed: 08/10/2024] Open
Abstract
Understanding the dynamic evolution of Salmonella is vital for effective bacterial infection management. This study explores the role of the flexible genome, organised in regions of genomic plasticity (RGP), in shaping the pathogenicity of Salmonella lineages. Through comprehensive genomic analysis of 12,244 Salmonella spp. genomes covering 2 species, 6 subspecies, and 46 serovars, we uncover distinct integration patterns of pathogenicity-related gene clusters into RGP, challenging traditional views of gene distribution. These RGP exhibit distinct preferences for specific genomic spots, and the presence or absence of such spots across Salmonella lineages profoundly shapes strain pathogenicity. RGP preferences are guided by conserved flanking genes surrounding integration spots, implicating their involvement in regulatory networks and functional synergies with integrated gene clusters. Additionally, we emphasise the multifaceted contributions of plasmids and prophages to the pathogenicity of diverse Salmonella lineages. Overall, this study provides a comprehensive blueprint of the pathogenicity potential of Salmonella. This unique insight identifies genomic spots in nonpathogenic lineages that hold the potential for harbouring pathogenicity genes, providing a foundation for predicting future adaptations and developing targeted strategies against emerging human pathogenic strains.
Collapse
Affiliation(s)
- Simran Krishnakant Kushwaha
- Department of Biological Sciences, Birla Institute of Technology & Science (BITS), Pilani, Rajasthan, India
- School of Biological Sciences, University of Southampton, Southampton, United Kingdom
| | - Yi Wu
- School of Biological Sciences, University of Southampton, Southampton, United Kingdom
| | - Hugo Leonardo Avila
- Laboratory for Applied Science and Technology in Health, Instituto Carlos Chagas, FIOCRUZ Paraná, Brazil
| | - Abhirath Anand
- Department of Computer Sciences and Information Systems, Birla Institute of Technology & Science (BITS), Pilani, Rajasthan, India
| | - Thomas Sicheritz-Pontén
- Center for Evolutionary Hologenomics, Globe Institute, University of Copenhagen, Copenhagen, Denmark
- Centre of Excellence for Omics-Driven Computational Biodiscovery (COMBio), AIMST University, Bedong, Kedah, Malaysia
| | - Andrew Millard
- Centre for Phage Research, Department of Genetics and Genome Biology, University of Leicester, Leicester, United Kingdom
| | - Sandhya Amol Marathe
- Department of Biological Sciences, Birla Institute of Technology & Science (BITS), Pilani, Rajasthan, India
| | - Franklin L. Nobrega
- School of Biological Sciences, University of Southampton, Southampton, United Kingdom
| |
Collapse
|
4
|
Hu Z, Chen J, Olatoye MO, Zhang H, Lin Z. Transcriptome-wide expression landscape and starch synthesis pathway co-expression network in sorghum. THE PLANT GENOME 2024; 17:e20448. [PMID: 38602082 DOI: 10.1002/tpg2.20448] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/12/2024]
Abstract
The gene expression landscape across different tissues and developmental stages reflects their biological functions and evolutionary patterns. Integrative and comprehensive analyses of all transcriptomic data in an organism are instrumental to obtaining a comprehensive picture of gene expression landscape. Such studies are still very limited in sorghum, which limits the discovery of the genetic basis underlying complex agricultural traits in sorghum. We characterized the genome-wide expression landscape for sorghum using 873 RNA-sequencing (RNA-seq) datasets representing 19 tissues. Our integrative analysis of these RNA-seq data provides the most comprehensive transcriptomic atlas for sorghum, which will be valuable for the sorghum research community for functional characterizations of sorghum genes. Based on the transcriptome atlas, we identified 595 housekeeping genes (HKGs) and 2080 tissue-specific expression genes (TEGs) for the 19 tissues. We identified different gene features between HKGs and TEGs, and we found that HKGs have experienced stronger selective constraints than TEGs. Furthermore, we built a transcriptome-wide co-expression network (TW-CEN) comprising 35 modules with each module enriched in specific Gene Ontology terms. High-connectivity genes in TW-CEN tend to express at high levels while undergoing intensive selective pressure. We also built global and seed-preferential co-expression networks of starch synthesis pathways, which indicated that photosynthesis and microtubule-based movement play important roles in starch synthesis. The global transcriptome atlas of sorghum generated by this study provides an important functional genomics resource for trait discovery and insight into starch synthesis regulation in sorghum.
Collapse
Affiliation(s)
- Zhenbin Hu
- Department of Biology, Saint Louis University, Saint Louis, Missouri, USA
| | - Junhao Chen
- Department of Biology, Saint Louis University, Saint Louis, Missouri, USA
| | - Marcus O Olatoye
- USDA-ARS, Forage Seed and Cereal Research Unit, Prosser, Washington, USA
| | - Hengyou Zhang
- State Key Laboratory of Black Soils Conservation and Utilization, Key Laboratory of Soybean Molecular Design and Breeding, Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Harbin, China
| | - Zhenguo Lin
- Department of Biology, Saint Louis University, Saint Louis, Missouri, USA
| |
Collapse
|
5
|
McCoy MJ, Fire AZ. Parallel gene size and isoform expansion of ancient neuronal genes. Curr Biol 2024; 34:1635-1645.e3. [PMID: 38460513 PMCID: PMC11043017 DOI: 10.1016/j.cub.2024.02.021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Revised: 12/16/2023] [Accepted: 02/11/2024] [Indexed: 03/11/2024]
Abstract
How nervous systems evolved is a central question in biology. A diversity of synaptic proteins is thought to play a central role in the formation of specific synapses leading to nervous system complexity. The largest animal genes, often spanning hundreds of thousands of base pairs, are known to be enriched for expression in neurons at synapses and are frequently mutated or misregulated in neurological disorders and diseases. Although many of these genes have been studied independently in the context of nervous system evolution and disease, general principles underlying their parallel evolution remain unknown. To investigate this, we directly compared orthologous gene sizes across eukaryotes. By comparing relative gene sizes within organisms, we identified a distinct class of large genes with origins predating the diversification of animals and, in many cases, the emergence of neurons as dedicated cell types. We traced this class of ancient large genes through evolution and found orthologs of the large synaptic genes potentially driving the immense complexity of metazoan nervous systems, including in humans and cephalopods. Moreover, we found that while these genes are evolving under strong purifying selection, as demonstrated by low dN/dS ratios, they have simultaneously grown larger and gained the most isoforms in animals. This work provides a new lens through which to view this distinctive class of large and multi-isoform genes and demonstrates how intrinsic genomic properties, such as gene length, can provide flexibility in molecular evolution and allow groups of genes and their host organisms to evolve toward complexity.
Collapse
Affiliation(s)
- Matthew J McCoy
- Department of Pathology, Stanford University School of Medicine, 300 Pasteur Drive, Stanford, CA 94305, USA.
| | - Andrew Z Fire
- Department of Pathology, Stanford University School of Medicine, 300 Pasteur Drive, Stanford, CA 94305, USA; Department of Genetics, Stanford University School of Medicine, 300 Pasteur Drive, Stanford, CA 94305, USA.
| |
Collapse
|
6
|
Lambourne L, Mattioli K, Santoso C, Sheynkman G, Inukai S, Kaundal B, Berenson A, Spirohn-Fitzgerald K, Bhattacharjee A, Rothman E, Shrestha S, Laval F, Yang Z, Bisht D, Sewell JA, Li G, Prasad A, Phanor S, Lane R, Campbell DM, Hunt T, Balcha D, Gebbia M, Twizere JC, Hao T, Frankish A, Riback JA, Salomonis N, Calderwood MA, Hill DE, Sahni N, Vidal M, Bulyk ML, Fuxman Bass JI. Widespread variation in molecular interactions and regulatory properties among transcription factor isoforms. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.12.584681. [PMID: 38617209 PMCID: PMC11014633 DOI: 10.1101/2024.03.12.584681] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/16/2024]
Abstract
Most human Transcription factors (TFs) genes encode multiple protein isoforms differing in DNA binding domains, effector domains, or other protein regions. The global extent to which this results in functional differences between isoforms remains unknown. Here, we systematically compared 693 isoforms of 246 TF genes, assessing DNA binding, protein binding, transcriptional activation, subcellular localization, and condensate formation. Relative to reference isoforms, two-thirds of alternative TF isoforms exhibit differences in one or more molecular activities, which often could not be predicted from sequence. We observed two primary categories of alternative TF isoforms: "rewirers" and "negative regulators", both of which were associated with differentiation and cancer. Our results support a model wherein the relative expression levels of, and interactions involving, TF isoforms add an understudied layer of complexity to gene regulatory networks, demonstrating the importance of isoform-aware characterization of TF functions and providing a rich resource for further studies.
Collapse
Affiliation(s)
- Luke Lambourne
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Kaia Mattioli
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Clarissa Santoso
- Department of Biology, Boston University, Boston, MA, USA
- Bioinformatics Program, Boston University, Boston, MA, USA
| | - Gloria Sheynkman
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Sachi Inukai
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Babita Kaundal
- Department of Epigenetics and Molecular Carcinogenesis, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Anna Berenson
- Molecular Biology, Cell Biology & Biochemistry Program, Boston University, Boston, MA, USA
| | - Kerstin Spirohn-Fitzgerald
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Anukana Bhattacharjee
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, USA
- Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | - Elisabeth Rothman
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | | | - Florent Laval
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
- TERRA Teaching and Research Centre, University of Liège, Gembloux, Belgium
- Laboratory of Viral Interactomes, GIGA Institute, University of Liège, Liège, Belgium
| | - Zhipeng Yang
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Deepa Bisht
- Department of Epigenetics and Molecular Carcinogenesis, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Jared A Sewell
- Department of Biology, Boston University, Boston, MA, USA
| | - Guangyuan Li
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, USA
- Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | - Anisa Prasad
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Harvard College, Cambridge MA, USA
| | - Sabrina Phanor
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Ryan Lane
- Department of Biology, Boston University, Boston, MA, USA
| | | | - Toby Hunt
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Dawit Balcha
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Marinella Gebbia
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
- Lunenfeld-Tanenbaum Research Institute (LTRI), Sinai Health System, Toronto, Ontario, Canada
| | - Jean-Claude Twizere
- TERRA Teaching and Research Centre, University of Liège, Gembloux, Belgium
- Laboratory of Viral Interactomes, GIGA Institute, University of Liège, Liège, Belgium
| | - Tong Hao
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Adam Frankish
- Laboratory of Viral Interactomes, GIGA Institute, University of Liège, Liège, Belgium
| | - Josh A Riback
- Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, TX, USA
| | - Nathan Salomonis
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, USA
- Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | - Michael A Calderwood
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - David E Hill
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Nidhi Sahni
- Department of Epigenetics and Molecular Carcinogenesis, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Marc Vidal
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Martha L Bulyk
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Department of Pathology, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Juan I Fuxman Bass
- Department of Biology, Boston University, Boston, MA, USA
- Bioinformatics Program, Boston University, Boston, MA, USA
- Molecular Biology, Cell Biology & Biochemistry Program, Boston University, Boston, MA, USA
| |
Collapse
|
7
|
Liu Y, Xu W, Yang P, Liu X. Revealing Molecular Patterns of Alzheimer's Disease Risk Gene Expression Signatures in COVID-19 Brains. J Alzheimers Dis 2024; 101:31-48. [PMID: 39058446 DOI: 10.3233/jad-240609] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/28/2024]
Abstract
Background Various virus infections are known to predispose to Alzheimer's disease (AD), and a linkage between COVID-19 and AD has been established. COVID-19 infection modulates the gene expression of the genes implicated in progression of AD. Objective Determination of molecular patterns and codon usage and context analysis for the genes that are modulated during COVID-19 infection and are implicated in AD was the target of the study. Methods Our study employed a comprehensive array of research methods, including relative synonymous codon usage, Codon adaptation index analysis, Neutrality and parity analysis, Rare codon analyses, and codon context analysis. This meticulous approach was crucial in determining the molecular patterns present in genes up or downregulated during COVID-19 infection. Results G/C ending codons were preferred in upregulated genes while not in downregulated genes, and in both gene sets, longer genes have high expressivity. Similarly, T over A nucleotide was preferred, and selection was the major evolutionary force in shaping codon usage in both gene sets. Apart from stops codons, codons CGU - Arg, AUA - Ile, UUA - Leu, UCG - Ser, GUA - Val, and CGA - Arg in upregulated genes, while CUA - Leu, UCG - Ser, and UUA - Leu in downregulated genes were present below the 0.5%. Glutamine-initiated codon pairs have high residual values in upregulated genes. Identical codon pairs GAG-GAG and GUG-GUG were preferred in both gene sets. Conclusions The shared and unique molecular features in the up- and downregulated gene sets provide insights into the complex interplay between COVID-19 infection and AD. Further studies are required to elucidate the relationship of these molecular patterns with AD pathology.
Collapse
Affiliation(s)
- Yan Liu
- Shanxi Bethune Hospital, Shanxi Academy of Medical Sciences, Third Hospital of Shanxi Medical University, Tongji Shanxi Hospital, Taiyuan, China
| | - Weiyue Xu
- Shanxi Bethune Hospital, Shanxi Academy of Medical Sciences, Third Hospital of Shanxi Medical University, Tongji Shanxi Hospital, Taiyuan, China
| | - Pan Yang
- TEDA institute of Biological Science and Biotechnology, Nankai University, TEDA, Tianjin, China
| | - Xingshun Liu
- Department of Biochemistry and Molecular Biology, Shanxi Medical University, Taiyuan, China
| |
Collapse
|
8
|
Gurjar P, Khan AA, Alanazi AM, Vasil'ev VG, Zouganelis G, Alexiou A. Molecular Dissection of Herpes Simplex Virus Type 1 to Elucidate Molecular Mechanisms Behind Latency and Comparison of Its Codon Usage Patterns with Genes Modulated During Alzheimer's Disease as a Part of Host-Pathogen Interaction. J Alzheimers Dis 2024; 97:1111-1123. [PMID: 38306057 DOI: 10.3233/jad-231083] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2024]
Abstract
BACKGROUND Herpes simplex virus type 1 (HSV-1) is associated with Alzheimer's disease, which goes into a cycle of latency and reactivation. The present study was envisaged to understand the reasons for latency and specific molecular patterns present in the HSV-1. OBJECTIVE The objective is the molecular dissection of Herpes simplex virus type 1 to elucidate molecular mechanisms behind latency and compare its codon usage patterns with genes modulated during Alzheimer's disease as a part of host-pathogen interaction. METHODS In the present study, we tried to investigate the potential reasons for the latency of HSV-1 virus bioinformatically by determining the CpG patterns. Also, we investigated the codon usage pattern, the presence of rare codons, codon context, and protein properties. RESULTS The top 222 codon pairs graded based on their frequency in the HSV-1 genome revealed that with only one exception (CUG-UUU), all other codon pairs have codons ending with G/C. Considering it an extension of host-pathogen interaction, we compared HSV-1 codon usage with that of codon usage of genes modulated during Alzheimer's disease, and we found that CGT and TTT are only two codons that exhibited similar codon usage patterns and other codons showed statistically highly significant different codon preferences. Dinucleotide CpG tends to mutate to TpG, suggesting the presence of mutational forces and the imperative role of CpG methylation in HSV-1 latency. CONCLUSIONS Upon comparison of codon usage between HSV-1 and Alzheimer's disease genes, no similarities in codon usage were found as a part of host-pathogen interaction. CpG methylation plays an imperative role in latency HSV-1.
Collapse
Affiliation(s)
- Pankaj Gurjar
- Centre for Global Health Research, Saveetha Medical College and Hospital, Saveetha Institute of Medical and Technical Sciences, Saveetha University, Chennai, Tamil Nadu, India
- Department of Science and Engineering, Novel Global Community Educational Foundation, Hebersham, Australia
| | - Azmat Ali Khan
- Department of Pharmaceutical Chemistry, Pharmaceutical Biotechnology Laboratory, College of Pharmacy, King Saud University, Riyadh, Saudi Arabia
| | - Amer M Alanazi
- Department of Pharmaceutical Chemistry, Pharmaceutical Biotechnology Laboratory, College of Pharmacy, King Saud University, Riyadh, Saudi Arabia
| | | | - George Zouganelis
- School of Human Sciences, College of Life and Natural Sciences, University of Derby, Derby, UK
| | - Athanasios Alexiou
- Department of Science and Engineering, Novel Global Community Educational Foundation, Hebersham, Australia
- AFNP Med, Vienna, Austria
| |
Collapse
|
9
|
Abstract
Transcription and replication both require large macromolecular complexes to act on a DNA template, yet these machineries cannot simultaneously act on the same DNA sequence. Conflicts between the replication and transcription machineries (transcription-replication conflicts, or TRCs) are widespread in both prokaryotes and eukaryotes and have the capacity to both cause DNA damage and compromise complete, faithful replication of the genome. This review will highlight recent studies investigating the genomic locations of TRCs and the mechanisms by which they may be prevented, mitigated, or resolved. We address work from both model organisms and mammalian systems but predominantly focus on multicellular eukaryotes owing to the additional complexities inherent in the coordination of replication and transcription in the context of cell type-specific gene expression and higher-order chromatin organization.
Collapse
Affiliation(s)
- Liana Goehring
- Department of Biochemistry & Molecular Pharmacology, New York University School of Medicine, New York, NY, USA;
| | - Tony T Huang
- Department of Biochemistry & Molecular Pharmacology, New York University School of Medicine, New York, NY, USA;
| | - Duncan J Smith
- Center for Genomics and Systems Biology, Department of Biology, New York University, New York, NY, USA;
| |
Collapse
|
10
|
Balasooriya GI, Wee TL, Spector DL. A sub-set of guanine- and cytosine-rich genes are actively transcribed at the nuclear Lamin B1 region. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.28.564411. [PMID: 37961255 PMCID: PMC10634887 DOI: 10.1101/2023.10.28.564411] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
Chromatin organization in the mammalian cell nucleus plays a vital role in the regulation of gene expression. The lamina-associated domain at the inner nuclear membrane has been proposed to harbor heterochromatin, while the nuclear interior has been shown to contain most of the euchromatin. Here, we show that a sub-set of actively transcribing genes, marked by RNA Pol II pSer2, are associated with Lamin B1 at the inner nuclear envelop in mESCs and the number of genes proportionally increases upon in vitro differentiation of mESC to olfactory precursor cells. These nuclear periphery-associated actively transcribing genes primarily represent housekeeping genes, and their gene bodies are significantly enriched with guanine and cytosine compared to genes actively transcribed at the nuclear interior. We found the promoters of these genes to also be significantly enriched with guanine and to be predominantly regulated by zinc finger protein transcription factors. We provide evidence supporting the emerging notion that the Lamin B1 region is not solely transcriptionally silent.
Collapse
|
11
|
Bai MZ, Guo YY. Bioinformatics Analysis of MSH1 Genes of Green Plants: Multiple Parallel Length Expansions, Intron Gains and Losses, Partial Gene Duplications, and Alternative Splicing. Int J Mol Sci 2023; 24:13620. [PMID: 37686425 PMCID: PMC10487979 DOI: 10.3390/ijms241713620] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2023] [Revised: 08/28/2023] [Accepted: 08/29/2023] [Indexed: 09/10/2023] Open
Abstract
MutS homolog 1 (MSH1) is involved in the recombining and repairing of organelle genomes and is essential for maintaining their stability. Previous studies indicated that the length of the gene varied greatly among species and detected species-specific partial gene duplications in Physcomitrella patens. However, there are critical gaps in the understanding of the gene size expansion, and the extent of the partial gene duplication of MSH1 remains unclear. Here, we screened MSH1 genes in 85 selected species with genome sequences representing the main clades of green plants (Viridiplantae). We identified the MSH1 gene in all lineages of green plants, except for nine incomplete species, for bioinformatics analysis. The gene is a singleton gene in most of the selected species with conserved amino acids and protein domains. Gene length varies greatly among the species, ranging from 3234 bp in Ostreococcus tauri to 805,861 bp in Cycas panzhihuaensis. The expansion of MSH1 repeatedly occurred in multiple clades, especially in Gymnosperms, Orchidaceae, and Chloranthus spicatus. MSH1 has exceptionally long introns in certain species due to the gene length expansion, and the longest intron even reaches 101,025 bp. And the gene length is positively correlated with the proportion of the transposable elements (TEs) in the introns. In addition, gene structure analysis indicated that the MSH1 of green plants had undergone parallel intron gains and losses in all major lineages. However, the intron number of seed plants (gymnosperm and angiosperm) is relatively stable. All the selected gymnosperms contain 22 introns except for Gnetum montanum and Welwitschia mirabilis, while all the selected angiosperm species preserve 21 introns except for the ANA grade. Notably, the coding region of MSH1 in algae presents an exceptionally high GC content (47.7% to 75.5%). Moreover, over one-third of the selected species contain species-specific partial gene duplications of MSH1, except for the conserved mosses-specific partial gene duplication. Additionally, we found conserved alternatively spliced MSH1 transcripts in five species. The study of MSH1 sheds light on the evolution of the long genes of green plants.
Collapse
Affiliation(s)
| | - Yan-Yan Guo
- College of Plant Protection, Henan Agricultural University, Zhengzhou 450046, China
| |
Collapse
|
12
|
Jain A, Begum T, Ahmad S. Analysis and Prediction of Pathogen Nucleic Acid Specificity for Toll-like Receptors in Vertebrates. J Mol Biol 2023; 435:168208. [PMID: 37479078 DOI: 10.1016/j.jmb.2023.168208] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Revised: 06/20/2023] [Accepted: 07/13/2023] [Indexed: 07/23/2023]
Abstract
Identification of key sequence, expression and function related features of nucleic acid-sensing host proteins is of fundamental importance to understand the dynamics of pathogen-specific host responses. To meet this objective, we considered toll-like receptors (TLRs), a representative class of membrane-bound sensor proteins, from 17 vertebrate species covering mammals, birds, reptiles, amphibians, and fishes in this comparative study. We identified the molecular signatures of host TLRs that are responsible for sensing pathogen nucleic acids or other pathogen-associated molecular patterns (PAMPs), and potentially play important roles in host defence mechanism. Interestingly, our findings reveal that such host-specific features are directly related to the strand (single or double) specificity of nucleic acid from pathogens. However, during host-pathogen interactions, such features were unable to explain the pathogenic PAMP (i.e., DNA, RNA or other) selectivity, suggesting a more complex mechanism. Using these features, we developed a number of machine learning models, of which Random Forest achieved a high performance (94.57% accuracy) to predict strand specificity of TLRs from protein-derived features. We applied the trained model to propose strand specificity of some previously uncharacterized distinct fish-specific novel TLRs (TLR18, TLR23, TLR24, TLR25, TLR27).
Collapse
Affiliation(s)
- Anuja Jain
- School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Delhi 110067, India. https://twitter.com/@Anuja334
| | - Tina Begum
- School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Delhi 110067, India.
| | - Shandar Ahmad
- School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Delhi 110067, India.
| |
Collapse
|
13
|
McCoy MJ, Fire AZ. Ancient origins of complex neuronal genes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.28.534655. [PMID: 37034725 PMCID: PMC10081198 DOI: 10.1101/2023.03.28.534655] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
How nervous systems evolved is a central question in biology. An increasing diversity of synaptic proteins is thought to play a central role in the formation of specific synapses leading to nervous system complexity. The largest animal genes, often spanning millions of base pairs, are known to be enriched for expression in neurons at synapses and are frequently mutated or misregulated in neurological disorders and diseases. While many of these genes have been studied independently in the context of nervous system evolution and disease, general principles underlying their parallel evolution remain unknown. To investigate this, we directly compared orthologous gene sizes across eukaryotes. By comparing relative gene sizes within organisms, we identified a distinct class of large genes with origins predating the diversification of animals and in many cases the emergence of dedicated neuronal cell types. We traced this class of ancient large genes through evolution and found orthologs of the large synaptic genes driving the immense complexity of metazoan nervous systems, including in humans and cephalopods. Moreover, we found that while these genes are evolving under strong purifying selection as demonstrated by low dN/dS scores, they have simultaneously grown larger and gained the most isoforms in animals. This work provides a new lens through which to view this distinctive class of large and multi-isoform genes and demonstrates how intrinsic genomic properties, such as gene length, can provide flexibility in molecular evolution and allow groups of genes and their host organisms to evolve toward complexity.
Collapse
Affiliation(s)
- Matthew J. McCoy
- Department of Pathology, Stanford University School of Medicine, Stanford, CA 94305, USA
- Whitman Center, Marine Biological Laboratory, Woods Hole, MA 02543, USA
| | - Andrew Z. Fire
- Department of Pathology, Stanford University School of Medicine, Stanford, CA 94305, USA
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| |
Collapse
|
14
|
Bogan SN, Strader ME, Hofmann GE. Associations between DNA methylation and gene regulation depend on chromatin accessibility during transgenerational plasticity. BMC Biol 2023; 21:149. [PMID: 37365578 DOI: 10.1186/s12915-023-01645-8] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2022] [Accepted: 06/07/2023] [Indexed: 06/28/2023] Open
Abstract
BACKGROUND Epigenetic processes are proposed to be a mechanism regulating gene expression during phenotypic plasticity. However, environmentally induced changes in DNA methylation exhibit little-to-no association with differential gene expression in metazoans at a transcriptome-wide level. It remains unexplored whether associations between environmentally induced differential methylation and expression are contingent upon other epigenomic processes such as chromatin accessibility. We quantified methylation and gene expression in larvae of the purple sea urchin Strongylocentrotus purpuratus exposed to different ecologically relevant conditions during gametogenesis (maternal conditioning) and modeled changes in gene expression and splicing resulting from maternal conditioning as functions of differential methylation, incorporating covariates for genomic features and chromatin accessibility. We detected significant interactions between differential methylation, chromatin accessibility, and genic feature type associated with differential expression and splicing. RESULTS Differential gene body methylation had significantly stronger effects on expression among genes with poorly accessible transcriptional start sites while baseline transcript abundance influenced the direction of this effect. Transcriptional responses to maternal conditioning were 4-13 × more likely when accounting for interactions between methylation and chromatin accessibility, demonstrating that the relationship between differential methylation and gene regulation is partially explained by chromatin state. CONCLUSIONS DNA methylation likely possesses multiple associations with gene regulation during transgenerational plasticity in S. purpuratus and potentially other metazoans, but its effects are dependent on chromatin accessibility and underlying genic features.
Collapse
Affiliation(s)
- Samuel N Bogan
- Department of Ecology, Evolution and Marine Biology, University of California Santa Barbara, Santa Barbara, USA.
| | - Marie E Strader
- Department of Ecology, Evolution and Marine Biology, University of California Santa Barbara, Santa Barbara, USA
- Department of Biology, Texas A&M University, College Station, USA
| | - Gretchen E Hofmann
- Department of Ecology, Evolution and Marine Biology, University of California Santa Barbara, Santa Barbara, USA
| |
Collapse
|
15
|
Fraimovitch E, Hagai T. Promoter evolution of mammalian gene duplicates. BMC Biol 2023; 21:80. [PMID: 37055747 PMCID: PMC10100218 DOI: 10.1186/s12915-023-01590-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Accepted: 04/06/2023] [Indexed: 04/15/2023] Open
Abstract
BACKGROUND Gene duplication is thought to be a central process in evolution to gain new functions. The factors that dictate gene retention following duplication as well paralog gene divergence in sequence, expression and function have been extensively studied. However, relatively little is known about the evolution of promoter regions of gene duplicates and how they influence gene duplicate divergence. Here, we focus on promoters of paralog genes, comparing their similarity in sequence, in the sets of transcription factors (TFs) that bind them, and in their overall promoter architecture. RESULTS We observe that promoters of recent duplications display higher sequence similarity between them and that sequence similarity rapidly declines between promoters of more ancient paralogs. In contrast, similarity in cis-regulation, as measured by the set of TFs that bind promoters of both paralogs, does not simply decrease with time from duplication and is instead related to promoter architecture-paralogs with CpG Islands (CGIs) in their promoters share a greater fraction of TFs, while CGI-less paralogs are more divergent in their TF binding set. Focusing on recent duplication events and partitioning them by their duplication mechanism enables us to uncover promoter properties associated with gene retention, as well as to characterize the evolution of promoters of newly born genes: In recent retrotransposition-mediated duplications, we observe asymmetry in cis-regulation of paralog pairs: Retrocopy genes are lowly expressed and their promoters are bound by fewer TFs and are depleted of CGIs, in comparison with the original gene copy. Furthermore, looking at recent segmental duplication regions in primates enable us to compare successful retentions versus loss of duplicates, showing that duplicate retention is associated with fewer TFs and with CGI-less promoter architecture. CONCLUSIONS In this work, we profiled promoters of gene duplicates and their inter-paralog divergence. We also studied how their characteristics are associated with duplication time and duplication mechanism, as well as with the fate of these duplicates. These results underline the importance of cis-regulatory mechanisms in shaping the evolution of new genes and their fate following duplication.
Collapse
Affiliation(s)
- Evgeny Fraimovitch
- Shmunis School of Biomedicine and Cancer Research, George S Wise Faculty of Life Sciences, Tel Aviv University, 69978, Tel Aviv, Israel
| | - Tzachi Hagai
- Shmunis School of Biomedicine and Cancer Research, George S Wise Faculty of Life Sciences, Tel Aviv University, 69978, Tel Aviv, Israel.
| |
Collapse
|
16
|
Khandia R, Pandey MK, Rzhepakovsky IV, Khan AA, Alexiou A. Synonymous Codon Variant Analysis for Autophagic Genes Dysregulated in Neurodegeneration. Mol Neurobiol 2023; 60:2252-2267. [PMID: 36637744 DOI: 10.1007/s12035-022-03081-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2022] [Accepted: 09/27/2022] [Indexed: 01/14/2023]
Abstract
Neurodegenerative disorders are often a culmination of the accumulation of abnormally folded proteins and defective organelles. Autophagy is a process of removing these defective proteins, organelles, and harmful substances from the body, and it works to maintain homeostasis. If autophagic removal of defective proteins has interfered, it affects neuronal health. Some of the autophagic genes are specifically found to be associated with neurodegenerative phenotypes. Non-functional, mutated, or gene copies having silent mutations, often termed synonymous variants, might explain this. However, these synonymous variant which codes for exactly similar proteins have different translation rates, stability, and gene expression profiling. Hence, it would be interesting to study the pattern of synonymous variant usage. In the study, synonymous variant usage in various transcripts of autophagic genes ATG5, ATG7, ATG8A, ATG16, and ATG17/FIP200 reported to cause neurodegeneration (if dysregulated) is studied. These genes were analyzed for their synonymous variant usage; nucleotide composition; any possible nucleotide skew in a gene; physical properties of autophagic protein including GRAVY and AROMA; hydropathicity; instability index; and frequency of acidic, basic, neutral amino acids; and gene expression level. The study will help understand various evolutionary forces acting on these genes and the possible augmentation of a gene if showing unusual behavior.
Collapse
Affiliation(s)
- Rekha Khandia
- Department of Biochemistry and Genetics, Barkatullah University, Bhopal, 462026, India.
| | - Megha Katare Pandey
- Department of Translational Medicine, All India Institute of Medical Sciences, Bhopal, 462020, India
| | | | - Azmat Ali Khan
- Pharmaceutical Biotechnology Laboratory, Department of Pharmaceutical Chemistry, College of Pharmacy, King Saud University, Riyadh, 11451, Saudi Arabia.
| | - Athanasios Alexiou
- Novel Global Community Educational Foundation, Hebersham, Australia
- AFNP Med, Wien, Austria
| |
Collapse
|
17
|
Fajardo D, Saint Jean R, Lyons PJ. Acquisition of new function through gene duplication in the metallocarboxypeptidase family. Sci Rep 2023; 13:2512. [PMID: 36781897 PMCID: PMC9925722 DOI: 10.1038/s41598-023-29800-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2022] [Accepted: 02/10/2023] [Indexed: 02/15/2023] Open
Abstract
Gene duplication is a key first step in the process of expanding the functionality of a multigene family. In order to better understand the process of gene duplication and its role in the formation of new enzymes, we investigated recent duplication events in the M14 family of proteolytic enzymes. Within vertebrates, four of 23 M14 genes were frequently found in duplicate form. While AEBP1, CPXM1, and CPZ genes were duplicated once through a large-scale, likely whole-genome duplication event, the CPO gene underwent many duplication events within fish and Xenopus lineages. Bioinformatic analyses of enzyme specificity and conservation suggested a greater amount of neofunctionalization and purifying selection in CPO paralogs compared with other CPA/B enzymes. To examine the functional consequences of evolutionary changes on CPO paralogs, the four CPO paralogs from Xenopus tropicalis were expressed in Sf9 and HEK293T cells. Immunocytochemistry showed subcellular distribution of Xenopus CPO paralogs to be similar to that of human CPO. Upon activation with trypsin, the enzymes demonstrated differential activity against three substrates, suggesting an acquisition of new function following duplication and subsequent mutagenesis. Characteristics such as gene size and enzyme activation mechanisms are possible contributors to the evolutionary capacity of the CPO gene.
Collapse
Affiliation(s)
- Daniel Fajardo
- Department of Biology, Andrews University, Berrien Springs, MI, 49104, USA
| | - Ritchie Saint Jean
- Department of Biology, Andrews University, Berrien Springs, MI, 49104, USA
| | - Peter J Lyons
- Department of Biology, Andrews University, Berrien Springs, MI, 49104, USA.
| |
Collapse
|
18
|
Li T, Yin L, Stoll CE, Lisch D, Zhao M. Conserved noncoding sequences and de novo Mutator insertion alleles are imprinted in maize. PLANT PHYSIOLOGY 2023; 191:299-316. [PMID: 36173333 PMCID: PMC9806621 DOI: 10.1093/plphys/kiac459] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/21/2022] [Accepted: 08/30/2022] [Indexed: 05/20/2023]
Abstract
Genomic imprinting is an epigenetic phenomenon in which differential allele expression occurs in a parent-of-origin-dependent manner. Imprinting in plants is tightly linked to transposable elements (TEs), and it has been hypothesized that genomic imprinting may be a consequence of demethylation of TEs. Here, we performed high-throughput sequencing of ribonucleic acids from four maize (Zea mays) endosperms that segregated newly silenced Mutator (Mu) transposons and identified 110 paternally expressed imprinted genes (PEGs) and 139 maternally expressed imprinted genes (MEGs). Additionally, two potentially novel paternally suppressed MEGs are associated with de novo Mu insertions. In addition, we find evidence for parent-of-origin effects on expression of 407 conserved noncoding sequences (CNSs) in maize endosperm. The imprinted CNSs are largely localized within genic regions and near genes, but the imprinting status of the CNSs are largely independent of their associated genes. Both imprinted CNSs and PEGs have been subject to relaxed selection. However, our data suggest that although MEGs were already subject to a higher mutation rate prior to their being imprinted, imprinting may be the cause of the relaxed selection of PEGs. In addition, although DNA methylation is lower in the maternal alleles of both the maternally and paternally expressed CNSs (mat and pat CNSs), the difference between the two alleles in H3K27me3 levels was only observed in pat CNSs. Together, our findings point to the importance of both transposons and CNSs in genomic imprinting in maize.
Collapse
Affiliation(s)
- Tong Li
- Department of Biology, Miami University, Oxford, Ohio 45056, USA
- State Key Laboratory of Plant Physiology and Biochemistry, National Maize Improvement Center, Department of Plant Genetics and Breeding, China Agricultural University, Beijing 100193, P.R. China
| | - Liangwei Yin
- Department of Biology, Miami University, Oxford, Ohio 45056, USA
| | - Claire E Stoll
- Department of Biology, Miami University, Oxford, Ohio 45056, USA
| | - Damon Lisch
- Department of Botany and Plant Pathology, Purdue University, West Lafayette, Indiana 47907, USA
| | - Meixia Zhao
- Department of Microbiology and Cell Science, Institute of Food and Agricultural Sciences, University of Florida, Gainesville, Florida 32611, USA
| |
Collapse
|
19
|
Stoeger T, Grant RA, McQuattie-Pimentel AC, Anekalla KR, Liu SS, Tejedor-Navarro H, Singer BD, Abdala-Valencia H, Schwake M, Tetreault MP, Perlman H, Balch WE, Chandel NS, Ridge KM, Sznajder JI, Morimoto RI, Misharin AV, Budinger GRS, Nunes Amaral LA. Aging is associated with a systemic length-associated transcriptome imbalance. NATURE AGING 2022; 2:1191-1206. [PMID: 37118543 PMCID: PMC10154227 DOI: 10.1038/s43587-022-00317-6] [Citation(s) in RCA: 29] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/26/2021] [Accepted: 10/21/2022] [Indexed: 12/14/2022]
Abstract
Aging is among the most important risk factors for morbidity and mortality. To contribute toward a molecular understanding of aging, we analyzed age-resolved transcriptomic data from multiple studies. Here, we show that transcript length alone explains most transcriptional changes observed with aging in mice and humans. We present three lines of evidence supporting the biological importance of the uncovered transcriptome imbalance. First, in vertebrates the length association primarily displays a lower relative abundance of long transcripts in aging. Second, eight antiaging interventions of the Interventions Testing Program of the National Institute on Aging can counter this length association. Third, we find that in humans and mice the genes with the longest transcripts enrich for genes reported to extend lifespan, whereas those with the shortest transcripts enrich for genes reported to shorten lifespan. Our study opens fundamental questions on aging and the organization of transcriptomes.
Collapse
Affiliation(s)
- Thomas Stoeger
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL, USA.
- Northwestern Institute on Complex Systems, Northwestern University, Evanston, IL, USA.
- Center for Genetic Medicine, Northwestern University, Evanston, IL, USA.
| | - Rogan A Grant
- Department of Molecular Biosciences, Northwestern University, Evanston, IL, USA
- Division of Pulmonary and Critical Care Medicine, Northwestern University, Evanston, IL, USA
| | | | - Kishore R Anekalla
- Division of Pulmonary and Critical Care Medicine, Northwestern University, Evanston, IL, USA
| | - Sophia S Liu
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL, USA
| | | | - Benjamin D Singer
- Division of Pulmonary and Critical Care Medicine, Northwestern University, Evanston, IL, USA
- Simpson Querrey Lung Institute for Translational Science at Northwestern University (SQLIFTSNU), Evanston, IL, USA
- Department of Biochemistry and Molecular Genetics, Northwestern University, Evanston, IL, USA
| | - Hiam Abdala-Valencia
- Division of Pulmonary and Critical Care Medicine, Northwestern University, Evanston, IL, USA
| | - Michael Schwake
- Department of Neurology, Northwestern University, Evanston, IL, USA
- Faculty of Chemistry, University of Bielefeld, Bielefeld, Germany
| | - Marie-Pier Tetreault
- Division of Gastroenterology and Hepatology, Northwestern University, Evanston, IL, USA
| | - Harris Perlman
- Division of Rheumatology, Northwestern University, Evanston, IL, USA
| | | | - Navdeep S Chandel
- Division of Pulmonary and Critical Care Medicine, Northwestern University, Evanston, IL, USA
- Simpson Querrey Lung Institute for Translational Science at Northwestern University (SQLIFTSNU), Evanston, IL, USA
| | - Karen M Ridge
- Division of Pulmonary and Critical Care Medicine, Northwestern University, Evanston, IL, USA
- Simpson Querrey Lung Institute for Translational Science at Northwestern University (SQLIFTSNU), Evanston, IL, USA
| | - Jacob I Sznajder
- Division of Pulmonary and Critical Care Medicine, Northwestern University, Evanston, IL, USA
- Simpson Querrey Lung Institute for Translational Science at Northwestern University (SQLIFTSNU), Evanston, IL, USA
| | - Richard I Morimoto
- Department of Molecular Biosciences, Northwestern University, Evanston, IL, USA.
- Rice Institute for Biomedical Research, Northwestern University, Evanston, IL, USA.
| | - Alexander V Misharin
- Division of Pulmonary and Critical Care Medicine, Northwestern University, Evanston, IL, USA.
- Simpson Querrey Lung Institute for Translational Science at Northwestern University (SQLIFTSNU), Evanston, IL, USA.
| | - G R Scott Budinger
- Division of Pulmonary and Critical Care Medicine, Northwestern University, Evanston, IL, USA.
- Simpson Querrey Lung Institute for Translational Science at Northwestern University (SQLIFTSNU), Evanston, IL, USA.
| | - Luis A Nunes Amaral
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL, USA.
- Northwestern Institute on Complex Systems, Northwestern University, Evanston, IL, USA.
- Department of Physics and Astronomy, Northwestern University, Evanston, IL, USA.
| |
Collapse
|
20
|
Yanai I, Lercher MJ. What puzzle are you in? Genome Biol 2022; 23:179. [PMID: 36008862 PMCID: PMC9404603 DOI: 10.1186/s13059-022-02748-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Affiliation(s)
- Itai Yanai
- Institute for Computational Medicine, NYU Langone Health, New York, NY, 10016, USA.
| | - Martin J Lercher
- Institute for Computer Science & Department of Biology, Heinrich Heine University, 40225, Düsseldorf, Germany.
| |
Collapse
|
21
|
Khandia R, Saeed M, Alharbi AM, Ashraf GM, Greig NH, Kamal MA. Codon Usage Bias Correlates With Gene Length in Neurodegeneration Associated Genes. Front Neurosci 2022; 16:895607. [PMID: 35860292 PMCID: PMC9289476 DOI: 10.3389/fnins.2022.895607] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Accepted: 06/08/2022] [Indexed: 11/13/2022] Open
Abstract
Codon usage analysis is a crucial part of molecular characterization and is used to determine the factors affecting the evolution of a gene. The length of a gene is an important parameter that affects the characteristics of the gene, such as codon usage, compositional parameters, and sometimes, its functions. In the present study, we investigated the association of various parameters related to codon usage with the length of genes. Gene expression is affected by nucleotide disproportion. In sixty genes related to neurodegenerative disorders, the G nucleotide was the most abundant and the T nucleotide was the least. The nucleotide T exhibited a significant association with the length of the gene at both the overall compositional level and the first and second codon positions. Codon usage bias (CUB) of these genes was affected by pyrimidine and keto skews. Gene length was found to be significantly correlated with codon bias in neurodegeneration associated genes. In gene segments with lengths below 1,200 bp and above 2,400 bp, CUB was positively associated with length. Relative synonymous CUB, which is another measure of CUB, showed that codons TTA, GTT, GTC, TCA, GGT, and GGA exhibited a positive association with length, whereas codons GTA, AGC, CGT, CGA, and GGG showed a negative association. GC-ending codons were preferred over AT-ending codons. Overall analysis indicated that the association between CUB and length varies depending on the segment size; however, CUB of 1,200–2,000 bp gene segments appeared not affected by gene length. In synopsis, analysis suggests that length of the genes correlates with various imperative molecular signatures including A/T nucleotide disproportion and codon choices. In the present study we additionally evaluated various molecular features and their correlation with different indices of codon usage, like the Codon Adaptation Index (CAI) and Relative Dynonymous Codon Usage (RSCU) of codons. We also considered the impact of gene fragment size on different molecular features in genes related to neurodegeneration. This analysis will aid our understanding of and in potentially modulating gene expression in cases of defective gene functioning in clinical settings.
Collapse
Affiliation(s)
- Rekha Khandia
- Department of Biochemistry and Genetics, Barkatullah University, Bhopal, India
- *Correspondence: Rekha Khandia, ;
| | - Mohd. Saeed
- Department of Biology, College of Sciences, University of Hail, Hail, Saudi Arabia
| | - Ahmed M. Alharbi
- Department of Clinical Laboratory Sciences, College of Applied Medical Sciences, University of Hail, Hail, Saudi Arabia
| | - Ghulam Md. Ashraf
- Pre-clinical Research Unit, King Fahd Medical Research Center, King Abdulaziz University, Jeddah, Saudi Arabia
- Department of Medical Laboratory Sciences, Faculty of Applied Medical Sciences, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Nigel H. Greig
- Drug Design and Development Section, Translational Gerontology Branch, Intramural Research Program National Institute on Aging, NIH, Baltimore, MD, United States
| | - Mohammad Amjad Kamal
- Institutes for Systems Genetics, Frontiers Science Center for Disease-Related Molecular Network, West China Hospital, Sichuan University, Chengdu, China
- King Fahd Medical Research Center, King Abdulaziz University, Jeddah, Saudi Arabia
- Department of Pharmacy, Faculty of Allied Health Sciences, Daffodil International University, Dhaka, Bangladesh
- Enzymoics, Novel Global Community Educational Foundation, Hebersham, NSW, Australia
| |
Collapse
|
22
|
Jin YT, Pu DK, Guo HX, Deng Z, Chen LL, Guo FB. T-G-A Deficiency Pattern in Protein-Coding Genes and Its Potential Reason. Front Microbiol 2022; 13:847325. [PMID: 35602045 PMCID: PMC9116502 DOI: 10.3389/fmicb.2022.847325] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2022] [Accepted: 03/30/2022] [Indexed: 11/20/2022] Open
Abstract
If a stop codon appears within one gene, then its translation will be terminated earlier than expected. False folding of premature protein will be adverse to the host; hence, all functional genes would tend to avoid the intragenic stop codons. Therefore, we hypothesize that there will be less frequency of nucleotides corresponding to stop codons at each codon position of genes. Here, we validate this inference by investigating the nucleotide frequency at a large scale and results from 19,911 prokaryote genomes revealed that nucleotides coinciding with stop codons indeed have the lowest frequency in most genomes. Interestingly, genes with three types of stop codons all tend to follow a T-G-A deficiency pattern, suggesting that the property of avoiding intragenic termination pressure is the same and the major stop codon TGA plays a dominant role in this effect. Finally, a positive correlation between the TGA deficiency extent and the base length was observed in start-experimentally verified genes of Escherichia coli (E. coli). This strengthens the proof of our hypothesis. The T-G-A deficiency pattern observed would help to understand the evolution of codon usage tactics in extant organisms.
Collapse
Affiliation(s)
- Yan-Ting Jin
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China.,Department of Respiratory and Critical Care Medicine, Zhongnan Hospital of Wuhan University, Key Laboratory of Combinatorial Biosynthesis and Drug Discovery, Ministry of Education and School of Pharmaceutical Sciences, Wuhan University, Wuhan, China
| | - Dong-Kai Pu
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| | - Hai-Xia Guo
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| | - Zixin Deng
- Department of Respiratory and Critical Care Medicine, Zhongnan Hospital of Wuhan University, Key Laboratory of Combinatorial Biosynthesis and Drug Discovery, Ministry of Education and School of Pharmaceutical Sciences, Wuhan University, Wuhan, China
| | - Ling-Ling Chen
- Agricultural Bioinformatics Key Laboratory of Hubei Province, College of Informatics, Huazhong Agricultural University, Wuhan, China
| | - Feng-Biao Guo
- Department of Respiratory and Critical Care Medicine, Zhongnan Hospital of Wuhan University, Key Laboratory of Combinatorial Biosynthesis and Drug Discovery, Ministry of Education and School of Pharmaceutical Sciences, Wuhan University, Wuhan, China
| |
Collapse
|
23
|
Quan C, Chen G, Li S, Jia Z, Yu P, Tu J, Shen J, Yi B, Fu T, Dai C, Ma C. Transcriptome shock in interspecific F1 allotriploid hybrids between Brassica species. JOURNAL OF EXPERIMENTAL BOTANY 2022; 73:2336-2353. [PMID: 35139197 DOI: 10.1093/jxb/erac047] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/02/2021] [Accepted: 02/03/2022] [Indexed: 06/14/2023]
Abstract
Interspecific hybridization drives the evolution of angiosperms and can be used to introduce novel alleles for important traits or to activate heterosis in crop breeding. Hybridization brings together gene expression networks from two different species, potentially causing global alterations of gene expression in the F1 plants which is called 'transcriptome shock'. Here, we explored such a transcriptome shock in allotriploid Brassica hybrids. We generated interspecific F1 allotriploid hybrids between the allotetraploid species Brassica napus and three accessions of the diploid species Brassica rapa. RNA-seq of the F1 hybrids and the parental plants revealed that 26.34-30.89% of genes were differentially expressed between the parents. We also analyzed expression level dominance and homoeolog expression bias between the parents and the F1 hybrids. The expression-level dominance biases of the Ar, An, and Cn subgenomes was genotype and stage dependent, whereas significant homoeolog expression bias was observed among three subgenomes from different parents. Furthermore, more genes were involved in trans regulation than in cis regulation in allotriploid F1 hybrids. Our findings provide new insights into the transcriptomic responses of cross-species hybrids and hybrids showing heterosis, as well as a new method for promoting the breeding of desirable traits in polyploid Brassica species.
Collapse
Affiliation(s)
- Chengtao Quan
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan 430070, China
- Hubei Hongshan Laboratory, Wuhan, 430070, China
| | - Guoting Chen
- College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
| | - Sijia Li
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan 430070, China
| | - Zhibo Jia
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan 430070, China
| | - Pugang Yu
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan 430070, China
| | - Jinxing Tu
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan 430070, China
| | - Jinxiong Shen
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan 430070, China
| | - Bin Yi
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan 430070, China
| | - Tingdong Fu
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan 430070, China
| | - Cheng Dai
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan 430070, China
- Hubei Hongshan Laboratory, Wuhan, 430070, China
| | - Chaozhi Ma
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan 430070, China
- Hubei Hongshan Laboratory, Wuhan, 430070, China
| |
Collapse
|
24
|
Khandia R, Ali Khan A, Alexiou A, Povetkin SN, Nikolaevna VM. Codon Usage Analysis of Pro-Apoptotic Bim Gene Isoforms. J Alzheimers Dis 2022; 86:1711-1725. [DOI: 10.3233/jad-215691] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
Background: Bim is a Bcl-2 homology 3 (BH3)-only proteins, a group of pro-apoptotic proteins involved in physiological and pathological conditions. Both the overexpression and under-expression of Bim protein are associated with the diseased condition, and various isoforms of Bim protein are present with differential apoptotic potential. Objective: The present study attempted to envisage the association of various molecular signatures with the codon choices of Bim isoforms. Methods: Molecular signatures like composition, codon usage, nucleotide skews, the free energy of mRNA transcript, physical properties of proteins, codon adaptation index, relative synonymous codon usage, and dinucleotide odds ratio were determined and analyzed for their associations with codon choices of Bim gene. Results: Skew analysis of the Bim gene indicated the preference of C nucleotide over G, A, and T and preference of G over T and A nucleotides was observed. An increase in C content at the first and third codon position increased gene expression while it decreased at the second codon position. Compositional constraints on nucleotide C at all three codon positions affected gene expression. The analysis revealed an exceptionally high usage of CpC dinucleotide in all the envisaged 31 isoforms of Bim. We correlated it with the requirement of rapid demethylation machinery to fine-tune the Bimgene expression. Also, mutational pressure played a dominant role in shaping codon usage bias in Bim isoforms. Conclusion: An exceptionally high usage of CpC dinucleotide in all the envisaged 31 isoforms of Bim indicates a high order selectional force to fine tune Bim gene expression.
Collapse
Affiliation(s)
- Rekha Khandia
- Department of Biochemistry and Genetics, Barkatullah University, Bhopal, India
| | - Azmat Ali Khan
- Pharmaceutical Biotechnology Laboratory, Department of Pharmaceutical Chemistry, College of Pharmacy, King Saud University, Riyadh, Saudi Arabia
| | - Athanasios Alexiou
- Novel Global Community Educational Foundation, Australia & AFNP Med, Austria
| | | | | |
Collapse
|
25
|
Bose D, Mukhopadhyay S. The hunt for a yet unknown: Common molecular signature in some genetically monomorphic enterobacteria. J Basic Microbiol 2021; 61:524-546. [PMID: 33991346 DOI: 10.1002/jobm.202000630] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2020] [Revised: 04/04/2021] [Accepted: 04/22/2021] [Indexed: 11/09/2022]
Abstract
Mark Achtman introduced the term "genetically monomorphic bacteria" (GM bacteria) for some human and plant pathogens. They displayed a great uniformity in terms of their "genetic" properties. This "uniformity" poses a challenge to microbiologists. To address these problems, we used CodonW and IslandViewer 3 as analytical tools and took Escherichia coli, Salmonella, and Shigella strains as a model organisms. We hypothesized that GM bacterium contains a common molecular signature among them. We have found a significant correlation regarding the number of protein-coding genes, predicted highly expressed genes, and the highest length of gene in this regard. On the other hand, the correspondence analysis of pathogenicity-related genes identified by IslandViewer 3 displayed a somewhat unique pattern in GM bacteria. The probable pathogenic genes are clustered into two separate groups, which is a hallmark of some pattern. Similar genes of non-monomorphic pathogenic strain clustered almost similarly, but the clusters are joined together, they are not completely separated. These features, in our considered view, may be considered as codon usages signatures of these bacteria, and E. coli in particular.
Collapse
Affiliation(s)
- Debadin Bose
- Department of Botany, Kabi Nazrul College, Murarai, West Bengal, India
| | - Subhasis Mukhopadhyay
- Distributed Information Centre for Bioinformatics, Department of Biophysics, Molecular Biology and Bioinformatics, University of Calcutta, Calcutta, West Bengal, India
| |
Collapse
|
26
|
Dixit S, Thakur N, Shukla A, Upadhyay SK, C Verma P. Molecular characterization of N-methyl-d-aspartate receptor from Bemisia tabaci. INSECT MOLECULAR BIOLOGY 2021; 30:231-240. [PMID: 33368750 DOI: 10.1111/imb.12690] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Revised: 12/13/2020] [Accepted: 12/23/2020] [Indexed: 06/12/2023]
Abstract
The N-methyl-d-aspartate receptors (NMDARs) are ionotropic ligand gated channels that are highly permeable to calcium ions. In insects, NMDARs are associated with glutamatergic neurotransmission governing diverse physiological and biological processes like vitellogenesis and ovarian development. Therefore, NMDAR may act as attractive target for insect pest control. In present study, we performed structural and functional characterization of NMDARs in Bemisia tabaci, a highly invasive crop pest and potent virus vector. We identified that NMDAR consists of three subunits each encoded by single gene in whiteflies which are highly conserved among different insect orders. Expression analysis suggests that subunit 1 (BtNR1) and subunit 2 (BtNR2) are the main functional units. External supplementation of NMDAR ligand or BtNRs silencing was lethal to insects, which suggested that NMDAR function is highly balanced in whiteflies.
Collapse
Affiliation(s)
- S Dixit
- Molecular Biology and Biotechnology, CSIR-National Botanical Research Institute, (Council of Scientific and Industrial Research), Rana Pratap Marg, Lucknow, India
- Department of Biology, University of Western Ontario, London, Ontario, Canada
| | - N Thakur
- Molecular Biology and Biotechnology, CSIR-National Botanical Research Institute, (Council of Scientific and Industrial Research), Rana Pratap Marg, Lucknow, India
- DST-Centre for Policy Research, IIT-Delhi, New Delhi, India
| | - A Shukla
- Department of Biology, University of Western Ontario, London, Ontario, Canada
| | - S K Upadhyay
- Department of Botany, Panjab University, Chandigarh, India
| | - P C Verma
- Molecular Biology and Biotechnology, CSIR-National Botanical Research Institute, (Council of Scientific and Industrial Research), Rana Pratap Marg, Lucknow, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh, India
| |
Collapse
|
27
|
Telonis AG, Rigoutsos I. The transcriptional trajectories of pluripotency and differentiation comprise genes with antithetical architecture and repetitive-element content. BMC Biol 2021; 19:60. [PMID: 33765992 PMCID: PMC7995781 DOI: 10.1186/s12915-020-00928-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2020] [Accepted: 11/18/2020] [Indexed: 12/12/2022] Open
Abstract
Background Extensive molecular differences exist between proliferative and differentiated cells. Here, we conduct a meta-analysis of publicly available transcriptomic datasets from preimplantation and differentiation stages examining the architectural properties and content of genes whose abundance changes significantly across developmental time points. Results Analysis of preimplantation embryos from human and mouse showed that short genes whose introns are enriched in Alu (human) and B (mouse) elements, respectively, have higher abundance in the blastocyst compared to the zygote. These highly expressed genes encode ribosomal proteins or metabolic enzymes. On the other hand, long genes whose introns are depleted in repetitive elements have lower abundance in the blastocyst and include genes from signaling pathways. Additionally, the sequences of the genes that are differentially expressed between the blastocyst and the zygote contain distinct collections of pyknon motifs that differ between up- and down-regulated genes. Further examination of the genes that participate in the stem cell-specific protein interaction network shows that their introns are short and enriched in Alu (human) and B (mouse) elements. As organogenesis progresses, in both human and mouse, we find that the primarily short and repeat-rich expressed genes make way for primarily longer, repeat-poor genes. With that in mind, we used a machine learning-based approach to identify gene signatures able to classify human adult tissues: we find that the most discriminatory genes comprising these signatures have long introns that are repeat-poor and include transcription factors and signaling-cascade genes. The introns of widely expressed genes across human tissues, on the other hand, are short and repeat-rich, and coincide with those with the highest expression at the blastocyst stage. Conclusions Protein-coding genes that are characteristic of each trajectory, i.e., proliferation/pluripotency or differentiation, exhibit antithetical biases in their intronic and exonic lengths and in their repetitive-element content. While the respective human and mouse gene signatures are functionally and evolutionarily conserved, their introns and exons are enriched or depleted in organism-specific repetitive elements. We posit that these organism-specific repetitive sequences found in exons and introns are used to effect the corresponding genes’ regulation. Supplementary Information The online version contains supplementary material available at 10.1186/s12915-020-00928-8.
Collapse
Affiliation(s)
- Aristeidis G Telonis
- Computational Medicine Center, Sidney Kimmel College of Medicine, Thomas Jefferson University, 1020 Locust Street, Suite M81, Philadelphia, PA, 19107, USA. .,Department of Human Genetics, Miller School of Medicine, University of Miami, Miami, FL, 33136, USA.
| | - Isidore Rigoutsos
- Computational Medicine Center, Sidney Kimmel College of Medicine, Thomas Jefferson University, 1020 Locust Street, Suite M81, Philadelphia, PA, 19107, USA.
| |
Collapse
|
28
|
Savarese M, Välipakka S, Johari M, Hackman P, Udd B. Is Gene-Size an Issue for the Diagnosis of Skeletal Muscle Disorders? J Neuromuscul Dis 2021; 7:203-216. [PMID: 32176652 PMCID: PMC7369045 DOI: 10.3233/jnd-190459] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Human genes have a variable length. Those having a coding sequence of extraordinary length and a high number of exons were almost impossible to sequence using the traditional Sanger-based gene-by-gene approach. High-throughput sequencing has partly overcome the size-related technical issues, enabling a straightforward, rapid and relatively inexpensive analysis of large genes. Several large genes (e.g. TTN, NEB, RYR1, DMD) are recognized as disease-causing in patients with skeletal muscle diseases. However, because of their sheer size, the clinical interpretation of variants in these genes is probably the most challenging aspect of the high-throughput genetic investigation in the field of skeletal muscle diseases. The main aim of this review is to discuss the technical and interpretative issues related to the diagnostic investigation of large genes and to reflect upon the current state of the art and the future advancements in the field.
Collapse
Affiliation(s)
- Marco Savarese
- Folkhälsan Research Center, Helsinki, Finland.,Department of Medical Genetics, Medicum, University of Helsinki, Helsinki, Finland
| | - Salla Välipakka
- Folkhälsan Research Center, Helsinki, Finland.,Department of Medical Genetics, Medicum, University of Helsinki, Helsinki, Finland
| | - Mridul Johari
- Folkhälsan Research Center, Helsinki, Finland.,Department of Medical Genetics, Medicum, University of Helsinki, Helsinki, Finland
| | - Peter Hackman
- Folkhälsan Research Center, Helsinki, Finland.,Department of Medical Genetics, Medicum, University of Helsinki, Helsinki, Finland
| | - Bjarne Udd
- Folkhälsan Research Center, Helsinki, Finland.,Department of Medical Genetics, Medicum, University of Helsinki, Helsinki, Finland.,Neuromuscular Research Center, Tampere University and University Hospital, Tampere, Finland.,Department of Neurology, Vaasa Central Hospital, Vaasa, Finland
| |
Collapse
|
29
|
Role of Gene Length in Control of Human Gene Expression: Chromosome-Specific and Tissue-Specific Effects. Int J Genomics 2021; 2021:8902428. [PMID: 33688492 PMCID: PMC7911607 DOI: 10.1155/2021/8902428] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2020] [Revised: 01/12/2021] [Accepted: 02/03/2021] [Indexed: 11/19/2022] Open
Abstract
This study was carried out to pursue the observation that the level of gene expression is affected by gene length in the human genome. As transcription is a time-dependent process, it is expected that gene expression will be inversely related to gene length, and this is found to be the case. Here, I describe the results of studies performed to test whether the gene length/gene expression linkage is affected by two factors, the chromosome where the gene is located and the tissue where it is expressed. Studies were performed with a database of 3538 human genes that were divided into short, midlength, and long groups. Chromosome groups were then compared in the expression level of genes with the same length. A similar analysis was performed with 19 human tissues. Tissue-specific groups were compared in the expression level of genes with the same length. Both chromosome and tissue studies revealed new information about the role of gene length in control of gene expression. Chromosome studies led to the identification of two chromosome populations that differ in the expression level of short genes. A high level of expression was observed in chromosomes 2-10, 12-15, and 18 and a low level in 1, 11, 16-17, 19-20, 22, and 24. Studies with tissue-specific genes led to the identification of two tissues, brain and liver, which differ in the expression level of short genes. The results are interpreted to support the view that the level of a gene's expression can be affected by the chromosome and the tissue where the gene is transcribed.
Collapse
|
30
|
Lopes I, Altab G, Raina P, de Magalhães JP. Gene Size Matters: An Analysis of Gene Length in the Human Genome. Front Genet 2021; 12:559998. [PMID: 33643374 PMCID: PMC7905317 DOI: 10.3389/fgene.2021.559998] [Citation(s) in RCA: 68] [Impact Index Per Article: 22.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2020] [Accepted: 01/06/2021] [Indexed: 12/23/2022] Open
Abstract
While it is expected for gene length to be associated with factors such as intron number and evolutionary conservation, we are yet to understand the connections between gene length and function in the human genome. In this study, we show that, as expected, there is a strong positive correlation between gene length, transcript length, and protein size as well as a correlation with the number of genetic variants and introns. Among tissue-specific genes, we find that the longest transcripts tend to be expressed in the blood vessels, nerves, thyroid, cervix uteri, and the brain, while the smallest transcripts tend to be expressed in the pancreas, skin, stomach, vagina, and testis. We report, as shown previously, that natural selection suppresses changes for genes with longer transcripts and promotes changes for genes with smaller transcripts. We also observe that genes with longer transcripts tend to have a higher number of co-expressed genes and protein-protein interactions, as well as more associated publications. In the functional analysis, we show that bigger transcripts are often associated with neuronal development, while smaller transcripts tend to play roles in skin development and in the immune system. Furthermore, pathways related to cancer, neurons, and heart diseases tend to have genes with longer transcripts, with smaller transcripts being present in pathways related to immune responses and neurodegenerative diseases. Based on our results, we hypothesize that longer genes tend to be associated with functions that are important in the early development stages, while smaller genes tend to play a role in functions that are important throughout the whole life, like the immune system, which requires fast responses.
Collapse
Affiliation(s)
| | | | | | - João Pedro de Magalhães
- Integrative Genomics of Ageing Group, Institute of Ageing and Chronic Disease, University of Liverpool, Liverpool, United Kingdom
| |
Collapse
|
31
|
Bochtler M, Fernandes H. DNA adenine methylation in eukaryotes: Enzymatic mark or a form of DNA damage? Bioessays 2020; 43:e2000243. [PMID: 33244833 DOI: 10.1002/bies.202000243] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2020] [Revised: 10/30/2020] [Accepted: 11/02/2020] [Indexed: 12/16/2022]
Abstract
6-methyladenine (6mA) is fairly abundant in nuclear DNA of basal fungi, ciliates and green algae. In these organisms, 6mA is maintained near transcription start sites in ApT context by a parental-strand instruction dependent maintenance methyltransferase and is positively associated with transcription. In animals and plants, 6mA levels are high only in organellar DNA. The 6mA levels in nuclear DNA are very low. They are attributable to nucleotide salvage and the activity of otherwise mitochondrial METTL4, and may be considered as a price that cells pay for adenine methylation in RNA and/or organellar DNA. Cells minimize this price by sanitizing dNTP pools to limit 6mA incorporation, and by converting 6mA that has been incorporated into DNA back to adenine. Hence, 6mA in nuclear DNA should be described as an epigenetic mark only in basal fungi, ciliates and green algae, but not in animals and plants.
Collapse
Affiliation(s)
- Matthias Bochtler
- International Institute of Molecular and Cell Biology, Warsaw, Poland.,Institute of Biochemistry and Biophysics, Polish Academy of Sciences, Warsaw, Poland
| | - Humberto Fernandes
- Institute of Biochemistry and Biophysics, Polish Academy of Sciences, Warsaw, Poland
| |
Collapse
|
32
|
Soheili-Nezhad S, van der Linden RJ, Olde Rikkert M, Sprooten E, Poelmans G. Long genes are more frequently affected by somatic mutations and show reduced expression in Alzheimer's disease: Implications for disease etiology. Alzheimers Dement 2020; 17:489-499. [PMID: 33075204 PMCID: PMC8048495 DOI: 10.1002/alz.12211] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2020] [Revised: 10/11/2020] [Accepted: 10/21/2020] [Indexed: 12/17/2022]
Abstract
Aging, the greatest risk factor for Alzheimer's disease (AD), may lead to the accumulation of somatic mutations in neurons. We investigated whether somatic mutations, specifically in longer genes, are implicated in AD etiology. First, we modeled the theoretical likelihood of genes being affected by aging‐induced somatic mutations, dependent on their length. We then tested this model and found that long genes are indeed more affected by somatic mutations and that their expression is more frequently reduced in AD brains. Furthermore, using gene‐set enrichment analysis, we investigated the potential consequences of such long gene disruption. We found that long genes are involved in synaptic adhesion and other synaptic pathways that are predicted to be inhibited in the brains of AD patients. Taken together, our findings indicate that long gene–dependent synaptic impairment may contribute to AD pathogenesis.
Collapse
Affiliation(s)
- Sourena Soheili-Nezhad
- Department of Cognitive Neuroscience, Donders Institute for Brain, Cognition, and Behaviour, Radboud University Medical Center, Nijmegen, The Netherlands
| | | | - Marcel Olde Rikkert
- Department of Geriatric Medicine, Donders Institute for Brain, Cognition, and Behaviour, Radboud University Medical Center, Nijmegen, The Netherlands.,Radboudumc Alzheimer Center, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Emma Sprooten
- Department of Cognitive Neuroscience, Donders Institute for Brain, Cognition, and Behaviour, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Geert Poelmans
- Department of Human Genetics, Radboud University Medical Center, Nijmegen, The Netherlands
| |
Collapse
|
33
|
Petibon C, Malik Ghulam M, Catala M, Abou Elela S. Regulation of ribosomal protein genes: An ordered anarchy. WILEY INTERDISCIPLINARY REVIEWS-RNA 2020; 12:e1632. [PMID: 33038057 PMCID: PMC8047918 DOI: 10.1002/wrna.1632] [Citation(s) in RCA: 49] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/29/2020] [Revised: 09/08/2020] [Accepted: 09/23/2020] [Indexed: 02/06/2023]
Abstract
Ribosomal protein genes are among the most highly expressed genes in most cell types. Their products are generally essential for ribosome synthesis, which is the cornerstone for cell growth and proliferation. Many cellular resources are dedicated to producing ribosomal proteins and thus this process needs to be regulated in ways that carefully balance the supply of nascent ribosomal proteins with the demand for new ribosomes. Ribosomal protein genes have classically been viewed as a uniform interconnected regulon regulated in eukaryotic cells by target of rapamycin and protein kinase A pathway in response to changes in growth conditions and/or cellular status. However, recent literature depicts a more complex picture in which the amount of ribosomal proteins produced varies between genes in response to two overlapping regulatory circuits. The first includes the classical general ribosome‐producing program and the second is a gene‐specific feature responsible for fine‐tuning the amount of ribosomal proteins produced from each individual ribosomal gene. Unlike the general pathway that is mainly controlled at the level of transcription and translation, this specific regulation of ribosomal protein genes is largely achieved through changes in pre‐mRNA splicing efficiency and mRNA stability. By combining general and specific regulation, the cell can coordinate ribosome production, while allowing functional specialization and diversity. Here we review the many ways ribosomal protein genes are regulated, with special focus on the emerging role of posttranscriptional regulatory events in fine‐tuning the expression of ribosomal protein genes and its role in controlling the potential variation in ribosome functions. This article is categorized under:Translation > Ribosome Biogenesis Translation > Ribosome Structure/Function Translation > Translation Regulation
Collapse
Affiliation(s)
- Cyrielle Petibon
- Département de microbiologie et d'infectiologie, Universite de Sherbrooke, Faculté de Médecine et des Sciences de la Santé, Sherbrooke, Quebec, Canada
| | - Mustafa Malik Ghulam
- Département de microbiologie et d'infectiologie, Universite de Sherbrooke, Faculté de Médecine et des Sciences de la Santé, Sherbrooke, Quebec, Canada
| | - Mathieu Catala
- Département de microbiologie et d'infectiologie, Universite de Sherbrooke, Faculté de Médecine et des Sciences de la Santé, Sherbrooke, Quebec, Canada
| | - Sherif Abou Elela
- Département de microbiologie et d'infectiologie, Universite de Sherbrooke, Faculté de Médecine et des Sciences de la Santé, Sherbrooke, Quebec, Canada
| |
Collapse
|
34
|
Levin M, Zalts H, Mostov N, Hashimshony T, Yanai I. Gene expression dynamics are a proxy for selective pressures on alternatively polyadenylated isoforms. Nucleic Acids Res 2020; 48:5926-5938. [PMID: 32421815 PMCID: PMC7293032 DOI: 10.1093/nar/gkaa359] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2020] [Revised: 04/11/2020] [Accepted: 04/27/2020] [Indexed: 01/08/2023] Open
Abstract
Alternative polyadenylation (APA) produces isoforms with distinct 3′-ends, yet their functional differences remain largely unknown. Here, we introduce the APA-seq method to detect the expression levels of APA isoforms from 3′-end RNA-Seq data by exploiting both paired-end reads for gene isoform identification and quantification. We detected the expression levels of APA isoforms in individual Caenorhabditis elegans embryos at different stages throughout embryogenesis. Examining the correlation between the temporal profiles of isoforms led us to distinguish two classes of genes: those with highly correlated isoforms (HCI) and those with lowly correlated isoforms (LCI) across time. We hypothesized that variants with similar expression profiles may be the product of biological noise, while the LCI variants may be under tighter selection and consequently their distinct 3′ UTR isoforms are more likely to have functional consequences. Supporting this notion, we found that LCI genes have significantly more miRNA binding sites, more correlated expression profiles with those of their targeting miRNAs and a relative lack of correspondence between their transcription and protein abundances. Collectively, our results suggest that a lack of coherence among the regulation of 3′ UTR isoforms is a proxy for selective pressures acting upon APA usage and consequently for their functional relevance.
Collapse
Affiliation(s)
- Michal Levin
- Quantitative Proteomics, Institute of Molecular Biology, Mainz 55128, Germany
| | - Harel Zalts
- Faculty of Biology, Technion - Israel Institute of Technology, Haifa 3200003, Israel
| | - Natalia Mostov
- Faculty of Biology, Technion - Israel Institute of Technology, Haifa 3200003, Israel
| | - Tamar Hashimshony
- Faculty of Biology, Technion - Israel Institute of Technology, Haifa 3200003, Israel
| | - Itai Yanai
- Institute for Computational Medicine, NYU Grossman School of Medicine, New York 10016, USA
| |
Collapse
|
35
|
Singh R, Sophiarani Y. A report on DNA sequence determinants in gene expression. Bioinformation 2020; 16:422-431. [PMID: 32831525 PMCID: PMC7434957 DOI: 10.6026/97320630016422] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2020] [Accepted: 04/24/2020] [Indexed: 11/26/2022] Open
Abstract
The biased usage of nucleotides in coding sequence and its correlation with gene expression has been observed in several studies. A complex set of interactions between genes and other components of the expression system determine the amount of proteins produced from coding sequences. It is known that the elongation rate of polypeptide chain is affected by both codon usage bias and specific amino acid compositional constraints. Therefore, it is of interest to review local DNA-sequence elements and other positional as well as combinatorial constraints that play significant role in gene expression.
Collapse
Affiliation(s)
- Ravail Singh
- Indian Institute of Integrative Medicine, CSIR, Canal Road, Jammu-180001
| | | |
Collapse
|
36
|
McCoy MJ, Fire AZ. Intron and gene size expansion during nervous system evolution. BMC Genomics 2020; 21:360. [PMID: 32410625 PMCID: PMC7222433 DOI: 10.1186/s12864-020-6760-4] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2020] [Accepted: 04/28/2020] [Indexed: 01/07/2023] Open
Abstract
Background The evolutionary radiation of animals was accompanied by extensive expansion of gene and genome sizes, increased isoform diversity, and complexity of regulation. Results Here we show that the longest genes are enriched for expression in neuronal tissues of diverse vertebrates and of invertebrates. Additionally, we show that neuronal gene size expansion occurred predominantly through net gains in intron size, with a positional bias toward the 5′ end of each gene. Conclusions We find that intron and gene size expansion is a feature of many genes whose expression is enriched in nervous systems. We speculate that unique attributes of neurons may subject neuronal genes to evolutionary forces favoring net size expansion. This process could be associated with tissue-specific constraints on gene function and/or the evolution of increasingly complex gene regulation in nervous systems.
Collapse
Affiliation(s)
- Matthew J McCoy
- Grass Fellowship Program, Marine Biological Laboratory, Woods Hole, MA, 02543, USA. .,Departments of Pathology and Genetics, Stanford University School of Medicine, Stanford, CA, 94305, USA.
| | - Andrew Z Fire
- Departments of Pathology and Genetics, Stanford University School of Medicine, Stanford, CA, 94305, USA.
| |
Collapse
|
37
|
Iancu D, Ashton E. Inherited Renal Tubulopathies-Challenges and Controversies. Genes (Basel) 2020; 11:genes11030277. [PMID: 32150856 PMCID: PMC7140864 DOI: 10.3390/genes11030277] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2020] [Revised: 02/29/2020] [Accepted: 02/29/2020] [Indexed: 12/23/2022] Open
Abstract
Electrolyte homeostasis is maintained by the kidney through a complex transport function mostly performed by specialized proteins distributed along the renal tubules. Pathogenic variants in the genes encoding these proteins impair this function and have consequences on the whole organism. Establishing a genetic diagnosis in patients with renal tubular dysfunction is a challenging task given the genetic and phenotypic heterogeneity, functional characteristics of the genes involved and the number of yet unknown causes. Part of these difficulties can be overcome by gathering large patient cohorts and applying high-throughput sequencing techniques combined with experimental work to prove functional impact. This approach has led to the identification of a number of genes but also generated controversies about proper interpretation of variants. In this article, we will highlight these challenges and controversies.
Collapse
Affiliation(s)
- Daniela Iancu
- UCL-Centre for Nephrology, Royal Free Campus, University College London, Rowland Hill Street, London NW3 2PF, UK
- Correspondence: ; Tel.: +44-2381204172; Fax: +44-020-74726476
| | - Emma Ashton
- Rare & Inherited Disease Laboratory, London North Genomic Laboratory Hub, Great Ormond Street Hospital for Children National Health Service Foundation Trust, Levels 4-6 Barclay House 37, Queen Square, London WC1N 3BH, UK;
| |
Collapse
|
38
|
Miao Z, Zhang T, Qi Y, Song J, Han Z, Ma C. Evolution of the RNA N 6-Methyladenosine Methylome Mediated by Genomic Duplication. PLANT PHYSIOLOGY 2020; 182:345-360. [PMID: 31409695 PMCID: PMC6945827 DOI: 10.1104/pp.19.00323] [Citation(s) in RCA: 80] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/18/2019] [Accepted: 08/03/2019] [Indexed: 05/19/2023]
Abstract
RNA N 6-methyladenosine (m6A) modification is the most abundant form of RNA epigenetic modification in eukaryotes. Given that m6A evolution is associated with the selective constraints of nucleotide sequences in mammalian genomes, we hypothesize that m6A evolution can be linked, at least in part, to genomic duplication events in complex polyploid plant genomes. To test this hypothesis, we presented the maize (Zea mays) m6A modification landscape in a transcriptome-wide manner and identified 11,968 m6A peaks carried by 5,893 and 3,811 genes from two subgenomes (maize1 and maize2, respectively). Each of these subgenomes covered over 2,200 duplicate genes. Within these duplicate genes, those carrying m6A peaks exhibited significant differences in retention rate. This biased subgenome fractionation of m6A-methylated genes is associated with multiple sequence features and is influenced by asymmetric evolutionary rates. We also characterized the coevolutionary patterns of m6A-methylated genes and transposable elements, which can be mediated by whole genome duplication and tandem duplication. We revealed the evolutionary conservation and divergence of duplicated m6A functional factors and the potential role of m6A modification in maize responses to drought stress. This study highlights complex interplays between m6A modification and gene duplication, providing a reference for understanding the mechanisms underlying m6A evolution mediated by genome duplication events.
Collapse
Affiliation(s)
- Zhenyan Miao
- State Key Laboratory of Crop Stress Biology for Arid Areas, Center of Bioinformatics, College of Life Sciences, Northwest A&F University, Shaanxi, Yangling 712100, China
- Key Laboratory of Biology and Genetics Improvement of Maize in Arid Area of Northwest Region, Ministry of Agriculture, Northwest A&F University, Shaanxi, Yangling 712100, China
| | - Ting Zhang
- State Key Laboratory of Crop Stress Biology for Arid Areas, Center of Bioinformatics, College of Life Sciences, Northwest A&F University, Shaanxi, Yangling 712100, China
| | - Yuhong Qi
- State Key Laboratory of Crop Stress Biology for Arid Areas, Center of Bioinformatics, College of Life Sciences, Northwest A&F University, Shaanxi, Yangling 712100, China
| | - Jie Song
- State Key Laboratory of Crop Stress Biology for Arid Areas, Center of Bioinformatics, College of Life Sciences, Northwest A&F University, Shaanxi, Yangling 712100, China
| | - Zhaoxue Han
- State Key Laboratory of Crop Stress Biology for Arid Areas, Center of Bioinformatics, College of Life Sciences, Northwest A&F University, Shaanxi, Yangling 712100, China
- Key Laboratory of Biology and Genetics Improvement of Maize in Arid Area of Northwest Region, Ministry of Agriculture, Northwest A&F University, Shaanxi, Yangling 712100, China
| | - Chuang Ma
- State Key Laboratory of Crop Stress Biology for Arid Areas, Center of Bioinformatics, College of Life Sciences, Northwest A&F University, Shaanxi, Yangling 712100, China
- Key Laboratory of Biology and Genetics Improvement of Maize in Arid Area of Northwest Region, Ministry of Agriculture, Northwest A&F University, Shaanxi, Yangling 712100, China
| |
Collapse
|
39
|
Razban RM. Protein Melting Temperature Cannot Fully Assess Whether Protein Folding Free Energy Underlies the Universal Abundance-Evolutionary Rate Correlation Seen in Proteins. Mol Biol Evol 2019; 36:1955-1963. [PMID: 31093676 PMCID: PMC6736436 DOI: 10.1093/molbev/msz119] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
The protein misfolding avoidance hypothesis explains the universal negative correlation between protein abundance and sequence evolutionary rate across the proteome by identifying protein folding free energy (ΔG) as the confounding variable. Abundant proteins resist toxic misfolding events by being more stable, and more stable proteins evolve slower because their mutations are more destabilizing. Direct supporting evidence consists only of computer simulations. A study taking advantage of a recent experimental breakthrough in measuring protein stability proteome-wide through melting temperature (Tm) (Leuenberger et al. 2017), found weak misfolding avoidance hypothesis support for the Escherichia coli proteome, and no support for the Saccharomyces cerevisiae, Homo sapiens, and Thermus thermophilus proteomes (Plata and Vitkup 2018). I find that the nontrivial relationship between Tm and ΔG and inaccuracy in Tm measurements by Leuenberger et al. 2017 can be responsible for not observing strong positive abundance-Tm and strong negative Tm-evolutionary rate correlations.
Collapse
Affiliation(s)
- Rostam M Razban
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA
| |
Collapse
|
40
|
Das S, Bansal M. Variation of gene expression in plants is influenced by gene architecture and structural properties of promoters. PLoS One 2019; 14:e0212678. [PMID: 30908494 PMCID: PMC6433290 DOI: 10.1371/journal.pone.0212678] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2018] [Accepted: 02/07/2019] [Indexed: 12/03/2022] Open
Abstract
In higher eukaryotes, gene architecture and structural properties of promoters have emerged as significant factors influencing variation in number of transcripts (expression level) and specificity of gene expression in a tissue (expression breadth), which eventually shape the phenotype. In this study, transcriptome data of different tissue types at various developmental stages of A. thaliana, O. sativa, S. bicolor and Z. mays have been used to understand the relationship between properties of gene components and its expression. Our findings indicate that in plants, among all gene architecture and structural properties of promoters, compactness of genes in terms of intron content is significantly linked to gene expression level and breadth, whereas in human an exactly opposite scenario is seen. In plants, for the first time we have carried out a quantitative estimation of effect of a particular trait on expression level and breadth, by using multiple regression analysis and it confirms that intron content of primary transcript (as %) is a powerful determinant of expression breadth. Similarly, further regression analysis revealed that among structural properties of the promoters, stability is negatively linked to expression breadth, while DNase1 sensitivity strongly governs gene expression breadth in monocots and gene expression level in dicots. In addition, promoter regions of tissue specific genes are found to be enriched with TATA box and Y-patch motifs. Finally, multi copy orthologous genes in plants are found to be longer, highly regulated and tissue specific.
Collapse
Affiliation(s)
- Sanjukta Das
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore, Karnataka, India
| | - Manju Bansal
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore, Karnataka, India
| |
Collapse
|
41
|
Sugino K, Clark E, Schulmann A, Shima Y, Wang L, Hunt DL, Hooks BM, Tränkner D, Chandrashekar J, Picard S, Lemire AL, Spruston N, Hantman AW, Nelson SB. Mapping the transcriptional diversity of genetically and anatomically defined cell populations in the mouse brain. eLife 2019; 8:38619. [PMID: 30977723 PMCID: PMC6499542 DOI: 10.7554/elife.38619] [Citation(s) in RCA: 39] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2018] [Accepted: 04/11/2019] [Indexed: 01/27/2023] Open
Abstract
Understanding the principles governing neuronal diversity is a fundamental goal for neuroscience. Here, we provide an anatomical and transcriptomic database of nearly 200 genetically identified cell populations. By separately analyzing the robustness and pattern of expression differences across these cell populations, we identify two gene classes contributing distinctly to neuronal diversity. Short homeobox transcription factors distinguish neuronal populations combinatorially, and exhibit extremely low transcriptional noise, enabling highly robust expression differences. Long neuronal effector genes, such as channels and cell adhesion molecules, contribute disproportionately to neuronal diversity, based on their patterns rather than robustness of expression differences. By linking transcriptional identity to genetic strains and anatomical atlases, we provide an extensive resource for further investigation of mouse neuronal cell types.
Collapse
Affiliation(s)
- Ken Sugino
- Janelia Research CampusAshburnUnited States
| | | | | | | | - Lihua Wang
- Janelia Research CampusAshburnUnited States
| | | | | | | | | | | | | | | | | | | |
Collapse
|
42
|
Rajaraman J, Douchkov D, Lück S, Hensel G, Nowara D, Pogoda M, Rutten T, Meitzel T, Brassac J, Höfle C, Hückelhoven R, Klinkenberg J, Trujillo M, Bauer E, Schmutzer T, Himmelbach A, Mascher M, Lazzari B, Stein N, Kumlehn J, Schweizer P. Evolutionarily conserved partial gene duplication in the Triticeae tribe of grasses confers pathogen resistance. Genome Biol 2018; 19:116. [PMID: 30111359 PMCID: PMC6092874 DOI: 10.1186/s13059-018-1472-7] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2017] [Accepted: 07/04/2018] [Indexed: 11/11/2022] Open
Abstract
Background The large and highly repetitive genomes of the cultivated species Hordeum vulgare (barley), Triticum aestivum (wheat), and Secale cereale (rye) belonging to the Triticeae tribe of grasses appear to be particularly rich in gene-like sequences including partial duplicates. Most of them have been classified as putative pseudogenes. In this study we employ transient and stable gene silencing- and over-expression systems in barley to study the function of HvARM1 (for H. vulgare Armadillo 1), a partial gene duplicate of the U-box/armadillo-repeat E3 ligase HvPUB15 (for H. vulgare Plant U-Box 15). Results The partial ARM1 gene is derived from a gene-duplication event in a common ancestor of the Triticeae and contributes to quantitative host as well as nonhost resistance to the biotrophic powdery mildew fungus Blumeria graminis. In barley, allelic variants of HvARM1 but not of HvPUB15 are significantly associated with levels of powdery mildew infection. Both HvPUB15 and HvARM1 proteins interact in yeast and plant cells with the susceptibility-related, plastid-localized barley homologs of THF1 (for Thylakoid formation 1) and of ClpS1 (for Clp-protease adaptor S1) of Arabidopsis thaliana. A genome-wide scan for partial gene duplicates reveals further events in barley resulting in stress-regulated, potentially neo-functionalized, genes. Conclusion The results suggest neo-functionalization of the partial gene copy HvARM1 increases resistance against powdery mildew infection. It further links plastid function with susceptibility to biotrophic pathogen attack. These findings shed new light on a novel mechanism to employ partial duplication of protein-protein interaction domains to facilitate the expansion of immune signaling networks. Electronic supplementary material The online version of this article (10.1186/s13059-018-1472-7) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Jeyaraman Rajaraman
- Leibniz Institut für Pflanzengenetik und Kulturpflanzenforschung (IPK Gatersleben), Corrensstrasse 3, D-06466, Stadt Seeland, Germany.
| | - Dimitar Douchkov
- Leibniz Institut für Pflanzengenetik und Kulturpflanzenforschung (IPK Gatersleben), Corrensstrasse 3, D-06466, Stadt Seeland, Germany.
| | - Stefanie Lück
- Leibniz Institut für Pflanzengenetik und Kulturpflanzenforschung (IPK Gatersleben), Corrensstrasse 3, D-06466, Stadt Seeland, Germany
| | - Götz Hensel
- Leibniz Institut für Pflanzengenetik und Kulturpflanzenforschung (IPK Gatersleben), Corrensstrasse 3, D-06466, Stadt Seeland, Germany
| | - Daniela Nowara
- Leibniz Institut für Pflanzengenetik und Kulturpflanzenforschung (IPK Gatersleben), Corrensstrasse 3, D-06466, Stadt Seeland, Germany
| | - Maria Pogoda
- Leibniz Institut für Pflanzengenetik und Kulturpflanzenforschung (IPK Gatersleben), Corrensstrasse 3, D-06466, Stadt Seeland, Germany
| | - Twan Rutten
- Leibniz Institut für Pflanzengenetik und Kulturpflanzenforschung (IPK Gatersleben), Corrensstrasse 3, D-06466, Stadt Seeland, Germany
| | - Tobias Meitzel
- Leibniz Institut für Pflanzengenetik und Kulturpflanzenforschung (IPK Gatersleben), Corrensstrasse 3, D-06466, Stadt Seeland, Germany
| | - Jonathan Brassac
- Leibniz Institut für Pflanzengenetik und Kulturpflanzenforschung (IPK Gatersleben), Corrensstrasse 3, D-06466, Stadt Seeland, Germany
| | - Caroline Höfle
- Technische Universität München, Emil-Ramann-Straße 2, D-85354, Freising, Germany
| | - Ralph Hückelhoven
- Technische Universität München, Emil-Ramann-Straße 2, D-85354, Freising, Germany
| | - Jörn Klinkenberg
- Leibniz Institut für Pflanzenbiochemie, Weinberg 3, D-06120, Halle (Saale), Germany
| | - Marco Trujillo
- Leibniz Institut für Pflanzenbiochemie, Weinberg 3, D-06120, Halle (Saale), Germany.,Albert-Ludwigs-Universität Freiburg, Institut für Biologie II, Zellbiologie, D-79104, Freiburg, Germany
| | - Eva Bauer
- Technische Universität München, Liesel-Beckmann-Straße 2, D-85354, Freising, Germany
| | - Thomas Schmutzer
- Leibniz Institut für Pflanzengenetik und Kulturpflanzenforschung (IPK Gatersleben), Corrensstrasse 3, D-06466, Stadt Seeland, Germany
| | - Axel Himmelbach
- Leibniz Institut für Pflanzengenetik und Kulturpflanzenforschung (IPK Gatersleben), Corrensstrasse 3, D-06466, Stadt Seeland, Germany
| | - Martin Mascher
- Leibniz Institut für Pflanzengenetik und Kulturpflanzenforschung (IPK Gatersleben), Corrensstrasse 3, D-06466, Stadt Seeland, Germany
| | - Barbara Lazzari
- Parco Technologico Padano, Via Einstein, Loc. Cascina Codazza, 26900, Lodi, Italy
| | - Nils Stein
- Leibniz Institut für Pflanzengenetik und Kulturpflanzenforschung (IPK Gatersleben), Corrensstrasse 3, D-06466, Stadt Seeland, Germany
| | - Jochen Kumlehn
- Leibniz Institut für Pflanzengenetik und Kulturpflanzenforschung (IPK Gatersleben), Corrensstrasse 3, D-06466, Stadt Seeland, Germany
| | - Patrick Schweizer
- Leibniz Institut für Pflanzengenetik und Kulturpflanzenforschung (IPK Gatersleben), Corrensstrasse 3, D-06466, Stadt Seeland, Germany
| |
Collapse
|
43
|
Pan S, Bruford MW, Wang Y, Lin Z, Gu Z, Hou X, Deng X, Dixon A, Graves JAM, Zhan X. Transcription-Associated Mutation Promotes RNA Complexity in Highly Expressed Genes-A Major New Source of Selectable Variation. Mol Biol Evol 2018; 35:1104-1119. [PMID: 29420738 PMCID: PMC5913671 DOI: 10.1093/molbev/msy017] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
Alternatively spliced transcript isoforms are thought to play a critical role for functional diversity. However, the mechanism generating the enormous diversity of spliced transcript isoforms remains unknown, and its biological significance remains unclear. We analyzed transcriptomes in saker falcons, chickens, and mice to show that alternative splicing occurs more frequently, yielding more isoforms, in highly expressed genes. We focused on hemoglobin in the falcon, the most abundantly expressed genes in blood, finding that alternative splicing produces 10-fold more isoforms than expected from the number of splice junctions in the genome. These isoforms were produced mainly by alternative use of de novo splice sites generated by transcription-associated mutation (TAM), not by the RNA editing mechanism normally invoked. We found that high expression of globin genes increases mutation frequencies during transcription, especially on nontranscribed DNA strands. After DNA replication, transcribed strands inherit these somatic mutations, creating de novo splice sites, and generating multiple distinct isoforms in the cell clone. Bisulfate sequencing revealed that DNA methylation may counteract this process by suppressing TAM, suggesting DNA methylation can spatially regulate RNA complexity. RNA profiling showed that falcons living on the high Qinghai-Tibetan Plateau possess greater global gene expression levels and higher diversity of mean to high abundance isoforms (reads per kilobases per million mapped reads ≥18) than their low-altitude counterparts, and we speculate that this may enhance their oxygen transport capacity under low-oxygen environments. Thus, TAM-induced RNA diversity may be physiologically significant, providing an alternative strategy in lifestyle evolution.
Collapse
Affiliation(s)
- Shengkai Pan
- Key Laboratory of Animal Ecology and Conservation Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, China.,Cardiff University-Institute of Zoology Joint Laboratory for Biocomplexity Research, Beijing, China.,University of Chinese Academy of Sciences, Beijing, China
| | - Michael W Bruford
- Cardiff University-Institute of Zoology Joint Laboratory for Biocomplexity Research, Beijing, China.,Organisms and Environment Division, School of Biosciences and Sustainable Place Institute, Cardiff University, Cardiff, United Kingdom
| | - Yusong Wang
- Key Laboratory of Animal Ecology and Conservation Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
| | - Zhenzhen Lin
- Key Laboratory of Animal Ecology and Conservation Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, China.,Cardiff University-Institute of Zoology Joint Laboratory for Biocomplexity Research, Beijing, China
| | - Zhongru Gu
- Key Laboratory of Animal Ecology and Conservation Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, China.,Cardiff University-Institute of Zoology Joint Laboratory for Biocomplexity Research, Beijing, China.,University of Chinese Academy of Sciences, Beijing, China
| | - Xian Hou
- Key Laboratory of Animal Ecology and Conservation Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
| | - Xuemei Deng
- National Engineering Laboratory for Animal Breeding and Key Laboratory of Animal Genetics, Breeding, and Reproduction of the Ministry of Agriculture, China Agricultural University, Beijing, China
| | - Andrew Dixon
- Cardiff University-Institute of Zoology Joint Laboratory for Biocomplexity Research, Beijing, China.,Emirates Falconers' Club, Abu Dhabi, UAE
| | | | - Xiangjiang Zhan
- Key Laboratory of Animal Ecology and Conservation Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, China.,Cardiff University-Institute of Zoology Joint Laboratory for Biocomplexity Research, Beijing, China.,Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, China
| |
Collapse
|
44
|
Qiao X, Yin H, Li L, Wang R, Wu J, Wu J, Zhang S. Different Modes of Gene Duplication Show Divergent Evolutionary Patterns and Contribute Differently to the Expansion of Gene Families Involved in Important Fruit Traits in Pear ( Pyrus bretschneideri). FRONTIERS IN PLANT SCIENCE 2018; 9:161. [PMID: 29487610 PMCID: PMC5816897 DOI: 10.3389/fpls.2018.00161] [Citation(s) in RCA: 53] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/29/2017] [Accepted: 01/29/2018] [Indexed: 05/21/2023]
Abstract
Pear is an important fruit crop of the Rosaceae family and has experienced two rounds of ancient whole-genome duplications (WGDs). However, whether different types of gene duplications evolved differently after duplication remains unclear in the pear genome. In this study, we identified the different modes of gene duplication in pear. Duplicate genes derived from WGD, tandem, proximal, retrotransposed, DNA-based transposed or dispersed duplications differ in genomic distribution, gene features, selection pressure, expression divergence, regulatory divergence and biological roles. Widespread sequence, expression and regulatory divergence have occurred between duplicate genes over the 30-45 million years of evolution after the recent genome duplication in pear. The retrotransposed genes show relatively higher expression and regulatory divergence than other gene duplication modes. In contrast, WGD genes underwent a slower sequence divergence and may be influenced by abundant gene conversion events. Moreover, the different classes of duplicate genes exhibited biased functional roles. We also investigated the evolution and expansion patterns of the gene families involved in sugar and organic acid metabolism pathways, which are closely related to the fruit quality and taste in pear. Single-gene duplications largely account for the extensive expansion of gene families involved in the sorbitol metabolism pathway in pear. Gene family expansion was also detected in the sucrose metabolism pathway and tricarboxylic acid cycle pathways. Thus, this study provides insights into the evolutionary fates of duplicated genes.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Shaoling Zhang
- State Key Laboratory of Crop Genetics and Germplasm Enhancement, Centre of Pear Engineering Technology Research, Nanjing Agricultural University, Nanjing, China
| |
Collapse
|
45
|
Shelby MV. Waardenburg Syndrome Expression and Penetrance. JOURNAL OF RARE DISEASES RESEARCH & TREATMENT 2017; 2:31-40. [PMID: 30854529 PMCID: PMC6404762] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/01/2022]
Abstract
Through a combination of in silico research and reviews of previous work, mechanisms by which nonsense-mediated mRNA decay (NMD) affects the inheritance and expressivity of Waardenburg syndrome is realized. While expressivity and inheritance both relate to biochemical processes underlying a gene's function, this research explores how alternative splicing and premature termination codons (PTC's) within mRNAs mutated in the disease are either translated into deleterious proteins or decayed to minimize expression of altered proteins. Elucidation of splice variants coupled with NMD perpetuating the various symptoms and inheritance patterns of this disease represent novel findings. By investigating nonsense mutations that lie within and outside the NMD boundary of these transcripts we can evaluate the effects of protein truncation versus minimized protein expression on the variable expressivity found between Type I and Type III Waardenburg syndrome, PAX3, while comparatively evaluating EDN3 and SOX10's role in inheritance of Type IV subtypes of the disease. This review will demonstrate how alternative splicing perpetuates or limits NMD activity by way of PTC positioning, thereby affecting the presentation of Waardenburg syndrome.
Collapse
Affiliation(s)
- Myeshia V. Shelby
- Department of Genetics and Human Genetics, Howard University Graduate School, Howard University, USA
| |
Collapse
|
46
|
Challenges and advances for transcriptome assembly in non-model species. PLoS One 2017; 12:e0185020. [PMID: 28931057 PMCID: PMC5607178 DOI: 10.1371/journal.pone.0185020] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2017] [Accepted: 09/04/2017] [Indexed: 12/28/2022] Open
Abstract
Analyses of high-throughput transcriptome sequences of non-model organisms are based on two main approaches: de novo assembly and genome-guided assembly using mapping to assign reads prior to assembly. Given the limits of mapping reads to a reference when it is highly divergent, as is frequently the case for non-model species, we evaluate whether using blastn would outperform mapping methods for read assignment in such situations (>15% divergence). We demonstrate its high performance by using simulated reads of lengths corresponding to those generated by the most common sequencing platforms, and over a realistic range of genetic divergence (0% to 30% divergence). Here we focus on gene identification and not on resolving the whole set of transcripts (i.e. the complete transcriptome). For simulated datasets, the transcriptome-guided assembly based on blastn recovers 94.8% of genes irrespective of read length at 0% divergence; however, assignment rate of reads is negatively correlated with both increasing divergence level and reducing read lengths. Nevertheless, we still observe 92.6% of recovered genes at 30% divergence irrespective of read length. This analysis also produces a categorization of genes relative to their assignment, and suggests guidelines for data processing prior to analyses of comparative transcriptomics and gene expression to minimize potential inferential bias associated with incorrect transcript assignment. We also compare the performances of de novo assembly alone vs in combination with a transcriptome-guided assembly based on blastn both via simulation and empirically, using data from a cyprinid fish species and from an oak species. For any simulated scenario, the transcriptome-guided assembly using blastn outperforms the de novo approach alone, including when the divergence level is beyond the reach of traditional mapping methods. Combining de novo assembly and a related reference transcriptome for read assignment also addresses the bias/error in contigs caused by the dependence on a related reference alone. Empirical data corroborate these findings when assembling transcriptomes from the two non-model organisms: Parachondrostoma toxostoma (fish) and Quercus pubescens (plant). For the fish species, out of the 31,944 genes known from D. rerio, the guided and de novo assemblies recover respectively 20,605 and 20,032 genes but the performance of the guided assembly approach is much higher for both the contiguity and completeness metrics. For the oak, out of the 29,971 genes known from Vitis vinifera, the transcriptome-guided and de novo assemblies display similar performance, but the new guided approach detects 16,326 genes where the de novo assembly only detects 9,385 genes.
Collapse
|
47
|
Bolotin E, Hershberg R. Horizontally Acquired Genes Are Often Shared between Closely Related Bacterial Species. Front Microbiol 2017; 8:1536. [PMID: 28890711 PMCID: PMC5575156 DOI: 10.3389/fmicb.2017.01536] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2017] [Accepted: 07/28/2017] [Indexed: 01/11/2023] Open
Abstract
Horizontal gene transfer (HGT) serves as an important source of innovation for bacterial species. We used a pangenome-based approach to identify genes that were horizontally acquired by four closely related bacterial species, belonging to the Enterobacteriaceae family. This enabled us to examine the extent to which such closely related species tend to share horizontally acquired genes. We find that a high percent of horizontally acquired genes are shared among these closely related species. Furthermore, we demonstrate that the extent of sharing of horizontally acquired genes among these four closely related species is predictive of the extent to which these genes will be found in additional bacterial species. Finally, we show that acquired genes shared by more species tend to be better optimized for expression within the genomes of their new hosts. Combined, our results demonstrate the existence of a large pool of frequently horizontally acquired genes that have distinct characteristics from horizontally acquired genes that are less frequently shared between species.
Collapse
Affiliation(s)
- Evgeni Bolotin
- Rachel and Menachem Mendelovitch Evolutionary Processes of Mutation and Natural Selection Research Laboratory, The Rappaport Family Institute for Research in the Medical Sciences, Department of Genetics and Developmental Biology, Technion-Israel Institute of TechnologyHaifa, Israel
| | - Ruth Hershberg
- Rachel and Menachem Mendelovitch Evolutionary Processes of Mutation and Natural Selection Research Laboratory, The Rappaport Family Institute for Research in the Medical Sciences, Department of Genetics and Developmental Biology, Technion-Israel Institute of TechnologyHaifa, Israel
| |
Collapse
|
48
|
Guschanski K, Warnefors M, Kaessmann H. The evolution of duplicate gene expression in mammalian organs. Genome Res 2017; 27:1461-1474. [PMID: 28743766 PMCID: PMC5580707 DOI: 10.1101/gr.215566.116] [Citation(s) in RCA: 63] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2016] [Accepted: 07/18/2017] [Indexed: 12/16/2022]
Abstract
Gene duplications generate genomic raw material that allows the emergence of novel functions, likely facilitating adaptive evolutionary innovations. However, global assessments of the functional and evolutionary relevance of duplicate genes in mammals were until recently limited by the lack of appropriate comparative data. Here, we report a large-scale study of the expression evolution of DNA-based functional gene duplicates in three major mammalian lineages (placental mammals, marsupials, egg-laying monotremes) and birds, on the basis of RNA sequencing (RNA-seq) data from nine species and eight organs. We observe dynamic changes in tissue expression preference of paralogs with different duplication ages, suggesting differential contribution of paralogs to specific organ functions during vertebrate evolution. Specifically, we show that paralogs that emerged in the common ancestor of bony vertebrates are enriched for genes with brain-specific expression and provide evidence for differential forces underlying the preferential emergence of young testis- and liver-specific expressed genes. Further analyses uncovered that the overall spatial expression profiles of gene families tend to be conserved, with several exceptions of pronounced tissue specificity shifts among lineage-specific gene family expansions. Finally, we trace new lineage-specific genes that may have contributed to the specific biology of mammalian organs, including the little-studied placenta. Overall, our study provides novel and taxonomically broad evidence for the differential contribution of duplicate genes to tissue-specific transcriptomes and for their importance for the phenotypic evolution of vertebrates.
Collapse
Affiliation(s)
- Katerina Guschanski
- Department of Animal Ecology, Evolutionary Biology Centre, Uppsala University, S-75105 Uppsala, Sweden
| | - Maria Warnefors
- Center for Molecular Biology of Heidelberg University (ZMBH), DKFZ-ZMBH Alliance, D-69120 Heidelberg, Germany
| | - Henrik Kaessmann
- Center for Molecular Biology of Heidelberg University (ZMBH), DKFZ-ZMBH Alliance, D-69120 Heidelberg, Germany
| |
Collapse
|
49
|
Aharonovich D, Sher D. Transcriptional response of Prochlorococcus to co-culture with a marine Alteromonas: differences between strains and the involvement of putative infochemicals. THE ISME JOURNAL 2016; 10:2892-2906. [PMID: 27128996 PMCID: PMC5148192 DOI: 10.1038/ismej.2016.70] [Citation(s) in RCA: 49] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/14/2016] [Revised: 03/16/2016] [Accepted: 03/22/2016] [Indexed: 11/08/2022]
Abstract
Interactions between marine microorganisms may determine the dynamics of microbial communities. Here, we show that two strains of the globally abundant marine cyanobacterium Prochlorococcus, MED4 and MIT9313, which belong to two different ecotypes, differ markedly in their response to co-culture with a marine heterotrophic bacterium, Alteromonas macleodii strain HOT1A3. HOT1A3 enhanced the growth of MIT9313 at low cell densities, yet inhibited it at a higher concentration, whereas it had no effect on MED4 growth. The early transcriptomic responses of Prochlorococcus cells after 20 h in co-culture showed no evidence of nutrient starvation, whereas the expression of genes involved in photosynthesis, protein synthesis and stress responses typically decreased in MED4 and increased in MIT313. Differential expression of genes involved in outer membrane modification, efflux transporters and, in MIT9313, lanthipeptides (prochlorosins) suggests that Prochlorococcus mount a specific response to the presence of the heterotroph in the cultures. Intriguingly, many of the differentially-expressed genes encoded short proteins, including two new families of co-culture responsive genes: CCRG-1, which is found across the Prochlorococcus lineage and CCRG-2, which contains a sequence motif involved in the export of prochlorosins and other bacteriocin-like peptides, and are indeed released from the cells into the media.
Collapse
Affiliation(s)
- Dikla Aharonovich
- Department of Marine Biology, Leon H. Charney School of Marine Sciences, University of Haifa, Haifa, Israel
| | - Daniel Sher
- Department of Marine Biology, Leon H. Charney School of Marine Sciences, University of Haifa, Haifa, Israel
| |
Collapse
|
50
|
Ramírez-Sánchez O, Pérez-Rodríguez P, Delaye L, Tiessen A. Plant Proteins Are Smaller Because They Are Encoded by Fewer Exons than Animal Proteins. GENOMICS, PROTEOMICS & BIOINFORMATICS 2016; 14:357-370. [PMID: 27998811 PMCID: PMC5200936 DOI: 10.1016/j.gpb.2016.06.003] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/22/2016] [Revised: 06/03/2016] [Accepted: 06/03/2016] [Indexed: 01/27/2023]
Abstract
Protein size is an important biochemical feature since longer proteins can harbor more domains and therefore can display more biological functionalities than shorter proteins. We found remarkable differences in protein length, exon structure, and domain count among different phylogenetic lineages. While eukaryotic proteins have an average size of 472 amino acid residues (aa), average protein sizes in plant genomes are smaller than those of animals and fungi. Proteins unique to plants are ∼81aa shorter than plant proteins conserved among other eukaryotic lineages. The smaller average size of plant proteins could neither be explained by endosymbiosis nor subcellular compartmentation nor exon size, but rather due to exon number. Metazoan proteins are encoded on average by ∼10 exons of small size [∼176 nucleotides (nt)]. Streptophyta have on average only ∼5.7 exons of medium size (∼230nt). Multicellular species code for large proteins by increasing the exon number, while most unicellular organisms employ rather larger exons (>400nt). Among subcellular compartments, membrane proteins are the largest (∼520aa), whereas the smallest proteins correspond to the gene ontology group of ribosome (∼240aa). Plant genes are encoded by half the number of exons and also contain fewer domains than animal proteins on average. Interestingly, endosymbiotic proteins that migrated to the plant nucleus became larger than their cyanobacterial orthologs. We thus conclude that plants have proteins larger than bacteria but smaller than animals or fungi. Compared to the average of eukaryotic species, plants have ∼34% more but ∼20% smaller proteins. This suggests that photosynthetic organisms are unique and deserve therefore special attention with regard to the evolutionary forces acting on their genomes and proteomes.
Collapse
Affiliation(s)
- Obed Ramírez-Sánchez
- Genetic Engineering Department, CINVESTAV Unidad Irapuato, Irapuato, CP 36821, Mexico
| | | | - Luis Delaye
- Genetic Engineering Department, CINVESTAV Unidad Irapuato, Irapuato, CP 36821, Mexico
| | - Axel Tiessen
- Genetic Engineering Department, CINVESTAV Unidad Irapuato, Irapuato, CP 36821, Mexico.
| |
Collapse
|