1
|
Abbasi-Vineh MA, Emadpour M. The First Introduction of an Exogenous 5' Untranslated Region for Control of Plastid Transgene Expression in Chlamydomonas reinhardtii. Mol Biotechnol 2024:10.1007/s12033-024-01279-3. [PMID: 39271617 DOI: 10.1007/s12033-024-01279-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2024] [Accepted: 08/27/2024] [Indexed: 09/15/2024]
Abstract
The utilization of heterologous 5' untranslated regions (5'UTRs) for expressing foreign proteins in the chloroplast of Chlamydomonas reinhardtii (C. reinhardtii) has posed a persistent challenge over the years. This challenge stems from the lack of a defined and comprehensive set of translational cis-elements responsible for stability, ribosome binding, and translation initiation, which are mediated by trans-acting factors native to C. reinhardtii. In the current study, we aimed to address this bottleneck by employing the 5'UTR from gene 10 of the T7 bacteriophage (T7g10 5'UTR), fused to the promoter of C. reinhardtii small subunit ribosomal RNA (rrnS), to facilitate the translation of a reporter gene, YFP. Using a chimeric construct, the YFP mRNA was efficiently translated utilizing the heterologous T7g10 5'UTR. Furthermore, the accumulation of YFP protein under the control of the T7g10 5'UTR was approximately one third of that observed under the control of the endogenous psaA promoter/5'UTR in the C. reinhardtii chloroplast. The results of computational analyses demonstrated that the T7g10 5'UTR sequence shares common elements with the endogenous 5'UTRs of the chloroplast genes. Moreover, the findings of the current study highlighted the potential of employing bacteriophage 5'UTRs for the foreign protein accumulation from the chloroplast genome of C. reinhardtii.
Collapse
Affiliation(s)
- Mohammad Ali Abbasi-Vineh
- Department of Agricultural Biotechnology, Tarbiat Modares University (TMU), 1497713111, Tehran, Iran
| | - Masoumeh Emadpour
- Department of Agricultural Biotechnology, Tarbiat Modares University (TMU), 1497713111, Tehran, Iran.
| |
Collapse
|
2
|
Cardiff RL, Faulkner I, Beall J, Carothers JM, Zalatan J. CRISPR-Cas tools for simultaneous transcription & translation control in bacteria. Nucleic Acids Res 2024; 52:5406-5419. [PMID: 38613390 PMCID: PMC11109947 DOI: 10.1093/nar/gkae275] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2023] [Revised: 03/27/2024] [Accepted: 04/05/2024] [Indexed: 04/14/2024] Open
Abstract
Robust control over gene translation at arbitrary mRNA targets is an outstanding challenge in microbial synthetic biology. The development of tools that can regulate translation will greatly expand our ability to precisely control genes across the genome. In Escherichia coli, most genes are contained in multi-gene operons, which are subject to polar effects where targeting one gene for repression leads to silencing of other genes in the same operon. These effects pose a challenge for independently regulating individual genes in multi-gene operons. Here, we use CRISPR-dCas13 to address this challenge. We find dCas13-mediated repression exhibits up to 6-fold lower polar effects compared to dCas9. We then show that we can selectively activate single genes in a synthetic multi-gene operon by coupling dCas9 transcriptional activation of an operon with dCas13 translational repression of individual genes within the operon. We also show that dCas13 and dCas9 can be multiplexed for improved biosynthesis of a medically-relevant human milk oligosaccharide. Taken together, our findings suggest that combining transcriptional and translational control can access effects that are difficult to achieve with either mode independently. These combined tools for gene regulation will expand our abilities to precisely engineer bacteria for biotechnology and perform systematic genetic screens.
Collapse
Affiliation(s)
- Ryan A L Cardiff
- Molecular Engineering & Sciences Institute and Center for Synthetic Biology University of Washington Seattle, WA 98195 USA
| | - Ian D Faulkner
- Department of Chemical Engineering University of Washington Seattle, WA 98195 USA
| | - Juliana G Beall
- Department of Chemistry University of Washington Seattle, WA 98195 USA
| | - James M Carothers
- Molecular Engineering & Sciences Institute and Center for Synthetic Biology University of Washington Seattle, WA 98195 USA
- Department of Chemical Engineering University of Washington Seattle, WA 98195 USA
| | - Jesse G Zalatan
- Molecular Engineering & Sciences Institute and Center for Synthetic Biology University of Washington Seattle, WA 98195 USA
- Department of Chemical Engineering University of Washington Seattle, WA 98195 USA
- Department of Chemistry University of Washington Seattle, WA 98195 USA
| |
Collapse
|
3
|
Bajić D. Information Theory, Living Systems, and Communication Engineering. ENTROPY (BASEL, SWITZERLAND) 2024; 26:430. [PMID: 38785679 PMCID: PMC11120474 DOI: 10.3390/e26050430] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/07/2024] [Revised: 05/08/2024] [Accepted: 05/17/2024] [Indexed: 05/25/2024]
Abstract
Mainstream research on information theory within the field of living systems involves the application of analytical tools to understand a broad range of life processes. This paper is dedicated to an opposite problem: it explores the information theory and communication engineering methods that have counterparts in the data transmission process by way of DNA structures and neural fibers. Considering the requirements of modern multimedia, transmission methods chosen by nature may be different, suboptimal, or even far from optimal. However, nature is known for rational resource usage, so its methods have a significant advantage: they are proven to be sustainable. Perhaps understanding the engineering aspects of methods of nature can inspire a design of alternative green, stable, and low-cost transmission.
Collapse
Affiliation(s)
- Dragana Bajić
- Department of Communications and Signal Processing, Faculty of Technical Sciences, University of Novi Sad, Trg Dositeja Obradovica 6, 21000 Novi Sad, Serbia
| |
Collapse
|
4
|
López-Pérez M, Aguirre-Garrido F, Herrera-Zúñiga L, Fernández FJ. Gene as a dynamical notion: An extensive and integrative vision. Redefining the gene concept, from traditional to genic-interaction, as a new dynamical version. Biosystems 2023; 234:105060. [PMID: 37844827 DOI: 10.1016/j.biosystems.2023.105060] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Revised: 09/08/2023] [Accepted: 10/10/2023] [Indexed: 10/18/2023]
Abstract
The current concept of gene has been very useful during the 20th and 21st centuries. However, recent advances in molecular biology and bioinformatics, which have further diversified the functional and adaptive profile of genetic information and its integration with cell physiology and environmental response, have contributed to focusing on additional new gene properties besides the traditional definition. Considering the inherent complexity of gene expression, whose adaptive objective must be referred to the Tortoise-Hare model, in which two tendencies converge, one focused on rapid adaptation to achieve survival, and the other that prevents an over-adaptation effect. In this context, a revision of the gene concept must be made, which must include these new mechanisms and approaches. In this paper, we propose a new conception of the idea of a gene that moves from a static and defined version of hereditary information to a dynamic idea that preponderates gene interaction (circumscribed to that established between protein-protein, protein-nucleic acid, and nucleic acid-nucleic acid) and the selection it exerts, as the irreducible element that works in a coordinated way in a genomic regulatory network (GRN).
Collapse
Affiliation(s)
- Marcos López-Pérez
- Environmental Sciences Department, Universidad Autónoma Metropolitana (Lerma Unit) Av. de las Garzas N° 10, Col. El Panteón, Municipio de Lerma de Villada, Estado de México, C.P. 52005, Mexico.
| | - Félix Aguirre-Garrido
- Environmental Sciences Department, Universidad Autónoma Metropolitana (Lerma Unit) Av. de las Garzas N° 10, Col. El Panteón, Municipio de Lerma de Villada, Estado de México, C.P. 52005, Mexico
| | - Leonardo Herrera-Zúñiga
- Chemistry Department, Universidad Autónoma Metropolitana (Iztapalapa Unit), C.P. 09340, Mexico City, Mexico
| | - Francisco J Fernández
- Biotechnology Department, Universidad Autónoma Metropolitana (Iztapalapa Unit), C.P. 09340, Mexico City, Mexico.
| |
Collapse
|
5
|
Bukhnikashvili L. Overlaps Between CDS Regions of Protein-Coding Genes in the Human Genome: A Case Study on the NR1D1-THRA Gene Pair. J Mol Evol 2023; 91:963-975. [PMID: 38006429 DOI: 10.1007/s00239-023-10147-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2023] [Accepted: 11/12/2023] [Indexed: 11/27/2023]
Abstract
For several decades, it has been known that a substantial number of genes within human DNA exhibit overlap; however, the biological and evolutionary significance of these overlaps remain poorly understood. This study focused on investigating specific instances of overlap where the overlapping DNA region encompasses the coding DNA sequences (CDSs) of protein-coding genes. The results revealed that proteins encoded by overlapping CDSs exhibit greater disorder than those from nonoverlapping CDSs. Additionally, these DNA regions were identified as GC-rich. This could be partially attributed to the absence of stop codons from two distinct reading frames rather than one. Furthermore, these regions were found to harbour fewer single-nucleotide polymorphism (SNP) sites, possibly due to constraints arising from the overlapping state where mutations could affect two genes simultaneously.While elucidating these properties, the NR1D1-THRA gene pair emerged as an exceptional case with highly structured proteins and a distinctly conserved sequence across eutherian mammals. Both NR1D1 and THRA are nuclear receptors lacking a ligand-binding domain at their C-terminus, which is the region where these gene pairs overlap. The NR1D1 gene is involved in the regulation of circadian rhythm, while the THRA gene encodes a thyroid hormone receptor, and both play crucial roles in various physiological processes. This study suggests that, in addition to their well-established functions, the specifically overlapping CDS regions of these genes may encode protein segments with additional, yet undiscovered, biological roles.
Collapse
|
6
|
Huber M, Vogel N, Borst A, Pfeiffer F, Karamycheva S, Wolf YI, Koonin EV, Soppa J. Unidirectional gene pairs in archaea and bacteria require overlaps or very short intergenic distances for translational coupling via termination-reinitiation and often encode subunits of heteromeric complexes. Front Microbiol 2023; 14:1291523. [PMID: 38029211 PMCID: PMC10666635 DOI: 10.3389/fmicb.2023.1291523] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2023] [Accepted: 10/25/2023] [Indexed: 12/01/2023] Open
Abstract
Genomes of bacteria and archaea contain a much larger fraction of unidirectional (serial) gene pairs than convergent or divergent gene pairs. Many of the unidirectional gene pairs have short overlaps of -4 nt and -1 nt. As shown previously, translation of the genes in overlapping unidirectional gene pairs is tightly coupled. Two alternative models for the fate of the post-termination ribosome predict either that overlaps or very short intergenic distances are essential for translational coupling or that the undissociated post-termination ribosome can scan through long intergenic regions, up to hundreds of nucleotides. We aimed to experimentally resolve the contradiction between the two models by analyzing three native gene pairs from the model archaeon Haloferax volcanii and three native pairs from Escherichia coli. A two reporter gene system was used to quantify the reinitiation frequency, and several stop codons in the upstream gene were introduced to increase the intergenic distances. For all six gene pairs from two species, an extremely strong dependence of the reinitiation efficiency on the intergenic distance was unequivocally demonstrated, such that even short intergenic distances of about 20 nt almost completely abolished translational coupling. Bioinformatic analysis of the intergenic distances in all unidirectional gene pairs in the genomes of H. volcanii and E. coli and in 1,695 prokaryotic species representative of 49 phyla showed that intergenic distances of -4 nt or -1 nt (= short gene overlaps of 4 nt or 1 nt) were by far most common in all these groups of archaea and bacteria. A small set of genes in E. coli, but not in H. volcanii, had intergenic distances of around +10 nt. Our experimental and bioinformatic analyses clearly show that translational coupling requires short gene overlaps, whereas scanning of intergenic regions by the post-termination ribosome occurs rarely, if at all. Short overlaps are enriched among genes that encode subunits of heteromeric complexes, and co-translational complex formation requiring precise subunit stoichiometry likely confers an evolutionary advantage that drove the formation and conservation of overlapping gene pairs during evolution.
Collapse
Affiliation(s)
- Madeleine Huber
- Institute for Molecular Biosciences, Biocentre, Goethe-University, Frankfurt, Germany
| | - Nico Vogel
- Institute for Molecular Biosciences, Biocentre, Goethe-University, Frankfurt, Germany
| | - Andreas Borst
- Institute for Molecular Biosciences, Biocentre, Goethe-University, Frankfurt, Germany
| | - Friedhelm Pfeiffer
- Computational Biology Group, Max-Planck-Institute of Biochemistry, Martinsried, Germany
| | - Svetlana Karamycheva
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, United States
| | - Yuri I. Wolf
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, United States
| | - Eugene V. Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, United States
| | - Jörg Soppa
- Institute for Molecular Biosciences, Biocentre, Goethe-University, Frankfurt, Germany
| |
Collapse
|
7
|
Ryczek N, Łyś A, Makałowska I. The Functional Meaning of 5'UTR in Protein-Coding Genes. Int J Mol Sci 2023; 24:2976. [PMID: 36769304 PMCID: PMC9917990 DOI: 10.3390/ijms24032976] [Citation(s) in RCA: 23] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2022] [Revised: 01/20/2023] [Accepted: 01/26/2023] [Indexed: 02/05/2023] Open
Abstract
As it is well known, messenger RNA has many regulatory regions along its sequence length. One of them is the 5' untranslated region (5'UTR), which itself contains many regulatory elements such as upstream ORFs (uORFs), internal ribosome entry sites (IRESs), microRNA binding sites, and structural components involved in the regulation of mRNA stability, pre-mRNA splicing, and translation initiation. Activation of the alternative, more upstream transcription start site leads to an extension of 5'UTR. One of the consequences of 5'UTRs extension may be head-to-head gene overlap. This review describes elements in 5'UTR of protein-coding transcripts and the functional significance of protein-coding genes 5' overlap with implications for transcription, translation, and disease.
Collapse
Affiliation(s)
| | | | - Izabela Makałowska
- Institute of Human Biology and Evolution, Adam Mickiewicz University in Poznań, Uniwersytetu Ponańskiego 6, 61-614 Poznań, Poland
| |
Collapse
|
8
|
Mena-Bueno S, Poveda-Urkixo I, Irazoki O, Palacios L, Cava F, Zabalza-Baranguá A, Grilló MJ. Brucella melitensis Wzm/Wzt System: Changes in the Bacterial Envelope Lead to Improved Rev1Δwzm Vaccine Properties. Front Microbiol 2022; 13:908495. [PMID: 35875565 PMCID: PMC9306315 DOI: 10.3389/fmicb.2022.908495] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Accepted: 05/26/2022] [Indexed: 11/13/2022] Open
Abstract
The lipopolysaccharide (LPS) O-polysaccharide (O-PS) is the main virulence factor in Brucella. After synthesis in the cytoplasmic membrane, O-PS is exported to the periplasm by the Wzm/Wzt system, where it is assembled into a LPS. This translocation also engages a bactoprenol carrier required for further biosynthesis pathways, such as cell wall biogenesis. Targeting O-PS export by blockage holds great potential for vaccine development, but little is known about the biological implications of each Wzm/Wzt moiety. To improve this knowledge and to elucidate its potential application as a vaccine, we constructed and studied wzm/wzt single- and double-deletion mutants, using the attenuated strain Brucella melitensis Rev1 as the parental strain. This allowed us to describe the composition of Brucella peptidoglycan for the first time. We observed that these mutants lack external O-PS yet trigger changes in genetic transcription and in phenotypic properties associated with the outer membrane and cell wall. The three mutants are highly attenuated; unexpectedly, Rev1Δwzm also excels as an immunogenic and effective vaccine against B. melitensis and Brucella ovis in mice, revealing that low persistence is not at odds with efficacy. Rev1Δwzm is attenuated in BeWo trophoblasts, does not infect mouse placentas, and is safe in pregnant ewes. Overall, these attributes and the minimal serological interference induced in sheep make Rev1Δwzm a highly promising vaccine candidate.
Collapse
Affiliation(s)
- Sara Mena-Bueno
- Animal Health Department, Instituto de Agrobiotecnología (IdAB, CSIC-Gobierno de Navarra), Pamplona, Spain
- Agronomy, Biotecnology and Food Department, Universidad Pública de Navarra (UPNA), Pamplona, Spain
| | - Irati Poveda-Urkixo
- Animal Health Department, Instituto de Agrobiotecnología (IdAB, CSIC-Gobierno de Navarra), Pamplona, Spain
| | - Oihane Irazoki
- Laboratory for Molecular Infection Medicine Sweden, Department of Molecular Biology, Umeå Centre for Microbial Research, Umeå University, Umeå, Sweden
| | - Leyre Palacios
- Animal Health Department, Instituto de Agrobiotecnología (IdAB, CSIC-Gobierno de Navarra), Pamplona, Spain
| | - Felipe Cava
- Laboratory for Molecular Infection Medicine Sweden, Department of Molecular Biology, Umeå Centre for Microbial Research, Umeå University, Umeå, Sweden
| | - Ana Zabalza-Baranguá
- Animal Health Department, Instituto de Agrobiotecnología (IdAB, CSIC-Gobierno de Navarra), Pamplona, Spain
| | - María Jesús Grilló
- Animal Health Department, Instituto de Agrobiotecnología (IdAB, CSIC-Gobierno de Navarra), Pamplona, Spain
- *Correspondence: María Jesús Grilló,
| |
Collapse
|
9
|
Logel DY, Trofimova E, Jaschke PR. Codon-Restrained Method for Both Eliminating and Creating Intragenic Bacterial Promoters. ACS Synth Biol 2022; 11:689-699. [PMID: 35043622 DOI: 10.1021/acssynbio.1c00359] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Future applications of synthetic biology will require refactored genetic sequences devoid of internal regulatory elements within coding sequences. These regulatory elements include cryptic and intragenic promoters, which may constitute up to a third of the predicted Escherichia coli promoters. The promoter activity is dependent on the structural interaction of core bases with a σ factor. Rational engineering can be used to alter key promoter element nucleotides interacting with σ factors and eliminate downstream transcriptional activity. In this paper, we present codon-restrained promoter silencing (CORPSE), a system for removing intragenic promoters. CORPSE exploits the DNA-σ factor structural relationship to disrupt σ70 promoters embedded within gene coding sequences with a minimum of synonymous codon changes. Additionally, we present an inverted CORPSE system, iCORPSE, which can create highly active promoters within a gene sequence while not perturbing the function of the modified gene.
Collapse
Affiliation(s)
- Dominic Y. Logel
- School of Natural Sciences, ARC Centre of Excellence in Synthetic Biology, Macquarie University, Sydney 2109, New South Wales, Australia
| | - Ellina Trofimova
- School of Natural Sciences, ARC Centre of Excellence in Synthetic Biology, Macquarie University, Sydney 2109, New South Wales, Australia
| | - Paul R. Jaschke
- School of Natural Sciences, ARC Centre of Excellence in Synthetic Biology, Macquarie University, Sydney 2109, New South Wales, Australia
| |
Collapse
|
10
|
Nguyen-Vo TP, Ryu H, Sauer M, Park S. Improvement of 3-hydroxypropionic acid tolerance in Klebsiella pneumoniae by novel transporter YohJK. BIORESOURCE TECHNOLOGY 2022; 346:126613. [PMID: 34954352 DOI: 10.1016/j.biortech.2021.126613] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/04/2021] [Revised: 12/17/2021] [Accepted: 12/18/2021] [Indexed: 06/14/2023]
Abstract
3-Hydroxypropionic acid (3-HP) is a platform chemical which has potential applications in cosmetic and polymer industries. Microbial production of 3-HP is hampered by its toxic effect when its concentration is high (>300 mM). In this study, the effect of yohJK overexpression (via yieP deletion or episomal overexpression) on 3-HP tolerance was investigated in Klebsiella pneumoniae, Pseudomonas denitrificans and P. asiatica. The deletion of yieP homolog could improve 3-HP tolerance in K. pneumoniae. Transcriptional analysis suggested that, among the two yohJK homologs of K. pneumoniae, expression of yohJK1, not yohJK2, was under the negative control of YieP. Furthermore, deletion of yieP significantly reduced cytoplasmic 3-HP concentration when determined by 3-HP biosensor and enhanced 3-HP tolerance and 3-HP production. This study demonstrates that the YohJK1 functions as 3-HP transporter in K. pneumoniae and their overexpression by the yieP deletion is a good strategy to enhance 3-HP tolerance and its production.
Collapse
Affiliation(s)
- Thuan Phu Nguyen-Vo
- School of Energy and Chemical Engineering, UNIST, Ulsan 44919, Republic of Korea
| | - Huichang Ryu
- School of Energy and Chemical Engineering, UNIST, Ulsan 44919, Republic of Korea
| | - Michael Sauer
- Institute of Microbiology and Microbial Biotechnology, Department of Biotechnology, BOKU-University of Natural Resources and Life Sciences, Vienna, Muthgasse 18, 1190 Vienna, Austria
| | - Sunghoon Park
- School of Energy and Chemical Engineering, UNIST, Ulsan 44919, Republic of Korea; School of Chemical and Biomolecular Engineering, Pusan National University, Busan 46241, Republic of Korea.
| |
Collapse
|
11
|
Das A, Banik BK. Advances in heterocycles as DNA intercalating cancer drugs. PHYSICAL SCIENCES REVIEWS 2022. [DOI: 10.1515/psr-2021-0065] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Abstract
The insertion of a molecule between the bases of DNA is known as intercalation. A molecule is able to interact with DNA in different ways. DNA intercalators are generally aromatic, planar, and polycyclic. In chemotherapeutic treatment, to suppress DNA replication in cancer cells, intercalators are used. In this article, we discuss the anticancer activity of 10 intensively studied DNA intercalators as drugs. The list includes proflavine, ethidium bromide, doxorubicin, dactinomycin, bleomycin, epirubicin, mitoxantrone, ellipticine, elinafide, and echinomycin. Considerable structural diversities are seen in these molecules. Besides, some examples of the metallo-intercalators are presented at the end of the chapter. These molecules have other crucial properties that are also useful in the treatment of cancers. The successes and limitations of these molecules are also presented.
Collapse
Affiliation(s)
- Aparna Das
- Department of Mathematics and Natural Sciences , College of Sciences and Human Studies, Prince Mohammad Bin Fahd University , Al Khobar 31952 , Kingdom of Saudi Arabia
| | - Bimal Krishna Banik
- Department of Mathematics and Natural Sciences , College of Sciences and Human Studies, Prince Mohammad Bin Fahd University , Al Khobar 31952 , Kingdom of Saudi Arabia
| |
Collapse
|
12
|
Shoemaker WR, Chen D, Garud NR. Comparative Population Genetics in the Human Gut Microbiome. Genome Biol Evol 2022; 14:evab116. [PMID: 34028530 PMCID: PMC8743038 DOI: 10.1093/gbe/evab116] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/22/2021] [Indexed: 11/13/2022] Open
Abstract
Genetic variation in the human gut microbiome is responsible for conferring a number of crucial phenotypes like the ability to digest food and metabolize drugs. Yet, our understanding of how this variation arises and is maintained remains relatively poor. Thus, the microbiome remains a largely untapped resource, as the large number of coexisting species in the microbiome presents a unique opportunity to compare and contrast evolutionary processes across species to identify universal trends and deviations. Here we outline features of the human gut microbiome that, while not unique in isolation, as an assemblage make it a system with unparalleled potential for comparative population genomics studies. We consciously take a broad view of comparative population genetics, emphasizing how sampling a large number of species allows researchers to identify universal evolutionary dynamics in addition to new genes, which can then be leveraged to identify exceptional species that deviate from general patterns. To highlight the potential power of comparative population genetics in the microbiome, we reanalyze patterns of purifying selection across ∼40 prevalent species in the human gut microbiome to identify intriguing trends which highlight functional categories in the microbiome that may be under more or less constraint.
Collapse
Affiliation(s)
- William R Shoemaker
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, California, USA
| | - Daisy Chen
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, California, USA
| | - Nandita R Garud
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, California, USA
- Department of Human Genetics, University of California, Los Angeles, California, USA
| |
Collapse
|
13
|
Abstract
Modern genome-scale methods that identify new genes, such as proteogenomics and ribosome profiling, have revealed, to the surprise of many, that overlap in genes, open reading frames and even coding sequences is widespread and functionally integrated into prokaryotic, eukaryotic and viral genomes. In parallel, the constraints that overlapping regions place on genome sequences and their evolution can be harnessed in bioengineering to build more robust synthetic strains and constructs. With a focus on overlapping protein-coding and RNA-coding genes, this Review examines their discovery, topology and biogenesis in the context of their genome biology. We highlight exciting new uses for sequence overlap to control translation, compress synthetic genetic constructs, and protect against mutation.
Collapse
|
14
|
Watson AK, Lopez P, Bapteste E. Hundreds of out-of-frame remodelled gene families in the E. coli pangenome. Mol Biol Evol 2021; 39:6430988. [PMID: 34792602 PMCID: PMC8788219 DOI: 10.1093/molbev/msab329] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
All genomes include gene families with very limited taxonomic distributions that potentially represent new genes and innovations in protein-coding sequence, raising questions on the origins of such genes. Some of these genes are hypothesized to have formed de novo, from noncoding sequences, and recent work has begun to elucidate the processes by which de novo gene formation can occur. A special case of de novo gene formation, overprinting, describes the origin of new genes from noncoding alternative reading frames of existing open reading frames (ORFs). We argue that additionally, out-of-frame gene fission/fusion events of alternative reading frames of ORFs and out-of-frame lateral gene transfers could contribute to the origin of new gene families. To demonstrate this, we developed an original pattern-search in sequence similarity networks, enhancing the use of these graphs, commonly used to detect in-frame remodeled genes. We applied this approach to gene families in 524 complete genomes of Escherichia coli. We identified 767 gene families whose evolutionary history likely included at least one out-of-frame remodeling event. These genes with out-of-frame components represent ∼2.5% of all genes in the E. coli pangenome, suggesting that alternative reading frames of existing ORFs can contribute to a significant proportion of de novo genes in bacteria.
Collapse
Affiliation(s)
- Andrew K Watson
- Institut de Systématique, Evolution, Biodiversité (ISYEB), Sorbonne Université, CNRS, Museum National d'Histoire Naturelle, EPHE, Université des Antilles, 7, quai Saint Bernard, Paris, 75005, France
| | - Philippe Lopez
- Institut de Systématique, Evolution, Biodiversité (ISYEB), Sorbonne Université, CNRS, Museum National d'Histoire Naturelle, EPHE, Université des Antilles, 7, quai Saint Bernard, Paris, 75005, France
| | - Eric Bapteste
- Institut de Systématique, Evolution, Biodiversité (ISYEB), Sorbonne Université, CNRS, Museum National d'Histoire Naturelle, EPHE, Université des Antilles, 7, quai Saint Bernard, Paris, 75005, France
| |
Collapse
|
15
|
Larson MR, Biddle K, Gorman A, Boutom S, Rosenshine I, Saper MA. Escherichia coli O127 group 4 capsule proteins assemble at the outer membrane. PLoS One 2021; 16:e0259900. [PMID: 34780538 PMCID: PMC8592465 DOI: 10.1371/journal.pone.0259900] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2021] [Accepted: 10/28/2021] [Indexed: 12/26/2022] Open
Abstract
Enteropathogenic Escherichia coli O127 is encapsulated by a protective layer of polysaccharide made of the same strain specific O-antigen as the serotype lipopolysaccharide. Seven genes encoding capsule export functions comprise the group 4 capsule (gfc) operon. Genes gfcE, etk and etp encode homologs of the group 1 capsule secretion system but the upstream gfcABCD genes encode unknown functions specific to group 4 capsule export. We have developed an expression system for the large-scale production of the outer membrane protein GfcD. Contrary to annotations, we find that GfcD is a non-acylated integral membrane protein. Circular dichroism spectroscopy, light-scattering data, and the HHomp server suggested that GfcD is a monomeric β-barrel with 26 β-strands and an internal globular domain. We identified a set of novel protein-protein interactions between GfcB, GfcC, and GfcD, both in vivo and in vitro, and quantified the binding properties with isothermal calorimetry and biolayer interferometry. GfcC and GfcB form a high-affinity heterodimer with a KD near 100 nM. This heterodimer binds to GfcD (KD = 28 μM) significantly better than either GfcB or GfcC alone. These gfc proteins may form a complex at the outer membrane for group 4 capsule secretion or for a yet unknown function.
Collapse
Affiliation(s)
- Matthew R. Larson
- Department of Biological Chemistry, University of Michigan, Ann Arbor, Michigan, United States of America
| | - Kassia Biddle
- Department of Biological Chemistry, University of Michigan, Ann Arbor, Michigan, United States of America
| | - Adam Gorman
- Department of Biological Chemistry, University of Michigan, Ann Arbor, Michigan, United States of America
| | - Sarah Boutom
- Department of Biological Chemistry, University of Michigan, Ann Arbor, Michigan, United States of America
- Program in Biophysics, University of Michigan, Ann Arbor, Michigan, United States of America
| | - Ilan Rosenshine
- Dept of Microbiology and Molecular Genetics, Hebrew University Faculty of Medicine, Ein Kerem, Jerusalem, Israel
| | - Mark A. Saper
- Department of Biological Chemistry, University of Michigan, Ann Arbor, Michigan, United States of America
- Program in Biophysics, University of Michigan, Ann Arbor, Michigan, United States of America
- * E-mail:
| |
Collapse
|
16
|
Katyal G, Ebanks B, Lucassen M, Papetti C, Chakrabarti L. Sequence and structure comparison of ATP synthase F0 subunits 6 and 8 in notothenioid fish. PLoS One 2021; 16:e0245822. [PMID: 34613983 PMCID: PMC8494342 DOI: 10.1371/journal.pone.0245822] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2021] [Accepted: 09/09/2021] [Indexed: 11/20/2022] Open
Abstract
Mitochondrial changes such as tight coupling of the mitochondria have facilitated sustained oxygen and respiratory activity in haemoglobin-less icefish of the Channichthyidae family. We aimed to characterise features in the sequence and structure of the proteins directly involved in proton transport, which have potential physiological implications. ATP synthase subunit a (ATP6) and subunit 8 (ATP8) are proteins that function as part of the F0 component (proton pump) of the F0F1complex. Both proteins are encoded by the mitochondrial genome and involved in oxidative phosphorylation. To explore mitochondrial sequence variation for ATP6 and ATP8 we analysed sequences from C. gunnari and C. rastrospinosus and compared them with their closely related red-blooded species and eight other vertebrate species. Our comparison of the amino acid sequence of these proteins reveals important differences that could underlie aspects of the unique physiology of the icefish. In this study we find that changes in the sequence of subunit a of the icefish C. gunnari at position 35 where there is a hydrophobic alanine which is not seen in the other notothenioids we analysed. An amino acid change of this type is significant since it may have a structural impact. The biology of the haemoglobin-less icefish is necessarily unique and any insights about these animals will help to generate a better overall understanding of important physiological pathways.
Collapse
Affiliation(s)
- Gunjan Katyal
- School of Veterinary Medicine and Science, University of Nottingham, Sutton Bonington, United Kingdom
| | - Brad Ebanks
- School of Veterinary Medicine and Science, University of Nottingham, Sutton Bonington, United Kingdom
| | | | | | - Lisa Chakrabarti
- School of Veterinary Medicine and Science, University of Nottingham, Sutton Bonington, United Kingdom
- MRC-Versus Arthritis Centre for Musculoskeletal Ageing Research, Nottingham, United Kingdom
| |
Collapse
|
17
|
Sun S, Zhou J, Jiang J, Dai Y, Sheng M. Nitrile Hydratases: From Industrial Application to Acetamiprid and Thiacloprid Degradation. JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY 2021; 69:10440-10449. [PMID: 34469128 DOI: 10.1021/acs.jafc.1c03496] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
The widespread application of neonicotinoid insecticides (NEOs) in agriculture causes a series of environmental and ecological problems. Microbial remediation is a popular approach to relieve these negative impacts, but the associated molecular mechanisms are rarely explored. Nitrile hydratase (NHase), an enzyme commonly used in industry for amide production, was discovered to be responsible for the degradation of acetamiprid (ACE) and thiacloprid (THI) by microbes. Since then, research into NHases in NEO degradation has attracted increasing attention. In this review, microbial degradation of ACE and THI is briefly described. We then focus on NHase evolution, gene composition, maturation mechanisms, expression, and biochemical properties with regard to application of NHases in NEO degradation for bioremediation.
Collapse
Affiliation(s)
- Shilei Sun
- The Key Laboratory of Biotechnology for Medicinal Plants of Jiangsu Province and School of Life Science, Jiangsu Normal University, Xuzhou 221116, People's Republic of China
| | - Jiangsheng Zhou
- The Key Laboratory of Biotechnology for Medicinal Plants of Jiangsu Province and School of Life Science, Jiangsu Normal University, Xuzhou 221116, People's Republic of China
| | - Jihong Jiang
- The Key Laboratory of Biotechnology for Medicinal Plants of Jiangsu Province and School of Life Science, Jiangsu Normal University, Xuzhou 221116, People's Republic of China
| | - Yijun Dai
- Jiangsu Key Laboratory for Microbes and Functional Genomics, Jiangsu Engineering and Technology Research Center for Industrialization of Microbial Resources, College of Life Science, Nanjing Normal University, Nanjing 210023, People's Republic of China
| | - Miaomiao Sheng
- College of Pharmacy, Zhejiang Chinese Medical University, Hangzhou 310053, People's Republic of China
| |
Collapse
|
18
|
Khan MSI, Gao X, Liang K, Mei S, Zhan J. Virulent Drexlervirial Bacteriophage MSK, Morphological and Genome Resemblance With Rtp Bacteriophage Inhibits the Multidrug-Resistant Bacteria. Front Microbiol 2021; 12:706700. [PMID: 34504479 PMCID: PMC8421802 DOI: 10.3389/fmicb.2021.706700] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2021] [Accepted: 06/14/2021] [Indexed: 11/13/2022] Open
Abstract
Phage-host interactions are likely to have the most critical aspect of phage biology. Phages are the most abundant and ubiquitous infectious acellular entities in the biosphere, where their presence remains elusive. Here, the novel Escherichia coli lytic bacteriophage, named MSK, was isolated from the lysed culture of E. coli C (phix174 host). The genome of phage MSK was sequenced, comprising 45,053 bp with 44.8% G + C composition. In total, 73 open reading frames (ORFs) were predicted, out of which 24 showed a close homology with known functional proteins, including one tRNA-arg; however, the other 49 proteins with no proven function in the genome database were called hypothetical. Electron Microscopy and genome characterization have revealed that MSK phage has a rosette-like tail tip. There were, in total, 46 ORFs which were homologous to the Rtp genome. Among these ORFs, the tail fiber protein with a locus tag of MSK_000019 was homologous to Rtp 43 protein, which determines the host specificity. The other protein, MSK_000046, encodes lipoprotein (cor gene); that protein resembles Rtp 45, responsible for preventing adsorption during cell lysis. Thirteen MSK structural proteins were identified by SDS-PAGE analysis. Out of these, 12 were vital structural proteins, and one was a hypothetical protein. Among these, the protein terminase large (MSK_000072) subunit, which may be involved in DNA packaging and proposed packaging strategy of MSK bacteriophage genome, takes place through headful packaging using the pac-sites. Biosafety assessment of highly stable phage MSK genome analysis has revealed that the phage did not possess virulence genes, which indicates proper phage therapy. MSK phage potentially could be used to inhibit the multidrug-resistant bacteria, including AMP, TCN, and Colistin. Further, a comparative genome and lifestyle study of MSK phage confirmed the highest similarity level (87.18% ANI). These findings suggest it to be a new lytic isolated phage species. Finally, Blast and phylogenetic analysis of the large terminase subunit and tail fiber protein put it in Rtp viruses' genus of family Drexlerviridae.
Collapse
Affiliation(s)
- Muhammad Saleem Iqbal Khan
- Department of Biochemistry, Cancer Institute of the Second Affiliated Hospital (Key Laboratory of Cancer Prevention and Intervention, China National Ministry of Education), School of Medicine, Zhejiang University, Hangzhou, China
| | - Xiangzheng Gao
- Department of Biochemistry, Cancer Institute of the Second Affiliated Hospital (Key Laboratory of Cancer Prevention and Intervention, China National Ministry of Education), School of Medicine, Zhejiang University, Hangzhou, China
| | - Keying Liang
- Department of Biochemistry, Cancer Institute of the Second Affiliated Hospital (Key Laboratory of Cancer Prevention and Intervention, China National Ministry of Education), School of Medicine, Zhejiang University, Hangzhou, China
| | - Shengsheng Mei
- Department of Biochemistry, Cancer Institute of the Second Affiliated Hospital (Key Laboratory of Cancer Prevention and Intervention, China National Ministry of Education), School of Medicine, Zhejiang University, Hangzhou, China
| | - Jinbiao Zhan
- Department of Biochemistry, Cancer Institute of the Second Affiliated Hospital (Key Laboratory of Cancer Prevention and Intervention, China National Ministry of Education), School of Medicine, Zhejiang University, Hangzhou, China
| |
Collapse
|
19
|
A workflow to identify novel proteins based on the direct mapping of peptide-spectrum-matches to genomic locations. BMC Bioinformatics 2021; 22:277. [PMID: 34039272 PMCID: PMC8157683 DOI: 10.1186/s12859-021-04159-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2021] [Accepted: 04/27/2021] [Indexed: 02/06/2023] Open
Abstract
Background Small Proteins have received increasing attention in recent years. They have in particular been implicated as signals contributing to the coordination of bacterial communities. In genome annotations they are often missing or hidden among large numbers of hypothetical proteins because genome annotation pipelines often exclude short open reading frames or over-predict hypothetical proteins based on simple models. The validation of novel proteins, and in particular of small proteins (sProteins), therefore requires additional evidence. Proteogenomics is considered the gold standard for this purpose. It extends beyond established annotations and includes all possible open reading frames (ORFs) as potential sources of peptides, thus allowing the discovery of novel, unannotated proteins. Typically this results in large numbers of putative novel small proteins fraught with large fractions of false-positive predictions. Results We observe that number and quality of the peptide-spectrum matches (PSMs) that map to a candidate ORF can be highly informative for the purpose of distinguishing proteins from spurious ORF annotations. We report here on a workflow that aggregates PSM quality information and local context into simple descriptors and reliably separates likely proteins from the large pool of false-positive, i.e., most likely untranslated ORFs. We investigated the artificial gut microbiome model SIHUMIx, comprising eight different species, for which we validate 5114 proteins that have previously been annotated only as hypothetical ORFs. In addition, we identified 37 non-annotated protein candidates for which we found evidence at the proteomic and transcriptomic level. Half (19) of these candidates have close functional homologs in other species. Another 12 candidates have homologs designated as hypothetical proteins in other species. The remaining six candidates are short (< 100 AA) and are most likely bona fide novel proteins. Conclusions The aggregation of PSM quality information for predicted ORFs provides a robust and efficient method to identify novel proteins in proteomics data. The workflow is in particular capable of identifying small proteins and frameshift variants. Since PSMs are explicitly mapped to genomic locations, it furthermore facilitates the integration of transcriptomics data and other sources of genome-level information. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-021-04159-8.
Collapse
|
20
|
Fijalkowska D, Fijalkowski I, Willems P, Van Damme P. Bacterial riboproteogenomics: the era of N-terminal proteoform existence revealed. FEMS Microbiol Rev 2021; 44:418-431. [PMID: 32386204 DOI: 10.1093/femsre/fuaa013] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2019] [Accepted: 05/07/2020] [Indexed: 12/17/2022] Open
Abstract
With the rapid increase in the number of sequenced prokaryotic genomes, relying on automated gene annotation became a necessity. Multiple lines of evidence, however, suggest that current bacterial genome annotations may contain inconsistencies and are incomplete, even for so-called well-annotated genomes. We here discuss underexplored sources of protein diversity and new methodologies for high-throughput genome reannotation. The expression of multiple molecular forms of proteins (proteoforms) from a single gene, particularly driven by alternative translation initiation, is gaining interest as a prominent contributor to bacterial protein diversity. In consequence, riboproteogenomic pipelines were proposed to comprehensively capture proteoform expression in prokaryotes by the complementary use of (positional) proteomics and the direct readout of translated genomic regions using ribosome profiling. To complement these discoveries, tailored strategies are required for the functional characterization of newly discovered bacterial proteoforms.
Collapse
Affiliation(s)
- Daria Fijalkowska
- Department of Biochemistry and Microbiology, Ghent University, K. L. Ledeganckstraat 35, B-9000 Ghent, Belgium
| | - Igor Fijalkowski
- Department of Biochemistry and Microbiology, Ghent University, K. L. Ledeganckstraat 35, B-9000 Ghent, Belgium
| | - Patrick Willems
- Department of Biochemistry and Microbiology, Ghent University, K. L. Ledeganckstraat 35, B-9000 Ghent, Belgium
| | - Petra Van Damme
- Department of Biochemistry and Microbiology, Ghent University, K. L. Ledeganckstraat 35, B-9000 Ghent, Belgium
| |
Collapse
|
21
|
Wright BW, Ruan J, Molloy MP, Jaschke PR. Genome Modularization Reveals Overlapped Gene Topology Is Necessary for Efficient Viral Reproduction. ACS Synth Biol 2020; 9:3079-3090. [PMID: 33044064 DOI: 10.1021/acssynbio.0c00323] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
Sequence overlap between two genes is common across all genomes, with viruses having high proportions of these gene overlaps. Genome modularization and refactoring is the process of disrupting natural gene overlaps to separate coding sequences to enable their individual manipulation. The biological function and fitness effects of gene overlaps are not fully understood, and their effects on gene cluster and genome-level refactoring are unknown. The bacteriophage φX174 genome has ∼26% of nucleotides involved in encoding more than one gene. In this study we use an engineered φX174 phage containing a genome with all gene overlaps removed to show that gene overlap is critical to maintaining optimal viral fecundity. Through detailed phenotypic measurements we reveal that genome modularization in φX174 causes virion replication, stability, and attachment deficiencies. Quantitation of the complete phage proteome across an infection cycle reveals 30% of proteins display abnormal expression patterns. Taken together, we have for the first time comprehensively demonstrated that gene modularization severely perturbs the coordinated functioning of a bacteriophage replication cycle. This work highlights the biological importance of gene overlap in natural genomes and that reducing gene overlap disruption should be an integral part of future genome engineering projects.
Collapse
Affiliation(s)
- Bradley W. Wright
- Department of Molecular Sciences, Macquarie University, Sydney, NSW 2109, Australia
| | - Juanfang Ruan
- Electron Microscope Unit, Mark Wainwright Analytical Centre, The University of New South Wales, Sydney, NSW 2052, Australia
- School of Biotechnology and Biomolecular Sciences, The University of New South Wales, Sydney, NSW 2052, Australia
| | - Mark P. Molloy
- Kolling Institute, Northern Clinical School, The University of Sydney, Sydney, NSW 2006, Australia
| | - Paul R. Jaschke
- Department of Molecular Sciences, Macquarie University, Sydney, NSW 2109, Australia
| |
Collapse
|
22
|
Expression, characterization and structural profile of a heterodimeric β-galactosidase from the novel strain Lactobacillus curieae M2011381. Process Biochem 2020. [DOI: 10.1016/j.procbio.2020.06.025] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
|
23
|
Weber M, Burgos R, Yus E, Yang J, Lluch‐Senar M, Serrano L. Impact of C-terminal amino acid composition on protein expression in bacteria. Mol Syst Biol 2020; 16:e9208. [PMID: 32449593 PMCID: PMC7246954 DOI: 10.15252/msb.20199208] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2019] [Revised: 04/07/2020] [Accepted: 04/09/2020] [Indexed: 11/30/2022] Open
Abstract
The C-terminal sequence of a protein is involved in processes such as efficiency of translation termination and protein degradation. However, the general relationship between features of this C-terminal sequence and levels of protein expression remains unknown. Here, we identified C-terminal amino acid biases that are ubiquitous across the bacterial taxonomy (1,582 genomes). We showed that the frequency is higher for positively charged amino acids (lysine, arginine), while hydrophobic amino acids and threonine are lower. We then studied the impact of C-terminal composition on protein levels in a library of Mycoplasma pneumoniae mutants, covering all possible combinations of the two last codons. We found that charged and polar residues, in particular lysine, led to higher expression, while hydrophobic and aromatic residues led to lower expression, with a difference in protein levels up to fourfold. We further showed that modulation of protein degradation rate could be one of the main mechanisms driving these differences. Our results demonstrate that the identity of the last amino acids has a strong influence on protein expression levels.
Collapse
Affiliation(s)
- Marc Weber
- Centre for Genomic Regulation (CRG)The Barcelona Institute of Science and TechnologyBarcelonaSpain
| | - Raul Burgos
- Centre for Genomic Regulation (CRG)The Barcelona Institute of Science and TechnologyBarcelonaSpain
| | - Eva Yus
- Centre for Genomic Regulation (CRG)The Barcelona Institute of Science and TechnologyBarcelonaSpain
| | - Jae‐Seong Yang
- Centre for Genomic Regulation (CRG)The Barcelona Institute of Science and TechnologyBarcelonaSpain
| | - Maria Lluch‐Senar
- Centre for Genomic Regulation (CRG)The Barcelona Institute of Science and TechnologyBarcelonaSpain
| | - Luis Serrano
- Centre for Genomic Regulation (CRG)The Barcelona Institute of Science and TechnologyBarcelonaSpain
- Universitat Pompeu Fabra (UPF)BarcelonaSpain
- ICREABarcelonaSpain
| |
Collapse
|
24
|
Saund K, Lapp Z, Thiede SN, Pirani A, Snitkin ES. prewas: data pre-processing for more informative bacterial GWAS. Microb Genom 2020; 6. [PMID: 32310745 PMCID: PMC7371116 DOI: 10.1099/mgen.0.000368] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023] Open
Abstract
While variant identification pipelines are becoming increasingly standardized, less attention has been paid to the pre-processing of variants prior to their use in bacterial genome-wide association studies (bGWAS). Three nuances of variant pre-processing that impact downstream identification of genetic associations include the separation of variants at multiallelic sites, separation of variants in overlapping genes, and referencing of variants relative to ancestral alleles. Here we demonstrate the importance of these variant pre-processing steps on diverse bacterial genomic datasets and present prewas, an R package, that standardizes the pre-processing of multiallelic sites, overlapping genes, and reference alleles before bGWAS. This package facilitates improved reproducibility and interpretability of bGWAS results. prewas enables users to extract maximal information from bGWAS by implementing multi-line representation for multiallelic sites and variants in overlapping genes. prewas outputs a binary SNP matrix that can be used for SNP-based bGWAS and will prevent the masking of minor alleles during bGWAS analysis. The optional binary gene matrix output can be used for gene-based bGWAS, which will enable users to maximize the power and evolutionary interpretability of their bGWAS studies. prewas is available for download from GitHub.
Collapse
Affiliation(s)
- Katie Saund
- Department of Microbiology and Immunology, University of Michigan, Ann Arbor, Michigan, USA
| | - Zena Lapp
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, USA
| | - Stephanie N Thiede
- Department of Microbiology and Immunology, University of Michigan, Ann Arbor, Michigan, USA
| | - Ali Pirani
- Department of Microbiology and Immunology, University of Michigan, Ann Arbor, Michigan, USA
| | - Evan S Snitkin
- Department of Microbiology and Immunology, University of Michigan, Ann Arbor, Michigan, USA.,Department of Internal Medicine/Division of Infectious Diseases, University of Michigan, Ann Arbor, Michigan, USA
| |
Collapse
|
25
|
Blazejewski T, Ho HI, Wang HH. Synthetic sequence entanglement augments stability and containment of genetic information in cells. Science 2020; 365:595-598. [PMID: 31395784 DOI: 10.1126/science.aav5477] [Citation(s) in RCA: 40] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2018] [Revised: 06/21/2019] [Accepted: 07/15/2019] [Indexed: 12/28/2022]
Abstract
In synthetic biology, methods for stabilizing genetically engineered functions and confining recombinant DNA to intended hosts are necessary to cope with natural mutation accumulation and pervasive lateral gene flow. We present a generalizable strategy to preserve and constrain genetic information through the computational design of overlapping genes. Overlapping a sequence with an essential gene altered its fitness landscape and produced a constrained evolutionary path, even for synonymous mutations. Embedding a toxin gene in a gene of interest restricted its horizontal propagation. We further demonstrated a multiplex and scalable approach to build and test >7500 overlapping sequence designs, yielding functional yet highly divergent variants from natural homologs. This work enables deeper exploration of natural and engineered overlapping genes and facilitates enhanced genetic stability and biocontainment in emerging applications.
Collapse
Affiliation(s)
- Tomasz Blazejewski
- Department of Systems Biology, Columbia University, New York, NY, USA.,Integrated Program in Cellular, Molecular, and Biomedical Studies, Columbia University, New York, NY, USA
| | - Hsing-I Ho
- Department of Systems Biology, Columbia University, New York, NY, USA
| | - Harris H Wang
- Department of Systems Biology, Columbia University, New York, NY, USA. .,Department of Pathology and Cell Biology, Columbia University, New York, NY, USA
| |
Collapse
|
26
|
Zehentner B, Ardern Z, Kreitmeier M, Scherer S, Neuhaus K. A Novel pH-Regulated, Unusual 603 bp Overlapping Protein Coding Gene pop Is Encoded Antisense to ompA in Escherichia coli O157:H7 (EHEC). Front Microbiol 2020; 11:377. [PMID: 32265854 PMCID: PMC7103648 DOI: 10.3389/fmicb.2020.00377] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2019] [Accepted: 02/20/2020] [Indexed: 12/23/2022] Open
Abstract
Antisense transcription is well known in bacteria. However, translation of antisense RNAs is typically not considered, as the implied overlapping coding at a DNA locus is assumed to be highly improbable. Therefore, such overlapping genes are systematically excluded in prokaryotic genome annotation. Here we report an exceptional 603 bp long open reading frame completely embedded in antisense to the gene of the outer membrane protein ompA. An active σ70 promoter, transcription start site (TSS), Shine-Dalgarno motif and rho-independent terminator were experimentally validated, providing evidence that this open reading frame has all the structural features of a functional gene. Furthermore, ribosomal profiling revealed translation of the mRNA, the protein was detected in Western blots and a pH-dependent phenotype conferred by the protein was shown in competitive overexpression growth experiments of a translationally arrested mutant versus wild type. We designate this novel gene pop (pH-regulated overlapping protein-coding gene), thus adding another example to the growing list of overlapping, protein coding genes in bacteria.
Collapse
Affiliation(s)
- Barbara Zehentner
- Chair for Microbial Ecology, Technical University of Munich, Freising, Germany
| | - Zachary Ardern
- Chair for Microbial Ecology, Technical University of Munich, Freising, Germany
| | - Michaela Kreitmeier
- Chair for Microbial Ecology, Technical University of Munich, Freising, Germany
| | - Siegfried Scherer
- Chair for Microbial Ecology, Technical University of Munich, Freising, Germany
- ZIEL – Institute for Food & Health, Technical University of Munich, Freising, Germany
| | - Klaus Neuhaus
- ZIEL – Institute for Food & Health, Technical University of Munich, Freising, Germany
- Core Facility Microbiome, ZIEL – Institute for Food & Health, Technical University of Munich, Freising, Germany
| |
Collapse
|
27
|
Abstract
Overlapping genes are commonplace in viruses and play an important role in their function and evolution. However, aside from studies on specific groups of viruses, relatively little is known about the extent and nature of gene overlap and its determinants in viruses as a whole. Here, we present an extensive characterisation of gene overlap in viruses through an analysis of reference genomes present in the NCBI virus genome database. We find that over half the instances of gene overlap are very small, covering <10 nt, and 84 per cent are <50 nt in length. Despite this, 53 per cent of all viruses still contained a gene overlap of 50 nt or larger. We also investigate several predictors of gene overlap such as genome structure (single- and double-stranded RNA and DNA), virus family, genome length, and genome segmentation. This revealed that gene overlap occurs more frequently in DNA viruses than in RNA viruses, and more frequently in single-stranded viruses than in double-stranded viruses. Genome segmentation is also associated with gene overlap, particularly in single-stranded DNA viruses. Notably, we observed a large range of overlap frequencies across families of all genome types, suggesting that it is a common evolutionary trait that provides flexible genome structures in all virus families.
Collapse
Affiliation(s)
- Timothy E Schlub
- Sydney School of Public Health, Faculty of Medicine and Health,The University of Sydney, NSW, 2006, Australia
| | - Edward C Holmes
- School of Life and Environmental Sciences and School of Medical Sciences, Marie Bashir Institute for Infectious Diseases and Biosecurity, The University of Sydney, Sydney, NSW 2006, Australia
| |
Collapse
|
28
|
Mikhailov KV, Efeykin BD, Panchin AY, Knorre DA, Logacheva MD, Penin AA, Muntyan MS, Nikitin MA, Popova OV, Zanegina ON, Vyssokikh MY, Spiridonov SE, Aleoshin VV, Panchin YV. Coding palindromes in mitochondrial genes of Nematomorpha. Nucleic Acids Res 2020; 47:6858-6870. [PMID: 31194871 PMCID: PMC6649704 DOI: 10.1093/nar/gkz517] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2019] [Revised: 05/29/2019] [Accepted: 06/01/2019] [Indexed: 12/11/2022] Open
Abstract
Inverted repeats are common DNA elements, but they rarely overlap with protein-coding sequences due to the ensuing conflict with the structure and function of the encoded protein. We discovered numerous perfect inverted repeats of considerable length (up to 284 bp) embedded within the protein-coding genes in mitochondrial genomes of four Nematomorpha species. Strikingly, both arms of the inverted repeats encode conserved regions of the amino acid sequence. We confirmed enzymatic activity of the respiratory complex I encoded by inverted repeat-containing genes. The nucleotide composition of inverted repeats suggests strong selection at the amino acid level in these regions. We conclude that the inverted repeat-containing genes are transcribed and translated into functional proteins. The survey of available mitochondrial genomes reveals that several other organisms possess similar albeit shorter embedded repeats. Mitochondrial genomes of Nematomorpha demonstrate an extraordinary evolutionary compromise where protein function and stringent secondary structure elements within the coding regions are preserved simultaneously.
Collapse
Affiliation(s)
- Kirill V Mikhailov
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Leninskiye Gory 1-40, Moscow 119991, Russian Federation.,Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow 127994, Russian Federation
| | - Boris D Efeykin
- Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow 127994, Russian Federation.,Severtsov Institute of Ecology and Evolution, Moscow 119071, Russian Federation
| | - Alexander Y Panchin
- Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow 127994, Russian Federation
| | - Dmitry A Knorre
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Leninskiye Gory 1-40, Moscow 119991, Russian Federation.,Institute of Molecular Medicine, Sechenov First Moscow State Medical University, Moscow 119991, Russian Federation
| | - Maria D Logacheva
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Leninskiye Gory 1-40, Moscow 119991, Russian Federation.,Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow 127994, Russian Federation.,Center for Data-Intensive Biomedicine and Biotechnology, Skolkovo Institute of Science and Technology, Moscow 143028, Russian Federation
| | - Aleksey A Penin
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Leninskiye Gory 1-40, Moscow 119991, Russian Federation.,Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow 127994, Russian Federation
| | - Maria S Muntyan
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Leninskiye Gory 1-40, Moscow 119991, Russian Federation
| | - Mikhail A Nikitin
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Leninskiye Gory 1-40, Moscow 119991, Russian Federation.,Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow 127994, Russian Federation
| | - Olga V Popova
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Leninskiye Gory 1-40, Moscow 119991, Russian Federation
| | - Olga N Zanegina
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Leninskiye Gory 1-40, Moscow 119991, Russian Federation
| | - Mikhail Y Vyssokikh
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Leninskiye Gory 1-40, Moscow 119991, Russian Federation
| | - Sergei E Spiridonov
- Severtsov Institute of Ecology and Evolution, Moscow 119071, Russian Federation
| | - Vladimir V Aleoshin
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Leninskiye Gory 1-40, Moscow 119991, Russian Federation.,Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow 127994, Russian Federation
| | - Yuri V Panchin
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Leninskiye Gory 1-40, Moscow 119991, Russian Federation.,Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow 127994, Russian Federation
| |
Collapse
|
29
|
Translational coupling via termination-reinitiation in archaea and bacteria. Nat Commun 2019; 10:4006. [PMID: 31488843 PMCID: PMC6728339 DOI: 10.1038/s41467-019-11999-9] [Citation(s) in RCA: 44] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2018] [Accepted: 08/12/2019] [Indexed: 11/18/2022] Open
Abstract
The genomes of many prokaryotes contain substantial fractions of gene pairs with overlapping stop and start codons (ATGA or TGATG). A potential benefit of overlapping gene pairs is translational coupling. In 720 genomes of archaea and bacteria representing all major phyla, we identify substantial, albeit highly variable, fractions of co-directed overlapping gene pairs. Various patterns are observed for the utilization of the SD motif for de novo initiation at upstream genes versus reinitiation at overlapping gene pairs. We experimentally test the predicted coupling in 9 gene pairs from the archaeon Haloferax volcanii and 5 gene pairs from the bacterium Escherichia coli. In 13 of 14 cases, translation of both genes is strictly coupled. Mutational analysis of SD motifs located upstream of the downstream genes indicate that the contribution of the SD to translational coupling widely varies from gene to gene. The nearly universal, abundant occurrence of overlapping gene pairs suggests that tight translational coupling is widespread in archaea and bacteria. Archaea and bacteria often have gene pairs with overlapping stop and start codons, suggesting translational coupling. Here, Huber et al. analyse overlapping gene pairs from 720 genomes, and validate translational coupling via termination-reinitiation for 14 gene pairs in Haloferax volcanii and Escherichia coli.
Collapse
|
30
|
Landscape of Overlapping Gene Expression in the Equine Placenta. Genes (Basel) 2019; 10:genes10070503. [PMID: 31269762 PMCID: PMC6678446 DOI: 10.3390/genes10070503] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2019] [Revised: 06/26/2019] [Accepted: 06/28/2019] [Indexed: 02/07/2023] Open
Abstract
Increasing evidence suggests that overlapping genes are much more common in eukaryotic genomes than previously thought. These different-strand overlapping genes are potential sense–antisense (SAS) pairs, which might have regulatory effects on each other. In the present study, we identified the SAS loci in the equine genome using previously generated stranded, paired-end RNA sequencing data from the equine chorioallantois. We identified a total of 1261 overlapping loci. The ratio of the number of overlapping regions to chromosomal length was numerically higher on chromosome 11 followed by chromosomes 13 and 12. These results show that overlapping transcription is distributed throughout the equine genome, but that distributions differ for each chromosome. Next, we evaluated the expression patterns of SAS pairs during the course of gestation. The sense and antisense genes showed an overall positive correlation between the sense and antisense pairs. We further provide a list of SAS pairs with both positive and negative correlation in their expression patterns throughout gestation. This study characterizes the landscape of sense and antisense gene expression in the placenta for the first time and provides a resource that will enable researchers to elucidate the mechanisms of sense/antisense regulation during pregnancy.
Collapse
|
31
|
Hücker SM, Vanderhaeghen S, Abellan-Schneyder I, Scherer S, Neuhaus K. The Novel Anaerobiosis-Responsive Overlapping Gene ano Is Overlapping Antisense to the Annotated Gene ECs2385 of Escherichia coli O157:H7 Sakai. Front Microbiol 2018; 9:931. [PMID: 29867840 PMCID: PMC5960689 DOI: 10.3389/fmicb.2018.00931] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2018] [Accepted: 04/23/2018] [Indexed: 12/26/2022] Open
Abstract
Current notion presumes that only one protein is encoded at a given bacterial genetic locus. However, transcription and translation of an overlapping open reading frame (ORF) of 186 bp length were discovered by RNAseq and RIBOseq experiments. This ORF is almost completely embedded in the annotated L,D-transpeptidase gene ECs2385 of Escherichia coli O157:H7 Sakai in the antisense reading frame -3. The ORF is transcribed as part of a bicistronic mRNA, which includes the annotated upstream gene ECs2384, encoding a murein lipoprotein. The transcriptional start site of the operon resides 38 bp upstream of the ECs2384 start codon and is driven by a predicted σ70 promoter, which is constitutively active under different growth conditions. The bicistronic operon contains a ρ-independent terminator just upstream of the novel gene, significantly decreasing its transcription. The novel gene can be stably expressed as an EGFP-fusion protein and a translationally arrested mutant of ano, unable to produce the protein, shows a growth advantage in competitive growth experiments compared to the wild type under anaerobiosis. Therefore, the novel antisense overlapping gene is named ano (anaerobiosis responsive overlapping gene). A phylostratigraphic analysis indicates that ano originated very recently de novo by overprinting after the Escherichia/Shigella clade separated from other enterobacteria. Therefore, ano is one of the very rare cases of overlapping genes known in the genus Escherichia.
Collapse
Affiliation(s)
- Sarah M Hücker
- Chair for Microbial Ecology, Technical University of Munich, Freising, Germany
| | - Sonja Vanderhaeghen
- Chair for Microbial Ecology, Technical University of Munich, Freising, Germany
| | | | - Siegfried Scherer
- Chair for Microbial Ecology, Technical University of Munich, Freising, Germany.,Institute for Food & Health, Technical University of Munich, Freising, Germany
| | - Klaus Neuhaus
- Chair for Microbial Ecology, Technical University of Munich, Freising, Germany.,Core Facility Microbiome/NGS, Institute for Food & Health, Technical University of Munich, Freising, Germany
| |
Collapse
|
32
|
Hücker SM, Vanderhaeghen S, Abellan-Schneyder I, Wecko R, Simon S, Scherer S, Neuhaus K. A novel short L-arginine responsive protein-coding gene (laoB) antiparallel overlapping to a CadC-like transcriptional regulator in Escherichia coli O157:H7 Sakai originated by overprinting. BMC Evol Biol 2018; 18:21. [PMID: 29433444 PMCID: PMC5810103 DOI: 10.1186/s12862-018-1134-0] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2017] [Accepted: 01/31/2018] [Indexed: 11/10/2022] Open
Abstract
Background Due to the DNA triplet code, it is possible that the sequences of two or more protein-coding genes overlap to a large degree. However, such non-trivial overlaps are usually excluded by genome annotation pipelines and, thus, only a few overlapping gene pairs have been described in bacteria. In contrast, transcriptome and translatome sequencing reveals many signals originated from the antisense strand of annotated genes, of which we analyzed an example gene pair in more detail. Results A small open reading frame of Escherichia coli O157:H7 strain Sakai (EHEC), designated laoB (L-arginine responsive overlapping gene), is embedded in reading frame −2 in the antisense strand of ECs5115, encoding a CadC-like transcriptional regulator. This overlapping gene shows evidence of transcription and translation in Luria-Bertani (LB) and brain-heart infusion (BHI) medium based on RNA sequencing (RNAseq) and ribosomal-footprint sequencing (RIBOseq). The transcriptional start site is 289 base pairs (bp) upstream of the start codon and transcription termination is 155 bp downstream of the stop codon. Overexpression of LaoB fused to an enhanced green fluorescent protein (EGFP) reporter was possible. The sequence upstream of the transcriptional start site displayed strong promoter activity under different conditions, whereas promoter activity was significantly decreased in the presence of L-arginine. A strand-specific translationally arrested mutant of laoB provided a significant growth advantage in competitive growth experiments in the presence of L-arginine compared to the wild type, which returned to wild type level after complementation of laoB in trans. A phylostratigraphic analysis indicated that the novel gene is restricted to the Escherichia/Shigella clade and might have originated recently by overprinting leading to the expression of part of the antisense strand of ECs5115. Conclusions Here, we present evidence of a novel small protein-coding gene laoB encoded in the antisense frame −2 of the annotated gene ECs5115. Clearly, laoB is evolutionarily young and it originated in the Escherichia/Shigella clade by overprinting, a process which may cause the de novo evolution of bacterial genes like laoB. Electronic supplementary material The online version of this article (10.1186/s12862-018-1134-0) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Sarah M Hücker
- Chair for Microbial Ecology, Wissenschaftszentrum Weihenstephan, Technische Universität München, Weihenstephaner Berg 3, 85354, Freising, Germany.,Fraunhofer ITEM-R, Am Biopark 9, 93053, Regensburg, Germany
| | - Sonja Vanderhaeghen
- Chair for Microbial Ecology, Wissenschaftszentrum Weihenstephan, Technische Universität München, Weihenstephaner Berg 3, 85354, Freising, Germany
| | - Isabel Abellan-Schneyder
- Chair for Microbial Ecology, Wissenschaftszentrum Weihenstephan, Technische Universität München, Weihenstephaner Berg 3, 85354, Freising, Germany.,Core Facility Microbiome/NGS, ZIEL - Institute for Food & Health, Technische Universität München, Weihenstephaner Berg 3, 85354, Freising, Germany
| | - Romy Wecko
- Chair for Microbial Ecology, Wissenschaftszentrum Weihenstephan, Technische Universität München, Weihenstephaner Berg 3, 85354, Freising, Germany
| | - Svenja Simon
- Department of Computer and Information Science, University of Konstanz, Box 78, 78457, Konstanz, Germany
| | - Siegfried Scherer
- Chair for Microbial Ecology, Wissenschaftszentrum Weihenstephan, Technische Universität München, Weihenstephaner Berg 3, 85354, Freising, Germany.,ZIEL - Institute for Food & Health, Technische Universität München, Weihenstephaner Berg 3, 85354, Freising, Germany
| | - Klaus Neuhaus
- Chair for Microbial Ecology, Wissenschaftszentrum Weihenstephan, Technische Universität München, Weihenstephaner Berg 3, 85354, Freising, Germany. .,Core Facility Microbiome/NGS, ZIEL - Institute for Food & Health, Technische Universität München, Weihenstephaner Berg 3, 85354, Freising, Germany.
| |
Collapse
|
33
|
Sunil M, Hariharan N, Dixit S, Choudhary B, Srinivasan S. Differential genomic arrangements in Caryophyllales through deep transcriptome sequencing of A. hypochondriacus. PLoS One 2017; 12:e0180528. [PMID: 28786999 PMCID: PMC5546567 DOI: 10.1371/journal.pone.0180528] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2016] [Accepted: 06/17/2017] [Indexed: 12/28/2022] Open
Abstract
Genome duplication event in edible dicots under the orders Rosid and Asterid, common during the oligocene period, is missing for species under the order Caryophyllales. Despite this, grain amaranths not only survived this period but display many desirable traits missing in species under rosids and asterids. For example, grain amaranths display traits like C4 photosynthesis, high-lysine seeds, high-yield, drought resistance, tolerance to infection and resilience to stress. It is, therefore, of interest to look for minor genome rearrangements with potential functional implications that are unique to grain amaranths. Here, by deep sequencing and assembly of 16 transcriptomes (86.8 billion bases) we have interrogated differential genome rearrangement unique to Amaranthus hypochondriacus with potential links to these phenotypes. We have predicted 125,581 non-redundant transcripts including 44,529 protein coding transcripts identified based on homology to known proteins and 13,529 predicted as novel/amaranth specific coding transcripts. Of the protein coding de novo assembled transcripts, we have identified 1810 chimeric transcripts. More than 30% and 19% of the gene pairs within the chimeric transcripts are found within the same loci in the genomes of A. hypochondriacus and Beta vulgaris respectively and are considered real positives. Interestingly, one of the chimeric transcripts comprises two important genes, namely DHDPS1, a key enzyme implicated in the biosynthesis of lysine, and alpha-glucosidase, an enzyme involved in sucrose catabolism, in close proximity to each other separated by a distance of 612 bases in the genome of A. hypochondriacus in a convergent configuration. We have experimentally validated that transcripts of these two genes are also overlapping in the 3' UTR with their expression negatively correlated from bud to mature seed, suggesting a potential link between the high seed lysine trait and unique genome organization.
Collapse
Affiliation(s)
- Meeta Sunil
- Institute of Bioinformatics and Applied Biotechnology, Bangalore, Karnataka, India
- Manipal University, Manipal, Karnataka, India
| | - Nivedita Hariharan
- Institute of Bioinformatics and Applied Biotechnology, Bangalore, Karnataka, India
| | - Shubham Dixit
- Institute of Bioinformatics and Applied Biotechnology, Bangalore, Karnataka, India
| | - Bibha Choudhary
- Institute of Bioinformatics and Applied Biotechnology, Bangalore, Karnataka, India
| | - Subhashini Srinivasan
- Institute of Bioinformatics and Applied Biotechnology, Bangalore, Karnataka, India
- * E-mail:
| |
Collapse
|
34
|
The Evolution and Expression Pattern of Human Overlapping lncRNA and Protein-coding Gene Pairs. Sci Rep 2017; 7:42775. [PMID: 28344339 PMCID: PMC5366806 DOI: 10.1038/srep42775] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2016] [Accepted: 01/13/2017] [Indexed: 12/27/2022] Open
Abstract
Long non-coding RNA overlapping with protein-coding gene (lncRNA-coding pair) is a special type of overlapping genes. Protein-coding overlapping genes have been well studied and increasing attention has been paid to lncRNAs. By studying lncRNA-coding pairs in human genome, we showed that lncRNA-coding pairs were more likely to be generated by overprinting and retaining genes in lncRNA-coding pairs were given higher priority than non-overlapping genes. Besides, the preference of overlapping configurations preserved during evolution was based on the origin of lncRNA-coding pairs. Further investigations showed that lncRNAs promoting the splicing of their embedded protein-coding partners was a unilateral interaction, but the existence of overlapping partners improving the gene expression was bidirectional and the effect was decreased with the increased evolutionary age of genes. Additionally, the expression of lncRNA-coding pairs showed an overall positive correlation and the expression correlation was associated with their overlapping configurations, local genomic environment and evolutionary age of genes. Comparison of the expression correlation of lncRNA-coding pairs between normal and cancer samples found that the lineage-specific pairs including old protein-coding genes may play an important role in tumorigenesis. This work presents a systematically comprehensive understanding of the evolution and the expression pattern of human lncRNA-coding pairs.
Collapse
|
35
|
Kremer FS, Eslabão MR, Dellagostin OA, Pinto LDS. Genix: a new online automated pipeline for bacterial genome annotation. FEMS Microbiol Lett 2016; 363:fnw263. [DOI: 10.1093/femsle/fnw263] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Revised: 04/30/2016] [Accepted: 11/15/2016] [Indexed: 12/23/2022] Open
|
36
|
Amarillas L, Chaidez C, González-Robles A, Lugo-Melchor Y, León-Félix J. Characterization of novel bacteriophage phiC119 capable of lysing multidrug-resistant Shiga toxin-producing Escherichia coli O157:H7. PeerJ 2016; 4:e2423. [PMID: 27672499 PMCID: PMC5028729 DOI: 10.7717/peerj.2423] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2016] [Accepted: 08/09/2016] [Indexed: 12/22/2022] Open
Abstract
BACKGROUND Shiga toxin-producing Escherichia coli (STEC) is one of the most common and widely distributed foodborne pathogens that has been frequently implicated in gastrointestinal and urinary tract infections. Moreover, high rates of multiple antibiotic-resistant E. coli strains have been reported worldwide. Due to the emergence of antibiotic-resistant strains, bacteriophages are considered an attractive alternative to biocontrol pathogenic bacteria. Characterization is a preliminary step towards designing a phage for biocontrol. METHODS In this study, we describe the characterization of a bacteriophage designated phiC119, which can infect and lyse several multidrug-resistant STEC strains and some Salmonella strains. The phage genome was screened to detect the stx-genes using PCR, morphological analysis, host range was determined, and genome sequencing were carried out, as well as an analysis of the cohesive ends and identification of the type of genetic material through enzymatic digestion of the genome. RESULTS Analysis of the bacteriophage particles by transmission electron microscopy showed that it had an icosahedral head and a long tail, characteristic of the family Siphoviridae. The phage exhibits broad host range against multidrug-resistant and highly virulent E. coli isolates. One-step growth experiments revealed that the phiC119 phage presented a large burst size (210 PFU/cell) and a latent period of 20 min. Based on genomic analysis, the phage contains a linear double-stranded DNA genome with a size of 47,319 bp. The phage encodes 75 putative proteins, but lysogeny and virulence genes were not found in the phiC119 genome. CONCLUSION These results suggest that phage phiC119 may be a good biological control agent. However, further studies are required to ensure its control of STEC and to confirm the safety of phage use.
Collapse
Affiliation(s)
- Luis Amarillas
- Laboratorio de Biología Molecular y Genómica Funcional, Centro de Investigación en Alimentación y Desarrollo, A. C., Culiacán, Sinaloa, México; Laboratorio de Genética, Instituto de Investigación Lightbourn, A. C., Cd. Jiménez, Chihuahua, México
| | - Cristóbal Chaidez
- Inocuidad Alimentaria, Centro de Investigación en Alimentación y Desarrollo, A. C. , Culiacán, Sinaloa , México
| | - Arturo González-Robles
- Departamento de Infectómica y Patogénesis Molecular, Centro de Investigación y de Estudios Avanzados, Instituto Politécnico Nacional , Ciudad de México , México
| | - Yadira Lugo-Melchor
- Laboratorio de Biología Molecular de la Unidad de Servicios Analíticos y Metrológicos, Centro de Investigación y Asistencia en Tecnología y Diseño del Estado de Jalisco A. C. , Guadalajara, Jalisco , México
| | - Josefina León-Félix
- Laboratorio de Biología Molecular y Genómica Funcional, Centro de Investigación en Alimentación y Desarrollo, A. C. , Culiacán, Sinaloa , México
| |
Collapse
|
37
|
Sheshukova EV, Shindyapina AV, Komarova TV, Dorokhov YL. “Matreshka” genes with alternative reading frames. RUSS J GENET+ 2016. [DOI: 10.1134/s1022795416020149] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
38
|
Nicotine Dehydrogenase Complexed with 6-Hydroxypseudooxynicotine Oxidase Involved in the Hybrid Nicotine-Degrading Pathway in Agrobacterium tumefaciens S33. Appl Environ Microbiol 2016; 82:1745-1755. [PMID: 26729714 DOI: 10.1128/aem.03909-15] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2015] [Accepted: 12/29/2015] [Indexed: 01/04/2023] Open
Abstract
Nicotine, a major toxic alkaloid in tobacco wastes, is degraded by bacteria, mainly via pyridine and pyrrolidine pathways. Previously, we discovered a new hybrid of the pyridine and pyrrolidine pathways in Agrobacterium tumefaciens S33 and characterized its key enzyme 6-hydroxy-3-succinoylpyridine (HSP) hydroxylase. Here, we purified the nicotine dehydrogenase initializing the nicotine degradation from the strain and found that it forms a complex with a novel 6-hydroxypseudooxynicotine oxidase. The purified complex is composed of three different subunits encoded by ndhAB and pno, where ndhA and ndhB overlap by 4 bp and are ∼26 kb away from pno. As predicted from the gene sequences and from chemical analyses, NdhA (82.4 kDa) and NdhB (17.1 kDa) harbor a molybdopterin cofactor and two [2Fe-2S] clusters, respectively, whereas Pno (73.3 kDa) harbors an flavin mononucleotide and a [4Fe-4S] cluster. Mutants with disrupted ndhA or ndhB genes did not grow on nicotine but grew well on 6-hydroxynicotine and HSP, whereas the pno mutant did not grow on nicotine or 6-hydroxynicotine but grew well on HSP, indicating that NdhA and NdhB are responsible for initialization of nicotine oxidation. We successfully expressed pno in Escherichia coli and found that the recombinant Pno presented 2,6-dichlorophenolindophenol reduction activity when it was coupled with 6-hydroxynicotine oxidation. The determination of reaction products catalyzed by the purified enzymes or mutants indicated that NdhAB catalyzed nicotine oxidation to 6-hydroxynicotine, whereas Pno oxidized 6-hydroxypseudooxynicotine to 6-hydroxy-3-succinoylsemialdehyde pyridine. These results provide new insights into this novel hybrid pathway of nicotine degradation in A. tumefaciens S33.
Collapse
|
39
|
Fellner L, Simon S, Scherling C, Witting M, Schober S, Polte C, Schmitt-Kopplin P, Keim DA, Scherer S, Neuhaus K. Evidence for the recent origin of a bacterial protein-coding, overlapping orphan gene by evolutionary overprinting. BMC Evol Biol 2015; 15:283. [PMID: 26677845 PMCID: PMC4683798 DOI: 10.1186/s12862-015-0558-z] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2015] [Accepted: 12/06/2015] [Indexed: 01/18/2023] Open
Abstract
BACKGROUND Gene duplication is believed to be the classical way to form novel genes, but overprinting may be an important alternative. Overprinting allows entirely novel proteins to evolve de novo, i.e., formerly non-coding open reading frames within functional genes become expressed. Only three cases have been described for Escherichia coli. Here, a fourth example is presented. RESULTS RNA sequencing revealed an open reading frame weakly transcribed in cow dung, coding for 101 residues and embedded completely in the -2 reading frame of citC in enterohemorrhagic E. coli. This gene is designated novel overlapping gene, nog1. The promoter region fused to gfp exhibits specific activities and 5' rapid amplification of cDNA ends indicated the transcriptional start 40-bp upstream of the start codon. nog1 was strand-specifically arrested in translation by a nonsense mutation silent in citC. This Nog1-mutant showed a phenotype in competitive growth against wild type in the presence of MgCl2. Small differences in metabolite concentrations were also found. Bioinformatic analyses propose Nog1 to be inner membrane-bound and to possess at least one membrane-spanning domain. A phylogenetic analysis suggests that the orphan gene nog1 arose by overprinting after Escherichia/Shigella separated from the other γ-proteobacteria. CONCLUSIONS Since nog1 is of recent origin, non-essential, short, weakly expressed and only marginally involved in E. coli's central metabolism, we propose that this gene is in an initial stage of evolution. While we present specific experimental evidence for the existence of a fourth overlapping gene in enterohemorrhagic E. coli, we believe that this may be an initial finding only and overlapping genes in bacteria may be more common than is currently assumed by microbiologists.
Collapse
Affiliation(s)
- Lea Fellner
- Lehrstuhl für Mikrobielle Ökologie, Wissenschaftszentrum Weihenstephan, Technische Universität München, Weihenstephaner Berg 3, 85350, Freising, Germany.
| | - Svenja Simon
- Lehrstuhl für Datenanalyse und Visualisierung, Fachbereich Informatik und Informationswissenschaft, Universität Konstanz, Box 78, 78457, Constance, Germany.
| | - Christian Scherling
- Lehrstuhl für Ernährungsphysiologie, Wissenschaftszentrum Weihenstephan, Technische Universität München, Gregor-Mendel-Straße 2, D-85354, Freising, Germany.
| | - Michael Witting
- Research Unit Analytical BioGeoChemistry, Deutsches Forschungszentrum für Gesundheit und Umwelt GmbH, Helmholtz Zentrum München, Ingolstädter Landstraße 1, 85754, Neuherberg, Germany.
| | - Steffen Schober
- Institute of Communications Engineering, Universität Ulm, Albert-Einstein-Allee 43, 89081, Ulm, Germany. .,Present address: Blue Yonder GmbH, Ohiostraße 8, Karlsruhe, Germany.
| | - Christine Polte
- Lehrstuhl für Mikrobielle Ökologie, Wissenschaftszentrum Weihenstephan, Technische Universität München, Weihenstephaner Berg 3, 85350, Freising, Germany. .,Present address: Institut für Biochemie und Molekularbiologie, Universität Hamburg, Martin-Luther-King Platz 6, 20146, Hamburg, Germany.
| | - Philippe Schmitt-Kopplin
- Research Unit Analytical BioGeoChemistry, Deutsches Forschungszentrum für Gesundheit und Umwelt GmbH, Helmholtz Zentrum München, Ingolstädter Landstraße 1, 85754, Neuherberg, Germany.
| | - Daniel A Keim
- Lehrstuhl für Datenanalyse und Visualisierung, Fachbereich Informatik und Informationswissenschaft, Universität Konstanz, Box 78, 78457, Constance, Germany.
| | - Siegfried Scherer
- Lehrstuhl für Mikrobielle Ökologie, Wissenschaftszentrum Weihenstephan, Technische Universität München, Weihenstephaner Berg 3, 85350, Freising, Germany.
| | - Klaus Neuhaus
- Lehrstuhl für Mikrobielle Ökologie, Wissenschaftszentrum Weihenstephan, Technische Universität München, Weihenstephaner Berg 3, 85350, Freising, Germany.
| |
Collapse
|
40
|
Zhang YC, Lin K. Phylogeny Inference of Closely Related Bacterial Genomes: Combining the Features of Both Overlapping Genes and Collinear Genomic Regions. Evol Bioinform Online 2015; 11:1-9. [PMID: 26715828 PMCID: PMC4686347 DOI: 10.4137/ebo.s33491] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2015] [Revised: 11/10/2015] [Accepted: 11/16/2015] [Indexed: 11/25/2022] Open
Abstract
Overlapping genes (OGs) represent one type of widespread genomic feature in bacterial genomes and have been used as rare genomic markers in phylogeny inference of closely related bacterial species. However, the inference may experience a decrease in performance for phylogenomic analysis of too closely or too distantly related genomes. Another drawback of OGs as phylogenetic markers is that they usually take little account of the effects of genomic rearrangement on the similarity estimation, such as intra-chromosome/genome translocations, horizontal gene transfer, and gene losses. To explore such effects on the accuracy of phylogeny reconstruction, we combine phylogenetic signals of OGs with collinear genomic regions, here called locally collinear blocks (LCBs). By putting these together, we refine our previous metric of pairwise similarity between two closely related bacterial genomes. As a case study, we used this new method to reconstruct the phylogenies of 88 Enterobacteriale genomes of the class Gammaproteobacteria. Our results demonstrated that the topological accuracy of the inferred phylogeny was improved when both OGs and LCBs were simultaneously considered, suggesting that combining these two phylogenetic markers may reduce, to some extent, the influence of gene loss on phylogeny inference. Such phylogenomic studies, we believe, will help us to explore a more effective approach to increasing the robustness of phylogeny reconstruction of closely related bacterial organisms.
Collapse
Affiliation(s)
- Yan-Cong Zhang
- State Key Laboratory of Earth Surface Processes and Resource Ecology, Beijing Normal University, Beijing, China. ; MOE Key Laboratory for Biodiversity Science and Ecological Engineering, College of Life Sciences, Beijing Normal University, Beijing, China
| | - Kui Lin
- State Key Laboratory of Earth Surface Processes and Resource Ecology, Beijing Normal University, Beijing, China. ; MOE Key Laboratory for Biodiversity Science and Ecological Engineering, College of Life Sciences, Beijing Normal University, Beijing, China
| |
Collapse
|
41
|
Evaluation and application of the strand-specific protocol for next-generation sequencing. BIOMED RESEARCH INTERNATIONAL 2015; 2015:182389. [PMID: 25893191 PMCID: PMC4393923 DOI: 10.1155/2015/182389] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/11/2014] [Accepted: 02/03/2015] [Indexed: 12/02/2022]
Abstract
Next-generation sequencing (NGS) has become a powerful sequencing tool, applied in a wide range of biological studies. However, the traditional sample preparation protocol for NGS is non-strand-specific (NSS), leading to biased estimates of expression for transcripts overlapped at the antisense strand. Strand-specific (SS) protocols have recently been developed. In this study, we prepared the same RNA sample by using the SS and NSS protocols, followed by sequencing with Illumina HiSeq platform. Using real-time quantitative PCR as a standard, we first proved that the SS protocol more precisely estimates gene expressions compared with the NSS protocol, particularly for those overlapped at the antisense strand. In addition, we also showed that the sequence reads from the SS protocol are comparable with those from conventional NSS protocols in many aspects. Finally, we also mapped a fraction of sequence reads back to the antisense strand of the known genes, originally without annotated genes located. Using sequence assembly and PCR validation, we succeeded in identifying and characterizing the novel antisense genes. Our results show that the SS protocol performs more accurately than the traditional NSS protocol and can be applied in future studies.
Collapse
|
42
|
Overlapping genes: a new strategy of thermophilic stress tolerance in prokaryotes. Extremophiles 2014; 19:345-53. [PMID: 25503326 DOI: 10.1007/s00792-014-0720-3] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2014] [Accepted: 12/01/2014] [Indexed: 12/29/2022]
Abstract
Overlapping genes (OGs) draw the focus of recent day's research. However, the significance of OGs in prokaryotic genomes remained unexplored. As an adaptation to high temperature, thermophiles were shown to eliminate their intergenic regions. Therefore, it could be possible that prokaryotes would increase their OG content to adapt to high temperature. To test this hypothesis, we carried out a comparative study on OG frequency of 256 prokaryotic genomes comprising both thermophiles and non-thermophiles. It was found that thermophiles exhibit higher frequency of overlapping genes than non-thermophiles. Moreover, overlap frequency was found to correlate with optimal growth temperature (OGT) in prokaryotes. Long overlap frequency was found to hold a positive correlation with OGT resulting in an abundance of long overlaps in thermophiles compared to non-thermophiles. On the other hand, short overlap (1-4 nucleotides) frequency (SOF) did not yield any direct correlation with OGT. However, the correlation of SOF with CAIavg (extent of variation of codon usage bias measured as the mean of codon adaptation index of all genes in a given genome) and IG% (proportion of intergenic regions) indicate that they might upregulate the aforementioned factors (CAIavg and IG%) which are already known to be vital forces for thermophilic adaptation. From these evidences, we propose that the OG content bears a strong link to thermophily. Long overlaps are important for their genome compaction and short overlaps are important to uphold high CAIavg. Our findings will surely help in better understanding of the significance of overlapping gene content in prokaryotic genomes.
Collapse
|
43
|
Abstract
Overlapping genes are two protein-coding sequences sharing a significant part of the same DNA locus in different reading frames. Although in recent times an increasing number of examples have been found in bacteria the underlying mechanisms of their evolution are unknown. In this work we explore how selective pressure in a protein-coding sequence influences its overlapping genes in alternative reading frames. We model evolution using a time-continuous Markov process and derive the corresponding model for the remaining frames to quantify selection pressure and genetic noise. Our findings lead to the presumption that, once information is embedded in the reverse reading frame −2 (relative to the mother gene in +1) purifying selection in the protein-coding reading frame automatically protects the sequences in both frames. We also found that this coincides with the fact that the genetic noise measured using the conditional entropy is minimal in frame −2 under selection in the coding frame.
Collapse
Affiliation(s)
- Katharina Mir
- Institute of Communications Engineering, Ulm University, Ulm, Germany
- * E-mail:
| | - Steffen Schober
- Institute of Communications Engineering, Ulm University, Ulm, Germany
| |
Collapse
|
44
|
Junier I. Conserved patterns in bacterial genomes: a conundrum physically tailored by evolutionary tinkering. Comput Biol Chem 2014; 53 Pt A:125-33. [PMID: 25239779 DOI: 10.1016/j.compbiolchem.2014.08.017] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/11/2014] [Indexed: 11/17/2022]
Abstract
The proper functioning of bacteria is encoded in their genome at multiple levels or scales, each of which is constrained by specific physical forces. At the smallest spatial scales, interatomic forces dictate the folding and function of proteins and nucleic acids. On longer length scales, stochastic forces emerging from the thermal jiggling of proteins and RNAs impose strong constraints on the organization of genes along chromosomes, more particularly in the context of the building of nucleoprotein complexes and the operational mode of regulatory agents. At the cellular level, transcription, replication and cell division activities generate forces that act on both the internal structure and cellular location of chromosomes. The overall result is a complex multi-scale organization of genomes that reflects the evolutionary tinkering of bacteria. The goal of this review is to highlight avenues for deciphering this complexity by focusing on patterns that are conserved among evolutionarily distant bacteria. To this end, I discuss three different organizational scales: the protein structures, the chromosomal organization of genes and the global structure of chromosomes.
Collapse
Affiliation(s)
- Ivan Junier
- Centre for Genomic Regulation (CRG), Dr. Aiguader 88, 08003 Barcelona, Spain; Universitat Pompeu Fabra (UPF), Barcelona, Spain.
| |
Collapse
|
45
|
Huvet M, Stumpf MPH. Overlapping genes: a window on gene evolvability. BMC Genomics 2014; 15:721. [PMID: 25159814 PMCID: PMC4161906 DOI: 10.1186/1471-2164-15-721] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2013] [Accepted: 08/18/2014] [Indexed: 11/13/2022] Open
Abstract
Background The forces underlying genome architecture and organization are still only poorly understood in detail. Overlapping genes (genes partially or entirely overlapping) represent a genomic feature that is shared widely across biological organisms ranging from viruses to multi-cellular organisms. In bacteria, a third of the annotated genes are involved in an overlap. Despite the widespread nature of this arrangement, its evolutionary origins and biological ramifications have so far eluded explanation. Results Here we present a comparative approach using information from 699 bacterial genomes that sheds light on the evolutionary dynamics of overlapping genes. We show that these structures exhibit high levels of plasticity. Conclusions We propose a simple model allowing us to explain the observed properties of overlapping genes based on the importance of initiation and termination of transcriptional and translational processes. We believe that taking into account the processes leading to the expression of protein-coding genes hold the key to the understanding of overlapping genes structures.
Collapse
Affiliation(s)
- Maxime Huvet
- Theoretical Systems Biology Group, Department of life sciences, Imperial College London, London SW7 2AZ, UK.
| | | |
Collapse
|
46
|
Gogoleva NE, Shlykova LV, Gorshkov VY, Daminova AG, Gogolev YV. Effect of topology of quorum sensing-related genes in Pectobacterium atrosepticum on their expression. Mol Biol 2014. [DOI: 10.1134/s0026893314040049] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
47
|
Sigurgeirsson B, Emanuelsson O, Lundeberg J. Analysis of stranded information using an automated procedure for strand specific RNA sequencing. BMC Genomics 2014; 15:631. [PMID: 25070246 PMCID: PMC4247151 DOI: 10.1186/1471-2164-15-631] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2014] [Accepted: 07/10/2014] [Indexed: 01/19/2023] Open
Abstract
Background Strand specific RNA sequencing is rapidly replacing conventional cDNA sequencing as an approach for assessing information about the transcriptome. Alongside improved laboratory protocols the development of bioinformatical tools is steadily progressing. In the current procedure the Illumina TruSeq library preparation kit is used, along with additional reagents, to make stranded libraries in an automated fashion which are then sequenced on Illumina HiSeq 2000. By the use of freely available bioinformatical tools we show, through quality metrics, that the protocol is robust and reproducible. We further highlight the practicality of strand specific libraries by comparing expression of strand specific libraries to non-stranded libraries, by looking at known antisense transcription of pseudogenes and by identifying novel transcription. Furthermore, two ribosomal depletion kits, RiboMinus and RiboZero, are compared and two sequence aligners, Tophat2 and STAR, are also compared. Results The, non-stranded, Illumina TruSeq kit can be adapted to generate strand specific libraries and can be used to access detailed information on the transcriptome. The RiboZero kit is very effective in removing ribosomal RNA from total RNA and the STAR aligner produces high mapping yield in a short time. Strand specific data gives more detailed and correct results than does non-stranded data as we show when estimating expression values and in assembling transcripts. Even well annotated genomes need improvements and corrections which can be achieved using strand specific data. Conclusions Researchers in the field should strive to use strand specific data; it allows for more confidence in the data analysis and is less likely to lead to false conclusions. If faced with analysing non-stranded data, researchers should be well aware of the caveats of that approach. Electronic supplementary material The online version of this article (doi:10.1186/1471-2164-15-631) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | | | - Joakim Lundeberg
- Science for Life Laboratory, School of Biotechnology, Royal Institute of Technology (KTH), Tomtebodavägen 23A, 17165 Solna, Stockholm, Sweden.
| |
Collapse
|
48
|
Dogsa I, Choudhary KS, Marsetic Z, Hudaiberdiev S, Vera R, Pongor S, Mandic-Mulec I. ComQXPA quorum sensing systems may not be unique to Bacillus subtilis: a census in prokaryotic genomes. PLoS One 2014; 9:e96122. [PMID: 24788106 PMCID: PMC4008528 DOI: 10.1371/journal.pone.0096122] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2014] [Accepted: 04/03/2014] [Indexed: 11/19/2022] Open
Abstract
The comQXPA locus of Bacillus subtilis encodes a quorum sensing (QS) system typical of Gram positive bacteria. It encodes four proteins, the ComQ isoprenyl transferase, the ComX pre-peptide signal, the ComP histidine kinase, and the ComA response regulator. These are encoded by four adjacent genes all situated on the same chromosome strand. Here we present results of a comprehensive census of comQXPA-like gene arrangements in 2620 complete and 6970 draft prokaryotic genomes (sequenced by the end of 2013). After manually checking the data for false-positive and false-negative hits, we found 39 novel com-like predictions. The census data show that in addition to B. subtilis and close relatives, 20 comQXPA-like loci are predicted to occur outside the B. subtilis clade. These include some species of Clostridiales order, but none outside the phylum Firmicutes. Characteristic gene-overlap patterns were observed in comQXPA loci, which were different for the B. subtilis-like and non-B. subtilis-like clades. Pronounced sequence variability associated with the ComX peptide in B. subtilis clade is evident also in the non-B. subtilis clade suggesting grossly similar evolutionary constraints in the underlying quorum sensing systems.
Collapse
Affiliation(s)
- Iztok Dogsa
- Department of Food Science and Technology, Biotechnical Faculty, University of Ljubljana, Ljubljana, Slovenia
| | - Kumari Sonal Choudhary
- Group of Protein Structure and Bioinformatics, International Centre for Genetic Engineering and Biotechnology, Trieste, Italy
| | - Ziva Marsetic
- Department of Food Science and Technology, Biotechnical Faculty, University of Ljubljana, Ljubljana, Slovenia
| | - Sanjarbek Hudaiberdiev
- Group of Protein Structure and Bioinformatics, International Centre for Genetic Engineering and Biotechnology, Trieste, Italy
| | - Roberto Vera
- Group of Protein Structure and Bioinformatics, International Centre for Genetic Engineering and Biotechnology, Trieste, Italy
- Faculty of Information Technology and Bionics, Pázmány Péter Catholic University, Budapest, Hungary
| | - Sándor Pongor
- Group of Protein Structure and Bioinformatics, International Centre for Genetic Engineering and Biotechnology, Trieste, Italy
- Faculty of Information Technology and Bionics, Pázmány Péter Catholic University, Budapest, Hungary
- * E-mail: (SP); (IMM)
| | - Ines Mandic-Mulec
- Department of Food Science and Technology, Biotechnical Faculty, University of Ljubljana, Ljubljana, Slovenia
- * E-mail: (SP); (IMM)
| |
Collapse
|
49
|
Khara P, Roy M, Chakraborty J, Ghosal D, Dutta TK. Functional characterization of diverse ring-hydroxylating oxygenases and induction of complex aromatic catabolic gene clusters in Sphingobium sp. PNB. FEBS Open Bio 2014; 4:290-300. [PMID: 24918041 PMCID: PMC4048848 DOI: 10.1016/j.fob.2014.03.001] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2014] [Revised: 03/03/2014] [Accepted: 03/03/2014] [Indexed: 11/27/2022] Open
Abstract
Sphingobium sp. PNB, like other sphingomonads, has multiple ring-hydroxylating oxygenase (RHO) genes. Three different fosmid clones have been sequenced to identify the putative genes responsible for the degradation of various aromatics in this bacterial strain. Comparison of the map of the catabolic genes with that of different sphingomonads revealed a similar arrangement of gene clusters that harbors seven sets of RHO terminal components and a sole set of electron transport (ET) proteins. The presence of distinctly conserved amino acid residues in ferredoxin and in silico molecular docking analyses of ferredoxin with the well characterized terminal oxygenase components indicated the structural uniqueness of the ET component in sphingomonads. The predicted substrate specificities, derived from the phylogenetic relationship of each of the RHOs, were examined based on transformation of putative substrates and their structural homologs by the recombinant strains expressing each of the oxygenases and the sole set of available ET proteins. The RHO AhdA1bA2b was functionally characterized for the first time and was found to be capable of transforming ethylbenzene, propylbenzene, cumene, p-cymene and biphenyl, in addition to a number of polycyclic aromatic hydrocarbons. Overexpression of aromatic catabolic genes in strain PNB, revealed by real-time PCR analyses, is a way forward to understand the complex regulation of degradative genes in sphingomonads.
Collapse
Affiliation(s)
| | | | | | | | - Tapan K. Dutta
- Department of Microbiology, Bose Institute, P-1/12 C.I.T. Scheme VII M, Kolkata 700054, India
| |
Collapse
|
50
|
Origin and length distribution of unidirectional prokaryotic overlapping genes. G3-GENES GENOMES GENETICS 2014; 4:19-27. [PMID: 24192837 PMCID: PMC3887535 DOI: 10.1534/g3.113.005652] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Prokaryotic unidirectional overlapping genes can be originated by disrupting and replacing of the start or stop codon of one protein-coding gene with another start or stop codon within the adjacent gene. However, the probability of disruption and replacement of a start or stop codon may differ significantly depending on the number and redundancy of the start and stop codons sets. Here, we performed a simulation study of the formation of unidirectional overlapping genes using a simple model of nucleotide change and contrasted it with empirical data. Our results suggest that overlaps originated by an elongation of the 3′-end of the upstream gene are significantly more frequent than those originated by an elongation of the 5′-end of the downstream gene. According to this, we propose a model for the creation of unidirectional overlaps that is based on the disruption probabilities of start codon and stop codon sets and on the different probabilities of phase 1 and phase 2 overlaps. Additionally, our results suggest that phase 2 overlaps are formed at higher rates than phase 1 overlaps, given the same evolutionary time. Finally, we propose that there is no need to invoke selection to explain the prevalence of long phase 1 unidirectional overlaps. Rather, the overrepresentation of long phase 1 relative to long phase 2 overlaps might occur because it is highly probable that phase 2 overlaps are retained as short overlaps by chance. Such a pattern is stronger if selection against very long overlaps is included in the model. Our model as a whole is able to explain to a large extent the empirical length distribution of unidirectional overlaps in prokaryotic genomes.
Collapse
|