301
|
Gucwa K, Wons E, Wisniewska A, Jakalski M, Dubiak Z, Kozlowski LP, Mruk I. Lethal perturbation of an Escherichia coli regulatory network is triggered by a restriction-modification system's regulator and can be mitigated by excision of the cryptic prophage Rac. Nucleic Acids Res 2024; 52:2942-2960. [PMID: 38153127 PMCID: PMC11014345 DOI: 10.1093/nar/gkad1234] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Revised: 12/08/2023] [Accepted: 12/13/2023] [Indexed: 12/29/2023] Open
Abstract
Bacterial gene regulatory networks orchestrate responses to environmental challenges. Horizontal gene transfer can bring in genes with regulatory potential, such as new transcription factors (TFs), and this can disrupt existing networks. Serious regulatory perturbations may even result in cell death. Here, we show the impact on Escherichia coli of importing a promiscuous TF that has adventitious transcriptional effects within the cryptic Rac prophage. A cascade of regulatory network perturbations occurred on a global level. The TF, a C regulatory protein, normally controls a Type II restriction-modification system, but in E. coli K-12 interferes with expression of the RacR repressor gene, resulting in de-repression of the normally-silent Rac ydaT gene. YdaT is a prophage-encoded TF with pleiotropic effects on E. coli physiology. In turn, YdaT alters expression of a variety of bacterial regulons normally controlled by the RcsA TF, resulting in deficient lipopolysaccharide biosynthesis and cell division. At the same time, insufficient RacR repressor results in Rac DNA excision, halting Rac gene expression due to loss of the replication-defective Rac prophage. Overall, Rac induction appears to counteract the lethal toxicity of YdaT. We show here that E. coli rewires its regulatory network, so as to minimize the adverse regulatory effects of the imported C TF. This complex set of interactions may reflect the ability of bacteria to protect themselves by having robust mechanisms to maintain their regulatory networks, and/or suggest that regulatory C proteins from mobile operons are under selection to manipulate their host's regulatory networks for their own benefit.
Collapse
Affiliation(s)
- Katarzyna Gucwa
- Department of Microbiology, Faculty of Biology, University of Gdansk, Wita Stwosza 59, Gdansk 80-308, Poland
| | - Ewa Wons
- Department of Microbiology, Faculty of Biology, University of Gdansk, Wita Stwosza 59, Gdansk 80-308, Poland
| | - Aleksandra Wisniewska
- Department of Microbiology, Faculty of Biology, University of Gdansk, Wita Stwosza 59, Gdansk 80-308, Poland
| | - Marcin Jakalski
- 3P-Medicine Laboratory, Medical University of Gdansk, Debinki 7, 80-211 Gdansk, Poland
| | - Zuzanna Dubiak
- Department of Microbiology, Faculty of Biology, University of Gdansk, Wita Stwosza 59, Gdansk 80-308, Poland
| | - Lukasz Pawel Kozlowski
- Institute of Informatics, Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Banacha 2, 02-097 Warsaw, Poland
| | - Iwona Mruk
- Department of Microbiology, Faculty of Biology, University of Gdansk, Wita Stwosza 59, Gdansk 80-308, Poland
| |
Collapse
|
302
|
Zhu Y, Wang Z, Zhou Z, Liu Y, Gao X, Guo W, Shi J. HEMU: An integrated comparative genomics database and analysis platform for Andropogoneae grasses. PLANT COMMUNICATIONS 2024; 5:100786. [PMID: 38155575 PMCID: PMC11009152 DOI: 10.1016/j.xplc.2023.100786] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Revised: 12/05/2023] [Accepted: 12/26/2023] [Indexed: 12/30/2023]
Abstract
This study reports an online database and analysis platform HEMU, which integrates 75 genome assemblies from 20 unique species, large amounts of multi-omics data, and six sophisticated analysis toolkits. HEMU will facilitate comparative genomics analysis within the tribe Andropogoneae.
Collapse
Affiliation(s)
- Yuzhi Zhu
- School of Agriculture, Shenzhen Campus of Sun Yat-sen University, Sun Yat-sen University, Shenzhen 518107, China
| | - Zijie Wang
- School of Agriculture, Shenzhen Campus of Sun Yat-sen University, Sun Yat-sen University, Shenzhen 518107, China
| | - Zanchen Zhou
- School of Agriculture, Shenzhen Campus of Sun Yat-sen University, Sun Yat-sen University, Shenzhen 518107, China
| | - Yuting Liu
- School of Agriculture, Shenzhen Campus of Sun Yat-sen University, Sun Yat-sen University, Shenzhen 518107, China
| | - Xiang Gao
- School of Agriculture, Shenzhen Campus of Sun Yat-sen University, Sun Yat-sen University, Shenzhen 518107, China
| | - Weilong Guo
- Frontiers Science Center for Molecular Design Breeding, Key Laboratory of Crop Heterosis and Utilization (MOE), and Beijing Key Laboratory of Crop Genetic Improvement, China Agricultural University, Beijing 100193, China
| | - Junpeng Shi
- School of Agriculture, Shenzhen Campus of Sun Yat-sen University, Sun Yat-sen University, Shenzhen 518107, China.
| |
Collapse
|
303
|
Ryan D, Bornet E, Prezza G, Alampalli SV, Franco de Carvalho T, Felchle H, Ebbecke T, Hayward RJ, Deutschbauer AM, Barquist L, Westermann AJ. An expanded transcriptome atlas for Bacteroides thetaiotaomicron reveals a small RNA that modulates tetracycline sensitivity. Nat Microbiol 2024; 9:1130-1144. [PMID: 38528147 PMCID: PMC10994844 DOI: 10.1038/s41564-024-01642-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2023] [Accepted: 02/07/2024] [Indexed: 03/27/2024]
Abstract
Plasticity in gene expression allows bacteria to adapt to diverse environments. This is particularly relevant in the dynamic niche of the human intestinal tract; however, transcriptional networks remain largely unknown for gut-resident bacteria. Here we apply differential RNA sequencing (RNA-seq) and conventional RNA-seq to the model gut bacterium Bacteroides thetaiotaomicron to map transcriptional units and profile their expression levels across 15 in vivo-relevant growth conditions. We infer stress- and carbon source-specific transcriptional regulons and expand the annotation of small RNAs (sRNAs). Integrating this expression atlas with published transposon mutant fitness data, we predict conditionally important sRNAs. These include MasB, which downregulates tetracycline tolerance. Using MS2 affinity purification and RNA-seq, we identify a putative MasB target and assess its role in the context of the MasB-associated phenotype. These data-publicly available through the Theta-Base web browser ( http://micromix.helmholtz-hiri.de/bacteroides/ )-constitute a valuable resource for the microbiome community.
Collapse
Affiliation(s)
- Daniel Ryan
- Helmholtz Institute for RNA-based Infection Research, Helmholtz Centre for Infection Research, Würzburg, Germany
| | - Elise Bornet
- Helmholtz Institute for RNA-based Infection Research, Helmholtz Centre for Infection Research, Würzburg, Germany
| | - Gianluca Prezza
- Helmholtz Institute for RNA-based Infection Research, Helmholtz Centre for Infection Research, Würzburg, Germany
| | - Shuba Varshini Alampalli
- Helmholtz Institute for RNA-based Infection Research, Helmholtz Centre for Infection Research, Würzburg, Germany
| | - Taís Franco de Carvalho
- Helmholtz Institute for RNA-based Infection Research, Helmholtz Centre for Infection Research, Würzburg, Germany
| | - Hannah Felchle
- Helmholtz Institute for RNA-based Infection Research, Helmholtz Centre for Infection Research, Würzburg, Germany
- Department of Radiation Oncology, Technical University of Munich, School of Medicine, Klinikum rechts der Isar, Munich, Germany
| | - Titus Ebbecke
- Helmholtz Institute for RNA-based Infection Research, Helmholtz Centre for Infection Research, Würzburg, Germany
| | - Regan J Hayward
- Helmholtz Institute for RNA-based Infection Research, Helmholtz Centre for Infection Research, Würzburg, Germany
| | - Adam M Deutschbauer
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
- Department of Plant and Microbial Biology, University of California, Berkeley, Berkeley, CA, USA
| | - Lars Barquist
- Helmholtz Institute for RNA-based Infection Research, Helmholtz Centre for Infection Research, Würzburg, Germany
- Faculty of Medicine, University of Würzburg, Würzburg, Germany
- Department of Biology, University of Toronto Mississauga, Mississauga, Ontario, Canada
| | - Alexander J Westermann
- Helmholtz Institute for RNA-based Infection Research, Helmholtz Centre for Infection Research, Würzburg, Germany.
- Institute of Molecular Infection Biology, University of Würzburg, Würzburg, Germany.
- Department of Microbiology, Biocentre, University of Würzburg, Würzburg, Germany.
| |
Collapse
|
304
|
Lawaetz AC, Cowley LA, Denham EL. Genome-wide annotation of transcript boundaries using bacterial Rend-seq datasets. Microb Genom 2024; 10. [PMID: 38668652 DOI: 10.1099/mgen.0.001239] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/16/2024] Open
Abstract
Accurate annotation to single-nucleotide resolution of the transcribed regions in genomes is key to optimally analyse RNA-seq data, understand regulatory events and for the design of experiments. However, currently most genome annotations provided by GenBank generally lack information about untranslated regions. Additionally, information regarding genomic locations of non-coding RNAs, such as sRNAs, or anti-sense RNAs is frequently missing. To provide such information, diverse RNA-seq technologies, such as Rend-seq, have been developed and applied to many bacterial species. However, incorporating this vast amount of information into annotation files has been limited and is bioinformatically challenging, resulting in UTRs and other non-coding elements being overlooked or misrepresented. To overcome this problem, we present pyRAP (python Rend-seq Annotation Pipeline), a software package that analyses Rend-seq datasets to accurately resolve transcript boundaries genome-wide. We report the use of pyRAP to find novel transcripts, transcript isoforms, and RNase-dependent sRNA processing events. In Bacillus subtilis we uncovered 63 novel transcripts and provide genomic coordinates with single-nucleotide resolution for 2218 5'UTRs, 1864 3'UTRs and 161 non-coding RNAs. In Escherichia coli, we report 117 novel transcripts, 2429 5'UTRs, 1619 3'UTRs and 91 non-coding RNAs, and in Staphylococcus aureus, 16 novel transcripts, 664 5'UTRs, 696 3'UTRs, and 81 non-coding RNAs. Finally, we use pyRAP to produce updated annotation files for B. subtilis 168, E. coli K-12 MG1655, and S. aureus 8325 for use in the wider microbial genomics research community.
Collapse
Affiliation(s)
- Andreas C Lawaetz
- Life Sciences Department, University of Bath, Claverton Down, Bath, BA2 7AY, UK
| | - Lauren A Cowley
- Life Sciences Department, University of Bath, Claverton Down, Bath, BA2 7AY, UK
- Milner Centre for Evolution, Life Sciences Department, University of Bath, Claverton Down, Bath, BA2 7AY, UK
| | - Emma L Denham
- Life Sciences Department, University of Bath, Claverton Down, Bath, BA2 7AY, UK
| |
Collapse
|
305
|
Wei X, Tan H, Lobb B, Zhen W, Wu Z, Parks DH, Neufeld JD, Moreno-Hagelsieb G, Doxey AC. AnnoView enables large-scale analysis, comparison, and visualization of microbial gene neighborhoods. Brief Bioinform 2024; 25:bbae229. [PMID: 38747283 PMCID: PMC11094555 DOI: 10.1093/bib/bbae229] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Revised: 04/02/2024] [Accepted: 04/26/2024] [Indexed: 05/19/2024] Open
Abstract
The analysis and comparison of gene neighborhoods is a powerful approach for exploring microbial genome structure, function, and evolution. Although numerous tools exist for genome visualization and comparison, genome exploration across large genomic databases or user-generated datasets remains a challenge. Here, we introduce AnnoView, a web server designed for interactive exploration of gene neighborhoods across the bacterial and archaeal tree of life. Our server offers users the ability to identify, compare, and visualize gene neighborhoods of interest from 30 238 bacterial genomes and 1672 archaeal genomes, through integration with the comprehensive Genome Taxonomy Database and AnnoTree databases. Identified gene neighborhoods can be visualized using pre-computed functional annotations from different sources such as KEGG, Pfam and TIGRFAM, or clustered based on similarity. Alternatively, users can upload and explore their own custom genomic datasets in GBK, GFF or CSV format, or use AnnoView as a genome browser for relatively small genomes (e.g. viruses and plasmids). Ultimately, we anticipate that AnnoView will catalyze biological discovery by enabling user-friendly search, comparison, and visualization of genomic data. AnnoView is available at http://annoview.uwaterloo.ca.
Collapse
Affiliation(s)
- Xin Wei
- Department of Biology and Waterloo Centre for Microbial Research, University of Waterloo, 200 University Avenue West, Waterloo, ON N2L 3G1, Canada
| | - Huagang Tan
- Department of Biology and Waterloo Centre for Microbial Research, University of Waterloo, 200 University Avenue West, Waterloo, ON N2L 3G1, Canada
| | - Briallen Lobb
- Department of Biology and Waterloo Centre for Microbial Research, University of Waterloo, 200 University Avenue West, Waterloo, ON N2L 3G1, Canada
| | - William Zhen
- Department of Biology and Waterloo Centre for Microbial Research, University of Waterloo, 200 University Avenue West, Waterloo, ON N2L 3G1, Canada
| | - Zijing Wu
- Department of Biology and Waterloo Centre for Microbial Research, University of Waterloo, 200 University Avenue West, Waterloo, ON N2L 3G1, Canada
| | - Donovan H Parks
- Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences, University of Queensland, St Lucia, QLD 4072, Brisbane, Australia
| | - Josh D Neufeld
- Department of Biology and Waterloo Centre for Microbial Research, University of Waterloo, 200 University Avenue West, Waterloo, ON N2L 3G1, Canada
| | - Gabriel Moreno-Hagelsieb
- Department of Biology, Wilfrid Laurier University, 75 University Avenue West, Waterloo, ON, Canada
| | - Andrew C Doxey
- Department of Biology and Waterloo Centre for Microbial Research, University of Waterloo, 200 University Avenue West, Waterloo, ON N2L 3G1, Canada
| |
Collapse
|
306
|
Luthringer R, Raphalen M, Guerra C, Colin S, Martinho C, Zheng M, Hoshino M, Badis Y, Lipinska AP, Haas FB, Barrera-Redondo J, Alva V, Coelho SM. Repeated co-option of HMG-box genes for sex determination in brown algae and animals. Science 2024; 383:eadk5466. [PMID: 38513029 DOI: 10.1126/science.adk5466] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Accepted: 01/31/2024] [Indexed: 03/23/2024]
Abstract
In many eukaryotes, genetic sex determination is not governed by XX/XY or ZW/ZZ systems but by a specialized region on the poorly studied U (female) or V (male) sex chromosomes. Previous studies have hinted at the existence of a dominant male-sex factor on the V chromosome in brown algae, a group of multicellular eukaryotes distantly related to animals and plants. The nature of this factor has remained elusive. Here, we demonstrate that an HMG-box gene acts as the male-determining factor in brown algae, mirroring the role HMG-box genes play in sex determination in animals. Over a billion-year evolutionary timeline, these lineages have independently co-opted the HMG box for male determination, representing a paradigm for evolution's ability to recurrently use the same genetic "toolkit" to accomplish similar tasks.
Collapse
Affiliation(s)
- Rémy Luthringer
- Department of Algal Development and Evolution, Max Planck Institute for Biology Tübingen, 72076 Tübingen, Germany
| | - Morgane Raphalen
- Department of Algal Development and Evolution, Max Planck Institute for Biology Tübingen, 72076 Tübingen, Germany
| | - Carla Guerra
- Department of Algal Development and Evolution, Max Planck Institute for Biology Tübingen, 72076 Tübingen, Germany
| | - Sébastien Colin
- Department of Algal Development and Evolution, Max Planck Institute for Biology Tübingen, 72076 Tübingen, Germany
| | - Claudia Martinho
- Department of Algal Development and Evolution, Max Planck Institute for Biology Tübingen, 72076 Tübingen, Germany
| | - Min Zheng
- Department of Algal Development and Evolution, Max Planck Institute for Biology Tübingen, 72076 Tübingen, Germany
| | - Masakazu Hoshino
- Department of Algal Development and Evolution, Max Planck Institute for Biology Tübingen, 72076 Tübingen, Germany
- Research Center for Inland Seas, Kobe University, Kobe 658-0022, Japan
| | - Yacine Badis
- Roscoff Biological Station, CNRS-Sorbonne University, Place Georges Teissier, 29680 Roscoff, France
| | - Agnieszka P Lipinska
- Department of Algal Development and Evolution, Max Planck Institute for Biology Tübingen, 72076 Tübingen, Germany
| | - Fabian B Haas
- Department of Algal Development and Evolution, Max Planck Institute for Biology Tübingen, 72076 Tübingen, Germany
| | - Josué Barrera-Redondo
- Department of Algal Development and Evolution, Max Planck Institute for Biology Tübingen, 72076 Tübingen, Germany
| | - Vikram Alva
- Department of Protein Evolution, Max Planck Institute for Biology Tübingen, 72076 Tübingen, Germany
| | - Susana M Coelho
- Department of Algal Development and Evolution, Max Planck Institute for Biology Tübingen, 72076 Tübingen, Germany
| |
Collapse
|
307
|
Beijen EPW, Ohm RA. Genome annotations for the ascomycete fungi Trichoderma harzianum, Trichoderma aggressivum, and Purpureocillium lilacinum. Microbiol Resour Announc 2024; 13:e0115323. [PMID: 38385672 DOI: 10.1128/mra.01153-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2023] [Accepted: 02/11/2024] [Indexed: 02/23/2024] Open
Abstract
We sequenced and annotated the genomes of the ascomycete fungi Trichoderma harzianum, Trichoderma aggressivum f. europaeum, and Purpureocillium lilacinum. Moreover, we developed a website to allow users to interactively analyze the assemblies, gene predictions, and functional annotations of these species and 70+ previously sequenced fungi.
Collapse
Affiliation(s)
- Erik P W Beijen
- Department of Biology, Microbiology, Faculty of Science, Utrecht University, Utrecht, the Netherlands
| | - Robin A Ohm
- Department of Biology, Microbiology, Faculty of Science, Utrecht University, Utrecht, the Netherlands
| |
Collapse
|
308
|
Boyes D, Holland PWH. The genome sequence of the Elephant Hawk-moth, Deilephila elpenor (Linnaeus, 1758). Wellcome Open Res 2024; 9:104. [PMID: 39239169 PMCID: PMC11375399 DOI: 10.12688/wellcomeopenres.21012.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/02/2024] [Indexed: 09/07/2024] Open
Abstract
We present a genome assembly from an individual female Deilephila elpenor (the Elephant Hawk-moth; Arthropoda; Insecta; Lepidoptera; Sphingidae). The genome sequence is 414.1 megabases in span. Most of the assembly is scaffolded into 30 chromosomal pseudomolecules, including the Z and W sex chromosomes. The mitochondrial genome has also been assembled and is 15.37 kilobases in length. Gene annotation of this assembly on Ensembl identified 11,748 protein coding genes.
Collapse
Affiliation(s)
- Douglas Boyes
- UK Centre for Ecology & Hydrology, Wallingford, England, UK
| | | |
Collapse
|
309
|
Thieme M, Minadakis N, Himber C, Keller B, Xu W, Rutowicz K, Matteoli C, Böhrer M, Rymen B, Laudencia-Chingcuanco D, Vogel JP, Sibout R, Stritt C, Blevins T, Roulin AC. Transposition of HOPPLA in siRNA-deficient plants suggests a limited effect of the environment on retrotransposon mobility in Brachypodium distachyon. PLoS Genet 2024; 20:e1011200. [PMID: 38470914 PMCID: PMC10959353 DOI: 10.1371/journal.pgen.1011200] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Revised: 03/22/2024] [Accepted: 02/23/2024] [Indexed: 03/14/2024] Open
Abstract
Long terminal repeat retrotransposons (LTR-RTs) are powerful mutagens regarded as a major source of genetic novelty and important drivers of evolution. Yet, the uncontrolled and potentially selfish proliferation of LTR-RTs can lead to deleterious mutations and genome instability, with large fitness costs for their host. While population genomics data suggest that an ongoing LTR-RT mobility is common in many species, the understanding of their dual role in evolution is limited. Here, we harness the genetic diversity of 320 sequenced natural accessions of the Mediterranean grass Brachypodium distachyon to characterize how genetic and environmental factors influence plant LTR-RT dynamics in the wild. When combining a coverage-based approach to estimate global LTR-RT copy number variations with mobilome-sequencing of nine accessions exposed to eight different stresses, we find little evidence for a major role of environmental factors in LTR-RT accumulations in B. distachyon natural accessions. Instead, we show that loss of RNA polymerase IV (Pol IV), which mediates RNA-directed DNA methylation in plants, results in high transcriptional and transpositional activities of RLC_BdisC024 (HOPPLA) LTR-RT family elements, and that these effects are not stress-specific. This work supports findings indicating an ongoing mobility in B. distachyon and reveals that host RNA-directed DNA methylation rather than environmental factors controls their mobility in this wild grass model.
Collapse
Affiliation(s)
- Michael Thieme
- Department of Plant and Microbial Biology, University of Zurich, Zurich, Switzerland
| | - Nikolaos Minadakis
- Department of Plant and Microbial Biology, University of Zurich, Zurich, Switzerland
| | - Christophe Himber
- Institut de Biologie Moléculaire des Plantes, Centre National de la Recherche Scientifique, Université de Strasbourg, Strasbourg, France
| | - Bettina Keller
- Department of Plant and Microbial Biology, University of Zurich, Zurich, Switzerland
| | - Wenbo Xu
- Department of Plant and Microbial Biology, University of Zurich, Zurich, Switzerland
| | - Kinga Rutowicz
- Department of Plant and Microbial Biology, University of Zurich, Zurich, Switzerland
| | - Calvin Matteoli
- Institut de Biologie Moléculaire des Plantes, Centre National de la Recherche Scientifique, Université de Strasbourg, Strasbourg, France
| | - Marcel Böhrer
- Institut de Biologie Moléculaire des Plantes, Centre National de la Recherche Scientifique, Université de Strasbourg, Strasbourg, France
| | - Bart Rymen
- Institut de Biologie Moléculaire des Plantes, Centre National de la Recherche Scientifique, Université de Strasbourg, Strasbourg, France
| | - Debbie Laudencia-Chingcuanco
- United States Department of Agriculture Agricultural Research Service Western Regional Research Center, Albany, California, United States of America
| | - John P. Vogel
- United States Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, California, United States of America
| | - Richard Sibout
- Institut National de la Recherche Agronomique Unité BIA- 1268 Biopolymères Interactions Assemblages Equipe Paroi Végétale et Polymères Pariétaux (PVPP), Nantes, France
| | - Christoph Stritt
- Swiss Tropical and Public Health Institute (Swiss TPH), Allschwil, Switzerland
| | - Todd Blevins
- Institut de Biologie Moléculaire des Plantes, Centre National de la Recherche Scientifique, Université de Strasbourg, Strasbourg, France
| | - Anne C. Roulin
- Department of Plant and Microbial Biology, University of Zurich, Zurich, Switzerland
| |
Collapse
|
310
|
Roces V, Guerrero S, Álvarez A, Pascual J, Meijón M. PlantFUNCO: Integrative Functional Genomics Database Reveals Clues into Duplicates Divergence Evolution. Mol Biol Evol 2024; 41:msae042. [PMID: 38411627 PMCID: PMC10917205 DOI: 10.1093/molbev/msae042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2023] [Revised: 02/08/2024] [Accepted: 02/16/2024] [Indexed: 02/28/2024] Open
Abstract
Evolutionary epigenomics and, more generally, evolutionary functional genomics, are emerging fields that study how non-DNA-encoded alterations in gene expression regulation are an important form of plasticity and adaptation. Previous evidence analyzing plants' comparative functional genomics has mostly focused on comparing same assay-matched experiments, missing the power of heterogeneous datasets for conservation inference. To fill this gap, we developed PlantFUN(ctional)CO(nservation) database, which is constituted by several tools and two main resources: interspecies chromatin states and functional genomics conservation scores, presented and analyzed in this work for three well-established plant models (Arabidopsis thaliana, Oryza sativa, and Zea mays). Overall, PlantFUNCO elucidated evolutionary information in terms of cross-species functional agreement. Therefore, providing a new complementary comparative-genomics source for assessing evolutionary studies. To illustrate the potential applications of this database, we replicated two previously published models predicting genetic redundancy in A. thaliana and found that chromatin states are a determinant of paralogs degree of functional divergence. These predictions were validated based on the phenotypes of mitochondrial alternative oxidase knockout mutants under two different stressors. Taking all the above into account, PlantFUNCO aim to leverage data diversity and extrapolate molecular mechanisms findings from different model organisms to determine the extent of functional conservation, thus, deepening our understanding of how plants epigenome and functional noncoding genome have evolved. PlantFUNCO is available at https://rocesv.github.io/PlantFUNCO.
Collapse
Affiliation(s)
- Víctor Roces
- Plant Physiology, Department of Organisms and Systems Biology, Faculty of Biology and Biotechnology Institute of Asturias, University of Oviedo, Asturias, Spain
| | - Sara Guerrero
- Plant Physiology, Department of Organisms and Systems Biology, Faculty of Biology and Biotechnology Institute of Asturias, University of Oviedo, Asturias, Spain
| | - Ana Álvarez
- Plant Physiology, Department of Organisms and Systems Biology, Faculty of Biology and Biotechnology Institute of Asturias, University of Oviedo, Asturias, Spain
| | - Jesús Pascual
- Plant Physiology, Department of Organisms and Systems Biology, Faculty of Biology and Biotechnology Institute of Asturias, University of Oviedo, Asturias, Spain
| | - Mónica Meijón
- Plant Physiology, Department of Organisms and Systems Biology, Faculty of Biology and Biotechnology Institute of Asturias, University of Oviedo, Asturias, Spain
| |
Collapse
|
311
|
Chen Y, Wang W, Yang Z, Peng H, Ni Z, Sun Q, Guo W. Innovative computational tools provide new insights into the polyploid wheat genome. ABIOTECH 2024; 5:52-70. [PMID: 38576428 PMCID: PMC10987449 DOI: 10.1007/s42994-023-00131-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Accepted: 12/14/2023] [Indexed: 04/06/2024]
Abstract
Bread wheat (Triticum aestivum) is an important crop and serves as a significant source of protein and calories for humans, worldwide. Nevertheless, its large and allopolyploid genome poses constraints on genetic improvement. The complex reticulate evolutionary history and the intricacy of genomic resources make the deciphering of the functional genome considerably more challenging. Recently, we have developed a comprehensive list of versatile computational tools with the integration of statistical models for dissecting the polyploid wheat genome. Here, we summarize the methodological innovations and applications of these tools and databases. A series of step-by-step examples illustrates how these tools can be utilized for dissecting wheat germplasm resources and unveiling functional genes associated with important agronomic traits. Furthermore, we outline future perspectives on new advanced tools and databases, taking into consideration the unique features of bread wheat, to accelerate genomic-assisted wheat breeding.
Collapse
Affiliation(s)
- Yongming Chen
- Frontiers Science Center for Molecular Design Breeding, Key Laboratory of Crop Heterosis and Utilization, Beijing Key Laboratory of Crop Genetic Improvement, China Agricultural University, Beijing, 100193 China
| | - Wenxi Wang
- Frontiers Science Center for Molecular Design Breeding, Key Laboratory of Crop Heterosis and Utilization, Beijing Key Laboratory of Crop Genetic Improvement, China Agricultural University, Beijing, 100193 China
| | - Zhengzhao Yang
- Frontiers Science Center for Molecular Design Breeding, Key Laboratory of Crop Heterosis and Utilization, Beijing Key Laboratory of Crop Genetic Improvement, China Agricultural University, Beijing, 100193 China
| | - Huiru Peng
- Frontiers Science Center for Molecular Design Breeding, Key Laboratory of Crop Heterosis and Utilization, Beijing Key Laboratory of Crop Genetic Improvement, China Agricultural University, Beijing, 100193 China
| | - Zhongfu Ni
- Frontiers Science Center for Molecular Design Breeding, Key Laboratory of Crop Heterosis and Utilization, Beijing Key Laboratory of Crop Genetic Improvement, China Agricultural University, Beijing, 100193 China
| | - Qixin Sun
- Frontiers Science Center for Molecular Design Breeding, Key Laboratory of Crop Heterosis and Utilization, Beijing Key Laboratory of Crop Genetic Improvement, China Agricultural University, Beijing, 100193 China
| | - Weilong Guo
- Frontiers Science Center for Molecular Design Breeding, Key Laboratory of Crop Heterosis and Utilization, Beijing Key Laboratory of Crop Genetic Improvement, China Agricultural University, Beijing, 100193 China
| |
Collapse
|
312
|
Sivell O, Crowley LM, Natural History Museum Genome Acquisition Lab, University of Oxford and Wytham Woods Genome Acquisition Lab, Darwin Tree of Life Barcoding collective, Wellcome Sanger Institute Tree of Life Management, Samples and Laboratory team, Wellcome Sanger Institute Scientific Operations: Sequencing Operations, Wellcome Sanger Institute Tree of Life Core Informatics team, Tree of Life Core Informatics collective, Darwin Tree of Life Consortium. The genome sequence of a hoverfly, Merodon equestris (Fabricius, 1794). Wellcome Open Res 2024; 9:67. [PMID: 38911901 PMCID: PMC11192019 DOI: 10.12688/wellcomeopenres.20654.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/13/2023] [Indexed: 06/25/2024] Open
Abstract
We present a genome assembly from an individual female Merodon equestris (hoverfly; Arthropoda; Insecta; Diptera; Syrphidae). The genome sequence is 873.0 megabases in span. Most of the assembly is scaffolded into 6 chromosomal pseudomolecules. The mitochondrial genome has also been assembled and is 15.95 kilobases in length.
Collapse
|
313
|
Wang Y, Xu S. A high-quality genome assembly of the waterlily aphid Rhopalosiphum nymphaeae. Sci Data 2024; 11:194. [PMID: 38351256 PMCID: PMC10864314 DOI: 10.1038/s41597-024-03043-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2023] [Accepted: 02/03/2024] [Indexed: 02/16/2024] Open
Abstract
Waterlily aphid, Rhopalosiphum nymphaeae (Linnaeus), is a host-alternating aphid known to feed on both terrestrial and aquatic hosts. It causes damage through direct herbivory and acting as a vector for plant viruses, impacting worldwide Prunus spp. fruits and aquatic plants. Interestingly, R. nymphaeae's ability to thrive in both aquatic and terrestrial conditions sets it apart from other aphids, offering a unique perspective on adaptation. We present the first high-quality R. nymphaeae genome assembly with a size of 324.4 Mb using PacBio long-read sequencing. The resulting assembly is highly contiguous with a contig N50 reached 12.7 Mb. The BUSCO evaluation suggested a 97.5% completeness. The R. nymphaeae genome consists of 16.9% repetitive elements and 16,834 predicted protein-coding genes. Phylogenetic analysis positioned R. nymphaeae within the Aphidini tribe, showing close relations to R. maidis and R. padi. The high-quality reference genome R. nymphaeae provides a unique resource for understanding genome evolution in aphids and paves the foundation for understanding host plant adaptation mechanisms and developing pest control strategies.
Collapse
Affiliation(s)
- Yangzi Wang
- Institute of Organismic and Molecular Evolution (iomE), Johannes Gutenberg University Mainz, 55128, Mainz, Germany
- Institute for Evolution and Biodiversity, University of Münster, 48161, Münster, Germany
| | - Shuqing Xu
- Institute of Organismic and Molecular Evolution (iomE), Johannes Gutenberg University Mainz, 55128, Mainz, Germany.
| |
Collapse
|
314
|
Boyes D, Holland PWH, University of Oxford and Wytham Woods Genome Acquisition Lab, Darwin Tree of Life Barcoding collective, Wellcome Sanger Institute Tree of Life Management, Samples and Laboratory team, Wellcome Sanger Institute Scientific Operations: Sequencing Operations, Wellcome Sanger Institute Tree of Life Core Informatics team, Tree of Life Core Informatics collective, Darwin Tree of Life Consortium. The genome sequence of the Scarlet Tiger moth, Callimorpha dominula (Linnaeus, 1758). Wellcome Open Res 2024; 9:31. [PMID: 39233899 PMCID: PMC11372354 DOI: 10.12688/wellcomeopenres.20833.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/08/2024] [Indexed: 09/06/2024] Open
Abstract
We present a genome assembly from an individual male Callimorpha dominula (the Scarlet Tiger moth; Arthropoda; Insecta; Lepidoptera; Erebidae). The genome sequence is 658.1 megabases in span. Most of the assembly is scaffolded into 31 chromosomal pseudomolecules, including the Z sex chromosome. The mitochondrial genome has also been assembled and is 15.45 kilobases in length. Gene annotation of this assembly on Ensembl identified 20,234 protein coding genes.
Collapse
Affiliation(s)
- Douglas Boyes
- UK Centre for Ecology & Hydrology, Wallingford, England, UK
| | | | | | | | | | | | | | | | | |
Collapse
|
315
|
Feng X, Liu S, Li K, Bu F, Yuan H. NCAD v1.0: a database for non-coding variant annotation and interpretation. J Genet Genomics 2024; 51:230-242. [PMID: 38142743 DOI: 10.1016/j.jgg.2023.12.005] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Revised: 12/15/2023] [Accepted: 12/18/2023] [Indexed: 12/26/2023]
Abstract
The application of whole genome sequencing is expanding in clinical diagnostics across various genetic disorders, and the significance of non-coding variants in penetrant diseases is increasingly being demonstrated. Therefore, it is urgent to improve the diagnostic yield by exploring the pathogenic mechanisms of variants in non-coding regions. However, the interpretation of non-coding variants remains a significant challenge, due to the complex functional regulatory mechanisms of non-coding regions and the current limitations of available databases and tools. Hence, we develop the non-coding variant annotation database (NCAD, http://www.ncawdb.net/), encompassing comprehensive insights into 665,679,194 variants, regulatory elements, and element interaction details. Integrating data from 96 sources, spanning both GRCh37 and GRCh38 versions, NCAD v1.0 provides vital information to support the genetic diagnosis of non-coding variants, including allele frequencies of 12 diverse populations, with a particular focus on the population frequency information for 230,235,698 variants in 20,964 Chinese individuals. Moreover, it offers prediction scores for variant functionality, five categories of regulatory elements, and four types of non-coding RNAs. With its rich data and comprehensive coverage, NCAD serves as a valuable platform, empowering researchers and clinicians with profound insights into non-coding regulatory mechanisms while facilitating the interpretation of non-coding variants.
Collapse
Affiliation(s)
- Xiaoshu Feng
- Institute of Rare Diseases, West China Hospital, Sichuan University, Chengdu, Sichuan 610044, China
| | - Sihan Liu
- Institute of Rare Diseases, West China Hospital, Sichuan University, Chengdu, Sichuan 610044, China
| | - Ke Li
- Institute of Rare Diseases, West China Hospital, Sichuan University, Chengdu, Sichuan 610044, China
| | - Fengxiao Bu
- Institute of Rare Diseases, West China Hospital, Sichuan University, Chengdu, Sichuan 610044, China.
| | - Huijun Yuan
- Institute of Rare Diseases, West China Hospital, Sichuan University, Chengdu, Sichuan 610044, China.
| |
Collapse
|
316
|
Estevez-Castro CF, Rodrigues MF, Babarit A, Ferreira FV, de Andrade EG, Marois E, Cogni R, Aguiar ERGR, Marques JT, Olmo RP. Neofunctionalization driven by positive selection led to the retention of the loqs2 gene encoding an Aedes specific dsRNA binding protein. BMC Biol 2024; 22:14. [PMID: 38273313 PMCID: PMC10809485 DOI: 10.1186/s12915-024-01821-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2022] [Accepted: 01/10/2024] [Indexed: 01/27/2024] Open
Abstract
BACKGROUND Mosquito borne viruses, such as dengue, Zika, yellow fever and Chikungunya, cause millions of infections every year. These viruses are mostly transmitted by two urban-adapted mosquito species, Aedes aegypti and Aedes albopictus. Although mechanistic understanding remains largely unknown, Aedes mosquitoes may have unique adaptations that lower the impact of viral infection. Recently, we reported the identification of an Aedes specific double-stranded RNA binding protein (dsRBP), named Loqs2, that is involved in the control of infection by dengue and Zika viruses in mosquitoes. Preliminary analyses suggested that the loqs2 gene is a paralog of loquacious (loqs) and r2d2, two co-factors of the RNA interference (RNAi) pathway, a major antiviral mechanism in insects. RESULTS Here we analyzed the origin and evolution of loqs2. Our data suggest that loqs2 originated from two independent duplications of the first double-stranded RNA binding domain of loqs that occurred before the origin of the Aedes Stegomyia subgenus, around 31 million years ago. We show that the loqs2 gene is evolving under relaxed purifying selection at a faster pace than loqs, with evidence of neofunctionalization driven by positive selection. Accordingly, we observed that Loqs2 is localized mainly in the nucleus, different from R2D2 and both isoforms of Loqs that are cytoplasmic. In contrast to r2d2 and loqs, loqs2 expression is stage- and tissue-specific, restricted mostly to reproductive tissues in adult Ae. aegypti and Ae. albopictus. Transgenic mosquitoes engineered to express loqs2 ubiquitously undergo developmental arrest at larval stages that correlates with massive dysregulation of gene expression without major effects on microRNAs or other endogenous small RNAs, classically associated with RNA interference. CONCLUSIONS Our results uncover the peculiar origin and neofunctionalization of loqs2 driven by positive selection. This study shows an example of unique adaptations in Aedes mosquitoes that could ultimately help explain their effectiveness as virus vectors.
Collapse
Affiliation(s)
- Carlos F Estevez-Castro
- Department of Biochemistry and Immunology, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, 31270-901, Brazil
- CNRS UPR9022, Inserm U1257, Université de Strasbourg, 67084, Strasbourg, France
| | - Murillo F Rodrigues
- Institute of Ecology and Evolution, University of Oregon, Eugene, OR, 97403-5289, USA
| | - Antinéa Babarit
- CNRS UPR9022, Inserm U1257, Université de Strasbourg, 67084, Strasbourg, France
| | - Flávia V Ferreira
- Department of Biochemistry and Immunology, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, 31270-901, Brazil
| | - Elisa G de Andrade
- Department of Biochemistry and Immunology, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, 31270-901, Brazil
- CNRS UPR9022, Inserm U1257, Université de Strasbourg, 67084, Strasbourg, France
| | - Eric Marois
- CNRS UPR9022, Inserm U1257, Université de Strasbourg, 67084, Strasbourg, France
| | - Rodrigo Cogni
- Department of Ecology, Institute of Biosciences, University of São Paulo, São Paulo, 05508-090, Brazil
| | - Eric R G R Aguiar
- Department of Biological Science, Center of Biotechnology and Genetics, State University of Santa Cruz, Ilhéus, 45662-900, Brazil
| | - João T Marques
- Department of Biochemistry and Immunology, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, 31270-901, Brazil.
- CNRS UPR9022, Inserm U1257, Université de Strasbourg, 67084, Strasbourg, France.
| | - Roenick P Olmo
- Department of Biochemistry and Immunology, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, 31270-901, Brazil.
- CNRS UPR9022, Inserm U1257, Université de Strasbourg, 67084, Strasbourg, France.
| |
Collapse
|
317
|
Peng B, Weintraub SJ, Lu Z, Evans S, Shen Q, McDonnell L, Plan M, Collier T, Cheah LC, Ji L, Howard CB, Anderson W, Trau M, Dumsday G, Bredeweg EL, Young EM, Speight R, Vickers CE. Integration of Yeast Episomal/Integrative Plasmid Causes Genotypic and Phenotypic Diversity and Improved Sesquiterpene Production in Metabolically Engineered Saccharomyces cerevisiae. ACS Synth Biol 2024; 13:141-156. [PMID: 38084917 DOI: 10.1021/acssynbio.3c00363] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2024]
Abstract
The variability in phenotypic outcomes among biological replicates in engineered microbial factories presents a captivating mystery. Establishing the association between phenotypic variability and genetic drivers is important to solve this intricate puzzle. We applied a previously developed auxin-inducible depletion of hexokinase 2 as a metabolic engineering strategy for improved nerolidol production in Saccharomyces cerevisiae, and biological replicates exhibit a dichotomy in nerolidol production of either 3.5 or 2.5 g L-1 nerolidol. Harnessing Oxford Nanopore's long-read genomic sequencing, we reveal a potential genetic cause─the chromosome integration of a 2μ sequence-based yeast episomal plasmid, encoding the expression cassettes for nerolidol synthetic enzymes. This finding was reinforced through chromosome integration revalidation, engineering nerolidol and valencene production strains, and generating a diverse pool of yeast clones, each uniquely fingerprinted by gene copy numbers, plasmid integrations, other genomic rearrangements, protein expression levels, growth rate, and target product productivities. Τhe best clone in two strains produced 3.5 g L-1 nerolidol and ∼0.96 g L-1 valencene. Comparable genotypic and phenotypic variations were also generated through the integration of a yeast integrative plasmid lacking 2μ sequences. Our work shows that multiple factors, including plasmid integration status, subchromosomal location, gene copy number, sesquiterpene synthase expression level, and genome rearrangement, together play a complicated determinant role on the productivities of sesquiterpene product. Integration of yeast episomal/integrative plasmids may be used as a versatile method for increasing the diversity and optimizing the efficiency of yeast cell factories, thereby uncovering metabolic control mechanisms.
Collapse
Affiliation(s)
- Bingyin Peng
- ARC Centre of Excellence in Synthetic Biology, Sydney, NSW 2109, Australia
- Centre of Agriculture and the Bioeconomy, School of Biology and Environmental Science, Faculty of Science, Queensland University of Technology, Brisbane, QLD 4000, Australia
- Australian Institute for Bioengineering and Nanotechnology (AIBN), The University of Queensland, Brisbane, QLD 4072, Australia
| | - Sarah J Weintraub
- Bioinformatics and Computational Biology, Worcester Polytechnic Institute, Worcester, Massachusetts 01609, United States of America
| | - Zeyu Lu
- ARC Centre of Excellence in Synthetic Biology, Sydney, NSW 2109, Australia
- Centre of Agriculture and the Bioeconomy, School of Biology and Environmental Science, Faculty of Science, Queensland University of Technology, Brisbane, QLD 4000, Australia
- Australian Institute for Bioengineering and Nanotechnology (AIBN), The University of Queensland, Brisbane, QLD 4072, Australia
| | - Samuel Evans
- ARC Centre of Excellence in Synthetic Biology, Sydney, NSW 2109, Australia
- Centre of Agriculture and the Bioeconomy, School of Biology and Environmental Science, Faculty of Science, Queensland University of Technology, Brisbane, QLD 4000, Australia
| | - Qianyi Shen
- ARC Centre of Excellence in Synthetic Biology, Sydney, NSW 2109, Australia
- Centre of Agriculture and the Bioeconomy, School of Biology and Environmental Science, Faculty of Science, Queensland University of Technology, Brisbane, QLD 4000, Australia
- Australian Institute for Bioengineering and Nanotechnology (AIBN), The University of Queensland, Brisbane, QLD 4072, Australia
- School of Chemistry and Molecular Biosciences (SCMB), The University of Queensland, Brisbane, QLD4072, Australia
| | - Liam McDonnell
- ARC Centre of Excellence in Synthetic Biology, Sydney, NSW 2109, Australia
- Centre of Agriculture and the Bioeconomy, School of Biology and Environmental Science, Faculty of Science, Queensland University of Technology, Brisbane, QLD 4000, Australia
| | - Manuel Plan
- Australian Institute for Bioengineering and Nanotechnology (AIBN), The University of Queensland, Brisbane, QLD 4072, Australia
- Metabolomics Australia (Queensland Node), Australian Institute for Bioengineering and Nanotechnology (AIBN), The University of Queensland, Brisbane, QLD 4072, Australia
| | - Thomas Collier
- ARC Centre of Excellence in Synthetic Biology, Sydney, NSW 2109, Australia
- School of Natural Sciences, Macquarie University, Sydney, NSW 2109, Australia
| | - Li Chen Cheah
- ARC Centre of Excellence in Synthetic Biology, Sydney, NSW 2109, Australia
- Australian Institute for Bioengineering and Nanotechnology (AIBN), The University of Queensland, Brisbane, QLD 4072, Australia
| | - Lei Ji
- Shandong Provincial Key Laboratory of Applied Microbiology, Ecology Institute, Qilu University of Technology (Shandong Academy of Sciences), Jinan 250103, PR China
| | - Christopher B Howard
- Australian Institute for Bioengineering and Nanotechnology (AIBN), The University of Queensland, Brisbane, QLD 4072, Australia
| | - Will Anderson
- Australian Institute for Bioengineering and Nanotechnology (AIBN), The University of Queensland, Brisbane, QLD 4072, Australia
| | - Matt Trau
- Australian Institute for Bioengineering and Nanotechnology (AIBN), The University of Queensland, Brisbane, QLD 4072, Australia
- School of Chemistry and Molecular Biosciences (SCMB), The University of Queensland, Brisbane, QLD4072, Australia
| | | | - Erin L Bredeweg
- Functional and Systems Biology Group, Environmental Molecular Sciences Division, Pacific Northwest National Laboratory, Richland, Washington 99352, United States
| | - Eric M Young
- Department of Chemical Engineering, Worcester Polytechnic Institute, Worcester, Massachusetts 01609, United States
| | - Robert Speight
- ARC Centre of Excellence in Synthetic Biology, Sydney, NSW 2109, Australia
- Centre of Agriculture and the Bioeconomy, School of Biology and Environmental Science, Faculty of Science, Queensland University of Technology, Brisbane, QLD 4000, Australia
- Advanced Engineering Biology Future Science Platform, Commonwealth Scientific and Industrial Research Organisation (CSIRO), Black Mountain, ACT 2601, Australia
| | - Claudia E Vickers
- ARC Centre of Excellence in Synthetic Biology, Sydney, NSW 2109, Australia
- Centre of Agriculture and the Bioeconomy, School of Biology and Environmental Science, Faculty of Science, Queensland University of Technology, Brisbane, QLD 4000, Australia
| |
Collapse
|
318
|
Schwarzl T, Sahadevan S, Lang B, Miladi M, Backofen R, Huber W, Hentze MW, Tartaglia GG. Improved discovery of RNA-binding protein binding sites in eCLIP data using DEWSeq. Nucleic Acids Res 2024; 52:e1. [PMID: 37962298 PMCID: PMC10783507 DOI: 10.1093/nar/gkad998] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2023] [Revised: 09/04/2023] [Accepted: 10/18/2023] [Indexed: 11/15/2023] Open
Abstract
Enhanced crosslinking and immunoprecipitation (eCLIP) sequencing is a method for transcriptome-wide detection of binding sites of RNA-binding proteins (RBPs). However, identified crosslink sites can deviate from experimentally established functional elements of even well-studied RBPs. Current peak-calling strategies result in low replication and high false positive rates. Here, we present the R/Bioconductor package DEWSeq that makes use of replicate information and size-matched input controls. We benchmarked DEWSeq on 107 RBPs for which both eCLIP data and RNA sequence motifs are available and were able to more than double the number of motif-containing binding regions relative to standard eCLIP processing. The improvement not only relates to the number of binding sites (3.1-fold with known motifs for RBFOX2), but also their subcellular localization (1.9-fold of mitochondrial genes for FASTKD2) and structural targets (2.2-fold increase of stem-loop regions for SLBP. On several orthogonal CLIP-seq datasets, DEWSeq recovers a larger number of motif-containing binding sites (3.3-fold). DEWSeq is a well-documented R/Bioconductor package, scalable to adequate numbers of replicates, and tends to substantially increase the proportion and total number of RBP binding sites containing biologically relevant features.
Collapse
Affiliation(s)
- Thomas Schwarzl
- European Molecular Biology Laboratory (EMBL), Meyerhofstraße 1, 69117 Heidelberg, Germany
| | - Sudeep Sahadevan
- European Molecular Biology Laboratory (EMBL), Meyerhofstraße 1, 69117 Heidelberg, Germany
| | - Benjamin Lang
- Department of Structural Biology and Center of Excellence for Data-Driven Discovery, St. Jude Children's Research Hospital, 262 Danny Thomas Place, Memphis, TN 38105, USA
| | - Milad Miladi
- Bioinformatics Group, Department of Computer Science, University of Freiburg, 79098 Freiburg im Breisgau, Germany
| | - Rolf Backofen
- Bioinformatics Group, Department of Computer Science, University of Freiburg, 79098 Freiburg im Breisgau, Germany
| | - Wolfgang Huber
- European Molecular Biology Laboratory (EMBL), Meyerhofstraße 1, 69117 Heidelberg, Germany
| | - Matthias W Hentze
- European Molecular Biology Laboratory (EMBL), Meyerhofstraße 1, 69117 Heidelberg, Germany
| | - Gian Gaetano Tartaglia
- Center for Life Nano & Neuroscience, Italian Institute of Technology, 00161 Rome, Italy and Department of Biology, Sapienza University of Rome, 00185 Rome, Italy
| |
Collapse
|
319
|
Wang K, Perera BPU, Morgan RK, Sala-Hamrick K, Geron V, Svoboda LK, Faulk C, Dolinoy DC, Sartor MA. piOxi database: a web resource of germline and somatic tissue piRNAs identified by chemical oxidation. Database (Oxford) 2024; 2024:baad096. [PMID: 38204359 PMCID: PMC10782149 DOI: 10.1093/database/baad096] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Revised: 11/27/2023] [Accepted: 12/27/2023] [Indexed: 01/12/2024]
Abstract
PIWI-interacting RNAs (piRNAs) are a class of small non-coding RNAs that are highly expressed and extensively studied from the germline. piRNAs associate with PIWI proteins to maintain DNA methylation for transposon silencing and transcriptional gene regulation for genomic stability. Mature germline piRNAs have distinct characteristics including a 24- to 32-nucleotide length and a 2'-O-methylation signature at the 3' end. Although recent studies have identified piRNAs in somatic tissues, they remain poorly characterized. For example, we recently demonstrated notable expression of piRNA in the murine soma, and while overall expression was lower than that of the germline, unique characteristics suggested tissue-specific functions of this class. While currently available databases commonly use length and association with PIWI proteins to identify piRNA, few have included a chemical oxidation method that detects piRNA based on its 3' modification. This method leads to reproducible and rigorous data processing when coupled with next-generation sequencing and bioinformatics analysis. Here, we introduce piOxi DB, a user-friendly web resource that provides a comprehensive analysis of piRNA, generated exclusively through sodium periodate treatment of small RNA. The current version of piOxi DB includes 435 749 germline and 9828 somatic piRNA sequences robustly identified from M. musculus, M. fascicularis and H. sapiens. The database provides species- and tissue-specific data that are further analyzed according to chromosome location and correspondence to gene and repetitive elements. piOxi DB is an informative tool to assist broad research applications in the fields of RNA biology, cancer biology, environmental toxicology and beyond. Database URL: https://pioxidb.dcmb.med.umich.edu/.
Collapse
Affiliation(s)
| | - Bambarendage P U Perera
- Department of Environmental Health Sciences, School of Public Health, University of Michigan, 1415 Washington Heights, Ann Arbor, MI 48109, USA
| | - Rachel K Morgan
- Department of Environmental Health Sciences, School of Public Health, University of Michigan, 1415 Washington Heights, Ann Arbor, MI 48109, USA
| | - Kimberley Sala-Hamrick
- Department of Environmental Health Sciences, School of Public Health, University of Michigan, 1415 Washington Heights, Ann Arbor, MI 48109, USA
| | - Viviana Geron
- Department of Environmental Health Sciences, School of Public Health, University of Michigan, 1415 Washington Heights, Ann Arbor, MI 48109, USA
| | - Laurie K Svoboda
- Department of Environmental Health Sciences, School of Public Health, University of Michigan, 1415 Washington Heights, Ann Arbor, MI 48109, USA
- Department of Pharmacology, School of Medicine, University of Michigan, 1150 W. Medical Center Drive, Ann Arbor, MI 48109, USA
| | - Christopher Faulk
- Department of Animal Science, College of Food, Agricultural and Natural Resource Sciences, University of Minnesota, 1988 Fitch Avenue, Saint Paul, MN 55108, USA
| | - Dana C Dolinoy
- Department of Environmental Health Sciences, School of Public Health, University of Michigan, 1415 Washington Heights, Ann Arbor, MI 48109, USA
- Department of Nutritional Sciences, School of Public Health, University of Michigan, 1415 Washington Heights, Ann Arbor, MI 48109, USA
- Department of Computational Medicine and Bioinformatics, School of Medicine, University of Michigan, 100 Washtenaw Ave, Ann Arbor, MI 48109, USA
| | - Maureen A Sartor
- Department of Computational Medicine and Bioinformatics, School of Medicine, University of Michigan, 100 Washtenaw Ave, Ann Arbor, MI 48109, USA
- Department of Biostatistics, School of Public Health, University of Michigan, 1415 Washington Heights, Ann Arbor, MI 48109, USA
| |
Collapse
|
320
|
Tadaka S, Kawashima J, Hishinuma E, Saito S, Okamura Y, Otsuki A, Kojima K, Komaki S, Aoki Y, Kanno T, Saigusa D, Inoue J, Shirota M, Takayama J, Katsuoka F, Shimizu A, Tamiya G, Shimizu R, Hiratsuka M, Motoike I, Koshiba S, Sasaki M, Yamamoto M, Kinoshita K. jMorp: Japanese Multi-Omics Reference Panel update report 2023. Nucleic Acids Res 2024; 52:D622-D632. [PMID: 37930845 PMCID: PMC10767895 DOI: 10.1093/nar/gkad978] [Citation(s) in RCA: 45] [Impact Index Per Article: 45.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Revised: 10/06/2023] [Accepted: 10/17/2023] [Indexed: 11/08/2023] Open
Abstract
Modern medicine is increasingly focused on personalized medicine, and multi-omics data is crucial in understanding biological phenomena and disease mechanisms. Each ethnic group has its unique genetic background with specific genomic variations influencing disease risk and drug response. Therefore, multi-omics data from specific ethnic populations are essential for the effective implementation of personalized medicine. Various prospective cohort studies, such as the UK Biobank, All of Us and Lifelines, have been conducted worldwide. The Tohoku Medical Megabank project was initiated after the Great East Japan Earthquake in 2011. It collects biological specimens and conducts genome and omics analyses to build a basis for personalized medicine. Summary statistical data from these analyses are available in the jMorp web database (https://jmorp.megabank.tohoku.ac.jp), which provides a multidimensional approach to the diversity of the Japanese population. jMorp was launched in 2015 as a public database for plasma metabolome and proteome analyses and has been continuously updated. The current update will significantly expand the scale of the data (metabolome, genome, transcriptome, and metagenome). In addition, the user interface and backend server implementations were rewritten to improve the connectivity between the items stored in jMorp. This paper provides an overview of the new version of the jMorp.
Collapse
Affiliation(s)
- Shu Tadaka
- Tohoku Medical Megabank Organization, Tohoku University, Sendai, Miyagi 980-8573, Japan
| | - Junko Kawashima
- Tohoku Medical Megabank Organization, Tohoku University, Sendai, Miyagi 980-8573, Japan
| | - Eiji Hishinuma
- Tohoku Medical Megabank Organization, Tohoku University, Sendai, Miyagi 980-8573, Japan
- Advanced Research Center for Innovations in Next-Generation Medicine, Tohoku University, Sendai, Miyagi 980-8573, Japan
| | - Sakae Saito
- Tohoku Medical Megabank Organization, Tohoku University, Sendai, Miyagi 980-8573, Japan
- Advanced Research Center for Innovations in Next-Generation Medicine, Tohoku University, Sendai, Miyagi 980-8573, Japan
| | - Yasunobu Okamura
- Tohoku Medical Megabank Organization, Tohoku University, Sendai, Miyagi 980-8573, Japan
- Advanced Research Center for Innovations in Next-Generation Medicine, Tohoku University, Sendai, Miyagi 980-8573, Japan
| | - Akihito Otsuki
- Tohoku Medical Megabank Organization, Tohoku University, Sendai, Miyagi 980-8573, Japan
- Graduate School of Medicine, Tohoku University, Sendai, Miyagi 980-8575, Japan
| | - Kaname Kojima
- Tohoku Medical Megabank Organization, Tohoku University, Sendai, Miyagi 980-8573, Japan
| | - Shohei Komaki
- Iwate Tohoku Medical Megabank Organization, Iwate Medical University, Shiwa-gun, Iwate 028-3609, Japan
| | - Yuichi Aoki
- Tohoku Medical Megabank Organization, Tohoku University, Sendai, Miyagi 980-8573, Japan
- Graduate School of Information Sciences, Tohoku University, Sendai, Miyagi 980-8579, Japan
| | - Takanari Kanno
- Tohoku Medical Megabank Organization, Tohoku University, Sendai, Miyagi 980-8573, Japan
| | - Daisuke Saigusa
- Tohoku Medical Megabank Organization, Tohoku University, Sendai, Miyagi 980-8573, Japan
- Faculty of Pharma-Science, Teikyo University, Tokyo 173-8605, Japan
| | - Jin Inoue
- Tohoku Medical Megabank Organization, Tohoku University, Sendai, Miyagi 980-8573, Japan
- Advanced Research Center for Innovations in Next-Generation Medicine, Tohoku University, Sendai, Miyagi 980-8573, Japan
| | - Matsuyuki Shirota
- Tohoku Medical Megabank Organization, Tohoku University, Sendai, Miyagi 980-8573, Japan
- Graduate School of Medicine, Tohoku University, Sendai, Miyagi 980-8575, Japan
| | - Jun Takayama
- Tohoku Medical Megabank Organization, Tohoku University, Sendai, Miyagi 980-8573, Japan
- Advanced Research Center for Innovations in Next-Generation Medicine, Tohoku University, Sendai, Miyagi 980-8573, Japan
- Graduate School of Medicine, Tohoku University, Sendai, Miyagi 980-8575, Japan
- RIKEN Center for Advanced Intelligence Project, Tokyo 103-0027, Japan
| | - Fumiki Katsuoka
- Tohoku Medical Megabank Organization, Tohoku University, Sendai, Miyagi 980-8573, Japan
- Advanced Research Center for Innovations in Next-Generation Medicine, Tohoku University, Sendai, Miyagi 980-8573, Japan
| | - Atsushi Shimizu
- Tohoku Medical Megabank Organization, Tohoku University, Sendai, Miyagi 980-8573, Japan
- Iwate Tohoku Medical Megabank Organization, Iwate Medical University, Shiwa-gun, Iwate 028-3609, Japan
| | - Gen Tamiya
- Tohoku Medical Megabank Organization, Tohoku University, Sendai, Miyagi 980-8573, Japan
- Advanced Research Center for Innovations in Next-Generation Medicine, Tohoku University, Sendai, Miyagi 980-8573, Japan
- Graduate School of Medicine, Tohoku University, Sendai, Miyagi 980-8575, Japan
- RIKEN Center for Advanced Intelligence Project, Tokyo 103-0027, Japan
| | - Ritsuko Shimizu
- Tohoku Medical Megabank Organization, Tohoku University, Sendai, Miyagi 980-8573, Japan
- Advanced Research Center for Innovations in Next-Generation Medicine, Tohoku University, Sendai, Miyagi 980-8573, Japan
- Graduate School of Medicine, Tohoku University, Sendai, Miyagi 980-8575, Japan
| | - Masahiro Hiratsuka
- Tohoku Medical Megabank Organization, Tohoku University, Sendai, Miyagi 980-8573, Japan
- Advanced Research Center for Innovations in Next-Generation Medicine, Tohoku University, Sendai, Miyagi 980-8573, Japan
- Graduate School of Pharmaceutical Sciences, Tohoku University, Sendai, Miyagi 980-8578, Japan
| | - Ikuko N Motoike
- Tohoku Medical Megabank Organization, Tohoku University, Sendai, Miyagi 980-8573, Japan
- Graduate School of Information Sciences, Tohoku University, Sendai, Miyagi 980-8579, Japan
| | - Seizo Koshiba
- Tohoku Medical Megabank Organization, Tohoku University, Sendai, Miyagi 980-8573, Japan
- Advanced Research Center for Innovations in Next-Generation Medicine, Tohoku University, Sendai, Miyagi 980-8573, Japan
| | - Makoto Sasaki
- Iwate Tohoku Medical Megabank Organization, Iwate Medical University, Shiwa-gun, Iwate 028-3609, Japan
| | - Masayuki Yamamoto
- Tohoku Medical Megabank Organization, Tohoku University, Sendai, Miyagi 980-8573, Japan
- Advanced Research Center for Innovations in Next-Generation Medicine, Tohoku University, Sendai, Miyagi 980-8573, Japan
| | - Kengo Kinoshita
- Tohoku Medical Megabank Organization, Tohoku University, Sendai, Miyagi 980-8573, Japan
- Advanced Research Center for Innovations in Next-Generation Medicine, Tohoku University, Sendai, Miyagi 980-8573, Japan
- Graduate School of Information Sciences, Tohoku University, Sendai, Miyagi 980-8579, Japan
| |
Collapse
|
321
|
Liang Y, Yuan Q, Zheng Q, Mei Z, Song Y, Yan H, Yang J, Wu S, Yuan J, Wu W. DNA Damage Atlas: an atlas of DNA damage and repair. Nucleic Acids Res 2024; 52:D1218-D1226. [PMID: 37831087 PMCID: PMC10767978 DOI: 10.1093/nar/gkad845] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2023] [Revised: 09/06/2023] [Accepted: 09/21/2023] [Indexed: 10/14/2023] Open
Abstract
DNA damage and its improper repair are the major source of genomic alterations responsible for many human diseases, particularly cancer. To aid researchers in understanding the underlying mechanisms of genome instability, a number of genome-wide profiling approaches have been developed to monitor DNA damage and repair events. The rapid accumulation of published datasets underscores the critical necessity of a comprehensive database to curate sequencing data on DNA damage and repair intermediates. Here, we present DNA Damage Atlas (DDA, http://www.bioinformaticspa.com/DDA/), the first large-scale repository of DNA damage and repair information. Currently, DDA comprises 6,030 samples from 262 datasets by 59 technologies, covering 16 species, 10 types of damage and 135 treatments. Data collected in DDA was processed through a standardized workflow, including quality checks, hotspots identification and a series of feature characterization for the hotspots. Notably, DDA encompasses analyses of highly repetitive regions, ribosomal DNA and telomere. DDA offers a user-friendly interface that facilitates browsing, searching, genome browser visualization, hotspots comparison and data downloading, enabling convenient and thorough exploration for datasets of interest. In summary, DDA will stand as a valuable resource for research in genome instability and its association with diseases.
Collapse
Affiliation(s)
- Yu Liang
- State Key Laboratory of Molecular Biology, CAS Center for Excellence in Molecular Cell Science, Shanghai Institute of Biochemistry and Cell Biology, Chinese Academy of Sciences, University of Chinese Academy of Sciences, 320 Yue Yang Road, Shanghai 200031, China
| | - Qingqing Yuan
- State Key Laboratory of Molecular Biology, CAS Center for Excellence in Molecular Cell Science, Shanghai Institute of Biochemistry and Cell Biology, Chinese Academy of Sciences, University of Chinese Academy of Sciences, 320 Yue Yang Road, Shanghai 200031, China
| | - Qijie Zheng
- State Key Laboratory of Molecular Biology, CAS Center for Excellence in Molecular Cell Science, Shanghai Institute of Biochemistry and Cell Biology, Chinese Academy of Sciences, University of Chinese Academy of Sciences, 320 Yue Yang Road, Shanghai 200031, China
| | - Zilv Mei
- State Key Laboratory of Molecular Biology, CAS Center for Excellence in Molecular Cell Science, Shanghai Institute of Biochemistry and Cell Biology, Chinese Academy of Sciences, University of Chinese Academy of Sciences, 320 Yue Yang Road, Shanghai 200031, China
| | - Yawei Song
- State Key Laboratory of Molecular Biology, CAS Center for Excellence in Molecular Cell Science, Shanghai Institute of Biochemistry and Cell Biology, Chinese Academy of Sciences, University of Chinese Academy of Sciences, 320 Yue Yang Road, Shanghai 200031, China
| | - Huan Yan
- GMU-GIBH Joint School of Life Sciences, The Guangdong-Hong Kong-Macau Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou National Laboratory, Guangzhou Medical University, Guangzhou, China
| | - Jiajie Yang
- State Key Laboratory of Molecular Biology, CAS Center for Excellence in Molecular Cell Science, Shanghai Institute of Biochemistry and Cell Biology, Chinese Academy of Sciences, University of Chinese Academy of Sciences, 320 Yue Yang Road, Shanghai 200031, China
| | - Shuheng Wu
- State Key Laboratory of Molecular Biology, CAS Center for Excellence in Molecular Cell Science, Shanghai Institute of Biochemistry and Cell Biology, Chinese Academy of Sciences, University of Chinese Academy of Sciences, 320 Yue Yang Road, Shanghai 200031, China
| | - Jiao Yuan
- GMU-GIBH Joint School of Life Sciences, The Guangdong-Hong Kong-Macau Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou National Laboratory, Guangzhou Medical University, Guangzhou, China
| | - Wei Wu
- State Key Laboratory of Molecular Biology, CAS Center for Excellence in Molecular Cell Science, Shanghai Institute of Biochemistry and Cell Biology, Chinese Academy of Sciences, University of Chinese Academy of Sciences, 320 Yue Yang Road, Shanghai 200031, China
| |
Collapse
|
322
|
Lu K, Pan Y, Shen J, Yang L, Zhan C, Liang S, Tai S, Wan L, Li T, Cheng T, Ma B, Pan G, He N, Lu C, Westhof E, Xiang Z, Han MJ, Tong X, Dai F. SilkMeta: a comprehensive platform for sharing and exploiting pan-genomic and multi-omic silkworm data. Nucleic Acids Res 2024; 52:D1024-D1032. [PMID: 37941143 PMCID: PMC10767832 DOI: 10.1093/nar/gkad956] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Revised: 10/03/2023] [Accepted: 10/13/2023] [Indexed: 11/10/2023] Open
Abstract
The silkworm Bombyx mori is a domesticated insect that serves as an animal model for research and agriculture. The silkworm super-pan-genome dataset, which we published last year, is a unique resource for the study of global genomic diversity and phenotype-genotype association. Here we present SilkMeta (http://silkmeta.org.cn), a comprehensive database covering the available silkworm pan-genome and multi-omics data. The database contains 1082 short-read genomes, 546 long-read assembled genomes, 1168 transcriptomes, 294 phenotype characterizations (phenome), tens of millions of variations (variome), 7253 long non-coding RNAs (lncRNAs), 18 717 full length transcripts and a set of population statistics. We have compiled publications on functional genomics research and genetic stock deciphering (mutant map). A range of bioinformatics tools is also provided for data visualization and retrieval. The large batch of omics data and tools were integrated in twelve functional modules that provide useful strategies and data for comparative and functional genomics research. The interactive bioinformatics platform SilkMeta will benefit not only the silkworm but also the insect biology communities.
Collapse
Affiliation(s)
- Kunpeng Lu
- State Key Laboratory of Resource Insects, Institute of Sericulture and Systems Biology, Southwest University, Chongqing 400715, China
- Key Laboratory of Sericultural Biology and Genetic Breeding, Ministry of Agriculture and Rural Affairs, College of Sericulture, Textile and Biomass Sciences, Southwest University, Chongqing 400715, China
| | - Yifei Pan
- State Key Laboratory of Resource Insects, Institute of Sericulture and Systems Biology, Southwest University, Chongqing 400715, China
| | - Jianghong Shen
- State Key Laboratory of Resource Insects, Institute of Sericulture and Systems Biology, Southwest University, Chongqing 400715, China
| | - Lin Yang
- State Key Laboratory of Resource Insects, Institute of Sericulture and Systems Biology, Southwest University, Chongqing 400715, China
| | - Chengyu Zhan
- State Key Laboratory of Resource Insects, Institute of Sericulture and Systems Biology, Southwest University, Chongqing 400715, China
| | - Shubo Liang
- State Key Laboratory of Resource Insects, Institute of Sericulture and Systems Biology, Southwest University, Chongqing 400715, China
| | | | - Linrong Wan
- State Key Laboratory of Resource Insects, Institute of Sericulture and Systems Biology, Southwest University, Chongqing 400715, China
| | - Tian Li
- State Key Laboratory of Resource Insects, Institute of Sericulture and Systems Biology, Southwest University, Chongqing 400715, China
| | - Tingcai Cheng
- State Key Laboratory of Resource Insects, Institute of Sericulture and Systems Biology, Southwest University, Chongqing 400715, China
| | - Bi Ma
- State Key Laboratory of Resource Insects, Institute of Sericulture and Systems Biology, Southwest University, Chongqing 400715, China
| | - Guoqing Pan
- State Key Laboratory of Resource Insects, Institute of Sericulture and Systems Biology, Southwest University, Chongqing 400715, China
| | - Ningjia He
- State Key Laboratory of Resource Insects, Institute of Sericulture and Systems Biology, Southwest University, Chongqing 400715, China
| | - Cheng Lu
- State Key Laboratory of Resource Insects, Institute of Sericulture and Systems Biology, Southwest University, Chongqing 400715, China
| | - Eric Westhof
- State Key Laboratory of Resource Insects, Institute of Sericulture and Systems Biology, Southwest University, Chongqing 400715, China
- Architecture et Réactivité de l’ARN, Institut de Biologie Moléculaire et Cellulaire, UPR9002 CNRS, Université de Strasbourg, Strasbourg 67084, France
| | - Zhonghuai Xiang
- State Key Laboratory of Resource Insects, Institute of Sericulture and Systems Biology, Southwest University, Chongqing 400715, China
| | - Min-Jin Han
- State Key Laboratory of Resource Insects, Institute of Sericulture and Systems Biology, Southwest University, Chongqing 400715, China
- Key Laboratory of Sericultural Biology and Genetic Breeding, Ministry of Agriculture and Rural Affairs, College of Sericulture, Textile and Biomass Sciences, Southwest University, Chongqing 400715, China
| | - Xiaoling Tong
- State Key Laboratory of Resource Insects, Institute of Sericulture and Systems Biology, Southwest University, Chongqing 400715, China
- Key Laboratory of Sericultural Biology and Genetic Breeding, Ministry of Agriculture and Rural Affairs, College of Sericulture, Textile and Biomass Sciences, Southwest University, Chongqing 400715, China
| | - Fangyin Dai
- State Key Laboratory of Resource Insects, Institute of Sericulture and Systems Biology, Southwest University, Chongqing 400715, China
- Key Laboratory of Sericultural Biology and Genetic Breeding, Ministry of Agriculture and Rural Affairs, College of Sericulture, Textile and Biomass Sciences, Southwest University, Chongqing 400715, China
| |
Collapse
|
323
|
Kang H, Huang T, Duan G, Meng Y, Chen X, He S, Xia Z, Zhou X, Chao J, Tang B, Wang Z, Zhu J, Du Z, Sun Y, Zhang S, Xiao J, Tian W, Wang W, Zhao W. TCOD: an integrated resource for tropical crops. Nucleic Acids Res 2024; 52:D1651-D1660. [PMID: 37843152 PMCID: PMC10767838 DOI: 10.1093/nar/gkad870] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Revised: 09/25/2023] [Accepted: 09/29/2023] [Indexed: 10/17/2023] Open
Abstract
Tropical crops are vital for tropical agriculture, with resource scarcity, functional diversity and extensive market demand, providing considerable economic benefits for the world's tropical agriculture-producing countries. The rapid development of sequencing technology has promoted a milestone in tropical crop research, resulting in the generation of massive amount of data, which urgently needs an effective platform for data integration and sharing. However, the existing databases cannot fully satisfy researchers' requirements due to the relatively limited integration level and untimely update. Here, we present the Tropical Crop Omics Database (TCOD, https://ngdc.cncb.ac.cn/tcod), a comprehensive multi-omics data platform for tropical crops. TCOD integrates diverse omics data from 15 species, encompassing 34 chromosome-level de novo assemblies, 1 255 004 genes with functional annotations, 282 436 992 unique variants from 2048 WGS samples, 88 transcriptomic profiles from 1997 RNA-Seq samples and 13 381 germplasm items. Additionally, TCOD not only employs genes as a bridge to interconnect multi-omics data, enabling cross-species comparisons based on homology relationships, but also offers user-friendly online tools for efficient data mining and visualization. In short, TCOD integrates multi-species, multi-omics data and online tools, which will facilitate the research on genomic selective breeding and trait biology of tropical crops.
Collapse
Affiliation(s)
- Hailong Kang
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Tianhao Huang
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Guangya Duan
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yuyan Meng
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Xiaoning Chen
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Shuang He
- Sanya Nanfan Research Institute, Hainan University, Sanya 572025, China
| | - Zhiqiang Xia
- Sanya Nanfan Research Institute, Hainan University, Sanya 572025, China
| | - Xincheng Zhou
- Institute of Tropical Biosciences and Biotechnology, Chinese Academy of Tropical Agricultural Sciences, Haikou 571101, China
| | - Jinquan Chao
- Rubber Research Institute, Chinese Academy of Tropical Agricultural Sciences, Haikou 571101, China
| | - Bixia Tang
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
| | - Zhonghuang Wang
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Junwei Zhu
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
| | - Zhenglin Du
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
| | - Yanlin Sun
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
| | - Sisi Zhang
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
| | - Jingfa Xiao
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Weimin Tian
- Rubber Research Institute, Chinese Academy of Tropical Agricultural Sciences, Haikou 571101, China
| | - Wenquan Wang
- Sanya Nanfan Research Institute, Hainan University, Sanya 572025, China
| | - Wenming Zhao
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
324
|
Gu X, Wang M, Zhang XO. TE-TSS: an integrated data resource of human and mouse transposable element (TE)-derived transcription start site (TSS). Nucleic Acids Res 2024; 52:D322-D333. [PMID: 37956335 PMCID: PMC10767810 DOI: 10.1093/nar/gkad1048] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2023] [Revised: 10/21/2023] [Accepted: 10/23/2023] [Indexed: 11/15/2023] Open
Abstract
Transposable elements (TEs) are abundant in the genome and serve as crucial regulatory elements. Some TEs function as epigenetically regulated promoters, and these TE-derived transcription start sites (TSSs) play a crucial role in regulating genes associated with specific functions, such as cancer and embryogenesis. However, the lack of an accessible database that systematically gathers TE-derived TSS data is a current research gap. To address this, we established TE-TSS, an integrated data resource of human and mouse TE-derived TSSs (http://xozhanglab.com/TETSS). TE-TSS has compiled 2681 RNA sequencing datasets, spanning various tissues, cell lines and developmental stages. From these, we identified 5768 human TE-derived TSSs and 2797 mouse TE-derived TSSs, with 47% and 38% being experimentally validated, respectively. TE-TSS enables comprehensive exploration of TSS usage in diverse samples, providing insights into tissue-specific gene expression patterns and transcriptional regulatory elements. Furthermore, TE-TSS compares TE-derived TSS regions across 15 mammalian species, enhancing our understanding of their evolutionary and functional aspects. The establishment of TE-TSS facilitates further investigations into the roles of TEs in shaping the transcriptomic landscape and offers valuable resources for comprehending their involvement in diverse biological processes.
Collapse
Affiliation(s)
- Xiaobing Gu
- Shanghai Key Laboratory of Maternal and Fetal Medicine, Clinical and Translational Research Center of Shanghai First Maternity and Infant Hospital, Frontier Science Center for Stem Cell Research, School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
| | - Mingdong Wang
- Shanghai Key Laboratory of Maternal and Fetal Medicine, Clinical and Translational Research Center of Shanghai First Maternity and Infant Hospital, Frontier Science Center for Stem Cell Research, School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
| | - Xiao-Ou Zhang
- Shanghai Key Laboratory of Maternal and Fetal Medicine, Clinical and Translational Research Center of Shanghai First Maternity and Infant Hospital, Frontier Science Center for Stem Cell Research, School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
| |
Collapse
|
325
|
Lavikka K, Oikkonen J, Li Y, Muranen T, Micoli G, Marchi G, Lahtinen A, Huhtinen K, Lehtonen R, Hietanen S, Hynninen J, Virtanen A, Hautaniemi S. Deciphering cancer genomes with GenomeSpy: a grammar-based visualization toolkit. Gigascience 2024; 13:giae040. [PMID: 39101783 PMCID: PMC11299109 DOI: 10.1093/gigascience/giae040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2024] [Revised: 05/13/2024] [Accepted: 06/19/2024] [Indexed: 08/06/2024] Open
Abstract
BACKGROUND Visualization is an indispensable facet of genomic data analysis. Despite the abundance of specialized visualization tools, there remains a distinct need for tailored solutions. However, their implementation typically requires extensive programming expertise from bioinformaticians and software developers, especially when building interactive applications. Toolkits based on visualization grammars offer a more accessible, declarative way to author new visualizations. Yet, current grammar-based solutions fall short in adequately supporting the interactive analysis of large datasets with extensive sample collections, a pivotal task often encountered in cancer research. FINDINGS We present GenomeSpy, a grammar-based toolkit for authoring tailored, interactive visualizations for genomic data analysis. By using combinatorial building blocks and a declarative language, users can implement new visualization designs easily and embed them in web pages or end-user-oriented applications. A distinctive element of GenomeSpy's architecture is its effective use of the graphics processing unit in all rendering, enabling a high frame rate and smoothly animated interactions, such as navigation within a genome. We demonstrate the utility of GenomeSpy by characterizing the genomic landscape of 753 ovarian cancer samples from patients in the DECIDER clinical trial. Our results expand the understanding of the genomic architecture in ovarian cancer, particularly the diversity of chromosomal instability. CONCLUSIONS GenomeSpy is a visualization toolkit applicable to a wide range of tasks pertinent to genome analysis. It offers high flexibility and exceptional performance in interactive analysis. The toolkit is open source with an MIT license, implemented in JavaScript, and available at https://genomespy.app/.
Collapse
Affiliation(s)
- Kari Lavikka
- Research Program in Systems Oncology, Research Programs Unit, Faculty of Medicine, University of Helsinki, 00014 Helsinki, Finland
| | - Jaana Oikkonen
- Research Program in Systems Oncology, Research Programs Unit, Faculty of Medicine, University of Helsinki, 00014 Helsinki, Finland
| | - Yilin Li
- Research Program in Systems Oncology, Research Programs Unit, Faculty of Medicine, University of Helsinki, 00014 Helsinki, Finland
| | - Taru Muranen
- Research Program in Systems Oncology, Research Programs Unit, Faculty of Medicine, University of Helsinki, 00014 Helsinki, Finland
| | - Giulia Micoli
- Research Program in Systems Oncology, Research Programs Unit, Faculty of Medicine, University of Helsinki, 00014 Helsinki, Finland
| | - Giovanni Marchi
- Research Program in Systems Oncology, Research Programs Unit, Faculty of Medicine, University of Helsinki, 00014 Helsinki, Finland
| | - Alexandra Lahtinen
- Research Program in Systems Oncology, Research Programs Unit, Faculty of Medicine, University of Helsinki, 00014 Helsinki, Finland
| | - Kaisa Huhtinen
- Research Program in Systems Oncology, Research Programs Unit, Faculty of Medicine, University of Helsinki, 00014 Helsinki, Finland
- Cancer Research Unit, Institute of Biomedicine and FICAN West Cancer Centre, University of Turku, 20521 Turku, Finland
| | - Rainer Lehtonen
- Applied Tumor Genomics Research Program, Research Programs Unit, Faculty of Medicine, University of Helsinki, 00014 Helsinki, Finland
| | - Sakari Hietanen
- Department of Obstetrics and Gynecology, University of Turku and Turku University Hospital, 20521 Turku, Finland
| | - Johanna Hynninen
- Department of Obstetrics and Gynecology, University of Turku and Turku University Hospital, 20521 Turku, Finland
| | - Anni Virtanen
- Department of Pathology, University of Helsinki and HUS Diagnostic Center, Helsinki University Hospital, 00260 Helsinki, Finland
| | - Sampsa Hautaniemi
- Research Program in Systems Oncology, Research Programs Unit, Faculty of Medicine, University of Helsinki, 00014 Helsinki, Finland
| |
Collapse
|
326
|
Harb OS, McDowell MA, Roos DS. VEuPathDB Resources: A Platform for Free Online Data Exploration, Integration, and Analysis. Methods Mol Biol 2024; 2802:573-586. [PMID: 38819572 DOI: 10.1007/978-1-0716-3838-5_19] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/01/2024]
Abstract
The Eukaryotic Pathogen, Vector and Host Informatics Resources ( VEuPathDB.org ) provide free online access to omic data from eukaryotic protozoan and fungal pathogens, arthropod vectors of disease, and host responses to pathogen infection. The goal of VEuPathDB is to make data easily accessible, findable, and importantly, re-usable by laboratory scientists. All integrated data and analyses follow standard workflows and methods to ensure data accuracy and enable data interoperability. Integrated data include genomes and annotation, transcriptomic (e.g., single-cell/bulk RNA-sequence and microarray data), proteomic (e.g., mass spectrometry evidence and quantitative data), isolate sequencing data used for variant calling and copy number variation determination, epigenomics, whole-genome phenotyping data (e.g., CRISPR screens and large-scale imaging and subcellular localization data), etc. Standard analyses provide additional data such as InterPro domains, signal peptide and transmembrane domain predictions, and metabolic pathways. Comparative genomic analysis in VEuPathDB is facilitated by leveraging orthology to enable the transformation of results between organisms and identifying genes with specific phyletic patterns. In addition, synteny between genomes is facilitated by shading orthologs across species and strains. Accessibility to and re-usability of the data is made possible through specialized searches and a graphical search strategy system that enables scientists to build in silico experiments combining results from multiple experiments with diverse data types.
Collapse
Affiliation(s)
- Omar S Harb
- University of Pennsylvania, Philadelphia, PA, USA.
| | | | - David S Roos
- University of Pennsylvania, Philadelphia, PA, USA
| |
Collapse
|
327
|
Sharma SP, Purcell CM, Hyde JR, Severin AJ. Spirochaete genome identified in red abalone sample represents a novel genus Candidatus Haliotispira gen. nov. within the order Spirochaetales. Int J Syst Evol Microbiol 2024; 74. [PMID: 38179990 DOI: 10.1099/ijsem.0.006198] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2024] Open
Abstract
A fully assembled spirochaete genome was identified as a contaminating scaffold in our red abalone (Haliotis rufescens) genome assembly. In this paper, we describe the analysis of this bacterial genome. The assembled spirochaete genome is 3.25 Mb in size with 48.5 mol% G+C content. The proteomes of 38 species were compared with the spirochaete genome and it was discovered to form an independent branch within the family Spirochaetaceae on the phylogenetic tree. The comparison of 16S rRNA sequences and average nucleotide identity scores between the spirochaete genome with known species of different families in Spirochaetia indicate that it is an unknown species. Further, the percentage of conserved proteins compared to neighbouring taxa confirm that it does not belong to a known genus within Spirochaetaceae. We propose the name Candidatus Haliotispira prima gen. nov., sp. nov. based on its taxonomic placement and origin. We also tested for the presence of this species in different species of abalone and found that it is also present in white abalone (Haliotis sorenseni). In addition, we highlight the need for better classification of taxa within the class Spirochaetia.
Collapse
Affiliation(s)
| | - Catherine M Purcell
- NOAA Fisheries Southwest Fisheries Science Center, La Jolla, California, USA
| | - John R Hyde
- NOAA Fisheries Southwest Fisheries Science Center, La Jolla, California, USA
| | - Andrew J Severin
- Genome Informatics Facility, Iowa State University, Ames, Iowa, USA
| |
Collapse
|
328
|
Pogoda CS, Keepers KG, Reinert S, Talukder ZI, Smart BC, Attia Z, Corwin JA, Money KL, Collier-Zans ECE, Underwood W, Gulya TJ, Quandt CA, Kane NC, Hulke BS. Heritable differences in abundance of bacterial rhizosphere taxa are correlated with fungal necrotrophic pathogen resistance. Mol Ecol 2024; 33:e17218. [PMID: 38038696 DOI: 10.1111/mec.17218] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2020] [Revised: 11/09/2023] [Accepted: 11/13/2023] [Indexed: 12/02/2023]
Abstract
Host-microbe interactions are increasingly recognized as important drivers of organismal health, growth, longevity and community-scale ecological processes. However, less is known about how genetic variation affects hosts' associated microbiomes and downstream phenotypes. We demonstrate that sunflower (Helianthus annuus) harbours substantial, heritable variation in microbial communities under field conditions. We show that microbial communities co-vary with heritable variation in resistance to root infection caused by the necrotrophic pathogen Sclerotinia sclerotiorum and that plants grown in autoclaved soil showed almost complete elimination of pathogen resistance. Association mapping suggests at least 59 genetic locations with effects on both microbial relative abundance and Sclerotinia resistance. Although the genetic architecture appears quantitative, we have elucidated previously unexplained genetic variation for resistance to this pathogen. We identify new targets for plant breeding and demonstrate the potential for heritable microbial associations to play important roles in defence in natural and human-altered environments.
Collapse
Affiliation(s)
- Cloe S Pogoda
- Ecology and Evolutionary Biology Department, University of Colorado, Boulder, Colorado, USA
| | - Kyle G Keepers
- Ecology and Evolutionary Biology Department, University of Colorado, Boulder, Colorado, USA
| | - Stephan Reinert
- Ecology and Evolutionary Biology Department, University of Colorado, Boulder, Colorado, USA
| | - Zahirul I Talukder
- USDA-ARS Sunflower and Plant Biology Research Unit, Edward T Schafer Agricultural Research Center, Fargo, North Dakota, USA
- Department of Plant Sciences, North Dakota State University, Fargo, North Dakota, USA
| | - Brian C Smart
- Department of Plant Sciences, North Dakota State University, Fargo, North Dakota, USA
| | - Ziv Attia
- Ecology and Evolutionary Biology Department, University of Colorado, Boulder, Colorado, USA
| | - Jason A Corwin
- Ecology and Evolutionary Biology Department, University of Colorado, Boulder, Colorado, USA
| | - Kennedy L Money
- Department of Plant Sciences, North Dakota State University, Fargo, North Dakota, USA
| | - Erin C E Collier-Zans
- Ecology and Evolutionary Biology Department, University of Colorado, Boulder, Colorado, USA
| | - William Underwood
- USDA-ARS Sunflower and Plant Biology Research Unit, Edward T Schafer Agricultural Research Center, Fargo, North Dakota, USA
| | - Thomas J Gulya
- USDA-ARS Sunflower and Plant Biology Research Unit, Edward T Schafer Agricultural Research Center, Fargo, North Dakota, USA
| | - C Alisha Quandt
- Ecology and Evolutionary Biology Department, University of Colorado, Boulder, Colorado, USA
| | - Nolan C Kane
- Ecology and Evolutionary Biology Department, University of Colorado, Boulder, Colorado, USA
| | - Brent S Hulke
- USDA-ARS Sunflower and Plant Biology Research Unit, Edward T Schafer Agricultural Research Center, Fargo, North Dakota, USA
| |
Collapse
|
329
|
Li D, Zhuo X, Harrison JK, Liu S, Wang T. Modbed track: Visualization of modified bases in single-molecule sequencing. CELL GENOMICS 2023; 3:100455. [PMID: 38116122 PMCID: PMC10726485 DOI: 10.1016/j.xgen.2023.100455] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Revised: 09/28/2023] [Accepted: 11/04/2023] [Indexed: 12/21/2023]
Abstract
Recent advances in long-read sequencing technologies have not only dramatically increased sequencing read length but also have improved the accuracy of detecting chemical modifications to the canonical nucleotide bases, thus opening exciting venues to investigate the epigenome. Currently, the ability to visualize modified bases from long-read sequencing data in genome browsers is still limited, preventing users from easily and fully exploring these type of data. To address this limitation, the WashU Epigenome Browser introduces the modbed track type, which provides visualization of modification details in each single read as well as aggregated modifications of individual or multiple molecules across a dynamic range of resolutions. The modbed file can be uploaded for visualization as a local track or viewed with an accessible URL freely on the WashU Epigenome Browser at https://epigenomegateway.wustl.edu/.
Collapse
Affiliation(s)
- Daofeng Li
- Department of Genetics, The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine, St. Louis, MO, USA.
| | - Xiaoyu Zhuo
- Department of Genetics, The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine, St. Louis, MO, USA
| | - Jessica K Harrison
- Department of Genetics, The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine, St. Louis, MO, USA
| | - Shane Liu
- Department of Genetics, The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine, St. Louis, MO, USA; Computer Science and Engineering Division, University of Michigan, Ann Arbor, MI, USA
| | - Ting Wang
- Department of Genetics, The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine, St. Louis, MO, USA; McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO, USA.
| |
Collapse
|
330
|
Younuskunju S, Mohamoud YA, Mathew LS, Mayer KFX, Suhre K, Malek JA. Genome-wide association of dry (Tamar) date palm fruit color. THE PLANT GENOME 2023; 16:e20373. [PMID: 37621134 DOI: 10.1002/tpg2.20373] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/08/2023] [Revised: 06/05/2023] [Accepted: 07/03/2023] [Indexed: 08/26/2023]
Abstract
Date palm (Phoenix dactylifera) fruit (dates) are an economically and culturally significant crop in the Middle East and North Africa. There are hundreds of different commercial cultivars producing dates with distinctive shapes, colors, and sizes. Genetic studies of some date palm traits have been performed, including sex determination, sugar content, and fresh fruit color. In this study, we used genome sequences and image data of 199 dry dates (Tamar) collected from 14 countries to identify genetic loci associated with the color of this fruit stage. Here, we find loci across multiple linkage groups (LG) associated with dry fruit color phenotype. We recover both the previously identified VIRESCENS (VIR) genotype associated with fresh fruit yellow or red color and new associations with the lightness and darkness of dry fruit. This study will add resolution to our understanding of date color phenotype, especially at the most commercially important Tamar stage.
Collapse
Affiliation(s)
- Shameem Younuskunju
- Genomics Laboratory, Weill Cornell Medicine-Qatar, Doha, Qatar
- School of Life Sciences, Technical University of Munich, Munich, Germany
| | | | - Lisa S Mathew
- Clinical Genomics Laboratory, Sidra Medicine, Doha, Qatar
| | - Klaus F X Mayer
- School of Life Sciences, Technical University of Munich, Munich, Germany
- Plant Genome and Systems Biology, Helmholtz Center Munich, Munich, Germany
| | - Karsten Suhre
- Department of Physiology, Weill Cornell Medicine-Qatar, Doha, Qatar
| | - Joel A Malek
- Genomics Laboratory, Weill Cornell Medicine-Qatar, Doha, Qatar
- Department of Genetic Medicine, Weill Cornell Medicine-Qatar, Doha, Qatar
| |
Collapse
|
331
|
Sánchez-Adriá IE, Sanmartín G, Prieto JA, Estruch F, Fortis E, Randez-Gil F. Adaptive laboratory evolution for acetic acid-tolerance matches sourdough challenges with yeast phenotypes. Microbiol Res 2023; 277:127487. [PMID: 37713908 DOI: 10.1016/j.micres.2023.127487] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Revised: 09/05/2023] [Accepted: 09/06/2023] [Indexed: 09/17/2023]
Abstract
Acetic acid tolerance of Saccharomyces cerevisiae is an important trait in sourdough fermentation processes, where the accumulation of acid by the growth of lactic acid bacteria reduces the yeast metabolic activity. In this work, we have carried out adaptive laboratory evolution (ALE) experiments in two sourdough isolates of S. cerevisiae exposed to acetic acid, or alternatively to acetic acid and myriocin, an inhibitor of sphingolipid biosynthesis that sped-up the evolutionary adaptation. Evolution approaches resulted in acetic tolerance, and surprisingly, increased lactic susceptibility. Four evolved clones, one from each parental strain and evolutionary scheme, were selected on the basis of their potential for CO2 production in sourdough conditions. Among them, two showed phenotypic instability characterized by strong lactic sensitivity after several rounds of growth under unstressed conditions, while two others, displayed increased constitutive acetic tolerance with no loss of growth in lactic medium. Genome sequencing and ploidy level analysis of all strains revealed aneuploidies, which could account for phenotypic heterogeneity. In addition, copy number variations (CNVs), affecting specially to genes involved in ion transport or flocculation, and single nucleotide polymorphisms (SNPs) were identified. Mutations in several genes, ARG82, KEX1, CTK1, SPT20, IRA2, ASG1 or GIS4, were confirmed as involved in acetic and/or lactic tolerance, and new determinants of these phenotypes, MSN5 and PSP2, identified.
Collapse
Affiliation(s)
- Isabel E Sánchez-Adriá
- Department of Biotechnology, Instituto de Agroquímica y Tecnología de los Alimentos, Consejo Superior de Investigaciones Científicas, Avda. Agustín Escardino, 7, Paterna, 46980 Valencia, Spain
| | - Gemma Sanmartín
- Department of Biotechnology, Instituto de Agroquímica y Tecnología de los Alimentos, Consejo Superior de Investigaciones Científicas, Avda. Agustín Escardino, 7, Paterna, 46980 Valencia, Spain
| | - Jose A Prieto
- Department of Biotechnology, Instituto de Agroquímica y Tecnología de los Alimentos, Consejo Superior de Investigaciones Científicas, Avda. Agustín Escardino, 7, Paterna, 46980 Valencia, Spain
| | - Francisco Estruch
- Department of Biochemistry and Molecular Biology, Universitat de València, Dr. Moliner 50, 46100 Burjassot, Spain
| | - Estefanía Fortis
- Cereal (Center for Research Europastry Advanced Lab), Europastry S.A., Marie Curie, 6, Sant Joan Despí, 08970 Barcelona, Spain
| | - Francisca Randez-Gil
- Department of Biotechnology, Instituto de Agroquímica y Tecnología de los Alimentos, Consejo Superior de Investigaciones Científicas, Avda. Agustín Escardino, 7, Paterna, 46980 Valencia, Spain.
| |
Collapse
|
332
|
Poretsky E, Andorf CM, Sen TZ. PhosBoost: Improved phosphorylation prediction recall using gradient boosting and protein language models. PLANT DIRECT 2023; 7:e554. [PMID: 38124705 PMCID: PMC10732782 DOI: 10.1002/pld3.554] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/10/2023] [Revised: 11/20/2023] [Accepted: 11/26/2023] [Indexed: 12/23/2023]
Abstract
Protein phosphorylation is a dynamic and reversible post-translational modification that regulates a variety of essential biological processes. The regulatory role of phosphorylation in cellular signaling pathways, protein-protein interactions, and enzymatic activities has motivated extensive research efforts to understand its functional implications. Experimental protein phosphorylation data in plants remains limited to a few species, necessitating a scalable and accurate prediction method. Here, we present PhosBoost, a machine-learning approach that leverages protein language models and gradient-boosting trees to predict protein phosphorylation from experimentally derived data. Trained on data obtained from a comprehensive plant phosphorylation database, qPTMplants, we compared the performance of PhosBoost to existing protein phosphorylation prediction methods, PhosphoLingo and DeepPhos. For serine and threonine prediction, PhosBoost achieved higher recall than PhosphoLingo and DeepPhos (.78, .56, and .14, respectively) while maintaining a competitive area under the precision-recall curve (.54, .56, and .42, respectively). PhosphoLingo and DeepPhos failed to predict any tyrosine phosphorylation sites, while PhosBoost achieved a recall score of .6. Despite the precision-recall tradeoff, PhosBoost offers improved performance when recall is prioritized while consistently providing more confident probability scores. A sequence-based pairwise alignment step improved prediction results for all classifiers by effectively increasing the number of inferred positive phosphosites. We provide evidence to show that PhosBoost models are transferable across species and scalable for genome-wide protein phosphorylation predictions. PhosBoost is freely and publicly available on GitHub.
Collapse
Affiliation(s)
- Elly Poretsky
- Agricultural Research Service, Crop Improvement and Genetics Research UnitU.S. Department of AgricultureAlbanyCAUnited States
| | - Carson M. Andorf
- Agricultural Research Service, Corn Insects and Crop Genetics ResearchU.S. Department of AgricultureAmesIAUnited States
- Department of Computer ScienceIowa State UniversityAmesIAUnited States
| | - Taner Z. Sen
- Agricultural Research Service, Crop Improvement and Genetics Research UnitU.S. Department of AgricultureAlbanyCAUnited States
- Department of BioengineeringUniversity of CaliforniaBerkeleyCAUnited States
| |
Collapse
|
333
|
L'Yi S, Maziec D, Stevens V, Manz T, Veit A, Berselli M, Park PJ, Głodzik D, Gehlenborg N. Chromoscope: interactive multiscale visualization for structural variation in human genomes. Nat Methods 2023; 20:1834-1835. [PMID: 37914857 PMCID: PMC10883074 DOI: 10.1038/s41592-023-02056-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2023]
Affiliation(s)
- Sehi L'Yi
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Dominika Maziec
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Victoria Stevens
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Trevor Manz
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Alexander Veit
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Michele Berselli
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Peter J Park
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.
| | - Dominik Głodzik
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.
| | - Nils Gehlenborg
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
334
|
Rangwala SH, Rudnev DV, Ananiev VV, Asztalos A, Benica B, Borodin EA, Bouk N, Evgeniev VI, Kodali VK, Lotov V, Mozes E, Oh DH, Omelchenko MV, Savkina S, Sukharnikov E, Virothaisakun J, Murphy TD, Pruitt KD, Schneider VA. Interactive visualization of whole eukaryote genome alignments using NCBI's Comparative Genome Viewer (CGV). BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.30.564672. [PMID: 38077029 PMCID: PMC10705539 DOI: 10.1101/2023.10.30.564672] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/23/2023]
Abstract
We report a new visualization tool for analysis of whole genome assembly-assembly alignments, the Comparative Genome Viewer (CGV) (https://ncbi.nlm.nih.gov/genome/cgv/). CGV visualizes pairwise same-species and cross-species alignments provided by NCBI using assembly alignment algorithms developed by us and others. Researchers can examine the alignments between the two assemblies using two alternate views: a chromosome ideogram-based view or a 2D genome dotplot. Whole genome alignment views expose large structural differences spanning chromosomes, such as inversions or translocations. Users can also navigate to regions of interest, where they can detect and analyze smaller-scale deletions and rearrangements within specific chromosome or gene regions. RefSeq or user-provided gene annotation is displayed in the ideogram view where available. CGV currently provides approximately 700 alignments from over 300 animal, plant, and fungal species. CGV and related NCBI viewers are undergoing active development to further meet needs of the research community in comparative genome visualization.
Collapse
Affiliation(s)
- Sanjida H Rangwala
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health (NIH), Bethesda, MD 20894, USA
| | - Dmitry V Rudnev
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health (NIH), Bethesda, MD 20894, USA
| | - Victor V Ananiev
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health (NIH), Bethesda, MD 20894, USA
| | - Andrea Asztalos
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health (NIH), Bethesda, MD 20894, USA
| | - Barrett Benica
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health (NIH), Bethesda, MD 20894, USA
| | - Evgeny A Borodin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health (NIH), Bethesda, MD 20894, USA
| | - Nathan Bouk
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health (NIH), Bethesda, MD 20894, USA
| | - Vladislav I Evgeniev
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health (NIH), Bethesda, MD 20894, USA
| | - Vamsi K Kodali
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health (NIH), Bethesda, MD 20894, USA
| | - Vadim Lotov
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health (NIH), Bethesda, MD 20894, USA
| | - Eyal Mozes
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health (NIH), Bethesda, MD 20894, USA
| | - Dong-Ha Oh
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health (NIH), Bethesda, MD 20894, USA
| | - Marina V Omelchenko
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health (NIH), Bethesda, MD 20894, USA
| | - Sofya Savkina
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health (NIH), Bethesda, MD 20894, USA
| | - Ekaterina Sukharnikov
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health (NIH), Bethesda, MD 20894, USA
| | - Joël Virothaisakun
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health (NIH), Bethesda, MD 20894, USA
| | - Terence D. Murphy
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health (NIH), Bethesda, MD 20894, USA
| | - Kim D Pruitt
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health (NIH), Bethesda, MD 20894, USA
| | - Valerie A. Schneider
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health (NIH), Bethesda, MD 20894, USA
| |
Collapse
|
335
|
Sharma PK, Ahmed HI, Heuberger M, Koo DH, Quiroz-Chavez J, Adhikari L, Raupp J, Cauet S, Rodde N, Cravero C, Callot C, Yadav IS, Kathiresan N, Athiyannan N, Ramirez-Gonzalez RH, Uauy C, Wicker T, Abrouk M, Gu YQ, Poland J, Krattinger SG, Lazo GR, Tiwari VK. An online database for einkorn wheat to aid in gene discovery and functional genomics studies. Database (Oxford) 2023; 2023:baad079. [PMID: 37971714 PMCID: PMC10653128 DOI: 10.1093/database/baad079] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2023] [Revised: 10/02/2023] [Accepted: 10/26/2023] [Indexed: 11/19/2023]
Abstract
Diploid A-genome wheat (einkorn wheat) presents a nutrition-rich option as an ancient grain crop and a resource for the improvement of bread wheat against abiotic and biotic stresses. Realizing the importance of this wheat species, reference-level assemblies of two einkorn wheat accessions were generated (wild and domesticated). This work reports an einkorn genome database that provides an interface to the cereals research community to perform comparative genomics, applied genetics and breeding research. It features queries for annotated genes, the use of a recent genome browser release, and the ability to search for sequence alignments using a modern BLAST interface. Other features include a comparison of reference einkorn assemblies with other wheat cultivars through genomic synteny visualization and an alignment visualization tool for BLAST results. Altogether, this resource will help wheat research and breeding. Database URL https://wheat.pw.usda.gov/GG3/pangenome.
Collapse
Affiliation(s)
- Parva Kumar Sharma
- Department of Plant Science and Landscape Architecture, University of Maryland, Fieldhouse Dr. College Park, MD 20742, USA
| | - Hanin Ibrahim Ahmed
- Plant Science Program, Biological and Environmental Science and Engineering Division, King Abdullah University of Science and Technology (KAUST), 4700 KAUST, Thuwal 23955-6900, Saudi Arabia
- Center for Desert Agriculture, KAUST, 4700 KAUST, Thuwal, Kingdom of Saudi Arabia 23955-6900, Saudi Arabia
| | - Matthias Heuberger
- Department of Plant and Microbial Biology, University of Zurich, 107, Zurich, Zollikerstrasse CH-8008, Switzerland
| | - Dal-Hoe Koo
- Wheat Genetics Resource Center and Department of Plant Pathology, Kansas State University, 4024 Throckmorton, 1712 Claflin Road, Manhattan, KS 66506, USA
| | - Jesus Quiroz-Chavez
- John Innes Centre John Innes Centre, Norwich Research Park, Norwich NR4 7UH, UK
| | - Laxman Adhikari
- Plant Science Program, Biological and Environmental Science and Engineering Division, King Abdullah University of Science and Technology (KAUST), 4700 KAUST, Thuwal 23955-6900, Saudi Arabia
- Center for Desert Agriculture, KAUST, 4700 KAUST, Thuwal, Kingdom of Saudi Arabia 23955-6900, Saudi Arabia
| | - John Raupp
- Wheat Genetics Resource Center and Department of Plant Pathology, Kansas State University, 4024 Throckmorton, 1712 Claflin Road, Manhattan, KS 66506, USA
| | - Stéphane Cauet
- INRAE, CNRGV French Plant Genomic Resource Center, 24 Chemin de Borde Rouge, Castanet Tolosan F-31320, France
| | - Nathalie Rodde
- INRAE, CNRGV French Plant Genomic Resource Center, 24 Chemin de Borde Rouge, Castanet Tolosan F-31320, France
| | - Charlotte Cravero
- INRAE, CNRGV French Plant Genomic Resource Center, 24 Chemin de Borde Rouge, Castanet Tolosan F-31320, France
| | - Caroline Callot
- INRAE, CNRGV French Plant Genomic Resource Center, 24 Chemin de Borde Rouge, Castanet Tolosan F-31320, France
| | - Inderjit Singh Yadav
- Department of Plant Science and Landscape Architecture, University of Maryland, Fieldhouse Dr. College Park, MD 20742, USA
| | - Nagarajan Kathiresan
- Supercomputing Core Lab, King Abdullah University of Science and Technology (KAUST), 4700 KAUST, Thuwal 23955-6900, Saudi Arabia
| | - Naveenkumar Athiyannan
- Plant Science Program, Biological and Environmental Science and Engineering Division, King Abdullah University of Science and Technology (KAUST), 4700 KAUST, Thuwal 23955-6900, Saudi Arabia
- Center for Desert Agriculture, KAUST, 4700 KAUST, Thuwal, Kingdom of Saudi Arabia 23955-6900, Saudi Arabia
| | | | - Cristobal Uauy
- John Innes Centre John Innes Centre, Norwich Research Park, Norwich NR4 7UH, UK
| | - Thomas Wicker
- Department of Plant and Microbial Biology, University of Zurich, 107, Zurich, Zollikerstrasse CH-8008, Switzerland
| | - Michael Abrouk
- Plant Science Program, Biological and Environmental Science and Engineering Division, King Abdullah University of Science and Technology (KAUST), 4700 KAUST, Thuwal 23955-6900, Saudi Arabia
- Center for Desert Agriculture, KAUST, 4700 KAUST, Thuwal, Kingdom of Saudi Arabia 23955-6900, Saudi Arabia
| | - Yong Q Gu
- United States Department of Agriculture—Agricultural Research Service, Western Regional Research Center, Crop Improvement and Genetics Research Unit, 800 Buchanan St., Albany, CA 94710, USA
| | - Jesse Poland
- Plant Science Program, Biological and Environmental Science and Engineering Division, King Abdullah University of Science and Technology (KAUST), 4700 KAUST, Thuwal 23955-6900, Saudi Arabia
- Center for Desert Agriculture, KAUST, 4700 KAUST, Thuwal, Kingdom of Saudi Arabia 23955-6900, Saudi Arabia
| | - Simon G Krattinger
- Plant Science Program, Biological and Environmental Science and Engineering Division, King Abdullah University of Science and Technology (KAUST), 4700 KAUST, Thuwal 23955-6900, Saudi Arabia
- Center for Desert Agriculture, KAUST, 4700 KAUST, Thuwal, Kingdom of Saudi Arabia 23955-6900, Saudi Arabia
| | - Gerard R Lazo
- United States Department of Agriculture—Agricultural Research Service, Western Regional Research Center, Crop Improvement and Genetics Research Unit, 800 Buchanan St., Albany, CA 94710, USA
| | - Vijay K Tiwari
- Department of Plant Science and Landscape Architecture, University of Maryland, Fieldhouse Dr. College Park, MD 20742, USA
| |
Collapse
|
336
|
Böhne A, Thiel-Bender C, Kukowka S, Darwin Tree of Life Barcoding collective, Wellcome Sanger Institute Tree of Life programme, Wellcome Sanger Institute Scientific Operations: DNA Pipelines collective, Tree of Life Core Informatics collective, Darwin Tree of Life Consortium. The genome sequence of the hazel dormouse, Muscardinus avellanarius (Linnaeus, 1758). Wellcome Open Res 2023; 8:514. [PMID: 38911281 PMCID: PMC11190645 DOI: 10.12688/wellcomeopenres.20360.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/02/2023] [Indexed: 06/25/2024] Open
Abstract
We present a genome assembly from an individual male Muscardinus avellanarius (the hazel dormouse; Chordata; Mammalia; Rodentia; Gliridae). The genome sequence is 2,497.5 megabases in span. Most of the assembly is scaffolded into 24 chromosomal pseudomolecules, including the X and Y sex chromosomes. The mitochondrial genome has also been assembled and is 16.73 kilobases in length.
Collapse
Affiliation(s)
- Astrid Böhne
- Centre for Molecular Biodiversity Research, Leibniz Institute for the Analysis of Biodiversity Change,, Museum Koenig Bonn, Bonn, Germany
| | - Christine Thiel-Bender
- Bund für Umwelt und Naturschutz Deutschland (BUND) Landesverband NRW e.V.,, Friends of the Earth Germany, Düsseldorf, Germany
| | - Sandra Kukowka
- Centre for Molecular Biodiversity Research, Leibniz Institute for the Analysis of Biodiversity Change,, Museum Koenig Bonn, Bonn, Germany
| | - Darwin Tree of Life Barcoding collective
- Centre for Molecular Biodiversity Research, Leibniz Institute for the Analysis of Biodiversity Change,, Museum Koenig Bonn, Bonn, Germany
- Bund für Umwelt und Naturschutz Deutschland (BUND) Landesverband NRW e.V.,, Friends of the Earth Germany, Düsseldorf, Germany
| | - Wellcome Sanger Institute Tree of Life programme
- Centre for Molecular Biodiversity Research, Leibniz Institute for the Analysis of Biodiversity Change,, Museum Koenig Bonn, Bonn, Germany
- Bund für Umwelt und Naturschutz Deutschland (BUND) Landesverband NRW e.V.,, Friends of the Earth Germany, Düsseldorf, Germany
| | | | - Tree of Life Core Informatics collective
- Centre for Molecular Biodiversity Research, Leibniz Institute for the Analysis of Biodiversity Change,, Museum Koenig Bonn, Bonn, Germany
- Bund für Umwelt und Naturschutz Deutschland (BUND) Landesverband NRW e.V.,, Friends of the Earth Germany, Düsseldorf, Germany
| | | |
Collapse
|
337
|
Broad GR, Crowley LM, McCulloch J, Natural History Museum Genome Acquisition Lab, University of Oxford and Wytham Woods Genome Acquisition Lab, Darwin Tree of Life Barcoding collective, Wellcome Sanger Institute Tree of Life programme, Wellcome Sanger Institute Scientific Operations: DNA Pipelines collective, Tree of Life Core Informatics collective, Darwin Tree of Life Consortium. The genome sequence of the Black Spongefly, Sisyra nigra (Retzius, 1783). Wellcome Open Res 2023; 8:511. [PMID: 38855724 PMCID: PMC11162525 DOI: 10.12688/wellcomeopenres.20295.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/02/2023] [Indexed: 06/11/2024] Open
Abstract
We present a genome assembly from an individual female Sisyra nigra (the Black Spongefly; Arthropoda; Insecta; Neuroptera; Sisyridae). The genome sequence is 372.6 megabases in span. Most of the assembly is scaffolded into 7 chromosomal pseudomolecules, including the X sex chromosome. The mitochondrial genome has also been assembled and is 16.34 kilobases in length.
Collapse
Affiliation(s)
| | - Liam M. Crowley
- Department of Biology, University of Oxford, Oxford, England, UK
| | - James McCulloch
- Department of Biology, University of Oxford, Oxford, England, UK
| | | | | | | | | | | | | | | |
Collapse
|
338
|
de Abreu CG, Roesch LFW, Andreote FD, Silva SR, de Moraes TSJ, Zied DC, de Siqueira FG, Dias ES, Varani AM, Pylro VS. Decoding the chromosome-scale genome of the nutrient-rich Agaricus subrufescens: a resource for fungal biology and biotechnology. Res Microbiol 2023; 174:104116. [PMID: 37573924 DOI: 10.1016/j.resmic.2023.104116] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2023] [Revised: 07/31/2023] [Accepted: 08/07/2023] [Indexed: 08/15/2023]
Abstract
Agaricus subrufescens, also known as the "sun mushroom," has significant nutritional and medicinal value. However, its short shelf life due to the browning process results in post-harvest losses unless it's quickly dehydrated. This restricts its availability to consumers in the form of capsules. A genome sequence of A. subrufescens may lead to new cultivation alternatives or the application of gene editing strategies to delay the browning process. We assembled a chromosome-scale genome using a hybrid approach combining Illumina and Nanopore sequencing. The genome was assembled into 13 chromosomes and 31 unplaced scaffolds, totaling 44.5 Mb with 96.5% completeness and 47.24% GC content. 14,332 protein-coding genes were identified, with 64.6% of the genome covered by genes and 23.41% transposable elements. The mitogenome was circularized and encoded fourteen typical mitochondrial genes. Four polyphenol oxidase (PPO) genes and the Mating-type locus were identified. Phylogenomic analysis supports the placement of A. subrufescens in the Agaricomycetes clade. This is the first available genome sequence of a strain of the "sun mushroom." Results are available through a Genome Browser (https://plantgenomics.ncc.unesp.br/gen.php?id=Asub) and can support further fungal biological and genomic studies.
Collapse
Affiliation(s)
| | | | - Fernando Dini Andreote
- Department of Soil Science, "Luiz de Queiroz" College of Agriculture, University of São Paulo, Piracicaba, SP, Brazil
| | - Saura Rodrigues Silva
- Department of Microbiology and Cell Science, University of Florida, Gainesville, FL, USA
| | | | - Diego Cunha Zied
- Department of Crop Production, School of Agricultural and Technological Sciences, São Paulo State University (UNESP), Dracena, São Paulo, Brazil
| | | | - Eustáquio Souza Dias
- Department of Biology, Federal University of Lavras - UFLA, Lavras, Minas Gerais, Brazil
| | - Alessandro M Varani
- UNESP - São Paulo State University, School of Agricultural and Veterinarian Sciences, Department of Agricultural and Environmental Biotechnology, Campus Jaboticabal, CEP 14884-900, SP, Brazil.
| | - Victor Satler Pylro
- Department of Biology, Federal University of Lavras - UFLA, Lavras, Minas Gerais, Brazil.
| |
Collapse
|
339
|
Luo H, Zhang P, Zhang W, Zheng Y, Hao D, Shi Y, Niu Y, Song T, Li Y, Zhao S, Chen H, Xu T, He S. Recent positive selection signatures reveal phenotypic evolution in the Han Chinese population. Sci Bull (Beijing) 2023; 68:2391-2404. [PMID: 37661541 DOI: 10.1016/j.scib.2023.08.027] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2023] [Revised: 05/08/2023] [Accepted: 08/10/2023] [Indexed: 09/05/2023]
Abstract
Characterizing natural selection signatures and relationships with phenotype spectra is important for understanding human evolution and both biological and pathological mechanisms. Here, we identified 24 genetic loci under recent selection by analyzing rare singletons in 3946 high-depth whole-genome sequencing data of Han Chinese. The loci include immune-related gene regions (MHC cluster, IGH cluster, STING1, and PSG), alcohol metabolism-related gene regions (ADH1B, ALDH2, and ALDH3B2), and the olfactory perception gene OR4C16, in which the MHC cluster, ADH1B, and ALDH2 were also identified by TOPMed and WestLake Biobank. Among the signals, the IGH cluster is particularly interesting, in which the favored allele of variant 14_105737776_C_T (rs117518546, IgG1-G396R) promotes immune response, but also increases the risk of an autoimmune disease systemic lupus erythematosus (SLE). It is also surprising that our newly discovered ALDH3B2 evolved in the opposite direction to ALDH2 for alcohol metabolism. Besides monogenic traits, we found that multiple complex traits experienced polygenic adaptation. Particularly, multi-methods consistently revealed that lower blood pressure was favored in natural selection. Finally, we built a database named RePoS (recent positive selection, http://bigdata.ibp.ac.cn/RePoS/) to integrate and display multi-population selection signals. Our study extended our understanding of natural evolution and phenotype adaptation in Han Chinese as well as other populations.
Collapse
Affiliation(s)
- Huaxia Luo
- Key Laboratory of Epigenetic Regulation and Intervention, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China; Department of Pediatrics, Peking University First Hospital, Beijing 100034, China
| | - Peng Zhang
- Key Laboratory of Epigenetic Regulation and Intervention, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
| | - Wanyu Zhang
- Key Laboratory of Epigenetic Regulation and Intervention, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
| | - Yu Zheng
- Key Laboratory of Epigenetic Regulation and Intervention, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China; College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Di Hao
- Key Laboratory of Epigenetic Regulation and Intervention, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
| | - Yirong Shi
- Key Laboratory of Epigenetic Regulation and Intervention, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yiwei Niu
- Key Laboratory of Epigenetic Regulation and Intervention, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China; College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Tingrui Song
- Key Laboratory of Epigenetic Regulation and Intervention, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
| | - Yanyan Li
- Key Laboratory of Epigenetic Regulation and Intervention, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
| | - Shilei Zhao
- CAS Key Laboratory of Genomic and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; China National Center for Bioinformation, Beijing 100101, China
| | - Hua Chen
- CAS Key Laboratory of Genomic and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; China National Center for Bioinformation, Beijing 100101, China.
| | - Tao Xu
- National Laboratory of Biomacromolecules, CAS Center for Excellence in Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China; Shandong First Medical University & Shandong Academy of Medical Sciences, Taian 271016, China.
| | - Shunmin He
- Key Laboratory of Epigenetic Regulation and Intervention, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China; College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China.
| |
Collapse
|
340
|
Clawson H, Lee BT, Raney BJ, Barber GP, Casper J, Diekhans M, Fischer C, Gonzalez JN, Hinrichs AS, Lee CM, Nassar LR, Perez G, Wick B, Schmelter D, Speir ML, Armstrong J, Zweig AS, Kuhn RM, Kirilenko BM, Hiller M, Haussler D, Kent WJ, Haeussler M. GenArk: towards a million UCSC genome browsers. Genome Biol 2023; 24:217. [PMID: 37784172 PMCID: PMC10544498 DOI: 10.1186/s13059-023-03057-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2023] [Accepted: 09/11/2023] [Indexed: 10/04/2023] Open
Abstract
Interactive graphical genome browsers are essential tools in genomics, but they do not contain all the recent genome assemblies. We create Genome Archive (GenArk) collection of UCSC Genome Browsers from NCBI assemblies. Built on our established track hub system, this enables fast visualization of annotations. Assemblies come with gene models, repeat masks, BLAT, and in silico PCR. Users can add annotations via track hubs and custom tracks. We can bulk-import third-party resources, demonstrated with TOGA and Ensembl gene models for hundreds of assemblies.Three thousand two hundred sixty-nine GenArk assemblies are listed at https://hgdownload.soe.ucsc.edu/hubs/ and can be searched for on the Genome Browser gateway page.
Collapse
Affiliation(s)
- Hiram Clawson
- Genomics Institute, University of California, Santa Cruz, CA, 95064, USA.
| | - Brian T Lee
- Genomics Institute, University of California, Santa Cruz, CA, 95064, USA
| | - Brian J Raney
- Genomics Institute, University of California, Santa Cruz, CA, 95064, USA
| | - Galt P Barber
- Genomics Institute, University of California, Santa Cruz, CA, 95064, USA
| | - Jonathan Casper
- Genomics Institute, University of California, Santa Cruz, CA, 95064, USA
| | - Mark Diekhans
- Genomics Institute, University of California, Santa Cruz, CA, 95064, USA
| | - Clay Fischer
- Genomics Institute, University of California, Santa Cruz, CA, 95064, USA
| | | | - Angie S Hinrichs
- Genomics Institute, University of California, Santa Cruz, CA, 95064, USA
| | - Christopher M Lee
- Genomics Institute, University of California, Santa Cruz, CA, 95064, USA
| | - Luis R Nassar
- Genomics Institute, University of California, Santa Cruz, CA, 95064, USA
| | - Gerardo Perez
- Genomics Institute, University of California, Santa Cruz, CA, 95064, USA
| | - Brittney Wick
- Genomics Institute, University of California, Santa Cruz, CA, 95064, USA
| | - Daniel Schmelter
- Genomics Institute, University of California, Santa Cruz, CA, 95064, USA
| | - Matthew L Speir
- Genomics Institute, University of California, Santa Cruz, CA, 95064, USA
| | - Joel Armstrong
- Genomics Institute, University of California, Santa Cruz, CA, 95064, USA
| | - Ann S Zweig
- Genomics Institute, University of California, Santa Cruz, CA, 95064, USA
| | - Robert M Kuhn
- Genomics Institute, University of California, Santa Cruz, CA, 95064, USA
| | - Bogdan M Kirilenko
- LOEWE Centre for Translational Biodiversity Genomics, Senckenberganlage 25, 60325, Frankfurt, Germany
- Senckenberg Research Institute, Senckenberganlage 25, 60325, Frankfurt, Germany
- Institute of Cell Biology and Neuroscience, Faculty of Biosciences, Goethe University Frankfurt, Max-von-Laue-Str. 9, 60438, Frankfurt, Germany
| | - Michael Hiller
- LOEWE Centre for Translational Biodiversity Genomics, Senckenberganlage 25, 60325, Frankfurt, Germany
- Senckenberg Research Institute, Senckenberganlage 25, 60325, Frankfurt, Germany
- Institute of Cell Biology and Neuroscience, Faculty of Biosciences, Goethe University Frankfurt, Max-von-Laue-Str. 9, 60438, Frankfurt, Germany
| | - David Haussler
- Genomics Institute, University of California, Santa Cruz, CA, 95064, USA
| | - W James Kent
- Genomics Institute, University of California, Santa Cruz, CA, 95064, USA
| | | |
Collapse
|
341
|
Song C, Zhang Y, Huang H, Wang Y, Zhao X, Zhang G, Yin M, Feng C, Wang Q, Qian F, Shang D, Zhang J, Liu J, Li C, Tang H. Cis-Cardio: A comprehensive analysis platform for cardiovascular-relavant cis-regulation in human and mouse. MOLECULAR THERAPY. NUCLEIC ACIDS 2023; 33:655-667. [PMID: 37637211 PMCID: PMC10458290 DOI: 10.1016/j.omtn.2023.07.030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/30/2023] [Accepted: 07/25/2023] [Indexed: 08/29/2023]
Abstract
Cis-regulatory elements are important molecular switches in controlling gene expression and are regarded as determinant hubs in the transcriptional regulatory network. Collection and processing of large-scale cis-regulatory data are urgent to decipher the potential mechanisms of cardiovascular diseases from a cis-regulatory element aspect. Here, we developed a novel web server, Cis-Cardio, which aims to document a large number of available cardiovascular-related cis-regulatory data and to provide analysis for unveiling the comprehensive mechanisms at a cis-regulation level. The current version of Cis-Cardio catalogs a total of 45,382,361 genomic regions from 1,013 human and mouse epigenetic datasets, including ATAC-seq, DNase-seq, Histone ChIP-seq, TF/TcoF ChIP-seq, RNA polymerase ChIP-seq, and Cohesin ChIP-seq. Importantly, Cis-Cardio provides six analysis tools, including region overlap analysis, element upstream/downstream analysis, transcription regulator enrichment analysis, variant interpretation, and protein-protein interaction-based co-regulatory analysis. Additionally, Cis-Cardio provides detailed and abundant (epi-) genetic annotations in cis-regulatory regions, such as super-enhancers, enhancers, transcription factor binding sites (TFBSs), methylation sites, common SNPs, risk SNPs, expression quantitative trait loci (eQTLs), motifs, DNase I hypersensitive sites (DHSs), and 3D chromatin interactions. In summary, Cis-Cardio is a valuable resource for elucidating and analyzing regulatory cues of cardiovascular-specific cis-regulatory elements. The platform is freely available at http://www.licpathway.net/Cis-Cardio/index.html.
Collapse
Affiliation(s)
- Chao Song
- The First Affiliated Hospital, Institute of Cardiovascular Disease, Hengyang Medical School, University of South China, Hengyang, Hunan 421001, China
- The First Affiliated Hospital, Cardiovascular Lab of Big Data and Imaging Artificial Intelligence, Hengyang Medical School, University of South China, Hengyang, Hunan 421001, China
- Hunan Provincial Key Laboratory of Multi-omics and Artificial Intelligence of Cardiovascular Diseases, University of South China, Hengyang, Hunan 421001, China
- School of Computer, University of South China, Hengyang, Hunan 421001, China
- The First Affiliated Hospital, Department of Cardiology, Hengyang Medical School, University of South China, Hengyang, China
| | - Yuexin Zhang
- The First Affiliated Hospital, Institute of Cardiovascular Disease, Hengyang Medical School, University of South China, Hengyang, Hunan 421001, China
- The First Affiliated Hospital, Cardiovascular Lab of Big Data and Imaging Artificial Intelligence, Hengyang Medical School, University of South China, Hengyang, Hunan 421001, China
- Hunan Provincial Key Laboratory of Multi-omics and Artificial Intelligence of Cardiovascular Diseases, University of South China, Hengyang, Hunan 421001, China
- School of Computer, University of South China, Hengyang, Hunan 421001, China
- The First Affiliated Hospital, Department of Cardiology, Hengyang Medical School, University of South China, Hengyang, China
| | - Hong Huang
- The First Affiliated Hospital, Institute of Cardiovascular Disease, Hengyang Medical School, University of South China, Hengyang, Hunan 421001, China
- The First Affiliated Hospital, Cardiovascular Lab of Big Data and Imaging Artificial Intelligence, Hengyang Medical School, University of South China, Hengyang, Hunan 421001, China
- Hunan Provincial Key Laboratory of Multi-omics and Artificial Intelligence of Cardiovascular Diseases, University of South China, Hengyang, Hunan 421001, China
- The First Affiliated Hospital, Department of Cardiology, Hengyang Medical School, University of South China, Hengyang, China
- Clinical Research Center for Myocardial Injury in Hunan Province, Hengyang, Hunan 421001, China
| | - Yuezhu Wang
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing 163319, China
| | - Xilong Zhao
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing 163319, China
| | - Guorui Zhang
- The First Affiliated Hospital, Institute of Cardiovascular Disease, Hengyang Medical School, University of South China, Hengyang, Hunan 421001, China
- The First Affiliated Hospital, Cardiovascular Lab of Big Data and Imaging Artificial Intelligence, Hengyang Medical School, University of South China, Hengyang, Hunan 421001, China
- Hunan Provincial Key Laboratory of Multi-omics and Artificial Intelligence of Cardiovascular Diseases, University of South China, Hengyang, Hunan 421001, China
| | - Mingxue Yin
- The First Affiliated Hospital, Institute of Cardiovascular Disease, Hengyang Medical School, University of South China, Hengyang, Hunan 421001, China
- The First Affiliated Hospital, Cardiovascular Lab of Big Data and Imaging Artificial Intelligence, Hengyang Medical School, University of South China, Hengyang, Hunan 421001, China
- Hunan Provincial Key Laboratory of Multi-omics and Artificial Intelligence of Cardiovascular Diseases, University of South China, Hengyang, Hunan 421001, China
| | - Chenchen Feng
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing 163319, China
| | - Qiuyu Wang
- The First Affiliated Hospital, Institute of Cardiovascular Disease, Hengyang Medical School, University of South China, Hengyang, Hunan 421001, China
- The First Affiliated Hospital, Cardiovascular Lab of Big Data and Imaging Artificial Intelligence, Hengyang Medical School, University of South China, Hengyang, Hunan 421001, China
- Hunan Provincial Key Laboratory of Multi-omics and Artificial Intelligence of Cardiovascular Diseases, University of South China, Hengyang, Hunan 421001, China
- School of Computer, University of South China, Hengyang, Hunan 421001, China
- The First Affiliated Hospital, Department of Cardiology, Hengyang Medical School, University of South China, Hengyang, China
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Hengyang Medical School, University of South China, Hengyang, Hunan 421001, China
- Department of Cell Biology and Genetics, School of Basic Medical Sciences, Hengyang Medical School, University of South China, Hengyang, Hunan 421001, China
| | - Fengcui Qian
- The First Affiliated Hospital, Institute of Cardiovascular Disease, Hengyang Medical School, University of South China, Hengyang, Hunan 421001, China
- The First Affiliated Hospital, Cardiovascular Lab of Big Data and Imaging Artificial Intelligence, Hengyang Medical School, University of South China, Hengyang, Hunan 421001, China
- Hunan Provincial Key Laboratory of Multi-omics and Artificial Intelligence of Cardiovascular Diseases, University of South China, Hengyang, Hunan 421001, China
- School of Computer, University of South China, Hengyang, Hunan 421001, China
- The First Affiliated Hospital, Department of Cardiology, Hengyang Medical School, University of South China, Hengyang, China
| | - Desi Shang
- The First Affiliated Hospital, Institute of Cardiovascular Disease, Hengyang Medical School, University of South China, Hengyang, Hunan 421001, China
- The First Affiliated Hospital, Cardiovascular Lab of Big Data and Imaging Artificial Intelligence, Hengyang Medical School, University of South China, Hengyang, Hunan 421001, China
- Hunan Provincial Key Laboratory of Multi-omics and Artificial Intelligence of Cardiovascular Diseases, University of South China, Hengyang, Hunan 421001, China
- School of Computer, University of South China, Hengyang, Hunan 421001, China
- The First Affiliated Hospital, Department of Cardiology, Hengyang Medical School, University of South China, Hengyang, China
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Hengyang Medical School, University of South China, Hengyang, Hunan 421001, China
- Department of Cell Biology and Genetics, School of Basic Medical Sciences, Hengyang Medical School, University of South China, Hengyang, Hunan 421001, China
| | - Jian Zhang
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing 163319, China
| | - Jiaqi Liu
- The First Affiliated Hospital, Institute of Cardiovascular Disease, Hengyang Medical School, University of South China, Hengyang, Hunan 421001, China
- The First Affiliated Hospital, Cardiovascular Lab of Big Data and Imaging Artificial Intelligence, Hengyang Medical School, University of South China, Hengyang, Hunan 421001, China
- Hunan Provincial Key Laboratory of Multi-omics and Artificial Intelligence of Cardiovascular Diseases, University of South China, Hengyang, Hunan 421001, China
- School of Computer, University of South China, Hengyang, Hunan 421001, China
- The First Affiliated Hospital, Department of Cardiology, Hengyang Medical School, University of South China, Hengyang, China
| | - Chunquan Li
- The First Affiliated Hospital, Institute of Cardiovascular Disease, Hengyang Medical School, University of South China, Hengyang, Hunan 421001, China
- The First Affiliated Hospital, Cardiovascular Lab of Big Data and Imaging Artificial Intelligence, Hengyang Medical School, University of South China, Hengyang, Hunan 421001, China
- Hunan Provincial Key Laboratory of Multi-omics and Artificial Intelligence of Cardiovascular Diseases, University of South China, Hengyang, Hunan 421001, China
- School of Computer, University of South China, Hengyang, Hunan 421001, China
- The First Affiliated Hospital, Department of Cardiology, Hengyang Medical School, University of South China, Hengyang, China
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Hengyang Medical School, University of South China, Hengyang, Hunan 421001, China
- Department of Cell Biology and Genetics, School of Basic Medical Sciences, Hengyang Medical School, University of South China, Hengyang, Hunan 421001, China
- National Health Commission Key Laboratory of Birth Defect Research and Prevention, Hunan Provincial Maternal and Child Health Care Hospital, Changsha, Hunan 410008, China
- Key Laboratory of Rare Pediatric Diseases, Ministry of Education, University of South China, Hengyang, Hunan 421001, China
| | - Huifang Tang
- The First Affiliated Hospital, Institute of Cardiovascular Disease, Hengyang Medical School, University of South China, Hengyang, Hunan 421001, China
- The First Affiliated Hospital, Cardiovascular Lab of Big Data and Imaging Artificial Intelligence, Hengyang Medical School, University of South China, Hengyang, Hunan 421001, China
- Hunan Provincial Key Laboratory of Multi-omics and Artificial Intelligence of Cardiovascular Diseases, University of South China, Hengyang, Hunan 421001, China
- School of Computer, University of South China, Hengyang, Hunan 421001, China
- The First Affiliated Hospital, Department of Cardiology, Hengyang Medical School, University of South China, Hengyang, China
- Clinical Research Center for Myocardial Injury in Hunan Province, Hengyang, Hunan 421001, China
| |
Collapse
|
342
|
Chang CH, Chen SP, Poelchau M, Childers C. Exploring Genetic Information with Ease: The Linkout Plugin for JBrowse 2. MICROPUBLICATION BIOLOGY 2023; 2023:10.17912/micropub.biology.000906. [PMID: 37662054 PMCID: PMC10474478 DOI: 10.17912/micropub.biology.000906] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Subscribe] [Scholar Register] [Received: 06/24/2023] [Revised: 07/25/2023] [Accepted: 08/17/2023] [Indexed: 09/05/2023]
Abstract
JBrowse 2 is a next-generation genome browser that can be run as a web or desktop application. We describe a new plugin, the Linkout Plugin, that enables users to link features to external databases based on their IDs and the remote URLs on JBrowse 2 desktop or web. As a result, genome analysis time and effort are reduced, enabling researchers to gain insights more quickly. The Linkout Plugin fills a common need scientists have: looking for more information on a gene. Overall, the Linkout Plugin is a valuable and practical addition to the JBrowse functionality.
Collapse
Affiliation(s)
- Chi-Hsien Chang
- Department of Electrical Engineering, National Taiwan University
| | | | - Monica Poelchau
- USDA, Agricultural Research Service, National Agricultural Library
| | | |
Collapse
|
343
|
Jimenez Gonzalez A, Baranasic D, Müller F. Zebrafish regulatory genomic resources for disease modelling and regeneration. Dis Model Mech 2023; 16:dmm050280. [PMID: 37529920 PMCID: PMC10417509 DOI: 10.1242/dmm.050280] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/03/2023] Open
Abstract
In the past decades, the zebrafish has become a disease model with increasing popularity owing to its advantages that include fast development, easy genetic manipulation, simplicity for imaging, and sharing conserved disease-associated genes and pathways with those of human. In parallel, studies of disease mechanisms are increasingly focusing on non-coding mutations, which require genome annotation maps of regulatory elements, such as enhancers and promoters. In line with this, genomic resources for zebrafish research are expanding, producing a variety of genomic data that help in defining regulatory elements and their conservation between zebrafish and humans. Here, we discuss recent developments in generating functional annotation maps for regulatory elements of the zebrafish genome and how this can be applied to human diseases. We highlight community-driven developments, such as DANIO-CODE, in generating a centralised and standardised catalogue of zebrafish genomics data and functional annotations; consider the advantages and limitations of current annotation maps; and offer considerations for interpreting and integrating existing maps with comparative genomics tools. We also discuss the need for developing standardised genomics protocols and bioinformatic pipelines and provide suggestions for the development of analysis and visualisation tools that will integrate various multiomic bulk sequencing data together with fast-expanding data on single-cell methods, such as single-cell assay for transposase-accessible chromatin with sequencing. Such integration tools are essential to exploit the multiomic chromatin characterisation offered by bulk genomics together with the cell-type resolution offered by emerging single-cell methods. Together, these advances will build an expansive toolkit for interrogating the mechanisms of human disease in zebrafish.
Collapse
Affiliation(s)
- Ada Jimenez Gonzalez
- Institute of Cancer and Genomic Sciences, Birmingham Centre for Genome Biology, College of Medical and Dental Sciences, University of Birmingham, Birmingham B15 2TT, UK
| | - Damir Baranasic
- Institute of Clinical Sciences, Faculty of Medicine, Imperial College London, Hammersmith Hospital Campus, London SW7 2AZ, UK
- MRC London Institute of Medical Sciences, London W12 0NN, UK
- Division of Electronics, Ruđer Bošković Institute, Bijenička cesta 54, 10000 Zagreb, Croatia
| | - Ferenc Müller
- Institute of Cancer and Genomic Sciences, Birmingham Centre for Genome Biology, College of Medical and Dental Sciences, University of Birmingham, Birmingham B15 2TT, UK
| |
Collapse
|
344
|
Merkulov P, Egorova E, Kirov I. Composition and Structure of Arabidopsis thaliana Extrachromosomal Circular DNAs Revealed by Nanopore Sequencing. PLANTS (BASEL, SWITZERLAND) 2023; 12:2178. [PMID: 37299157 PMCID: PMC10255303 DOI: 10.3390/plants12112178] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Revised: 05/19/2023] [Accepted: 05/29/2023] [Indexed: 06/12/2023]
Abstract
Extrachromosomal circular DNAs (eccDNAs) are enigmatic DNA molecules that have been detected in a range of organisms. In plants, eccDNAs have various genomic origins and may be derived from transposable elements. The structures of individual eccDNA molecules and their dynamics in response to stress are poorly understood. In this study, we showed that nanopore sequencing is a useful tool for the detection and structural analysis of eccDNA molecules. Applying nanopore sequencing to the eccDNA molecules of epigenetically stressed Arabidopsis plants grown under various stress treatments (heat, abscisic acid, and flagellin), we showed that TE-derived eccDNA quantity and structure vary dramatically between individual TEs. Epigenetic stress alone did not cause eccDNA up-regulation, whereas its combination with heat stress triggered the generation of full-length and various truncated eccDNAs of the ONSEN element. We showed that the ratio between full-length and truncated eccDNAs is TE- and condition-dependent. Our work paves the way for further elucidation of the structural features of eccDNAs and their connections with various biological processes, such as eccDNA transcription and eccDNA-mediated TE silencing.
Collapse
Affiliation(s)
- Pavel Merkulov
- Moscow Institute of Physics and Technology, 141701 Dolgoprudny, Russia;
- All-Russia Research Institute of Agricultural Biotechnology, 127550 Moscow, Russia;
| | - Ekaterina Egorova
- All-Russia Research Institute of Agricultural Biotechnology, 127550 Moscow, Russia;
| | - Ilya Kirov
- Moscow Institute of Physics and Technology, 141701 Dolgoprudny, Russia;
- All-Russia Research Institute of Agricultural Biotechnology, 127550 Moscow, Russia;
| |
Collapse
|
345
|
Cleary AM, Farmer AD. Genome Context Viewer (GCV) version 2: enhanced visual exploration of multiple annotated genomes. Nucleic Acids Res 2023:7173788. [PMID: 37207325 DOI: 10.1093/nar/gkad391] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2023] [Revised: 04/21/2023] [Accepted: 05/08/2023] [Indexed: 05/21/2023] Open
Abstract
The Genome Context Viewer is a web application for identifying, aligning, and visualizing genomic regions based on their micro and macrosyntenic structures. By using functional elements such as gene annotations as the unit of search and comparison, the Genome Context Viewer can compute and display relationships between regions across many assemblies from federated data sources in real-time, enabling users to rapidly explore multiple annotated genomes and identify divergence and structural events that can help provide insight into evolutionary mechanisms associated with functional consequences. In this work, we introduce version 2 of the Genome Context Viewer and highlight new features that enhance usability, performance, and ease of deployment.
Collapse
Affiliation(s)
- Alan M Cleary
- National Center for Genome Resources, 2935 Rodeo Park Dr E, Santa Fe, NM 87505, USA
| | - Andrew D Farmer
- National Center for Genome Resources, 2935 Rodeo Park Dr E, Santa Fe, NM 87505, USA
| |
Collapse
|
346
|
Zhuo X, Hsu S, Purushotham D, Kuntala PK, Harrison JK, Du AY, Chen S, Li D, Wang T. Comparing genomic and epigenomic features across species using the WashU Comparative Epigenome Browser. Genome Res 2023; 33:824-835. [PMID: 37156621 PMCID: PMC10317122 DOI: 10.1101/gr.277550.122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2022] [Accepted: 05/03/2023] [Indexed: 05/10/2023]
Abstract
Genome browsers have become an intuitive and critical tool to visualize and analyze genomic features and data. Conventional genome browsers display data/annotations on a single reference genome/assembly; there are also genomic alignment viewer/browsers that help users visualize alignment, mismatch, and rearrangement between syntenic regions. However, there is a growing need for a comparative epigenome browser that can display genomic and epigenomic data sets across different species and enable users to compare them between syntenic regions. Here, we present the WashU Comparative Epigenome Browser. It allows users to load functional genomic data sets/annotations mapped to different genomes and display them over syntenic regions simultaneously. The browser also displays genetic differences between the genomes from single-nucleotide variants (SNVs) to structural variants (SVs) to visualize the association between epigenomic differences and genetic differences. Instead of anchoring all data sets to the reference genome coordinates, it creates independent coordinates of different genome assemblies to faithfully present features and data mapped to different genomes. It uses a simple, intuitive genome-align track to illustrate the syntenic relationship between different species. It extends the widely used WashU Epigenome Browser infrastructure and can be expanded to support multiple species. This new browser function will greatly facilitate comparative genomic/epigenomic research, as well as support the recent growing needs to directly compare and benchmark the T2T CHM13 assembly and other human genome assemblies.
Collapse
Affiliation(s)
- Xiaoyu Zhuo
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63110, USA
- The Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, Missouri 63110, USA
| | - Silas Hsu
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63110, USA
- The Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, Missouri 63110, USA
| | - Deepak Purushotham
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63110, USA
- The Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, Missouri 63110, USA
| | - Prashant Kumar Kuntala
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63110, USA
- The Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, Missouri 63110, USA
| | - Jessica K Harrison
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63110, USA
- The Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, Missouri 63110, USA
| | - Alan Y Du
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63110, USA
- The Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, Missouri 63110, USA
| | - Samuel Chen
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63110, USA
- The Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, Missouri 63110, USA
| | - Daofeng Li
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63110, USA
- The Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, Missouri 63110, USA
| | - Ting Wang
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63110, USA;
- The Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, Missouri 63110, USA
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, Missouri 63110, USA
| |
Collapse
|