1
|
Gtari M, Maaoui R, Ghodhbane-Gtari F, Ben Slama K, Sbissi I. MAGs-centric crack: how long will, spore-positive Frankia and most Protofrankia, microsymbionts remain recalcitrant to axenic growth? Front Microbiol 2024; 15:1367490. [PMID: 39144212 PMCID: PMC11323853 DOI: 10.3389/fmicb.2024.1367490] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2024] [Accepted: 07/04/2024] [Indexed: 08/16/2024] Open
Abstract
Nearly 50 years after the ground-breaking isolation of the primary Comptonia peregrina microsymbiont under axenic conditions, efforts to isolate a substantial number of Protofrankia and Frankia strains continue with enduring challenges and complexities. This study aimed to streamline genomic insights through comparative and predictive tools to extract traits crucial for isolating specific Frankia in axenic conditions. Pangenome analysis unveiled significant genetic diversity, suggesting untapped potential for cultivation strategies. Shared metabolic strategies in cellular components, central metabolic pathways, and resource acquisition traits offered promising avenues for cultivation. Ecological trait extraction indicated that most uncultured strains exhibit no apparent barriers to axenic growth. Despite ongoing challenges, potential caveats, and errors that could bias predictive analyses, this study provides a nuanced perspective. It highlights potential breakthroughs and guides refined cultivation strategies for these yet-uncultured strains. We advocate for tailored media formulations enriched with simple carbon sources in aerobic environments, with atmospheric nitrogen optionally sufficient to minimize contamination risks. Temperature adjustments should align with strain preferences-28-29°C for Frankia and 32-35°C for Protofrankia-while maintaining an alkaline pH. Given potential extended incubation periods (predicted doubling times ranging from 3.26 to 9.60 days, possibly up to 21.98 days), patience and rigorous contamination monitoring are crucial for optimizing cultivation conditions.
Collapse
Affiliation(s)
- Maher Gtari
- Department of Biological and Chemical Engineering, USCR Molecular Bacteriology and Genomics, National Institute of Applied Sciences and Technology, University of Carthage, Tunis, Tunisia
| | - Radhi Maaoui
- Department of Biological and Chemical Engineering, USCR Molecular Bacteriology and Genomics, National Institute of Applied Sciences and Technology, University of Carthage, Tunis, Tunisia
| | - Faten Ghodhbane-Gtari
- Department of Biological and Chemical Engineering, USCR Molecular Bacteriology and Genomics, National Institute of Applied Sciences and Technology, University of Carthage, Tunis, Tunisia
- Higher Institute of Biotechnology Sidi Thabet, University of La Manouba, Tunisia
| | - Karim Ben Slama
- LR Bioresources, Environment, and Biotechnology (LR22ES04), Higher Institute of Applied Biological Sciences of Tunis, University of Tunis El Manar, Tunis, Tunisia
| | - Imed Sbissi
- LR Pastoral Ecology, Arid Regions Institute, University of Gabes, Medenine, Tunisia
| |
Collapse
|
2
|
Gao Z, Lu Y, Chong Y, Li M, Hong J, Wu J, Wu D, Xi D, Deng W. Beef Cattle Genome Project: Advances in Genome Sequencing, Assembly, and Functional Genes Discovery. Int J Mol Sci 2024; 25:7147. [PMID: 39000250 PMCID: PMC11240973 DOI: 10.3390/ijms25137147] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2024] [Revised: 06/23/2024] [Accepted: 06/26/2024] [Indexed: 07/16/2024] Open
Abstract
Beef is a major global source of protein, playing an essential role in the human diet. The worldwide production and consumption of beef continue to rise, reflecting a significant trend. However, despite the critical importance of beef cattle resources in agriculture, the diversity of cattle breeds faces severe challenges, with many breeds at risk of extinction. The initiation of the Beef Cattle Genome Project is crucial. By constructing a high-precision functional annotation map of their genome, it becomes possible to analyze the genetic mechanisms underlying important traits in beef cattle, laying a solid foundation for breeding more efficient and productive cattle breeds. This review details advances in genome sequencing and assembly technologies, iterative upgrades of the beef cattle reference genome, and its application in pan-genome research. Additionally, it summarizes relevant studies on the discovery of functional genes associated with key traits in beef cattle, such as growth, meat quality, reproduction, polled traits, disease resistance, and environmental adaptability. Finally, the review explores the potential of telomere-to-telomere (T2T) genome assembly, structural variations (SVs), and multi-omics techniques in future beef cattle genetic breeding. These advancements collectively offer promising avenues for enhancing beef cattle breeding and improving genetic traits.
Collapse
Affiliation(s)
- Zhendong Gao
- Yunnan Provincial Key Laboratory of Animal Nutrition and Feed, Faculty of Animal Science and Technology, Yunnan Agricultural University, Kunming 650201, China
| | - Ying Lu
- Yunnan Provincial Key Laboratory of Animal Nutrition and Feed, Faculty of Animal Science and Technology, Yunnan Agricultural University, Kunming 650201, China
| | - Yuqing Chong
- Yunnan Provincial Key Laboratory of Animal Nutrition and Feed, Faculty of Animal Science and Technology, Yunnan Agricultural University, Kunming 650201, China
| | - Mengfei Li
- Yunnan Provincial Key Laboratory of Animal Nutrition and Feed, Faculty of Animal Science and Technology, Yunnan Agricultural University, Kunming 650201, China
| | - Jieyun Hong
- Yunnan Provincial Key Laboratory of Animal Nutrition and Feed, Faculty of Animal Science and Technology, Yunnan Agricultural University, Kunming 650201, China
| | - Jiao Wu
- Yunnan Provincial Key Laboratory of Animal Nutrition and Feed, Faculty of Animal Science and Technology, Yunnan Agricultural University, Kunming 650201, China
| | - Dongwang Wu
- Yunnan Provincial Key Laboratory of Animal Nutrition and Feed, Faculty of Animal Science and Technology, Yunnan Agricultural University, Kunming 650201, China
| | - Dongmei Xi
- Yunnan Provincial Key Laboratory of Animal Nutrition and Feed, Faculty of Animal Science and Technology, Yunnan Agricultural University, Kunming 650201, China
| | - Weidong Deng
- Yunnan Provincial Key Laboratory of Animal Nutrition and Feed, Faculty of Animal Science and Technology, Yunnan Agricultural University, Kunming 650201, China
- State Key Laboratory for Conservation and Utilization of Bio-Resource in Yunnan, Kunming 650201, China
| |
Collapse
|
3
|
Chao KH, Heinz JM, Hoh C, Mao A, Shumate A, Pertea M, Salzberg SL. Combining DNA and protein alignments to improve genome annotation with LiftOn. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.16.593026. [PMID: 38798552 PMCID: PMC11118573 DOI: 10.1101/2024.05.16.593026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]
Abstract
As the number and variety of assembled genomes continues to grow, the number of annotated genomes is falling behind, particularly for eukaryotes. DNA-based mapping tools help to address this challenge, but they are only able to transfer annotation between closely-related species. Here we introduce LiftOn, a homology-based software tool that integrates DNA and protein alignments to enhance the accuracy of genome-scale annotation and to allow mapping between relatively distant species. LiftOn's protein-centric algorithm considers both types of alignments, chooses optimal open reading frames, resolves overlapping gene loci, and finds additional gene copies where they exist. LiftOn can reliably transfer annotation between genomes representing members of the same species, as we demonstrate on human, mouse, honey bee, rice, and Arabidopsis thaliana. It can further map annotation effectively across species pairs as far apart as mouse and rat or Drosophila melanogaster and D. erecta.
Collapse
Affiliation(s)
- Kuan-Hao Chao
- Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Jakob M. Heinz
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA
| | - Celine Hoh
- Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Alan Mao
- Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD 21218, USA
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Alaina Shumate
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD 21218, USA
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Mihaela Pertea
- Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD 21218, USA
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Steven L Salzberg
- Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD 21218, USA
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21218, USA
- Department of Biostatistics, Johns Hopkins University, Baltimore, MD 21211, USA
| |
Collapse
|
4
|
Chen Z, Ain NU, Zhao Q, Zhang X. From tradition to innovation: conventional and deep learning frameworks in genome annotation. Brief Bioinform 2024; 25:bbae138. [PMID: 38581418 PMCID: PMC10998533 DOI: 10.1093/bib/bbae138] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Revised: 03/08/2024] [Accepted: 03/10/2024] [Indexed: 04/08/2024] Open
Abstract
Following the milestone success of the Human Genome Project, the 'Encyclopedia of DNA Elements (ENCODE)' initiative was launched in 2003 to unearth information about the numerous functional elements within the genome. This endeavor coincided with the emergence of numerous novel technologies, accompanied by the provision of vast amounts of whole-genome sequences, high-throughput data such as ChIP-Seq and RNA-Seq. Extracting biologically meaningful information from this massive dataset has become a critical aspect of many recent studies, particularly in annotating and predicting the functions of unknown genes. The core idea behind genome annotation is to identify genes and various functional elements within the genome sequence and infer their biological functions. Traditional wet-lab experimental methods still rely on extensive efforts for functional verification. However, early bioinformatics algorithms and software primarily employed shallow learning techniques; thus, the ability to characterize data and features learning was limited. With the widespread adoption of RNA-Seq technology, scientists from the biological community began to harness the potential of machine learning and deep learning approaches for gene structure prediction and functional annotation. In this context, we reviewed both conventional methods and contemporary deep learning frameworks, and highlighted novel perspectives on the challenges arising during annotation underscoring the dynamic nature of this evolving scientific landscape.
Collapse
Affiliation(s)
- Zhaojia Chen
- National Key Laboratory for Tropical Crop Breeding, Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, Guangzhou 518120, China
- College of Biomedical Engineering, Taiyuan University of Technology, Jinzhong 030600, China
| | - Noor ul Ain
- National Key Laboratory for Tropical Crop Breeding, Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, Guangzhou 518120, China
| | - Qian Zhao
- State Key Laboratory for Ecological Pest Control of Fujian/Taiwan Crops and College of Life Science, Fujian Agriculture and Forestry University, Fuzhou, 350002, China
| | - Xingtan Zhang
- National Key Laboratory for Tropical Crop Breeding, Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, Guangzhou 518120, China
| |
Collapse
|
5
|
Ebu SM, Ray L, Panda AN, Gouda SK. De novo assembly and comparative genome analysis for polyhydroxyalkanoates-producing Bacillus sp. BNPI-92 strain. J Genet Eng Biotechnol 2023; 21:132. [PMID: 37991636 PMCID: PMC10665291 DOI: 10.1186/s43141-023-00578-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2022] [Accepted: 10/26/2023] [Indexed: 11/23/2023]
Abstract
BACKGROUND Certain Bacillus species play a vital role in polyhydroxyalkanoate (PHA) production. However, most of these isolates did not properly identify to species level when scientifically had been reported. RESULTS From NGS analysis, 5719 genes were predicted in the de novo genome assembly. Based on genome annotation using RAST server, 5,527,513 bp sequences were predicted with 5679 bp number of protein-coding sequence. Its genome sequence contains 35.1% and 156 GC content and contigs, respectively. In RAST server analysis, subsystem (43%) and non-subsystem coverage (57%) were generated. Ortho Venn comparative genome analysis indicated that Bacillus sp. BNPI-92 shared 2930 gene cluster (core gene) with B. cereus ATCC 14579 T (AE016877), B. paranthracis Mn5T (MACE01000012), B. thuringiensis ATCC 10792 T (ACNF01000156), and B. antrics Amen T (AE016879) strains. For our strain, the maximum gene cluster (190) was shared with B. cereus ATCC 14579 T (AE016877). For Ortho Venn pair wise analysis, the maximum overlapping gene clusters thresholds have been detected between Bacillus s p.BNPI-92 and Ba. cereus ATCC 14579 T (5414). Average nucleotide identity (ANI) such as OriginalANI and OrthoANI, in silicon digital DND-DNA hybridization (isDDH), Type (Strain) Genome Server (TYGS), and Genome-Genome Distance Calculator (GGDC) were more essentially related Bacillus sp. BNPI-92 with B. cereus ATCC 14579 T strain. Therefore, based on the combination of RAST annotation, OrthoVenn server, ANI and isDDH result Bacillus sp.BNPI-92 strain was strongly confirmed to be a B. cereus type strain. It was designated as B. cereus BNPI-92 strain. In B. cereus BNPI-92 strain whole genome sequence, PHA biosynthesis encoding genes such as phaP, phaQ, phaR (PHA synthesis repressor phaR gene sequence), phaB/phbB, and phaC were predicted on the same operon. These gene clusters were designated as phaPQRBC. However, phaA was located on other operons. CONCLUSIONS This newly obtained isolate was found to be new a strain based on comparative genomic analysis and it was also observed as a potential candidate for PHA biosynthesis.
Collapse
Affiliation(s)
- Seid Mohammed Ebu
- Department of Applied Biology, SoANS, Adama Science and Technology University, Oromia, Ethiopia.
| | - Lopamudra Ray
- School of Law, Campus -16 Adjunct Faculty, School of Biotech, Campus-11 KIIT University, Bhubaneswar, Odisha, 751024, India
| | - Ananta N Panda
- School of Biotechnology, Campus-11 KIIT University, Bhubaneswar, Odisha, 751024, India
| | - Sudhansu K Gouda
- School of Biotechnology, Campus-11 KIIT University, Bhubaneswar, Odisha, 751024, India
| |
Collapse
|
6
|
Boruta T. Computation-aided studies related to the induction of specialized metabolite biosynthesis in microbial co-cultures: An introductory overview. Comput Struct Biotechnol J 2023; 21:4021-4029. [PMID: 37649711 PMCID: PMC10462793 DOI: 10.1016/j.csbj.2023.08.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2023] [Revised: 08/14/2023] [Accepted: 08/14/2023] [Indexed: 09/01/2023] Open
Abstract
Co-cultivation is an effective method of inducing the production of specialized metabolites (SMs) in microbial strains. By mimicking the ecological interactions that take place in natural environment, this approach enables to trigger the biosynthesis of molecules which are not formed under monoculture conditions. Importantly, microbial co-cultivation may lead to the discovery of novel chemical entities of pharmaceutical interest. The experimental efforts aimed at the induction of SMs are greatly facilitated by computational techniques. The aim of this overview is to highlight the relevance of computational methods for the investigation of SM induction via microbial co-cultivation. The concepts related to the induction of SMs in microbial co-cultures are briefly introduced by addressing four areas associated with the SM induction workflows, namely the detection of SMs formed exclusively under co-culture conditions, the annotation of induced SMs, the identification of SM producer strains, and the optimization of fermentation conditions. The computational infrastructure associated with these areas, including the tools of multivariate data analysis, molecular networking, genome mining and mathematical optimization, is discussed in relation to the experimental results described in recent literature. The perspective on the future developments in the field, mainly in relation to the microbiome-related research, is also provided.
Collapse
Affiliation(s)
- Tomasz Boruta
- Lodz University of Technology, Faculty of Process and Environmental Engineering, Department of Bioprocess Engineering, ul. Wólczańska 213, 93-005 Łódź, Poland
| |
Collapse
|
7
|
Abstract
Within the next decade, the genomes of 1.8 million eukaryotic species will be sequenced. Identifying genes in these sequences is essential to understand the biology of the species. This is challenging due to the transcriptional complexity of eukaryotic genomes, which encode hundreds of thousands of transcripts of multiple types. Among these, a small set of protein-coding mRNAs play a disproportionately large role in defining phenotypes. Due to their sequence conservation, orthology can be established, making it possible to define the universal catalog of eukaryotic protein-coding genes. This catalog should substantially contribute to uncovering the genomic events underlying the emergence of eukaryotic phenotypes. This piece briefly reviews the basics of protein-coding gene prediction, discusses challenges in finalizing annotation of the human genome, and proposes strategies for producing annotations across the eukaryotic Tree of Life. This lays the groundwork for obtaining the catalog of all genes-the Earth's code of life.
Collapse
Affiliation(s)
- Roderic Guigó
- Bioinformatics and Genomics, Center for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology (BIST), Dr. Aiguader 88, 08003 Barcelona, Catalonia
- Universitat Pompeu Fabra (UPF), Barcelona, Catalonia
| |
Collapse
|
8
|
Nayarisseri A, Singh SK. Genome analysis of biosurfactant producing bacterium, Bacillus tequilensis. PLoS One 2023; 18:e0285994. [PMID: 37267268 DOI: 10.1371/journal.pone.0285994] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2022] [Accepted: 05/06/2023] [Indexed: 06/04/2023] Open
Abstract
Bioremediation is crucial for recuperating polluted water and soil. By expanding the surface area of substrates, biosurfactants play a vital role in bioremediation. Biosurfactant-producing microbes release certain biosurfactant compounds, which are promoted for oil spill remediation. In the present investigation, a biosurfactant-producing bacterium Bacillus tequilensis was isolated from Chilika Lake, Odisha, India (latitude and longitude: 19.8450 N 85.4788 E). Whole-Genome Sequencing (WGS) of Bacillus tequilensis was carried out using Illumina NextSeq 500. The size of the whole genome of Bacillus tequilensis was 4.47 MB consisting of 4,478,749 base pairs forming a circular chromosome with 528 scaffolds, 4492 protein-encoding genes (ORFs), 81 tRNA genes, and 114 ribosomal RNA transcription units. The total raw reads were 4209415, and the processed reads were 4058238 with 4492 genes. The whole genome obtained from the present investigation was used for genome annotation, variant calling, variant annotation, and comparative genome analysis with other existing Bacillus species. In this study, a pathway was constructed which describes the biosurfactant metabolism of Bacillus tequilensis. The study identified that genes such as SrfAD, SrfAC, SrfAA and SrfAB are involved in biosurfactant synthesis. The sequence of the genes SrfAD, SrfAC, SrfAA, SrfAB was deposited in GenBank database with accession MUG02427.1, MUG02428.1, MUG02429.1, MUG03515.1 respectively. The whole genome sequence was submitted to GenBank with an accession RMVO00000000 and the raw fastq reads were submitted to SRA, NCBI repository with an accession: SRX5023292.
Collapse
Affiliation(s)
- Anuraj Nayarisseri
- Computer Aided Drug Designing and Molecular Modeling Lab, Department of Bioinformatics, Alagappa University, Karaikudi, Tamil Nadu, India
- In silico Research Laboratory, Eminent Biosciences, Indore, Madhya Pradesh, India
- Bioinformatics Research Laboratory, LeGene Biosciences Pvt Ltd., Indore, Madhya Pradesh, India
| | - Sanjeev Kumar Singh
- Computer Aided Drug Designing and Molecular Modeling Lab, Department of Bioinformatics, Alagappa University, Karaikudi, Tamil Nadu, India
- Department of Data Sciences, Centre of Biomedical Research, Lucknow, India
| |
Collapse
|
9
|
Berger B, Yu YW. Navigating bottlenecks and trade-offs in genomic data analysis. Nat Rev Genet 2023; 24:235-250. [PMID: 36476810 PMCID: PMC10204111 DOI: 10.1038/s41576-022-00551-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/27/2022] [Indexed: 12/12/2022]
Abstract
Genome sequencing and analysis allow researchers to decode the functional information hidden in DNA sequences as well as to study cell to cell variation within a cell population. Traditionally, the primary bottleneck in genomic analysis pipelines has been the sequencing itself, which has been much more expensive than the computational analyses that follow. However, an important consequence of the continued drive to expand the throughput of sequencing platforms at lower cost is that often the analytical pipelines are struggling to keep up with the sheer amount of raw data produced. Computational cost and efficiency have thus become of ever increasing importance. Recent methodological advances, such as data sketching, accelerators and domain-specific libraries/languages, promise to address these modern computational challenges. However, despite being more efficient, these innovations come with a new set of trade-offs, both expected, such as accuracy versus memory and expense versus time, and more subtle, including the human expertise needed to use non-standard programming interfaces and set up complex infrastructure. In this Review, we discuss how to navigate these new methodological advances and their trade-offs.
Collapse
Affiliation(s)
- Bonnie Berger
- Department of Mathematics, Massachusetts Institute of Technology, Cambridge, MA, USA.
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA.
| | - Yun William Yu
- Department of Computer and Mathematical Sciences, University of Toronto Scarborough, Toronto, Ontario, Canada
- Tri-Campus Department of Mathematics, University of Toronto, Toronto, Ontario, Canada
| |
Collapse
|
10
|
Li J, Guan D, Halstead MM, Islas-Trejo AD, Goszczynski DE, Ernst CW, Cheng H, Ross P, Zhou H. Transcriptome annotation of 17 porcine tissues using nanopore sequencing technology. Anim Genet 2023; 54:35-44. [PMID: 36385508 DOI: 10.1111/age.13274] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2022] [Revised: 10/20/2022] [Accepted: 11/01/2022] [Indexed: 11/18/2022]
Abstract
The annotation of animal genomes plays an important role in elucidating molecular mechanisms behind the genetic control of economically important traits. Here, we employed long-read sequencing technology, Oxford Nanopore Technology, to annotate the pig transcriptome across 17 tissues from two Yorkshire littermate pigs. More than 9.8 million reads were obtained from a single flow cell, and 69 781 unique transcripts at 50 108 loci were identified. Of these transcripts, 16 255 were found to be novel isoforms, and 22 344 were found at loci that were novel and unannotated in the Ensembl (release 102) and NCBI (release 106) annotations. Novel transcripts were mostly expressed in cerebellum, followed by lung, liver, spleen, and hypothalamus. By comparing the unannotated transcripts to existing databases, there were 21 285 (95.3%) transcripts matched to the NT database (v5) and 13 676 (61.2%) matched to the NR database (v5). Moreover, there were 4324 (19.4%) transcripts matched to the SwissProt database (v5), corresponding to 11 356 proteins. Tissue-specific gene expression analyses showed that 9749 transcripts were highly tissue-specific, and cerebellum contained the most tissue-specific transcripts. As the same samples were used for the annotation of cis-regulatory elements in the pig genome, the transcriptome annotation generated by this study provides an additional and complementary annotation resource for the Functional Annotation of Animal Genomes effort to comprehensively annotate the pig genome.
Collapse
Affiliation(s)
- Jinghui Li
- Department of Animal Science, University of California Davis, Davis, California, USA
| | - Dailu Guan
- Department of Animal Science, University of California Davis, Davis, California, USA
| | - Michelle M Halstead
- Department of Animal Science, University of California Davis, Davis, California, USA
| | - Alma D Islas-Trejo
- Department of Animal Science, University of California Davis, Davis, California, USA
| | - Daniel E Goszczynski
- Department of Animal Science, University of California Davis, Davis, California, USA
| | - Catherine W Ernst
- Department of Animal Science, Michigan State University, East Lansing, Michigan, USA
| | - Hao Cheng
- Department of Animal Science, University of California Davis, Davis, California, USA
| | - Pablo Ross
- Department of Animal Science, University of California Davis, Davis, California, USA
| | - Huaijun Zhou
- Department of Animal Science, University of California Davis, Davis, California, USA
| |
Collapse
|
11
|
Analyzing Prokaryotic Transcriptomics in the Light of Genome Data with the MicroScope Platform. Methods Mol Biol 2022; 2605:241-270. [PMID: 36520398 DOI: 10.1007/978-1-0716-2871-3_13] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
Large-scale genome sequencing and the increasingly massive use of high-throughput approaches produce a vast amount of new information that completely transforms our understanding of thousands of microbial species occurring in our environment. However, despite the development of powerful bioinformatics approaches, full interpretation of the content of these genomes remains a difficult task. To address this challenge, the MicroScope platform has been developed. It is an integrated Web platform for management, annotation, comparative analysis, and visualization of microbial genomes ( https://mage.genoscope.cns.fr/microscope ). Launched in 2005, the platform has been under continuous development and provides analyzes for complete and ongoing genome projects together with metabolic network reconstruction and transcriptomic experiments allowing users to improve the understanding of gene functions. MicroScope platform is widely used by microbiologists from academia and industry all around the world for collaborative studies and expert annotation. It enables collaborative work in a rich comparative genomic context and improves community-based curation efforts. Here, we describe the protocol to follow for the integration and analysis of transcriptomics data in the Microscope platform. The chapter reviews each key step from the experimental design to the analysis and interpretation of the experiment data and results. The integration of transcriptomics data gives a dynamic view of the genome by allowing the users to improve the understanding of gene functions by interpreting them in the light of regulatory cell processes. Moreover, they can also contribute to the refinement of genome annotation through the discovery of new genes and help to fill metabolic gaps.
Collapse
|
12
|
Maia GA, Filho VB, Kawagoe EK, Teixeira Soratto TA, Moreira RS, Grisard EC, Wagner G. AnnotaPipeline: An integrated tool to annotate eukaryotic proteins using multi-omics data. Front Genet 2022; 13:1020100. [DOI: 10.3389/fgene.2022.1020100] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Accepted: 11/11/2022] [Indexed: 11/23/2022] Open
Abstract
Assignment of gene function has been a crucial, laborious, and time-consuming step in genomics. Due to a variety of sequencing platforms that generates increasing amounts of data, manual annotation is no longer feasible. Thus, the need for an integrated, automated pipeline allowing the use of experimental data towards validation of in silico prediction of gene function is of utmost relevance. Here, we present a computational workflow named AnnotaPipeline that integrates distinct software and data types on a proteogenomic approach to annotate and validate predicted features in genomic sequences. Based on FASTA (i) nucleotide or (ii) protein sequences or (iii) structural annotation files (GFF3), users can input FASTQ RNA-seq data, MS/MS data from mzXML or similar formats, as the pipeline uses both transcriptomic and proteomic information to corroborate annotations and validate gene prediction, providing transcription and expression evidence for functional annotation. Reannotation of the available Arabidopsis thaliana, Caenorhabditis elegans, Candida albicans, Trypanosoma cruzi, and Trypanosoma rangeli genomes was performed using the AnnotaPipeline, resulting in a higher proportion of annotated proteins and a reduced proportion of hypothetical proteins when compared to the annotations publicly available for these organisms. AnnotaPipeline is a Unix-based pipeline developed using Python and is available at: https://github.com/bioinformatics-ufsc/AnnotaPipeline.
Collapse
|
13
|
Grimplet J. Genomic and Bioinformatic Resources for Perennial Fruit Species. Curr Genomics 2022; 23:217-233. [PMID: 36777875 PMCID: PMC9875543 DOI: 10.2174/1389202923666220428102632] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2021] [Revised: 03/12/2022] [Accepted: 03/12/2022] [Indexed: 11/22/2022] Open
Abstract
In the post-genomic era, data management and development of bioinformatic tools are critical for the adequate exploitation of genomics data. In this review, we address the actual situation for the subset of crops represented by the perennial fruit species. The agronomical singularity of these species compared to plant and crop model species provides significant challenges on the implementation of good practices generally not addressed in other species. Studies are usually performed over several years in non-controlled environments, usage of rootstock is common, and breeders heavily rely on vegetative propagation. A reference genome is now available for all the major species as well as many members of the economically important genera for breeding purposes. Development of pangenome for these species is beginning to gain momentum which will require a substantial effort in term of bioinformatic tool development. The available tools for genome annotation and functional analysis will also be presented.
Collapse
Affiliation(s)
- Jérôme Grimplet
- Centro de Investigación y Tecnología Agroalimentaria de Aragón (CITA), Unidad de Hortofruticultura, Gobierno de Aragón, Avda. Montañana, Zaragoza, Spain
- Instituto Agroalimentario de Aragón–IA2 (CITA-Universidad de Zaragoza), Calle Miguel Servet, Zaragoza, Spain
| |
Collapse
|
14
|
Mandal K, Dutta S, Upadhyay A, Panda A, Tripathy S. Comparative Genome Analysis Across 128 Phytophthora Isolates Reveal Species-Specific Microsatellite Distribution and Localized Evolution of Compartmentalized Genomes. Front Microbiol 2022; 13:806398. [PMID: 35369471 PMCID: PMC8967354 DOI: 10.3389/fmicb.2022.806398] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2021] [Accepted: 01/04/2022] [Indexed: 11/13/2022] Open
Abstract
Phytophthora sp. are invasive groups of pathogens belonging to class Oomycetes. In order to contain and control them, a deep knowledge of their biology and infection strategy is imperative. With the availability of large-scale sequencing data, it has been possible to look directly into their genetic material and understand the strategies adopted by them for becoming successful pathogens. Here, we have studied the genomes of 128 Phytophthora species available publicly with reasonable quality. Our analysis reveals that the simple sequence repeats (SSRs) of all Phytophthora sp. follow distinct isolate specific patterns. We further show that TG/CA dinucleotide repeats are far more abundant in Phytophthora sp. than other classes of repeats. In case of tri- and tetranucleotide SSRs also, TG/CA-containing motifs always dominate over others. The GC content of the SSRs are stable without much variation across the isolates of Phytophthora. Telomeric repeats of Phytophthora follow a pattern of (TTTAGGG)n or (TTAGGGT)n rather than the canonical (TTAGGG)n. RxLR (arginine-any amino acid-leucine-arginine) motifs containing effectors diverge rapidly in Phytophthora and do not show any core common group. The RxLR effectors of some Phytophthora isolates have a tendency to form clusters with RxLRs from other species than within the same species. An analysis of the flanking intergenic distance clearly indicates a two-speed genome organization for all the Phytophthora isolates. Apart from effectors and the transposons, a large number of other virulence genes such as carbohydrate-active enzymes (CAZymes), transcriptional regulators, signal transduction genes, ATP-binding cassette transporters (ABC), and ubiquitins are also present in the repeat-rich compartments. This indicates a rapid co-evolution of this powerful arsenal for successful pathogenicity. Whole genome duplication studies indicate that the pattern followed is more specific to a geographic location. To conclude, the large-scale genomic studies of Phytophthora have thrown light on their adaptive evolution, which is largely guided by the localized host-mediated selection pressure.
Collapse
Affiliation(s)
- Kajal Mandal
- Computational Genomics Laboratory, Department of Structural Biology and Bioinformatics, CSIR-Indian Institute of Chemical Biology, Kolkata, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India
| | - Subhajeet Dutta
- Computational Genomics Laboratory, Department of Structural Biology and Bioinformatics, CSIR-Indian Institute of Chemical Biology, Kolkata, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India
| | - Aditya Upadhyay
- Computational Genomics Laboratory, Department of Structural Biology and Bioinformatics, CSIR-Indian Institute of Chemical Biology, Kolkata, India
| | - Arijit Panda
- Department of Quantitative Health Science, Mayo Clinic, Rochester, MN, United States
| | - Sucheta Tripathy
- Computational Genomics Laboratory, Department of Structural Biology and Bioinformatics, CSIR-Indian Institute of Chemical Biology, Kolkata, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India
| |
Collapse
|
15
|
Industrially Important Genes from Trichoderma. Fungal Biol 2022. [DOI: 10.1007/978-3-030-91650-3_16] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
16
|
Genome assembly and annotation. Bioinformatics 2022. [DOI: 10.1016/b978-0-323-89775-4.00013-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
|
17
|
Bardou P, Laguerre S, Maman Haddad S, Legoueix Rodriguez S, Laville E, Dumon C, Potocki-Veronese G, Klopp C. MINTIA: a metagenomic INserT integrated assembly and annotation tool. PeerJ 2021; 9:e11885. [PMID: 34692239 PMCID: PMC8483015 DOI: 10.7717/peerj.11885] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2020] [Accepted: 07/09/2021] [Indexed: 11/29/2022] Open
Abstract
The earth harbors trillions of bacterial species adapted to very diverse ecosystems thanks to specific metabolic function acquisition. Most of the genes responsible for these functions belong to uncultured bacteria and are still to be discovered. Functional metagenomics based on activity screening is a classical way to retrieve these genes from microbiomes. This approach is based on the insertion of large metagenomic DNA fragments into a vector and transformation of a host to express heterologous genes. Metagenomic libraries are then screened for activities of interest, and the metagenomic DNA inserts of active clones are extracted to be sequenced and analysed to identify genes that are responsible for the detected activity. Hundreds of metagenomics sequences found using this strategy have already been published in public databases. Here we present the MINTIA software package enabling biologists to easily generate and analyze large metagenomic sequence sets, retrieved after activity-based screening. It filters reads, performs assembly, removes cloning vector, annotates open reading frames and generates user friendly reports as well as files ready for submission to international sequence repositories. The software package can be downloaded from https://github.com/Bios4Biol/MINTIA.
Collapse
Affiliation(s)
- Philippe Bardou
- Sigenae, GenPhySE, Université de Toulouse, INRAE, ENVT, Castanet Tolosan, France
| | | | - Sarah Maman Haddad
- Sigenae, GenPhySE, Université de Toulouse, INRAE, ENVT, Castanet Tolosan, France
| | | | | | - Claire Dumon
- TBI, Université de Toulouse, CNRS, INRAE, INSA, Toulouse, France
| | | | - Christophe Klopp
- Sigenae, Genotoul Bioinfo, MIAT UR875, INRAE, Castanet Tolosan, France
| |
Collapse
|
18
|
Vu TTD, Jung J. Protein function prediction with gene ontology: from traditional to deep learning models. PeerJ 2021; 9:e12019. [PMID: 34513334 PMCID: PMC8395570 DOI: 10.7717/peerj.12019] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2021] [Accepted: 07/29/2021] [Indexed: 11/25/2022] Open
Abstract
Protein function prediction is a crucial part of genome annotation. Prediction methods have recently witnessed rapid development, owing to the emergence of high-throughput sequencing technologies. Among the available databases for identifying protein function terms, Gene Ontology (GO) is an important resource that describes the functional properties of proteins. Researchers are employing various approaches to efficiently predict the GO terms. Meanwhile, deep learning, a fast-evolving discipline in data-driven approach, exhibits impressive potential with respect to assigning GO terms to amino acid sequences. Herein, we reviewed the currently available computational GO annotation methods for proteins, ranging from conventional to deep learning approach. Further, we selected some suitable predictors from among the reviewed tools and conducted a mini comparison of their performance using a worldwide challenge dataset. Finally, we discussed the remaining major challenges in the field, and emphasized the future directions for protein function prediction with GO.
Collapse
Affiliation(s)
- Thi Thuy Duong Vu
- Department of Information and Communication Engineering, Myongji University, Yongin-si, Gyeonggi-do, South Korea
| | - Jaehee Jung
- Department of Information and Communication Engineering, Myongji University, Yongin-si, Gyeonggi-do, South Korea
| |
Collapse
|
19
|
Queirós P, Novikova P, Wilmes P, May P. Unification of functional annotation descriptions using text mining. Biol Chem 2021; 402:983-990. [PMID: 33984880 DOI: 10.1515/hsz-2021-0125] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Accepted: 05/03/2021] [Indexed: 02/06/2023]
Abstract
A common approach to genome annotation involves the use of homology-based tools for the prediction of the functional role of proteins. The quality of functional annotations is dependent on the reference data used, as such, choosing the appropriate sources is crucial. Unfortunately, no single reference data source can be universally considered the gold standard, thus using multiple references could potentially increase annotation quality and coverage. However, this comes with challenges, particularly due to the introduction of redundant and exclusive annotations. Through text mining it is possible to identify highly similar functional descriptions, thus strengthening the confidence of the final protein functional annotation and providing a redundancy-free output. Here we present UniFunc, a text mining approach that is able to detect similar functional descriptions with high precision. UniFunc was built as a small module and can be independently used or integrated into protein function annotation pipelines. By removing the need to individually analyse and compare annotation results, UniFunc streamlines the complementary use of multiple reference datasets.
Collapse
Affiliation(s)
| | | | - Paul Wilmes
- Systems Ecology, Esch-sur-Alzette, Luxembourg
| | - Patrick May
- Bioinformatics Core, Luxembourg Centre for Systems Biomedicine, University of Luxembourg, 4362, Esch-sur-Alzette, Luxembourg
| |
Collapse
|
20
|
DeVoe E, Oliver GR, Zenka R, Blackburn PR, Cousin MA, Boczek NJ, Kocher JPA, Urrutia R, Klee EW, Zimmermann MT. P 2T 2: Protein Panoramic annoTation Tool for the interpretation of protein coding genetic variants. JAMIA Open 2021; 4:ooab065. [PMID: 34377961 PMCID: PMC8346652 DOI: 10.1093/jamiaopen/ooab065] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2021] [Revised: 07/06/2021] [Accepted: 07/17/2021] [Indexed: 11/29/2022] Open
Abstract
MOTIVATION Genomic data are prevalent, leading to frequent encounters with uninterpreted variants or mutations with unknown mechanisms of effect. Researchers must manually aggregate data from multiple sources and across related proteins, mentally translating effects between the genome and proteome, to attempt to understand mechanisms. MATERIALS AND METHODS P2T2 presents diverse data and annotation types in a unified protein-centric view, facilitating the interpretation of coding variants and hypothesis generation. Information from primary sequence, domain, motif, and structural levels are presented and also organized into the first Paralog Annotation Analysis across the human proteome. RESULTS Our tool assists research efforts to interpret genomic variation by aggregating diverse, relevant, and proteome-wide information into a unified interactive web-based interface. Additionally, we provide a REST API enabling automated data queries, or repurposing data for other studies. CONCLUSION The unified protein-centric interface presented in P2T2 will help researchers interpret novel variants identified through next-generation sequencing. Code and server link available at github.com/GenomicInterpretation/p2t2.
Collapse
Affiliation(s)
- Elias DeVoe
- Clinical and Translational Sciences Institute, Medical College of Wisconsin, Milwaukee, Wisconsin 53226, USA
| | - Gavin R Oliver
- Department of Health Science Research, Division of Biomedical Statistics and Informatics, Mayo Clinic, Rochester, Minnesota, USA
- Center for Individualized Medicine, Mayo Clinic, Rochester, Minnesota, USA
| | - Roman Zenka
- Department of Health Science Research, Division of Biomedical Statistics and Informatics, Mayo Clinic, Rochester, Minnesota, USA
| | - Patrick R Blackburn
- Clinical and Translational Sciences Institute, Medical College of Wisconsin, Milwaukee, Wisconsin 53226, USA
- Center for Individualized Medicine, Mayo Clinic, Jacksonville, Florida, USA
| | - Margot A Cousin
- Department of Health Science Research, Division of Biomedical Statistics and Informatics, Mayo Clinic, Rochester, Minnesota, USA
- Center for Individualized Medicine, Mayo Clinic, Rochester, Minnesota, USA
| | - Nicole J Boczek
- Department of Health Science Research, Division of Biomedical Statistics and Informatics, Mayo Clinic, Rochester, Minnesota, USA
- Center for Individualized Medicine, Mayo Clinic, Rochester, Minnesota, USA
| | - Jean-Pierre A Kocher
- Department of Health Science Research, Division of Biomedical Statistics and Informatics, Mayo Clinic, Rochester, Minnesota, USA
- Center for Individualized Medicine, Mayo Clinic, Rochester, Minnesota, USA
| | - Raul Urrutia
- Genomic Sciences and Precision Medicine Center, Medical College of Wisconsin, Milwaukee, Wisconsin, USA
- Department of Biochemistry, Medical College of Wisconsin, Milwaukee, Wisconsin 53226, USA
- Department of Surgery, Medical College of Wisconsin, Milwaukee, Wisconsin, 53226, USA
| | - Eric W Klee
- Department of Health Science Research, Division of Biomedical Statistics and Informatics, Mayo Clinic, Rochester, Minnesota, USA
- Center for Individualized Medicine, Mayo Clinic, Rochester, Minnesota, USA
| | - Michael T Zimmermann
- Clinical and Translational Sciences Institute, Medical College of Wisconsin, Milwaukee, Wisconsin 53226, USA
- Genomic Sciences and Precision Medicine Center, Medical College of Wisconsin, Milwaukee, Wisconsin, USA
- Department of Biochemistry, Medical College of Wisconsin, Milwaukee, Wisconsin 53226, USA
| |
Collapse
|
21
|
Genomic Surveillance and Improved Molecular Typing of Bordetella pertussis Using wgMLST. J Clin Microbiol 2021; 59:JCM.02726-20. [PMID: 33627319 DOI: 10.1128/jcm.02726-20] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2020] [Accepted: 02/18/2021] [Indexed: 01/03/2023] Open
Abstract
Multilocus sequence typing (MLST) provides allele-based characterization of bacterial pathogens in a standardized framework. However, classical MLST schemes for Bordetella pertussis, the causative agent of whooping cough, seldom reveal diversity among the small number of gene targets and thereby fail to delineate population structure. To improve the discriminatory power of allele-based molecular typing of B. pertussis, we have developed a whole-genome MLST (wgMLST) scheme from 225 reference-quality genome assemblies. Iterative refinement and allele curation resulted in a scheme of 3,506 coding sequences and covering 81.4% of the B. pertussis genome. This wgMLST scheme was further evaluated with data from a convenience sample of 2,389 B. pertussis isolates sequenced on Illumina instruments, including isolates from known outbreaks and epidemics previously characterized by existing molecular assays, as well as replicates collected from individual patients. wgMLST demonstrated concordance with whole-genome single nucleotide polymorphism (SNP) profiles, accurately resolved outbreak and sporadic cases in a retrospective comparison, and clustered replicate isolates collected from individual patients during diagnostic confirmation. Additionally, a reanalysis of isolates from two statewide epidemics using wgMLST reconstructed the population structures of circulating strains with increased resolution, revealing new clusters of related cases. Comparison with an existing core genome (cgMLST) scheme highlights the stable gene content of this bacterium and forms the initial foundation for necessary standardization. These results demonstrate the utility of wgMLST for improving B. pertussis characterization and genomic surveillance during the current pertussis disease resurgence.
Collapse
|
22
|
Jensen-Ryan D, Murren CJ, Rutter MT, Thompson JJ. Advancing Science while Training Undergraduates: Recommendations from a Collaborative Biology Research Network. CBE LIFE SCIENCES EDUCATION 2020; 19:es13. [PMID: 33215973 PMCID: PMC8693944 DOI: 10.1187/cbe.20-05-0090] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2023]
Abstract
Biology research is becoming increasingly dependent on large-scale, "big data," networked research initiatives. At the same time, there has been a corresponding effort to expand undergraduate participation in research to benefit student learning and persistence in science. This essay examines the confluence of this trend through eight years of a collaboration within a successful biology research network that explicitly incorporates undergraduates into large-scale scientific research. We draw upon interviews with faculty in this network to consider the interplay of scientific and pedagogical objectives at the heart of this undergraduate-focused network research project. We identify ways that this network has expanded and diversified access to scientific knowledge production for faculty and students and examine a goal conflict that emerged around the dual objectives of mentoring emerging scientists while producing high-quality scientific data for the larger biology community. Based on lessons learned within this network, we provide three recommendations that can support institutions and faculty engaging in networked research projects with undergraduates: (1) establish rigorous protocols to ensure data and database quality, (2) protect personnel time to coordinate network and scientific processes, and (3) select appropriate partners and establish explicit expectations for specific collaborations.
Collapse
Affiliation(s)
- Danielle Jensen-Ryan
- Department of Math and Sciences, Laramie County Community College, Cheyenne, WY 82007
| | | | | | - Jennifer Jo Thompson
- Department of Crop and Soil Sciences, University of Georgia, Athens, GA 30602
- *Address correspondence to: Jennifer Jo Thompson ()
| |
Collapse
|
23
|
Ejigu GF, Jung J. Review on the Computational Genome Annotation of Sequences Obtained by Next-Generation Sequencing. BIOLOGY 2020; 9:E295. [PMID: 32962098 PMCID: PMC7565776 DOI: 10.3390/biology9090295] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/21/2020] [Revised: 09/13/2020] [Accepted: 09/16/2020] [Indexed: 12/16/2022]
Abstract
Next-Generation Sequencing (NGS) has made it easier to obtain genome-wide sequence data and it has shifted the research focus into genome annotation. The challenging tasks involved in annotation rely on the currently available tools and techniques to decode the information contained in nucleotide sequences. This information will improve our understanding of general aspects of life and evolution and improve our ability to diagnose genetic disorders. Here, we present a summary of both structural and functional annotations, as well as the associated comparative annotation tools and pipelines. We highlight visualization tools that immensely aid the annotation process and the contributions of the scientific community to the annotation. Further, we discuss quality-control practices and the need for re-annotation, and highlight the future of annotation.
Collapse
Affiliation(s)
| | - Jaehee Jung
- Department of Information and Communication Engineering, Myongji University, Yongin-si 17058, Gyeonggi-do, Korea;
| |
Collapse
|
24
|
Understanding the proteome encoded by "non-coding RNAs": new insights into human genome. SCIENCE CHINA. LIFE SCIENCES 2020; 63:986-995. [PMID: 32318910 DOI: 10.1007/s11427-019-1677-8] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/29/2019] [Accepted: 03/12/2020] [Indexed: 01/19/2023]
Abstract
A great number of non-coding RNAs (ncRNAs) account for the majority of the genome. The translation of these ncRNAs has been noted but seriously underestimated due to both technological and theoretical limitations. Based on the development of ribosome profiling (Ribo-seq), full length translating RNA analysis (RNC-seq) and mass spectrometry technology, more and more ncRNAs are being found to be translated in different organism, and some of them can produce functional peptides. While recently, not only individual new functional proteins, but also a new proteome have been experimentally discovered to be encoded by endogenous lncRNAs and circRNAs. These new proteins are of biological significance, suggesting the connection of the translation of ncRNAs to human physiology and diseases. Therefore, an in-depth and systematic understanding of the coding capabilities of ncRNAs is necessary for basic biology and medicine. In this review, we summarize the advances in the field of discovering this new proteome, i.e. "ncRNA-coded" proteins.
Collapse
|
25
|
Khodabandelou G, Routhier E, Mozziconacci J. Genome annotation across species using deep convolutional neural networks. PeerJ Comput Sci 2020; 6:e278. [PMID: 33816929 PMCID: PMC7924482 DOI: 10.7717/peerj-cs.278] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2019] [Accepted: 05/18/2020] [Indexed: 06/12/2023]
Abstract
Application of deep neural network is a rapidly expanding field now reaching many disciplines including genomics. In particular, convolutional neural networks have been exploited for identifying the functional role of short genomic sequences. These approaches rely on gathering large sets of sequences with known functional role, extracting those sequences from whole-genome-annotations. These sets are then split into learning, test and validation sets in order to train the networks. While the obtained networks perform well on validation sets, they often perform poorly when applied on whole genomes in which the ratio of positive over negative examples can be very different than in the training set. We here address this issue by assessing the genome-wide performance of networks trained with sets exhibiting different ratios of positive to negative examples. As a case study, we use sequences encompassing gene starts from the RefGene database as positive examples and random genomic sequences as negative examples. We then demonstrate that models trained using data from one organism can be used to predict gene-start sites in a related species, when using training sets providing good genome-wide performance. This cross-species application of convolutional neural networks provides a new way to annotate any genome from existing high-quality annotations in a related reference species. It also provides a way to determine whether the sequence motifs recognised by chromatin-associated proteins in different species are conserved or not.
Collapse
Affiliation(s)
- Ghazaleh Khodabandelou
- Laboratoire de Physique Théorique de la Matière Condensée (LPTMC), Sorbonne Université, Paris, France
- Laboratoire Images, Signaux et Systèmes Intelligents (LISSI), Université Val-de-Marne (Paris XII), Paris, France
| | - Etienne Routhier
- Laboratoire de Physique Théorique de la Matière Condensée (LPTMC), Sorbonne Université, Paris, France
| | - Julien Mozziconacci
- Laboratoire de Physique Théorique de la Matière Condensée (LPTMC), Sorbonne Université, Paris, France
- CNRS UMR 7196 / INSERM U1154 - Sorbonne Université, Museum national d’Histoire naturelle (MNHN), Paris, France
- Institut Universitaire de France, Paris, France
| |
Collapse
|
26
|
Mikalsen SO, Tausen M, Í Kongsstovu S. Phylogeny of teleost connexins reveals highly inconsistent intra- and interspecies use of nomenclature and misassemblies in recent teleost chromosome assemblies. BMC Genomics 2020; 21:223. [PMID: 32160866 PMCID: PMC7066803 DOI: 10.1186/s12864-020-6620-2] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2019] [Accepted: 02/25/2020] [Indexed: 01/17/2023] Open
Abstract
BACKGROUND Based on an initial collecting of database sequences from the gap junction protein gene family (also called connexin genes) in a few teleosts, the naming of these sequences appeared variable. The reasons could be (i) that the structure in this family is variable across teleosts, or (ii) unfortunate naming. Rather clear rules for the naming of genes in fish and mammals have been outlined by nomenclature committees, including the naming of orthologous and ohnologous genes. We therefore analyzed the connexin gene family in teleosts in more detail. We covered the range of divergence times in teleosts (eel, Atlantic herring, zebrafish, Atlantic cod, three-spined stickleback, Japanese pufferfish and spotted pufferfish; listed from early divergence to late divergence). RESULTS The gene family pattern of connexin genes is similar across the analyzed teleosts. However, (i) several nomenclature systems are used, (ii) specific orthologous groups contain genes that are named differently in different species, (iii) several distinct genes have the same name in a species, and (iv) some genes have incorrect names. The latter includes a human connexin pseudogene, claimed as GJA4P, but which in reality is Cx39.2P (a delta subfamily gene often called GJD2like). We point out the ohnologous pairs of genes in teleosts, and we suggest a more consistent nomenclature following the outlined rules from the nomenclature committees. We further show that connexin sequences can indicate some errors in two high-quality chromosome assemblies that became available very recently. CONCLUSIONS Minimal consistency exists in the present practice of naming teleost connexin genes. A consistent and unified nomenclature would be an advantage for future automatic annotations and would make various types of subsequent genetic analyses easier. Additionally, roughly 5% of the connexin sequences point out misassemblies in the new high-quality chromosome assemblies from herring and cod.
Collapse
Affiliation(s)
- Svein-Ole Mikalsen
- Faculty of Science and Technology, University of the Faroe Islands, Vestara Bryggja 15, FO-100, Tórshavn, Faroe Islands.
| | - Marni Tausen
- Faculty of Science and Technology, University of the Faroe Islands, Vestara Bryggja 15, FO-100, Tórshavn, Faroe Islands
- Present affiliation: Bioinformatics Research Centre, Aarhus University, C. F. Møllers Allé 8, 8000, Aarhus C, Denmark
| | - Sunnvør Í Kongsstovu
- Faculty of Science and Technology, University of the Faroe Islands, Vestara Bryggja 15, FO-100, Tórshavn, Faroe Islands
- Amplexa Genetics A/S, Hoyvíksvegur 51, FO-100, Tórshavn, Faroe Islands
| |
Collapse
|
27
|
Pedro H, Yates AD, Kersey PJ, De Silva NH. Collaborative Annotation Redefines Gene Sets for Crucial Phytopathogens. Front Microbiol 2019; 10:2477. [PMID: 31787936 PMCID: PMC6854995 DOI: 10.3389/fmicb.2019.02477] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2019] [Accepted: 10/15/2019] [Indexed: 11/15/2022] Open
Abstract
Accurate and comprehensive annotation of genomic sequences underpins advances in managing plant disease. However, important plant pathogens still have incomplete and inconsistent gene sets and lack dedicated funding or teams to improve this annotation. This paper describes a collaborative approach to gene curation to address this shortcoming. In the first instance, over 40 members of the Botrytis cinerea community from eight countries, with training and infrastructural support from Ensembl Fungi, used the gene editing tool Apollo to systematically review the entire gene set (11,707 protein coding genes) in 6-7 months. This has subsequently been checked and disseminated. Following this, a similar project for another pathogen, Blumeria graminis f. sp. hordei, also led to a completely redefined gene set. Currently, we are working with the Zymoseptoria tritici community to enable them to achieve the same. While the tangible outcome of these projects is improved gene sets, it is apparent that the inherent agreement and ownership of a single gene set by research teams as they undergo this curation process are consequential to the acceleration of research in the field. With the generation of large data sets increasingly affordable, there is value in unifying both the divergent data sets and their associated research teams, pooling time, expertise, and resources. Community-driven annotation efforts can pave the way for a new kind of collaboration among pathogen research communities to generate well-annotated reference data sets, beneficial not just for the genome being examined but for related species and the refinement of automatic gene prediction tools.
Collapse
Affiliation(s)
| | | | | | - Nishadi H. De Silva
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, United Kingdom
| |
Collapse
|
28
|
Mostajo NF, Lataretu M, Krautwurst S, Mock F, Desirò D, Lamkiewicz K, Collatz M, Schoen A, Weber F, Marz M, Hölzer M. A comprehensive annotation and differential expression analysis of short and long non-coding RNAs in 16 bat genomes. NAR Genom Bioinform 2019; 2:lqz006. [PMID: 32289119 PMCID: PMC7108008 DOI: 10.1093/nargab/lqz006] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2019] [Revised: 08/21/2019] [Accepted: 09/10/2019] [Indexed: 12/25/2022] Open
Abstract
Although bats are increasingly becoming the focus of scientific studies due to their unique properties, these exceptional animals are still among the least studied mammals. Assembly quality and completeness of bat genomes vary a lot and especially non-coding RNA (ncRNA) annotations are incomplete or simply missing. Accordingly, standard bioinformatics pipelines for gene expression analysis often ignore ncRNAs such as microRNAs or long antisense RNAs. The main cause of this problem is the use of incomplete genome annotations. We present a complete screening for ncRNAs within 16 bat genomes. NcRNAs affect a remarkable variety of vital biological functions, including gene expression regulation, RNA processing, RNA interference and, as recently described, regulatory processes in viral infections. Within all investigated bat assemblies, we annotated 667 ncRNA families including 162 snoRNAs and 193 miRNAs as well as rRNAs, tRNAs, several snRNAs and lncRNAs, and other structural ncRNA elements. We validated our ncRNA candidates by six RNA-Seq data sets and show significant expression patterns that have never been described before in a bat species on such a large scale. Our annotations will be usable as a resource (rna.uni-jena.de/supplements/bats) for deeper studying of bat evolution, ncRNAs repertoire, gene expression and regulation, ecology and important host–virus interactions.
Collapse
Affiliation(s)
- Nelly F Mostajo
- RNA Bioinformatics and High-Throughput Analysis, Friedrich Schiller University Jena, Leutragraben 1, 07743 Jena, Germany.,Institute of Virology, Philipps-University Marburg, Hans-Meerwein-Straße 2, 35043 Marburg, Germany.,European Virus Bioinformatics Center, Friedrich Schiller University Jena, Leutragraben 1, 07743 Jena, Germany
| | - Marie Lataretu
- RNA Bioinformatics and High-Throughput Analysis, Friedrich Schiller University Jena, Leutragraben 1, 07743 Jena, Germany.,European Virus Bioinformatics Center, Friedrich Schiller University Jena, Leutragraben 1, 07743 Jena, Germany
| | - Sebastian Krautwurst
- RNA Bioinformatics and High-Throughput Analysis, Friedrich Schiller University Jena, Leutragraben 1, 07743 Jena, Germany.,European Virus Bioinformatics Center, Friedrich Schiller University Jena, Leutragraben 1, 07743 Jena, Germany
| | - Florian Mock
- RNA Bioinformatics and High-Throughput Analysis, Friedrich Schiller University Jena, Leutragraben 1, 07743 Jena, Germany.,European Virus Bioinformatics Center, Friedrich Schiller University Jena, Leutragraben 1, 07743 Jena, Germany
| | - Daniel Desirò
- RNA Bioinformatics and High-Throughput Analysis, Friedrich Schiller University Jena, Leutragraben 1, 07743 Jena, Germany.,European Virus Bioinformatics Center, Friedrich Schiller University Jena, Leutragraben 1, 07743 Jena, Germany
| | - Kevin Lamkiewicz
- RNA Bioinformatics and High-Throughput Analysis, Friedrich Schiller University Jena, Leutragraben 1, 07743 Jena, Germany.,European Virus Bioinformatics Center, Friedrich Schiller University Jena, Leutragraben 1, 07743 Jena, Germany
| | - Maximilian Collatz
- RNA Bioinformatics and High-Throughput Analysis, Friedrich Schiller University Jena, Leutragraben 1, 07743 Jena, Germany.,European Virus Bioinformatics Center, Friedrich Schiller University Jena, Leutragraben 1, 07743 Jena, Germany
| | - Andreas Schoen
- Institute for Virology, FB10-Veterinary Medicine, Justus-Liebig University, 35392 Gießen, Germany.,German Center for Infection Research (DZIF), partner sites 35043 Marburg and 35392 Gießen, Germany
| | - Friedemann Weber
- Institute of Virology, Philipps-University Marburg, Hans-Meerwein-Straße 2, 35043 Marburg, Germany.,Institute for Virology, FB10-Veterinary Medicine, Justus-Liebig University, 35392 Gießen, Germany.,German Center for Infection Research (DZIF), partner sites 35043 Marburg and 35392 Gießen, Germany
| | - Manja Marz
- RNA Bioinformatics and High-Throughput Analysis, Friedrich Schiller University Jena, Leutragraben 1, 07743 Jena, Germany.,European Virus Bioinformatics Center, Friedrich Schiller University Jena, Leutragraben 1, 07743 Jena, Germany.,FLI Leibniz Institute for Age Research, Beutenbergstraße 11, 07745 Jena, Germany
| | - Martin Hölzer
- RNA Bioinformatics and High-Throughput Analysis, Friedrich Schiller University Jena, Leutragraben 1, 07743 Jena, Germany.,European Virus Bioinformatics Center, Friedrich Schiller University Jena, Leutragraben 1, 07743 Jena, Germany
| |
Collapse
|
29
|
Kutralam-Muniasamy G, Pérez-Guevara F. Comparative genome analysis of completely sequenced Cupriavidus genomes provides insights into the biosynthetic potential and versatile applications of Cupriavidus alkaliphilus ASC-732. Can J Microbiol 2019; 65:575-595. [PMID: 31022352 DOI: 10.1139/cjm-2019-0027] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
The genome analysis of microorganisms provides valuable information to endorse more extensive research on their potential applications. In this paper, the genome of Cupriavidus alkaliphilus ASC-732, isolated from agave rhizosphere in northeastern Mexico, was analyzed and compared with the genomes of other Cupriavidus species to gain better insight into the parts in the genetic makeup responsible for essential metabolic pathways and others of biotechnological importance. Here, the key genes related to glycolysis, pentose phosphate, and the Entner-Doudoroff and tricarboxylic acid cycle pathways were predicted. Comparative genome analysis revealed that the key genes for hydrogenotrophic growth and carbon fixation pathway, i.e., those coding for hydrogenase and enzymes Calvin-Benson-Bassham cycle, are absent in C. alkaliphilus ASC-732. Furthermore, capabilities for producing polyhydroxyalkanoates and extracellular polysaccharide matrix and degrading xenobiotics were found, and the related pathways are explained. Moreover, biofilm formation and the production of exopolysaccharides and polyhydroxyalkanoates were corroborated with crystal violet staining, calcofluor, and Nile red fluorochromes, confirming the presence of the products of the active genes in these pathways and their related metabolic routes, respectively. Additionally, a large group of genes essential for the resistance and detoxification of several heavy metals were also found. Thus, the present study demonstrates that this strain can respond to various environmental signals, such as energy source, nutrient limitations, virulence, and extreme metals concentration, indicating the possibility to foster C. alkaliphilus ASC-732 in diverse biotechnological applications.
Collapse
Affiliation(s)
- Gurusamy Kutralam-Muniasamy
- a Department of Biotechnology and Bioengineering, Centro de Investigación y de Estudios Avanzados del Instituto Politécnico Nacional, Ciudad de Mexico, Mexico
| | - Fermín Pérez-Guevara
- a Department of Biotechnology and Bioengineering, Centro de Investigación y de Estudios Avanzados del Instituto Politécnico Nacional, Ciudad de Mexico, Mexico.,b Nanoscience and Nanotechnology Program, Centro de Investigación y Estudios Avanzados del Instituto Politécnico Nacional, Ciudad de Mexico, Mexico
| |
Collapse
|
30
|
Caputo A, Fournier PE, Raoult D. Genome and pan-genome analysis to classify emerging bacteria. Biol Direct 2019; 14:5. [PMID: 30808378 PMCID: PMC6390601 DOI: 10.1186/s13062-019-0234-0] [Citation(s) in RCA: 58] [Impact Index Per Article: 11.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2018] [Accepted: 02/14/2019] [Indexed: 12/21/2022] Open
Abstract
Background In the recent years, genomic and pan-genomic studies have become increasingly important. Culturomics allows to study human microbiota through the use of different culture conditions, coupled with a method of rapid identification by MALDI-TOF, or 16S rRNA. Bacterial taxonomy is undergoing many changes as a consequence. With the help of pan-genomic analyses, species can be redefined, and new species definitions generated. Results Genomics, coupled with culturomics, has led to the discovery of many novel bacterial species or genera, including Akkermansia muciniphila and Microvirga massiliensis. Using the genome to define species has been applied within the genus Klebsiella. A discontinuity or an abrupt break in the core/pan-genome ratio can uncover novel species. Conclusions Applying genomic and pan-genomic analyses to the reclassification of other bacterial species or genera will be important in the future of medical microbiology. The pan-genome is one of many new innovative tools in bacterial taxonomy. Reviewers This article was reviewed by William Martin, Eric Bapteste and James Mcinerney. Open peer review Reviewed by William Martin, Eric Bapteste and James Mcinerney.
Collapse
Affiliation(s)
- Aurélia Caputo
- Aix Marseille Univ, IRD, APHM, MEPHI, IHU-Méditerranée Infection, Marseille, France
| | | | - Didier Raoult
- Aix Marseille Univ, IRD, APHM, MEPHI, IHU-Méditerranée Infection, Marseille, France.
| |
Collapse
|
31
|
Aslebagh R, Wormwood KL, Channaveerappa D, Wetie AGN, Woods AG, Darie CC. Identification of Posttranslational Modifications (PTMs) of Proteins by Mass Spectrometry. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2019; 1140:199-224. [DOI: 10.1007/978-3-030-15950-4_11] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
|
32
|
Ngounou Wetie AG, Sokolowska I, Channaveerappa D, Dupree EJ, Jayathirtha M, Woods AG, Darie CC. Proteomics and Non-proteomics Approaches to Study Stable and Transient Protein-Protein Interactions. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2019; 1140:121-142. [DOI: 10.1007/978-3-030-15950-4_7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
33
|
Lowe JWE. Sequencing through thick and thin: Historiographical and philosophical implications. STUDIES IN HISTORY AND PHILOSOPHY OF BIOLOGICAL AND BIOMEDICAL SCIENCES 2018; 72:10-27. [PMID: 30337139 DOI: 10.1016/j.shpsc.2018.10.007] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/09/2017] [Revised: 07/11/2018] [Accepted: 10/01/2018] [Indexed: 06/08/2023]
Abstract
DNA sequencing has been characterised by scholars and life scientists as an example of 'big', 'fast' and 'automated' science in biology. This paper argues, however, that these characterisations are a product of a particular interpretation of what sequencing is, what I call 'thin sequencing'. The 'thin sequencing' perspective focuses on the determination of the order of bases in a particular stretch of DNA. Based upon my research on the pig genome mapping and sequencing projects, I provide an alternative 'thick sequencing' perspective, which also includes a number of practices that enable the sequence to travel across and be used in wider communities. If we take sequencing in the thin manner to be an event demarcated by the determination of sequences in automated sequencing machines and computers, this has consequences for the historical analysis of sequencing projects, as it focuses attention on those parts of the work of sequencing that are more centralised, fast (and accelerating) and automated. I argue instead that sequencing can be interpreted as a more open-ended process including activities such as the generation of a minimum tile path or annotation, and detail the historiographical and philosophical consequences of this move.
Collapse
Affiliation(s)
- James W E Lowe
- Science, Technology and Innovation Studies, University of Edinburgh, Old Surgeons' Hall, High School Yards, Edinburgh, EH1 1LZ, UK.
| |
Collapse
|
34
|
Ruegg K, Bay RA, Anderson EC, Saracco JF, Harrigan RJ, Whitfield M, Paxton EH, Smith TB. Ecological genomics predicts climate vulnerability in an endangered southwestern songbird. Ecol Lett 2018; 21:1085-1096. [DOI: 10.1111/ele.12977] [Citation(s) in RCA: 49] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2017] [Revised: 12/11/2017] [Accepted: 03/15/2018] [Indexed: 12/13/2022]
Affiliation(s)
- Kristen Ruegg
- Center for Tropical Research; Institute for the Environment and Sustainability; University of California Los Angeles; Los Angeles CA 90095 USA
- Department of Ecology and Evolutionary Biology; University of California Santa Cruz; Santa Cruz CA 95060 USA
| | - Rachael A. Bay
- Center for Tropical Research; Institute for the Environment and Sustainability; University of California Los Angeles; Los Angeles CA 90095 USA
- Department of Evolution and Ecology; University of California Davis; One Shields Ave Davis CA 95616 USA
- Southwest Fisheries Science Center; National Marine Fisheries Service; 110 Shaffer Road Santa Cruz CA 95060 USA
| | - Eric C. Anderson
- Department of Evolution and Ecology; University of California Davis; One Shields Ave Davis CA 95616 USA
- Southwest Fisheries Science Center; National Marine Fisheries Service; 110 Shaffer Road Santa Cruz CA 95060 USA
| | - James F. Saracco
- The Institute for Bird Populations; PO Box 1346 Point Reyes Station CA 94956 USA
| | - Ryan J. Harrigan
- Center for Tropical Research; Institute for the Environment and Sustainability; University of California Los Angeles; Los Angeles CA 90095 USA
| | - Mary Whitfield
- Southern Sierra Research Station; P.O. Box 1316 Weldon CA 932883 USA
| | - Eben H. Paxton
- U.S. Geological Survey Pacific Island Ecosystems Research Center; Hawaii Volcano National Park; HI 96718
| | - Thomas B. Smith
- Center for Tropical Research; Institute for the Environment and Sustainability; University of California Los Angeles; Los Angeles CA 90095 USA
- Department of Ecology and Evolutionary Biology; University of California, Los Angeles; 621 Charles E. Young Drive South Los Angeles CA 90095 USA
| |
Collapse
|
35
|
Pasipoularides A. Implementing genome-driven personalized cardiology in clinical practice. J Mol Cell Cardiol 2018; 115:142-157. [PMID: 29343412 PMCID: PMC5820118 DOI: 10.1016/j.yjmcc.2018.01.008] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/16/2017] [Revised: 01/04/2018] [Accepted: 01/12/2018] [Indexed: 12/18/2022]
Abstract
Genomics designates the coordinated investigation of a large number of genes in the context of a biological process or disease. It may be long before we attain comprehensive understanding of the genomics of common complex cardiovascular diseases (CVDs) such as inherited cardiomyopathies, valvular diseases, primary arrhythmogenic conditions, congenital heart syndromes, hypercholesterolemia and atherosclerotic heart disease, hypertensive syndromes, and heart failure with preserved/reduced ejection fraction. Nonetheless, as genomics is evolving rapidly, it is constructive to survey now pertinent concepts and breakthroughs. Today, clinical multimodal electronic medical/health records (EMRs/EHRs) incorporating genomic information establish a continuously-learning, vast knowledge-network with seamless cycling between clinical application and research. It can inform insights into specific pathogenetic pathways, guide biomarker-assisted precise diagnoses and individualized treatments, and stratify prognoses. Complex CVDs blend multiple interacting genomic variants, epigenetics, and environmental risk-factors, engendering progressions of multifaceted disease-manifestations, including clinical symptoms and signs. There is no straight-line linkage between genetic cause(s) or causal gene-variant(s) and disease phenotype(s). Because of interactions involving modifier-gene influences, (micro)-environmental, and epigenetic effects, the same variant may actually produce dissimilar abnormalities in different individuals. Implementing genome-driven personalized cardiology in clinical practice reveals that the study of CVDs at the level of molecules and cells can yield crucial clinical benefits. Complementing evidence-based medicine guidelines from large ("one-size fits all") randomized controlled trials, genomics-based personalized or precision cardiology is a most-creditable paradigm: It provides customizable approaches to prevent, diagnose, and manage CVDs with treatments directly/precisely aimed at causal defects identified by high-throughput genomic technologies. They encompass stem cell and gene therapies exploiting CRISPR-Cas9-gene-editing, and metabolomic-pharmacogenomic therapeutic modalities, precisely fine-tuned for the individual patient. Following the Human Genome Project, many expected genomics technology to provide imminent solutions to intractable medical problems, including CVDs. This eagerness has reaped some disappointment that advances have not yet materialized to the degree anticipated. Undoubtedly, personalized genetic/genomics testing is an emergent technology that should not be applied without supplementary phenotypic/clinical information: Genotype≠Phenotype. However, forthcoming advances in genomics will naturally build on prior attainments and, combined with insights into relevant epigenetics and environmental factors, can plausibly eradicate intractable CVDs, improving human health and well-being.
Collapse
Affiliation(s)
- Ares Pasipoularides
- Consulting Professor of Surgery, Emeritus Faculty of Surgery and of Biomedical Engineering, Duke University School of Medicine and Graduate School, Durham, NC 27710, USA.
| |
Collapse
|
36
|
BluePharmTrain: Biology and Biotechnology of Marine Sponges. GRAND CHALLENGES IN MARINE BIOTECHNOLOGY 2018. [DOI: 10.1007/978-3-319-69075-9_13] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
|
37
|
|
38
|
Abstract
Microbial communities are widespread in the environment, and to isolate and identify species or to determine relations among microorganisms, some 'omics methods like metagenomics, proteomics, and metabolomics have been used. When combined with various 'omics data, models known as artificial microbial ecosystems (AME) are powerful methods that can make functional predictions about microbial communities. Reconstruction of an AME model is the first step for model analysis. Many techniques have been applied to the construction of AME models, e.g., the compartmentalization approach, community objectives method, and dynamic analysis approach. Of these approaches, species compartmentalization is the most relevant to genetics. Besides, some algorithms have been developed for the analysis of AME models. In this chapter, we present a general protocol for the use of the species compartmentalization method to reconstruct a model of microbial communities. Then, the analysis of an AME is discussed.
Collapse
|
39
|
|
40
|
Kim S, Jeong H, Kim EY, Kim JF, Lee SY, Yoon SH. Genomic and transcriptomic landscape of Escherichia coli BL21(DE3). Nucleic Acids Res 2017; 45:5285-5293. [PMID: 28379538 PMCID: PMC5435950 DOI: 10.1093/nar/gkx228] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2017] [Accepted: 03/26/2017] [Indexed: 11/23/2022] Open
Abstract
Escherichia coli BL21(DE3) has long served as a model organism for scientific research, as well as a workhorse for biotechnology. Here we present the most current genome annotation of E. coli BL21(DE3) based on the transcriptome structure of the strain that was determined for the first time. The genome was annotated using multiple automated pipelines and compared to the current genome annotation of the closely related strain, E. coli K-12. High-resolution tiling array data of E. coli BL21(DE3) from several different stages of cell growth in rich and minimal media were analyzed to characterize the transcriptome structure and to provide supporting evidence for open reading frames. This new integrated analysis of the genomic and transcriptomic structure of E. coli BL21(DE3) has led to the correction of translation initiation sites for 88 coding DNA sequences and provided updated information for most genes. Additionally, 37 putative genes and 66 putative non-coding RNAs were also identified. The panoramic landscape of the genome and transcriptome of E. coli BL21(DE3) revealed here will allow us to better understand the fundamental biology of the strain and also advance biotechnological applications in industry.
Collapse
Affiliation(s)
- Sinyeon Kim
- Department of Bioscience and Biotechnology, Konkuk University, Seoul 05029, Republic of Korea
| | - Haeyoung Jeong
- Infectious Disease Research Center, Korea Research Institute of Bioscience and Biotechnology (KRIBB), Daejeon 34141, Republic of Korea
| | - Eun-Youn Kim
- School of Basic Sciences, Hanbat National University, Daejeon 34158, Republic of Korea
| | - Jihyun F Kim
- Department of Systems Biology and Division of Life Sciences, Yonsei University, Seoul 03722, Republic of Korea
| | - Sang Yup Lee
- Metabolic and Biomolecular Engineering National Research Laboratory, Department of Chemical and Biomolecular Engineering (BK21 Plus Program), BioProcess Engineering Research Center, Center for Systems and Synthetic Biotechnology, and Institute for the BioCentury, KAIST, Daejeon 34141, Republic of Korea
| | - Sung Ho Yoon
- Department of Bioscience and Biotechnology, Konkuk University, Seoul 05029, Republic of Korea
| |
Collapse
|
41
|
Saha S, Hosmani PS, Villalobos-Ayala K, Miller S, Shippy T, Flores M, Rosendale A, Cordola C, Bell T, Mann H, DeAvila G, DeAvila D, Moore Z, Buller K, Ciolkevich K, Nandyal S, Mahoney R, Van Voorhis J, Dunlevy M, Farrow D, Hunter D, Morgan T, Shore K, Guzman V, Izsak A, Dixon DE, Cridge A, Cano L, Cao X, Jiang H, Leng N, Johnson S, Cantarel BL, Richards S, English A, Shatters RG, Childers C, Chen MJ, Hunter W, Cilia M, Mueller LA, Munoz-Torres M, Nelson D, Poelchau MF, Benoit JB, Wiersma-Koch H, D’Elia T, Brown SJ. Improved annotation of the insect vector of citrus greening disease: biocuration by a diverse genomics community. Database (Oxford) 2017; 2017:3917099. [PMID: 29220441 PMCID: PMC5502364 DOI: 10.1093/database/bax032] [Citation(s) in RCA: 44] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2017] [Revised: 03/03/2017] [Accepted: 03/25/2017] [Indexed: 01/08/2023]
Abstract
Database URL https://citrusgreening.org/.
Collapse
Affiliation(s)
| | | | | | - Sherry Miller
- Division of Biology, Kansas State University, Manhattan, KS
| | - Teresa Shippy
- Division of Biology, Kansas State University, Manhattan, KS
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - David Hunter
- Division of Biology, Kansas State University, Manhattan, KS
| | - Taylar Morgan
- Division of Biology, Kansas State University, Manhattan, KS
| | - Kayla Shore
- Division of Biology, Kansas State University, Manhattan, KS
| | | | | | - Danielle E Dixon
- Boyce Thompson Institute, Ithaca, NY
- University of Puget Sound, Tacoma, WA, USA
| | - Andrew Cridge
- University of Otago, North Dunedin, Dunedin, New Zealand
| | - Liliana Cano
- Plant Pathology, University of Florida/IFAS Indian River Research and Education Center, Ft. Pierce, FL
| | | | - Haobo Jiang
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX
| | - Nan Leng
- Department of Bioinformatics, UT Southwestern Medical Center, Bioinformatics Core Facility, Dallas, TX
| | | | - Brandi L Cantarel
- Department of Entomology and Plant Pathology, Oklahoma State University, Stillwater, OK
| | - Stephen Richards
- Illumina Inc., San Diego, CA
- Los Alamos National Laboratory, Los Alamos, NM
| | - Adam English
- Illumina Inc., San Diego, CA
- Los Alamos National Laboratory, Los Alamos, NM
| | | | - Chris Childers
- USDA ARS, U.S. Horticultural Research Laboratory, Ft. Pierce, FL
| | - Mei-Ju Chen
- USDA Agricultural Research Service, National Agricultural Library, Beltsville, MD, USA
| | - Wayne Hunter
- Graduate Institute of Biomedical Electronics and Bioinformatics, National Taiwan University, Taipei, Taiwan
| | - Michelle Cilia
- USDA ARS, Emerging Pests and Pathogens Research Unit, Ithaca, NY
- Plant Pathology and Plant-Microbe Biology Section
| | - Lukas A Mueller
- Boyce Thompson Institute, Ithaca, NY
- Plant Breeding and Genetics Section, School of Integrative Plant Science, Cornell University, Ithaca, NY
| | - Monica Munoz-Torres
- Lawrence Berkeley National Laboratory, Environmental Genomics and Systems Biology, Berkeley, CA
| | - David Nelson
- Department of Microbiology, Immunology and Biochemistry, The University of Tennessee Health Science Center, Memphis, TN, USA
| | | | | | | | - Tom D’Elia
- Indian River State College, Fort Pierce, FL
| | - Susan J Brown
- Division of Biology, Kansas State University, Manhattan, KS
| |
Collapse
|
42
|
Chan CX, Beiko RG, Ragan MA. Scaling Up the Phylogenetic Detection of Lateral Gene Transfer Events. Methods Mol Biol 2017; 1525:421-432. [PMID: 27896730 DOI: 10.1007/978-1-4939-6622-6_16] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
Lateral genetic transfer (LGT) is the process by which genetic material moves between organisms (and viruses) in the biosphere. Among the many approaches developed for the inference of LGT events from DNA sequence data, methods based on the comparison of phylogenetic trees remain the gold standard for many types of problem. Identifying LGT events from sequenced genomes typically involves a series of steps in which homologous sequences are identified and aligned, phylogenetic trees are inferred, and their topologies are compared to identify unexpected or conflicting relationships. These types of approach have been used to elucidate the nature and extent of LGT and its physiological and ecological consequences throughout the Tree of Life. Advances in DNA sequencing technology have led to enormous increases in the number of sequenced genomes, including ultra-deep sampling of specific taxonomic groups and single cell-based sequencing of unculturable "microbial dark matter." Environmental shotgun sequencing enables the study of LGT among organisms that share the same habitat.This abundance of genomic data offers new opportunities for scientific discovery, but poses two key problems. As ever more genomes are generated, the assembly and annotation of each individual genome receives less scrutiny; and with so many genomes available it is tempting to include them all in a single analysis, but thousands of genomes and millions of genes can overwhelm key algorithms in the analysis pipeline. Identifying LGT events of interest therefore depends on choosing the right dataset, and on algorithms that appropriately balance speed and accuracy given the size and composition of the chosen set of genomes.
Collapse
Affiliation(s)
- Cheong Xin Chan
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, QLD, 4072, Australia
| | - Robert G Beiko
- Faculty of Computer Science, Dalhousie University, Halifax, NS, B3H 4R2, Canada
| | - Mark A Ragan
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, QLD, 4072, Australia.
| |
Collapse
|
43
|
Abstract
Collaborations between the scientific community and members of the Gene Ontology (GO) Consortium have led to an increase in the number and specificity of GO terms, as well as increasing the number of GO annotations. A variety of approaches have been taken to encourage research scientists to contribute to the GO, but the success of these approaches has been variable. This chapter reviews both the successes and failures of engaging the scientific community in GO development and annotation, as well as, providing motivation and advice to encourage individual researchers to contribute to GO.
Collapse
Affiliation(s)
- Ruth C Lovering
- Functional Gene Annotation Initiative, Centre for Cardiovascular Genetics, Institute of Cardiovascular Science, University College London, 5 University Street, London, WC1E 6JF, UK.
| |
Collapse
|
44
|
Abstract
During the journey from the discovery of DNA to be the source of genetic information and elucidation of double-helical nature of DNA molecule to the assembly of human genome sequence and there after, bioinformatics has become an integral part of modern biology. Bioinformatics relies substantially on significant contributions made by scientists in various fields, including but not limited to, linguistics, biology, mathematics, computer science, and statistics. There is an ever increasing amount of data to elucidate toxic mechanisms and/or adverse effects of xenobiotics in the field of toxicogenomics. Annotation in combination with various bioinformatics analytical tools can play a crucial role in the understanding of genes and proteins, and can potentially help draw meaningful conclusions from various data sources. This article attempts to present a simple overview of bioinformatics, and an effort is made to discuss annotation.
Collapse
Affiliation(s)
- Deepak K Rajpal
- GlaxoSmithKline, Research Triangle Park, North Carolina 27709-3398, USA.
| |
Collapse
|
45
|
Making sense of genomes of parasitic worms: Tackling bioinformatic challenges. Biotechnol Adv 2016; 34:663-686. [DOI: 10.1016/j.biotechadv.2016.03.001] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2015] [Revised: 02/25/2016] [Accepted: 03/01/2016] [Indexed: 01/25/2023]
|
46
|
Leelananda SP, Kloczkowski A, Jernigan RL. Fold-specific sequence scoring improves protein sequence matching. BMC Bioinformatics 2016; 17:328. [PMID: 27578239 PMCID: PMC5006591 DOI: 10.1186/s12859-016-1198-z] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2016] [Accepted: 08/24/2016] [Indexed: 11/10/2022] Open
Abstract
Background Sequence matching is extremely important for applications throughout biology, particularly for discovering information such as functional and evolutionary relationships, and also for discriminating between unimportant and disease mutants. At present the functions of a large fraction of genes are unknown; improvements in sequence matching will improve gene annotations. Universal amino acid substitution matrices such as Blosum62 are used to measure sequence similarities and to identify distant homologues, regardless of the structure class. However, such single matrices do not take into account important structural information evident within the different topologies of proteins and treats substitutions within all protein folds identically. Others have suggested that the use of structural information can lead to significant improvements in sequence matching but this has not yet been very effective. Here we develop novel substitution matrices that include not only general sequence information but also have a topology specific component that is unique for each CATH topology. This novel feature of using a combination of sequence and structure information for each protein topology significantly improves the sequence matching scores for the sequence pairs tested. We have used a novel multi-structure alignment method for each homology level of CATH in order to extract topological information. Results We obtain statistically significant improved sequence matching scores for 73 % of the alpha helical test cases. On average, 61 % of the test cases showed improvements in homology detection when structure information was incorporated into the substitution matrices. On average z-scores for homology detection are improved by more than 54 % for all cases, and some individual cases have z-scores more than twice those obtained using generic matrices. Our topology specific similarity matrices also outperform other traditional similarity matrices and single matrix based structure methods. When default amino acid substitution matrix in the Psi-blast algorithm is replaced by our structure-based matrices, the structure matching is significantly improved over conventional Psi-blast. It also outperforms results obtained for the corresponding HMM profiles generated for each topology. Conclusions We show that by incorporating topology-specific structure information in addition to sequence information into specific amino acid substitution matrices, the sequence matching scores and homology detection are significantly improved. Our topology specific similarity matrices outperform other traditional similarity matrices, single matrix based structure methods, also show improvement over conventional Psi-blast and HMM profile based methods in sequence matching. The results support the discriminatory ability of the new amino acid similarity matrices to distinguish between distant homologs and structurally dissimilar pairs. Electronic supplementary material The online version of this article (doi:10.1186/s12859-016-1198-z) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Sumudu P Leelananda
- Department of Biochemistry, Biophysics and Molecular Biology, Iowa State University, 112 Office and Lab Building, Ames, IA, 50011-3020, USA.,Laurence H. Baker Center for Bioinformatics and Biological Statistics, Iowa State University, 112 Office and Lab Building, Ames, IA, 50011-3020, USA.,Present Address: 2120 Newman and Wolfrom Laboratory, The Ohio State University, 100 W 18th Ave, Columbus, OH, 43210, USA.,Present Address: Battelle Center for Mathematical Medicine, The Research Institute at Nationwide Children's Hospital, Columbus, OH, 43205, USA
| | - Andrzej Kloczkowski
- Present Address: Battelle Center for Mathematical Medicine, The Research Institute at Nationwide Children's Hospital, Columbus, OH, 43205, USA.,Present Address: Department of Pediatrics, The Ohio State University College of Medicine, Columbus, OH, 43205, USA
| | - Robert L Jernigan
- Department of Biochemistry, Biophysics and Molecular Biology, Iowa State University, 112 Office and Lab Building, Ames, IA, 50011-3020, USA. .,Laurence H. Baker Center for Bioinformatics and Biological Statistics, Iowa State University, 112 Office and Lab Building, Ames, IA, 50011-3020, USA.
| |
Collapse
|
47
|
An F, Chen T, Stéphanie DMA, Li K, Li QX, Carvalho LJCB, Tomlins K, Li J, Gu B, Chen S. Domestication Syndrome Is Investigated by Proteomic Analysis between Cultivated Cassava (Manihot esculenta Crantz) and Its Wild Relatives. PLoS One 2016; 11:e0152154. [PMID: 27023871 PMCID: PMC4811587 DOI: 10.1371/journal.pone.0152154] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2015] [Accepted: 03/09/2016] [Indexed: 11/19/2022] Open
Abstract
Cassava (Manihot esculenta Crantz) wild relatives remain a largely untapped potential for genetic improvement. However, the domestication syndrome phenomena from wild species to cultivated cassava remain poorly understood. The analysis of leaf anatomy and photosynthetic activity showed significantly different between cassava cultivars SC205, SC8 and wild relative M. esculenta ssp. Flabellifolia (W14). The dry matter, starch and amylose contents in the storage roots of cassava cultivars were significantly more than that in wild species. In order to further reveal the differences in photosynthesis and starch accumulation of cultivars and wild species, the globally differential proteins between cassava SC205, SC8 and W14 were analyzed using 2-DE in combination with MALDI-TOF tandem mass spectrometry. A total of 175 and 304 proteins in leaves and storage roots were identified, respectively. Of these, 122 and 127 common proteins in leaves and storage roots were detected in SC205, SC8 and W14, respectively. There were 11, 2 and 2 unique proteins in leaves, as well as 58, 9 and 12 unique proteins in storage roots for W14, SC205 and SC8, respectively, indicating proteomic changes in leaves and storage roots between cultivated cassava and its wild relatives. These proteins and their differential regulation across plants of contrasting leaf morphology, leaf anatomy pattern and photosynthetic related parameters and starch content could contribute to the footprinting of cassava domestication syndrome. We conclude that these global protein data would be of great value to detect the key gene groups related to cassava selection in the domestication syndrome phenomena.
Collapse
Affiliation(s)
- Feifei An
- Tropical Crops Genetic Resources Institute, Chinese Academy of Tropical Agricultural Sciences/Key Laboratory of Ministry of Agriculture for Germplasm Resources Conservation and Utilization of Cassava, Danzhou 571737, China
| | - Ting Chen
- Tropical Crops Genetic Resources Institute, Chinese Academy of Tropical Agricultural Sciences/Key Laboratory of Ministry of Agriculture for Germplasm Resources Conservation and Utilization of Cassava, Danzhou 571737, China
| | - Djabou Mouafi Astride Stéphanie
- Tropical Crops Genetic Resources Institute, Chinese Academy of Tropical Agricultural Sciences/Key Laboratory of Ministry of Agriculture for Germplasm Resources Conservation and Utilization of Cassava, Danzhou 571737, China
- Laboratory of Plant Physiology, Higher Teacher’s Training College, University of Yaounde I, P. O. Box 47, Yaounde, Cameroon
| | - Kaimian Li
- Tropical Crops Genetic Resources Institute, Chinese Academy of Tropical Agricultural Sciences/Key Laboratory of Ministry of Agriculture for Germplasm Resources Conservation and Utilization of Cassava, Danzhou 571737, China
| | - Qing X. Li
- Department of Molecular Biosciences and Bioengineering, University of Hawaii at Manoa, Manoa, HI 96822, United States of America
- * E-mail: (SBC); (QXL)
| | | | - Keith Tomlins
- Natural Resources Institute, University of Greenwich, Chatham ME4 4TB, United Kingdom
| | - Jun Li
- Analysis and Testing Center, Jiangsu University, Zhenjiang 212013, China
| | - Bi Gu
- Chemical Starch Institute, Guangxi University, Nanning 300004, China
| | - Songbi Chen
- Tropical Crops Genetic Resources Institute, Chinese Academy of Tropical Agricultural Sciences/Key Laboratory of Ministry of Agriculture for Germplasm Resources Conservation and Utilization of Cassava, Danzhou 571737, China
- * E-mail: (SBC); (QXL)
| |
Collapse
|
48
|
Abstract
Y. pestis exhibits dramatically different traits of pathogenicity and transmission, albeit their close genetic relationship with its ancestor-Y. pseudotuberculosis, a self-limiting gastroenteric pathogen. Y. pestis is evolved into a deadly pathogen and transmitted to mammals and/or human beings by infected flea biting or directly contacting with the infected animals. Various kinds of environmental changes are implicated into its complex life cycle and pathogenesis. Dynamic regulation of gene expression is critical for environmental adaptation or survival, primarily reflected by genetic regulation mediated by transcriptional factors and small regulatory RNAs at the transcriptional and posttranscriptional level, respectively. The effects of genetic regulation have been shown to profoundly influence Y. pestis physiology and pathogenesis such as stress resistance, biofilm formation, intracellular survival, and replication. In this chapter, we mainly summarize the progresses on popular methods of genetic regulation and on regulatory patterns and consequences of many key transcriptional and posttranscriptional regulators, with a particular emphasis on how genetic regulation influences the biofilm and virulence of Y. pestis.
Collapse
|
49
|
An efficient transcriptome analysis pipeline to accelerate venom peptide discovery and characterisation. Toxicon 2015; 107:282-9. [DOI: 10.1016/j.toxicon.2015.09.012] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2015] [Revised: 08/26/2015] [Accepted: 09/10/2015] [Indexed: 01/04/2023]
|
50
|
Lapatas V, Stefanidakis M, Jimenez RC, Via A, Schneider MV. Data integration in biological research: an overview. JOURNAL OF BIOLOGICAL RESEARCH (THESSALONIKE, GREECE) 2015; 22:9. [PMID: 26336651 PMCID: PMC4557916 DOI: 10.1186/s40709-015-0032-5] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/20/2015] [Accepted: 08/10/2015] [Indexed: 11/16/2022]
Abstract
Data sharing, integration and annotation are essential to ensure the reproducibility of the analysis and interpretation of the experimental findings. Often these activities are perceived as a role that bioinformaticians and computer scientists have to take with no or little input from the experimental biologist. On the contrary, biological researchers, being the producers and often the end users of such data, have a big role in enabling biological data integration. The quality and usefulness of data integration depend on the existence and adoption of standards, shared formats, and mechanisms that are suitable for biological researchers to submit and annotate the data, so it can be easily searchable, conveniently linked and consequently used for further biological analysis and discovery. Here, we provide background on what is data integration from a computational science point of view, how it has been applied to biological research, which key aspects contributed to its success and future directions.
Collapse
Affiliation(s)
- Vasileios Lapatas
- />Department of Informatics, Ionian University, 7 Tsirigoti Square, Corfu, 49100 Greece
| | - Michalis Stefanidakis
- />Department of Informatics, Ionian University, 7 Tsirigoti Square, Corfu, 49100 Greece
| | | | - Allegra Via
- />Biocomputing Group, Sapienza University, Piazzale Aldo Moro 5, Rome, 00185 Italy
| | | |
Collapse
|