101
|
orthoFind Facilitates the Discovery of Homologous and Orthologous Proteins. PLoS One 2015; 10:e0143906. [PMID: 26624019 PMCID: PMC4666658 DOI: 10.1371/journal.pone.0143906] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2015] [Accepted: 10/07/2015] [Indexed: 11/19/2022] Open
Abstract
Finding homologous and orthologous protein sequences is often the first step in evolutionary studies, annotation projects, and experiments of functional complementation. Despite all currently available computational tools, there is a requirement for easy-to-use tools that provide functional information. Here, a new web application called orthoFind is presented, which allows a quick search for homologous and orthologous proteins given one or more query sequences, allowing a recurrent and exhaustive search against reference proteomes, and being able to include user databases. It addresses the protein multidomain problem, searching for homologs with the same domain architecture, and gives a simple functional analysis of the results to help in the annotation process. orthoFind is easy to use and has been proven to provide accurate results with different datasets. Availability: http://www.bioinfocabd.upo.es/orthofind/.
Collapse
|
102
|
Wang Y, Coleman-Derr D, Chen G, Gu YQ. OrthoVenn: a web server for genome wide comparison and annotation of orthologous clusters across multiple species. Nucleic Acids Res 2015. [PMID: 25964301 DOI: 10.1093/narlgkv487] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/11/2023] Open
Abstract
Genome wide analysis of orthologous clusters is an important component of comparative genomics studies. Identifying the overlap among orthologous clusters can enable us to elucidate the function and evolution of proteins across multiple species. Here, we report a web platform named OrthoVenn that is useful for genome wide comparisons and visualization of orthologous clusters. OrthoVenn provides coverage of vertebrates, metazoa, protists, fungi, plants and bacteria for the comparison of orthologous clusters and also supports uploading of customized protein sequences from user-defined species. An interactive Venn diagram, summary counts, and functional summaries of the disjunction and intersection of clusters shared between species are displayed as part of the OrthoVenn result. OrthoVenn also includes in-depth views of the clusters using various sequence analysis tools. Furthermore, OrthoVenn identifies orthologous clusters of single copy genes and allows for a customized search of clusters of specific genes through key words or BLAST. OrthoVenn is an efficient and user-friendly web server freely accessible at http://probes.pw.usda.gov/OrthoVenn or http://aegilops.wheat.ucdavis.edu/OrthoVenn.
Collapse
Affiliation(s)
- Yi Wang
- USDA-ARS, Western Regional Research Center, Crop Improvement and Genetics Research Unit, Albany, CA 94710, USA Department of Plant Sciences, University of California, Davis, CA 95616, USA Bioengineering College, Campus A, Chongqing University, Chongqing 400030, China
| | | | - Guoping Chen
- Bioengineering College, Campus A, Chongqing University, Chongqing 400030, China
| | - Yong Q Gu
- USDA-ARS, Western Regional Research Center, Crop Improvement and Genetics Research Unit, Albany, CA 94710, USA
| |
Collapse
|
103
|
Guna A, Butcher NJ, Bassett AS. Comparative mapping of the 22q11.2 deletion region and the potential of simple model organisms. J Neurodev Disord 2015; 7:18. [PMID: 26137170 PMCID: PMC4487986 DOI: 10.1186/s11689-015-9113-x] [Citation(s) in RCA: 73] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/23/2015] [Accepted: 05/26/2015] [Indexed: 01/18/2023] Open
Abstract
Background 22q11.2 deletion syndrome (22q11.2DS) is the most common micro-deletion syndrome. The associated 22q11.2 deletion conveys the strongest known molecular risk for schizophrenia. Neurodevelopmental phenotypes, including intellectual disability, are also prominent though variable in severity. Other developmental features include congenital cardiac and craniofacial anomalies. Whereas existing mouse models have been helpful in determining the role of some genes overlapped by the hemizygous 22q11.2 deletion in phenotypic expression, much remains unknown. Simple model organisms remain largely unexploited in exploring these genotype-phenotype relationships. Methods We first developed a comprehensive map of the human 22q11.2 deletion region, delineating gene content, and brain expression. To identify putative orthologs, standard methods were used to interrogate the proteomes of the zebrafish (D. rerio), fruit fly (D. melanogaster), and worm (C. elegans), in addition to the mouse. Spatial locations of conserved homologues were mapped to examine syntenic relationships. We systematically cataloged available knockout and knockdown models of all conserved genes across these organisms, including a comprehensive review of associated phenotypes. Results There are 90 genes overlapped by the typical 2.5 Mb deletion 22q11.2 region. Of the 46 protein-coding genes, 41 (89.1 %) have documented expression in the human brain. Identified homologues in the zebrafish (n = 37, 80.4 %) were comparable to those in the mouse (n = 40, 86.9 %) and included some conserved gene cluster structures. There were 22 (47.8 %) putative homologues in the fruit fly and 17 (37.0 %) in the worm involving multiple chromosomes. Individual gene knockdown mutants were available for the simple model organisms, but not for mouse. Although phenotypic data were relatively limited for knockout and knockdown models of the 17 genes conserved across all species, there was some evidence for roles in neurodevelopmental phenotypes, including four of the six mitochondrial genes in the 22q11.2 deletion region. Conclusions Simple model organisms represent a powerful but underutilized means of investigating the molecular mechanisms underlying the elevated risk for neurodevelopmental disorders in 22q11.2DS. This comparative multi-species study provides novel resources and support for the potential utility of non-mouse models in expression studies and high-throughput drug screening. The approach has implications for other recurrent copy number variations associated with neurodevelopmental phenotypes. Electronic supplementary material The online version of this article (doi:10.1186/s11689-015-9113-x) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Alina Guna
- Clinical Genetics Research Program and Campbell Family Mental Health Research Institute, Centre for Addiction and Mental Health, Toronto, ON Canada
| | - Nancy J Butcher
- Clinical Genetics Research Program and Campbell Family Mental Health Research Institute, Centre for Addiction and Mental Health, Toronto, ON Canada ; Institute of Medical Science, University of Toronto, Toronto, ON Canada
| | - Anne S Bassett
- Clinical Genetics Research Program and Campbell Family Mental Health Research Institute, Centre for Addiction and Mental Health, Toronto, ON Canada ; Institute of Medical Science, University of Toronto, Toronto, ON Canada ; Dalglish Family Hearts and Minds Clinic for Adults with 22q11.2 Deletion Syndrome, Division of Cardiology, Department of Medicine, Department of Psychiatry, and Toronto General Research Institute, University Health Network, Toronto, ON Canada ; Department of Psychiatry, University of Toronto, Toronto, ON Canada ; Centre for Addiction and Mental Health, 33 Russell Street, Room 1100, M5S 2S1 Toronto, ON Canada
| |
Collapse
|
104
|
Li L, Ji G, Ye C, Shu C, Zhang J, Liang C. PlantOrDB: a genome-wide ortholog database for land plants and green algae. BMC PLANT BIOLOGY 2015; 15:161. [PMID: 26112452 PMCID: PMC4481079 DOI: 10.1186/s12870-015-0531-4] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/01/2015] [Accepted: 05/21/2015] [Indexed: 05/07/2023]
Abstract
BACKGROUND Genes with different functions are originally generated from some ancestral genes by gene duplication, mutation and functional recombination. It is widely accepted that orthologs are homologous genes evolved from speciation events while paralogs are homologous genes resulted from gene duplication events.With the rapid increase of genomic data, identifying and distinguishing these genes among different species is becoming an important part of functional genomics research. DESCRIPTION Using 35 plant and 6 green algal genomes from Phytozome v9, we clustered 1,291,670 peptide sequences into 49,355 homologous gene families in terms of sequence similarity. For each gene family, we have generated a peptide sequence alignment and phylogenetic tree, and identified the speciation/duplication events for every node within the tree. For each node, we also identified and highlighted diagnostic characters that facilitate appropriate addition of a new query sequence into the existing phylogenetic tree and sequence alignment of its best matched gene family. Based on a desired species or subgroup of all species, users can view the phylogenetic tree, sequence alignment and diagnostic characters for a given gene family selectively. PlantOrDB not only allows users to identify orthologs or paralogs from phylogenetic trees, but also provides all orthologs that are built using Reciprocal Best Hit (RBH) pairwise alignment method. Users can upload their own sequences to find the best matched gene families, and visualize their query sequences within the relevant phylogenetic trees and sequence alignments. CONCLUSION PlantOrDB ( http://bioinfolab.miamioh.edu/plantordb ) is a genome-wide ortholog database for land plants and green algae. PlantOrDB offers highly interactive visualization, accurate query classification and powerful search functions useful for functional genomic research.
Collapse
Affiliation(s)
- Lei Li
- Department of Automation, Xiamen University, Fujian, 361005, China.
- Department of Biology, Miami University, Oxford, OH, 45056, USA.
| | - Guoli Ji
- Department of Automation, Xiamen University, Fujian, 361005, China.
- Innovation Center for Cell Signaling Network, Xiamen University, Xiamen, Fujian, 361005, China.
| | - Congting Ye
- Department of Automation, Xiamen University, Fujian, 361005, China.
- Department of Biology, Miami University, Oxford, OH, 45056, USA.
| | - Changlong Shu
- State Key Laboratory for Biology of Plant Diseases and Insect Pests, Institute of Plant Protection, Chinese Academy of Agricultural Sciences, Beijing, 100193, China.
| | - Jie Zhang
- State Key Laboratory for Biology of Plant Diseases and Insect Pests, Institute of Plant Protection, Chinese Academy of Agricultural Sciences, Beijing, 100193, China.
| | - Chun Liang
- Department of Biology, Miami University, Oxford, OH, 45056, USA.
- State Key Laboratory for Biology of Plant Diseases and Insect Pests, Institute of Plant Protection, Chinese Academy of Agricultural Sciences, Beijing, 100193, China.
| |
Collapse
|
105
|
Wang Y, Coleman-Derr D, Chen G, Gu YQ. OrthoVenn: a web server for genome wide comparison and annotation of orthologous clusters across multiple species. Nucleic Acids Res 2015; 43:W78-84. [PMID: 25964301 PMCID: PMC4489293 DOI: 10.1093/nar/gkv487] [Citation(s) in RCA: 313] [Impact Index Per Article: 34.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2015] [Accepted: 05/02/2015] [Indexed: 01/19/2023] Open
Abstract
Genome wide analysis of orthologous clusters is an important component of comparative genomics studies. Identifying the overlap among orthologous clusters can enable us to elucidate the function and evolution of proteins across multiple species. Here, we report a web platform named OrthoVenn that is useful for genome wide comparisons and visualization of orthologous clusters. OrthoVenn provides coverage of vertebrates, metazoa, protists, fungi, plants and bacteria for the comparison of orthologous clusters and also supports uploading of customized protein sequences from user-defined species. An interactive Venn diagram, summary counts, and functional summaries of the disjunction and intersection of clusters shared between species are displayed as part of the OrthoVenn result. OrthoVenn also includes in-depth views of the clusters using various sequence analysis tools. Furthermore, OrthoVenn identifies orthologous clusters of single copy genes and allows for a customized search of clusters of specific genes through key words or BLAST. OrthoVenn is an efficient and user-friendly web server freely accessible at http://probes.pw.usda.gov/OrthoVenn or http://aegilops.wheat.ucdavis.edu/OrthoVenn.
Collapse
Affiliation(s)
- Yi Wang
- USDA-ARS, Western Regional Research Center, Crop Improvement and Genetics Research Unit, Albany, CA 94710, USA Department of Plant Sciences, University of California, Davis, CA 95616, USA Bioengineering College, Campus A, Chongqing University, Chongqing 400030, China
| | | | - Guoping Chen
- Bioengineering College, Campus A, Chongqing University, Chongqing 400030, China
| | - Yong Q Gu
- USDA-ARS, Western Regional Research Center, Crop Improvement and Genetics Research Unit, Albany, CA 94710, USA
| |
Collapse
|
106
|
Abstract
The human spliceosome is a large ribonucleoprotein complex that catalyzes pre-mRNA splicing. It consists of five snRNAs and more than 200 proteins. Because of this complexity, much work has focused on the Saccharomyces cerevisiae spliceosome, viewed as a highly simplified system with fewer than half as many splicing factors as humans. Nevertheless, it has been difficult to ascribe a mechanistic function to individual splicing factors or even to discern which are critical for catalyzing the splicing reaction. We have identified and characterized the splicing machinery from the red alga Cyanidioschyzon merolae, which has been reported to harbor only 26 intron-containing genes. The U2, U4, U5, and U6 snRNAs contain expected conserved sequences and have the ability to adopt secondary structures and form intermolecular base-pairing interactions, as in other organisms. C. merolae has a highly reduced set of 43 identifiable core splicing proteins, compared with ∼90 in budding yeast and ∼140 in humans. Strikingly, we have been unable to find a U1 snRNA candidate or any predicted U1-associated proteins, suggesting that splicing in C. merolae may occur without the U1 small nuclear ribonucleoprotein particle. In addition, based on mapping the identified proteins onto the known splicing cycle, we propose that there is far less compositional variability during splicing in C. merolae than in other organisms. The observed reduction in splicing factors is consistent with the elimination of spliceosomal components that play a peripheral or modulatory role in splicing, presumably retaining those with a more central role in organization and catalysis.
Collapse
|
107
|
Ream DC, Bankapur AR, Friedberg I. An event-driven approach for studying gene block evolution in bacteria. ACTA ACUST UNITED AC 2015; 31:2075-83. [PMID: 25717195 PMCID: PMC4481853 DOI: 10.1093/bioinformatics/btv128] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2014] [Accepted: 02/20/2015] [Indexed: 11/24/2022]
Abstract
Motivation: Gene blocks are genes co-located on the chromosome. In many cases, gene blocks are conserved between bacterial species, sometimes as operons, when genes are co-transcribed. The conservation is rarely absolute: gene loss, gain, duplication, block splitting and block fusion are frequently observed. An open question in bacterial molecular evolution is that of the formation and breakup of gene blocks, for which several models have been proposed. These models, however, are not generally applicable to all types of gene blocks, and consequently cannot be used to broadly compare and study gene block evolution. To address this problem, we introduce an event-based method for tracking gene block evolution in bacteria. Results: We show here that the evolution of gene blocks in proteobacteria can be described by a small set of events. Those include the insertion of genes into, or the splitting of genes out of a gene block, gene loss, and gene duplication. We show how the event-based method of gene block evolution allows us to determine the evolutionary rateand may be used to trace the ancestral states of their formation. We conclude that the event-based method can be used to help us understand the formation of these important bacterial genomic structures. Availability and implementation: The software is available under GPLv3 license on http://github.com/reamdc1/gene_block_evolution.git. Supplementary online material: http://iddo-friedberg.net/operon-evolution Contact:i.friedberg@miamioh.edu Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- David C Ream
- Department of Microbiology, Miami University, Oxford, OH, USA and Department of Computer Science and Software Engineering, Miami University, Oxford, OH, USA
| | - Asma R Bankapur
- Department of Microbiology, Miami University, Oxford, OH, USA and Department of Computer Science and Software Engineering, Miami University, Oxford, OH, USA
| | - Iddo Friedberg
- Department of Microbiology, Miami University, Oxford, OH, USA and Department of Computer Science and Software Engineering, Miami University, Oxford, OH, USA Department of Microbiology, Miami University, Oxford, OH, USA and Department of Computer Science and Software Engineering, Miami University, Oxford, OH, USA
| |
Collapse
|
108
|
Predicting Functional Interactions Among Genes in Prokaryotes by Genomic Context. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2015; 883:97-106. [PMID: 26621463 DOI: 10.1007/978-3-319-23603-2_5] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Genomic context methods for finding functions of unannotated genes were implemented very early after the publication of the first few prokaryotic genomes. The ideas behind these methods include gene fusions, conservation of gene adjacency, and the patters of co-occurrence of genes across available genomes. A later addition was the prediction of features related to functional organization, such as operons, stretches of genes co-transcribed into a single messenger RNA. The ideas behind these methods tend to be easy to understand, while the strategies for transforming those basic ideas into predictions can vary in complexity, mostly because genes whose products are known to functionally interact vary in the way they relate to those basic ideas. We present here a view of genomic context methods for predicting functional interactions, with simple examples of their implementation as compared and evaluated using genes whose products are known to functionally interact.
Collapse
|
109
|
del Grande M, Moreno-Hagelsieb G. The loose evolutionary relationships between transcription factors and other gene products across prokaryotes. BMC Res Notes 2014; 7:928. [PMID: 25515977 PMCID: PMC4300776 DOI: 10.1186/1756-0500-7-928] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2014] [Accepted: 12/10/2014] [Indexed: 11/20/2022] Open
Abstract
Background Tests for the evolutionary conservation of associations between genes coding for transcription factors (TFs) and other genes have been limited to a few model organisms due to the lack of experimental information of functional associations in other organisms. We aimed at surmounting this limitation by using the most co-occurring gene pairs as proxies for the most conserved functional interactions available for each gene in a genome. We then used genes predicted to code for TFs to compare their most conserved interactions against the most conserved interactions for the rest of the genes within each prokaryotic genome available. Results We plotted profiles of phylogenetic profiles, p-cubic, to compare the maximally scoring interactions of TFs against those of other genes. In most prokaryotes, genes coding for TFs showed lower co-occurrences when compared to other genes. We also show that genes coding for TFs tend to have lower Codon Adaptation Indexes compared to other genes. Conclusions The co-occurrence tests suggest that transcriptional regulation evolves quickly in most, if not all, prokaryotes. The Codon Adaptation Index analyses suggest quick gene exchange and rewiring of transcriptional regulation across prokaryotes. Electronic supplementary material The online version of this article (doi:10.1186/1756-0500-7-928) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | - Gabriel Moreno-Hagelsieb
- Department of Biology, Wilfrid Laurier University, 75 University Ave, W,, N2L 3C5 Waterloo, Ontario, Canada.
| |
Collapse
|
110
|
Kriventseva EV, Tegenfeldt F, Petty TJ, Waterhouse RM, Simão FA, Pozdnyakov IA, Ioannidis P, Zdobnov EM. OrthoDB v8: update of the hierarchical catalog of orthologs and the underlying free software. Nucleic Acids Res 2014; 43:D250-6. [PMID: 25428351 PMCID: PMC4383991 DOI: 10.1093/nar/gku1220] [Citation(s) in RCA: 241] [Impact Index Per Article: 24.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Orthology, refining the concept of homology, is the cornerstone of evolutionary comparative studies. With the ever-increasing availability of genomic data, inference of orthology has become instrumental for generating hypotheses about gene functions crucial to many studies. This update of the OrthoDB hierarchical catalog of orthologs (http://www.orthodb.org) covers 3027 complete genomes, including the most comprehensive set of 87 arthropods, 61 vertebrates, 227 fungi and 2627 bacteria (sampling the most complete and representative genomes from over 11,000 available). In addition to the most extensive integration of functional annotations from UniProt, InterPro, GO, OMIM, model organism phenotypes and COG functional categories, OrthoDB uniquely provides evolutionary annotations including rates of ortholog sequence divergence, copy-number profiles, sibling groups and gene architectures. We re-designed the entirety of the OrthoDB website from the underlying technology to the user interface, enabling the user to specify species of interest and to select the relevant orthology level by the NCBI taxonomy. The text searches allow use of complex logic with various identifiers of genes, proteins, domains, ontologies or annotation keywords and phrases. Gene copy-number profiles can also be queried. This release comes with the freely available underlying ortholog clustering pipeline (http://www.orthodb.org/software).
Collapse
Affiliation(s)
- Evgenia V Kriventseva
- Department of Genetic Medicine and Development, University of Geneva Medical School, rue Michel-Servet 1, 1211 Geneva, Switzerland Swiss Institute of Bioinformatics, rue Michel-Servet 1, 1211 Geneva, Switzerland
| | - Fredrik Tegenfeldt
- Department of Genetic Medicine and Development, University of Geneva Medical School, rue Michel-Servet 1, 1211 Geneva, Switzerland Swiss Institute of Bioinformatics, rue Michel-Servet 1, 1211 Geneva, Switzerland
| | - Tom J Petty
- Department of Genetic Medicine and Development, University of Geneva Medical School, rue Michel-Servet 1, 1211 Geneva, Switzerland Swiss Institute of Bioinformatics, rue Michel-Servet 1, 1211 Geneva, Switzerland
| | - Robert M Waterhouse
- Department of Genetic Medicine and Development, University of Geneva Medical School, rue Michel-Servet 1, 1211 Geneva, Switzerland Swiss Institute of Bioinformatics, rue Michel-Servet 1, 1211 Geneva, Switzerland
| | - Felipe A Simão
- Department of Genetic Medicine and Development, University of Geneva Medical School, rue Michel-Servet 1, 1211 Geneva, Switzerland Swiss Institute of Bioinformatics, rue Michel-Servet 1, 1211 Geneva, Switzerland
| | - Igor A Pozdnyakov
- Department of Genetic Medicine and Development, University of Geneva Medical School, rue Michel-Servet 1, 1211 Geneva, Switzerland Swiss Institute of Bioinformatics, rue Michel-Servet 1, 1211 Geneva, Switzerland
| | - Panagiotis Ioannidis
- Department of Genetic Medicine and Development, University of Geneva Medical School, rue Michel-Servet 1, 1211 Geneva, Switzerland Swiss Institute of Bioinformatics, rue Michel-Servet 1, 1211 Geneva, Switzerland
| | - Evgeny M Zdobnov
- Department of Genetic Medicine and Development, University of Geneva Medical School, rue Michel-Servet 1, 1211 Geneva, Switzerland Swiss Institute of Bioinformatics, rue Michel-Servet 1, 1211 Geneva, Switzerland
| |
Collapse
|