1
|
Ahirwar SS, Rizwan R, Sethi S, Shahid Z, Malviya S, Khandia R, Agarwal A, Kotnis A. Comparative Analysis of Published Database Predicting MicroRNA Binding in 3'UTR of mRNA in Diverse Species. Microrna 2024; 13:2-13. [PMID: 37929739 DOI: 10.2174/0122115366261005231018070640] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2023] [Revised: 09/03/2023] [Accepted: 09/15/2023] [Indexed: 11/07/2023]
Abstract
BACKGROUND Micro-RNAs are endogenous non-coding RNA moieties of 22-27 nucleotides that play a crucial role in the regulation of various biological processes and make them useful prognostic and diagnostic biomarkers. Discovery and experimental validation of miRNA is a laborious and time-consuming process. For early prediction, multiple bioinformatics databases are available for miRNA target prediction; however, their utility can confuse amateur researchers in selecting the most appropriate tools for their study. OBJECTIVE This descriptive review aimed to analyse the usability of the existing database based on the following criteria: accessibility, efficiency, interpretability, updatability, and flexibility for miRNA target prediction of 3'UTR of mRNA in diverse species so that the researchers can utilize the database most appropriate to their research. METHODS A systematic literature search was performed in PubMed, Google Scholar and Scopus databases up to November 2022. ≥10,000 articles found online, including ⁓130 miRNA tools, which contain various information on miRNA. Out of them, 31 databases that provide information on validated 3'UTR miRNAs target databases were included and analysed in this review. RESULTS These miRNA database tools are being used in varied areas of biological research to select the most suitable miRNA for their experimental validation. These databases, updated until the year 2021, consist of miRNA-related data from humans, animals, mice, plants, viruses etc. They contain 525-29806351 data entries, and information from most databases is freely available on the online platform. CONCLUSION Reviewed databases provide significant information, but not all information is accurate or up-to-date. Therefore, Diana-TarBase and miRWalk are the most comprehensive and up-to-date databases.
Collapse
Affiliation(s)
- Sonu Singh Ahirwar
- Department of Biochemistry, All India Institute of Medical Sciences Bhopal, AIIMS Bhopal, Saket Nagar, Bhopal, MP, India
| | - Rehma Rizwan
- Department of Biochemistry, All India Institute of Medical Sciences Bhopal, AIIMS Bhopal, Saket Nagar, Bhopal, MP, India
| | - Samdish Sethi
- Department of Biochemistry, All India Institute of Medical Sciences Bhopal, AIIMS Bhopal, Saket Nagar, Bhopal, MP, India
| | - Zainab Shahid
- Department of Biochemistry, All India Institute of Medical Sciences Bhopal, AIIMS Bhopal, Saket Nagar, Bhopal, MP, India
| | - Shivani Malviya
- Department of Biochemistry and Genetics, Barkatullah University, Bhopal, Madhya Pradesh, 462026, India
| | - Rekha Khandia
- Department of Biochemistry and Genetics, Barkatullah University, Bhopal, Madhya Pradesh, 462026, India
| | - Amit Agarwal
- Department of Neurosurgery, All India Institute of Medical Sciences Bhopal, Bhopal MP, 462020, India
| | - Ashwin Kotnis
- Department of Biochemistry, All India Institute of Medical Sciences Bhopal, AIIMS Bhopal, Saket Nagar, Bhopal, MP, India
| |
Collapse
|
2
|
Birzu G, Muralidharan HS, Goudeau D, Malmstrom RR, Fisher DS, Bhaya D. Hybridization breaks species barriers in long-term coevolution of a cyanobacterial population. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.06.06.543983. [PMID: 37333348 PMCID: PMC10274767 DOI: 10.1101/2023.06.06.543983] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/20/2023]
Abstract
Bacterial species often undergo rampant recombination yet maintain cohesive genomic identity. Ecological differences can generate recombination barriers between species and sustain genomic clusters in the short term. But can these forces prevent genomic mixing during long-term coevolution? Cyanobacteria in Yellowstone hot springs comprise several diverse species that have coevolved for hundreds of thousands of years, providing a rare natural experiment. By analyzing more than 300 single-cell genomes, we show that despite each species forming a distinct genomic cluster, much of the diversity within species is the result of hybridization driven by selection, which has mixed their ancestral genotypes. This widespread mixing is contrary to the prevailing view that ecological barriers can maintain cohesive bacterial species and highlights the importance of hybridization as a source of genomic diversity.
Collapse
Affiliation(s)
- Gabriel Birzu
- Department of Applied Physics, Stanford University, Stanford, CA 94305, USA
| | | | - Danielle Goudeau
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Rex R. Malmstrom
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Daniel S. Fisher
- Department of Applied Physics, Stanford University, Stanford, CA 94305, USA
| | - Devaki Bhaya
- Department of Plant Biology, Carnegie Institution for Science, Stanford, CA 94305, USA
| |
Collapse
|
3
|
Kuznetsov D, Tegenfeldt F, Manni M, Seppey M, Berkeley M, Kriventseva E, Zdobnov EM. OrthoDB v11: annotation of orthologs in the widest sampling of organismal diversity. Nucleic Acids Res 2022; 51:D445-D451. [PMID: 36350662 PMCID: PMC9825584 DOI: 10.1093/nar/gkac998] [Citation(s) in RCA: 64] [Impact Index Per Article: 32.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2022] [Revised: 10/15/2022] [Accepted: 10/26/2022] [Indexed: 11/10/2022] Open
Abstract
OrthoDB provides evolutionary and functional annotations of genes in a diverse sampling of eukaryotes, prokaryotes, and viruses. Genomics continues to accelerate our exploration of gene diversity and orthology is the most precise way of bridging gene functional knowledge with the rapidly expanding universe of genomic sequences. OrthoDB samples the most diverse organisms with the best quality genomics data to provide the leading coverage of species diversity. This update of the underlying data to over 18 000 prokaryotes and almost 2000 eukaryotes with over 100 million genes propels the coverage to another level. This achievement also demonstrates the scalability of the underlying OrthoLoger software for delineation of orthologs, freely available from https://orthologer.ezlab.org. In addition to the ab-initio computations of gene orthology used for the OrthoDB release, the OrthoLoger software allows mapping of novel gene sets to precomputed orthologs and thereby links to their annotations. The LEMMI-style benchmarking of OrthoLoger ensures its state-of-the-art performance and is available from https://lemortho.ezlab.org. The OrthoDB web interface has been further developed to include a pairwise orthology view from any gene to any other sampled species. OrthoDB-computed evolutionary annotations as well as extensively collated functional annotations can be accessed via REST API or SPARQL/RDF, downloaded or browsed online from https://www.orthodb.org.
Collapse
Affiliation(s)
| | | | - Mosè Manni
- Department of Genetic Medicine and Development, University of Geneva Medical School, Swiss Institute of Bioinformatics, rue Michel-Servet 1, 1211 Geneva, Switzerland
| | - Mathieu Seppey
- Department of Genetic Medicine and Development, University of Geneva Medical School, Swiss Institute of Bioinformatics, rue Michel-Servet 1, 1211 Geneva, Switzerland
| | - Matthew Berkeley
- Department of Genetic Medicine and Development, University of Geneva Medical School, Swiss Institute of Bioinformatics, rue Michel-Servet 1, 1211 Geneva, Switzerland
| | | | - Evgeny M Zdobnov
- To whom correspondence should be addressed. Tel: +41 22 379 59 73;
| |
Collapse
|
4
|
Mylarshchikov DE, Mironov AA. ortho2align: a sensitive approach for searching for orthologues of novel lncRNAs. BMC Bioinformatics 2022; 23:384. [PMID: 36123626 PMCID: PMC9487038 DOI: 10.1186/s12859-022-04929-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2022] [Accepted: 09/13/2022] [Indexed: 11/12/2022] Open
Abstract
Background Many novel long noncoding RNAs have been discovered in recent years due to advances in high-throughput sequencing experiments. Finding orthologues of these novel lncRNAs might facilitate clarification of their functional role in living organisms. However, lncRNAs exhibit low sequence conservation, so specific methods for enhancing the signal-to-noise ratio were developed. Nevertheless, current methods such as transcriptomes comparison approaches or searches for conserved secondary structures are not applicable to novel, previously unannotated lncRNAs by design. Results We present ortho2align—a versatile sensitive synteny-based lncRNA orthologue search tool with statistical assessment of sequence conservation. This tool allows control of the specificity of the search process and optional annotation of found orthologues. ortho2align shows similar performance in terms of sensitivity and resource usage as the state-of-the-art method for aligning orthologous lncRNAs but also enables scientists to predict unannotated orthologous sequences for lncRNAs in question. Using ortho2align, we predicted orthologues of three distinct classes of novel human lncRNAs in six Vertebrata species to estimate their degree of conservation. Conclusions Being designed for the discovery of unannotated orthologues of novel lncRNAs in distant species, ortho2align is a versatile tool applicable to any genomic regions, especially weakly conserved ones. A small amount of input files makes ortho2align easy to use in orthology studies as a single tool or in bundle with other steps that researchers will consider sensible. ortho2align is available as an Anaconda package with its source code hosted at https://github.com/dmitrymyl/ortho2align. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-022-04929-y.
Collapse
Affiliation(s)
| | - Andrey Alexandrovich Mironov
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Moscow, Russian Federation, 119234.,Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow, Russian Federation, 127994
| |
Collapse
|
5
|
Lohse K, García-Berro A, Talavera G. The genome sequence of the red admiral, Vanessa atalanta (Linnaeus, 1758). Wellcome Open Res 2021. [DOI: 10.12688/wellcomeopenres.17524.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
We present a genome assembly from an individual female Vanessa atalanta (the red admiral; Arthropoda; Insecta; Lepidoptera; Nymphalidae). The genome sequence is 370 megabases in span. The majority of the assembly (99.44%) is scaffolded into 32 chromosomal pseudomolecules, with the W and Z sex chromosome assembled. Gene annotation of this assembly on Ensembl has identified 12,493 protein coding genes.
Collapse
|
6
|
Hayward A, Wright C. The genome sequence of the holly blue, Celastrina argiolus (Linnaeus, 1758). Wellcome Open Res 2021; 6:340. [PMID: 35028429 PMCID: PMC8729184 DOI: 10.12688/wellcomeopenres.17478.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/07/2021] [Indexed: 11/22/2022] Open
Abstract
We present a genome assembly from an individual male Celastrina argiolus) (the holly blue; Arthropoda; Insecta; Lepidoptera; Lycaenidae). The genome sequence is 499 megabases in span. The majority (99.99%) of the assembly is scaffolded into 26 chromosomal pseudomolecules, with the Z sex chromosome assembled. Gene annotation of this assembly on Ensembl has identified 12,199 protein coding genes.
Collapse
Affiliation(s)
- Alex Hayward
- College of Life and Environmental Sciences, Department of Biosciences, University of Exeter, Penryn, UK
| | - Charlotte Wright
- Tree of Life, Wellcome Sanger Institute, Cambridge, CB10 1SA, UK
| | - Darwin Tree of Life Barcoding collective
- College of Life and Environmental Sciences, Department of Biosciences, University of Exeter, Penryn, UK
- Tree of Life, Wellcome Sanger Institute, Cambridge, CB10 1SA, UK
| | | | | | - Tree of Life Core Informatics collective
- College of Life and Environmental Sciences, Department of Biosciences, University of Exeter, Penryn, UK
- Tree of Life, Wellcome Sanger Institute, Cambridge, CB10 1SA, UK
| | | |
Collapse
|
7
|
Lohse K, Wright C, Talavera G, García-Berro A. The genome sequence of the painted lady, Vanessa cardui Linnaeus 1758. Wellcome Open Res 2021; 6:324. [PMID: 37008186 PMCID: PMC10061037 DOI: 10.12688/wellcomeopenres.17358.1] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/02/2021] [Indexed: 11/20/2022] Open
Abstract
We present a genome assembly from an individual female Vanessa cardui (the painted lady; Arthropoda; Insecta; Lepidoptera; Nymphalidae). The genome sequence is 425 megabases in span. The majority of the assembly is scaffolded into 32 chromosomal pseudomolecules, with the W and Z sex chromosome assembled. Gene annotation of this assembly on Ensembl has identified 12,821 protein coding genes.
Collapse
Affiliation(s)
- Konrad Lohse
- Institute of Evolutionary Biology, University of Edinburgh, Edingburgh, UK
| | | | - Gerard Talavera
- Institut Botànic de Barcelona (IBB, CSIC-Ajuntament de Barcelona), Barcelona, Spain
| | - Aurora García-Berro
- Institut Botànic de Barcelona (IBB, CSIC-Ajuntament de Barcelona), Barcelona, Spain
| | - Darwin Tree of Life Barcoding collective
- Institute of Evolutionary Biology, University of Edinburgh, Edingburgh, UK
- Tree of Life, Wellcome Sanger Institute, Cambridge, UK
- Institut Botànic de Barcelona (IBB, CSIC-Ajuntament de Barcelona), Barcelona, Spain
| | - Wellcome Sanger Institute Tree of Life programme
- Institute of Evolutionary Biology, University of Edinburgh, Edingburgh, UK
- Tree of Life, Wellcome Sanger Institute, Cambridge, UK
- Institut Botànic de Barcelona (IBB, CSIC-Ajuntament de Barcelona), Barcelona, Spain
| | | | - Tree of Life Core Informatics collective
- Institute of Evolutionary Biology, University of Edinburgh, Edingburgh, UK
- Tree of Life, Wellcome Sanger Institute, Cambridge, UK
- Institut Botànic de Barcelona (IBB, CSIC-Ajuntament de Barcelona), Barcelona, Spain
| | | |
Collapse
|
8
|
Hayward A, Vila R, Laetsch DR, Lohse K, Baril T. The genome sequence of the heath fritillary, Melitaea athalia (Rottemburg, 1775). Wellcome Open Res 2021; 6:304. [PMID: 35136843 PMCID: PMC8796007 DOI: 10.12688/wellcomeopenres.17280.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/06/2021] [Indexed: 11/23/2022] Open
Abstract
We present a genome assembly from an individual female
Melitaea athalia (also known as
Mellicta athalia;
the heath fritillary; Arthropoda; Insecta; Lepidoptera; Nymphalidae). The genome sequence is 610 megabases in span. In total, 99.98% of the assembly is scaffolded into 32 chromosomal pseudomolecules, with the W and Z sex chromosome assembled. Gene annotation of this assembly on Ensembl has identified 12,824 protein coding genes.
Collapse
Affiliation(s)
| | - Roger Vila
- Institut de Biologia Evolutiva (CSIC - Universitat Pompeu Fabra), Barcelona, Spain
| | - Dominik R. Laetsch
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh, UK
| | - Konrad Lohse
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh, UK
| | | | | | | | | | | | | |
Collapse
|
9
|
Lohse K, Weir J. The genome sequence of the meadow brown, Maniola jurtina (Linnaeus, 1758). Wellcome Open Res 2021; 6:296. [PMID: 36866280 PMCID: PMC9971652 DOI: 10.12688/wellcomeopenres.17304.1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/22/2021] [Indexed: 11/20/2022] Open
Abstract
We present a genome assembly from an individual female Maniola jurtina (the meadow brown; Arthropoda; Insecta; Lepidoptera; Nymphalidae). The genome sequence is 402 megabases in span. The complete assembly is scaffolded into 30 chromosomal pseudomolecules, with the W and Z sex chromosome assembled. Gene annotation of this assembly on Ensembl has identified 12,502 protein coding genes.
Collapse
Affiliation(s)
- Konrad Lohse
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh, UK
| | - Jamie Weir
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh, UK
| | | | | | | | | | | |
Collapse
|
10
|
Lohse K, Laetsch DR, Vila R. The genome sequence of the small copper, Lycaena phlaeas (Linnaeus, 1760). Wellcome Open Res 2021. [DOI: 10.12688/wellcomeopenres.17289.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
We present a genome assembly from an individual male Lycaena phlaeas (the small copper; Arthropoda; Insecta; Lepidoptera; Lycaenidae). The genome sequence is 420 megabases in span. The whole of the assembly is scaffolded into 24 chromosomal pseudomolecules, with the Z sex chromosome assembled. Gene annotation of this assembly on Ensembl has identified 12,147 protein coding genes.
Collapse
|
11
|
Lohse K, Hayward A, Ebdon S. The genome sequences of the male and female green-veined white, Pieris napi (Linnaeus, 1758). Wellcome Open Res 2021; 6:288. [PMID: 35846179 PMCID: PMC9257262 DOI: 10.12688/wellcomeopenres.17277.1] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/06/2021] [Indexed: 11/20/2022] Open
Abstract
We present genome assemblies from a male and female
Pieris napi (the green-veined white; Arthropoda; Insecta; Lepidoptera; Pieridae). The genome sequences of the male and female are 320 and 319 megabases in span, respectively. The majority of the assembly (99.79% of the male assembly, 99.88% of the female) is scaffolded into 24 autosomal pseudomolecules, with the Z sex chromosome assembled for the male and Z and W chromosomes assembled for the female. Gene annotation of the male assembly on Ensembl has identified 13,221 protein coding genes.
Collapse
Affiliation(s)
- Konrad Lohse
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh, UK
| | | | - Sam Ebdon
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh, UK
| | | | | | | | | | | |
Collapse
|
12
|
Lohse K, Taylor-Cox E. The genome sequence of the speckled wood butterfly, Pararge aegeria (Linnaeus, 1758). Wellcome Open Res 2021. [DOI: 10.12688/wellcomeopenres.17278.1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023] Open
Abstract
We present a genome assembly from an individual female Pararge aegeria (the speckled wood butterfly; Arthropoda; Insecta; Lepidoptera; Nymphalidae). The genome sequence is 517 megabases in span. The majority of the assembly (99.68%) is scaffolded into 29 chromosomal pseudomolecules, with the W and Z sex chromosome assembled. Gene annotation of this assembly on Ensembl has identified 12,288 protein coding genes.
Collapse
|
13
|
Ebdon S, Mackintosh A, Hayward A, Wotton K. The genome sequence of the clouded yellow, Colias crocea (Geoffroy, 1785). Wellcome Open Res 2021; 6:284. [PMID: 36157970 PMCID: PMC9490288 DOI: 10.12688/wellcomeopenres.17292.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/13/2021] [Indexed: 11/28/2022] Open
Abstract
We present a genome assembly from an individual female
Colias crocea (also known as
Colias croceus; the clouded yellow; Arthropoda; Insecta; Lepidoptera; Pieridae). The genome sequence is 325 megabases in span. The complete assembly is scaffolded into 32 chromosomal pseudomolecules, with the W and Z sex chromosome assembled. Gene annotation of this assembly on Ensembl has identified 13,803 protein coding genes.
Collapse
Affiliation(s)
- Sam Ebdon
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh, UK
| | - Alex Mackintosh
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh, UK
| | | | | | | | | | | | | | | |
Collapse
|
14
|
Lohse K, Ebdon S, Vila R. The genome sequence of the small white, Pieris rapae (Linnaeus, 1758). Wellcome Open Res 2021. [DOI: 10.12688/wellcomeopenres.17288.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
We present a genome assembly from an individual female Pieris rapae (the small white; Arthropoda; Insecta; Lepidoptera; Pieridae). The genome sequence is 256 megabases in span. The majority of the assembly is scaffolded into 26 chromosomal pseudomolecules, with the W and Z sex chromosome assembled. Gene annotation of this assembly on Ensembl has identified 12,390 protein coding genes.
Collapse
|
15
|
Boyes D, Holland PW. The genome sequence of the peach blossom moth, Thyatira batis (Linnaeus, 1758). Wellcome Open Res 2021; 6:267. [PMID: 35252591 PMCID: PMC8874031 DOI: 10.12688/wellcomeopenres.17268.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/01/2021] [Indexed: 11/20/2022] Open
Abstract
We present a genome assembly from an individual male Thyatira batis (the peach-blossom moth; Arthropoda; Insecta; Lepidoptera; Drepanidae). The genome sequence is 315 megabases in span. The majority of the assembly (99.68%) is scaffolded into 31 chromosomal pseudomolecules, with the Z sex chromosome assembled. The mitochondrial genome was also assembled and is 15.4 kilobases in length. Gene annotation of this assembly on Ensembl has identified 12,238 protein coding genes.
Collapse
Affiliation(s)
- Douglas Boyes
- UK Centre for Ecology & Hydrology, Wallingford, OX10 8BB, UK
| | | | | | | | | | | | | | | |
Collapse
|
16
|
Vila R, Hayward A, Lohse K, Wright C. The genome sequence of the Glanville fritillary, Melitaea cinxia (Linnaeus, 1758). Wellcome Open Res 2021; 6:266. [PMID: 36873711 PMCID: PMC9975429 DOI: 10.12688/wellcomeopenres.17283.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/07/2021] [Indexed: 11/20/2022] Open
Abstract
We present a genome assembly from an individual male Melitaea cinxia (the Glanville fritillary; Arthropoda; Insecta; Lepidoptera; Nymphalidae). The genome sequence is 499 megabases in span. The complete assembly is scaffolded into 31 chromosomal pseudomolecules, with the Z sex chromosome assembled. Gene annotation of this assembly on Ensembl has identified 13,666 protein coding genes.
Collapse
Affiliation(s)
- Roger Vila
- Institut de Biologia Evolutiva (CSIC - Universitat Pompeu Fabra), Barcelona, Spain
| | | | - Konrad Lohse
- Institute for Evolutionary Biology, University of Edinburgh, Edinburgh, UK
| | | | - Darwin Tree of Life Barcoding collective
- Institut de Biologia Evolutiva (CSIC - Universitat Pompeu Fabra), Barcelona, Spain
- University of Exeter, Penryn, UK
- Institute for Evolutionary Biology, University of Edinburgh, Edinburgh, UK
- Tree of Life, Wellcome Sanger Institute, Cambridge, UK
| | - Wellcome Sanger Institute Tree of Life programme
- Institut de Biologia Evolutiva (CSIC - Universitat Pompeu Fabra), Barcelona, Spain
- University of Exeter, Penryn, UK
- Institute for Evolutionary Biology, University of Edinburgh, Edinburgh, UK
- Tree of Life, Wellcome Sanger Institute, Cambridge, UK
| | - Wellcome Sanger Institute Scientific Operations: DNA Pipelines collective
- Institut de Biologia Evolutiva (CSIC - Universitat Pompeu Fabra), Barcelona, Spain
- University of Exeter, Penryn, UK
- Institute for Evolutionary Biology, University of Edinburgh, Edinburgh, UK
- Tree of Life, Wellcome Sanger Institute, Cambridge, UK
| | - Tree of Life Core Informatics collective
- Institut de Biologia Evolutiva (CSIC - Universitat Pompeu Fabra), Barcelona, Spain
- University of Exeter, Penryn, UK
- Institute for Evolutionary Biology, University of Edinburgh, Edinburgh, UK
- Tree of Life, Wellcome Sanger Institute, Cambridge, UK
| | | |
Collapse
|
17
|
Lohse K, Mackintosh A. The genome sequence of the large white, Pieris brassicae (Linnaeus, 1758). Wellcome Open Res 2021; 6:262. [PMID: 36312456 PMCID: PMC9608253 DOI: 10.12688/wellcomeopenres.17274.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/06/2021] [Indexed: 11/20/2022] Open
Abstract
We present a genome assembly from an individual female
Pieris brassicae (the large white; Arthropoda; Insecta; Lepidoptera; Pieridae). The genome sequence is 292 megabases in span. The majority of the assembly is scaffolded into 16 chromosomal pseudomolecules, with the W and Z sex chromosome assembled. Gene annotation of this assembly on Ensembl has identified 12,229 protein coding genes.
Collapse
Affiliation(s)
- Konrad Lohse
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh, UK
| | | | | | | | | | | | | |
Collapse
|
18
|
Lohse K, Mackintosh A, Vila R. The genome sequence of the European peacock butterfly, Aglais io (Linnaeus, 1758). Wellcome Open Res 2021; 6:258. [PMID: 36072556 PMCID: PMC9372638 DOI: 10.12688/wellcomeopenres.17204.1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/07/2021] [Indexed: 11/20/2022] Open
Abstract
We present a genome assembly from an individual male Aglais io (also known as Inachis io and Nymphalis io) (the European peacock; Arthropoda; Insecta; Lepidoptera; Nymphalidae). The genome sequence is 384 megabases in span. The majority (99.91%) of the assembly is scaffolded into 31 chromosomal pseudomolecules, with the Z sex chromosome assembled. Gene annotation of this assembly on Ensembl has identified 11,420 protein coding genes.
Collapse
Affiliation(s)
- Konrad Lohse
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh, UK
| | | | - Roger Vila
- Institut de Biologia Evolutiva (CSIC - Universitat Pompeu Fabra), Barcelona, Spain
| | | | | | | | | | | |
Collapse
|
19
|
Harris CD, Torrance EL, Raymann K, Bobay LM. CoreCruncher: Fast and Robust Construction of Core Genomes in Large Prokaryotic Data Sets. Mol Biol Evol 2021; 38:727-734. [PMID: 32886787 PMCID: PMC7826169 DOI: 10.1093/molbev/msaa224] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
The core genome represents the set of genes shared by all, or nearly all, strains of a given population or species of prokaryotes. Inferring the core genome is integral to many genomic analyses, however, most methods rely on the comparison of all the pairs of genomes; a step that is becoming increasingly difficult given the massive accumulation of genomic data. Here, we present CoreCruncher; a program that robustly and rapidly constructs core genomes across hundreds or thousands of genomes. CoreCruncher does not compute all pairwise genome comparisons and uses a heuristic based on the distributions of identity scores to classify sequences as orthologs or paralogs/xenologs. Although it is much faster than current methods, our results indicate that our approach is more conservative than other tools and less sensitive to the presence of paralogs and xenologs. CoreCruncher is freely available from: https://github.com/lbobay/CoreCruncher. CoreCruncher is written in Python 3.7 and can also run on Python 2.7 without modification. It requires the python library Numpy and either Usearch or Blast. Certain options require the programs muscle or mafft.
Collapse
Affiliation(s)
- Connor D Harris
- Department of Biology, University of North Carolina Greensboro, Greensboro, NC
| | - Ellis L Torrance
- Department of Biology, University of North Carolina Greensboro, Greensboro, NC
| | - Kasie Raymann
- Department of Biology, University of North Carolina Greensboro, Greensboro, NC
| | - Louis-Marie Bobay
- Department of Biology, University of North Carolina Greensboro, Greensboro, NC
| |
Collapse
|
20
|
Liedtke HC, Harney E, Gomez-Mestre I. Cross-species transcriptomics uncovers genes underlying genetic accommodation of developmental plasticity in spadefoot toads. Mol Ecol 2021; 30:2220-2234. [PMID: 33730392 DOI: 10.1111/mec.15883] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2020] [Revised: 01/29/2021] [Accepted: 02/26/2021] [Indexed: 10/21/2022]
Abstract
That hardcoded genomes can manifest as plastic phenotypes responding to environmental perturbations is a fascinating feature of living organisms. How such developmental plasticity is regulated at the molecular level is beginning to be uncovered aided by the development of -omic techniques. Here, we compare the transcriptome-wide responses of two species of spadefoot toads with differing capacity for developmental acceleration of their larvae in the face of a shared environmental risk: pond drying. By comparing gene expression profiles over time and performing cross-species network analyses, we identified orthologues and functional gene pathways whose environmental sensitivity in expression have diverged between species. Genes related to lipid, cholesterol and steroid biosynthesis and metabolism make up most of a module of genes environmentally responsive in one species, but canalized in the other. The evolutionary changes in the regulation of the genes identified through these analyses may have been key in the genetic accommodation of developmental plasticity in this system.
Collapse
Affiliation(s)
- Hans Christoph Liedtke
- Ecology, Evolution and Development Group, Department of Wetland Ecology, Estación Biológica de Doñana, CSIC, Seville, Spain
| | - Ewan Harney
- Department of Evolution, Ecology and Behaviour, Institute of Infection, Veterinary & Ecological Sciences, University of Liverpool, Liverpool, UK
| | - Ivan Gomez-Mestre
- Ecology, Evolution and Development Group, Department of Wetland Ecology, Estación Biológica de Doñana, CSIC, Seville, Spain
| |
Collapse
|
21
|
Hao Y, Lee HJ, Baraboo M, Burch K, Maurer T, Somarelli JA, Conant GC. Baby Genomics: Tracing the Evolutionary Changes That Gave Rise to Placentation. Genome Biol Evol 2021; 12:35-47. [PMID: 32053193 PMCID: PMC7144826 DOI: 10.1093/gbe/evaa026] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/02/2020] [Indexed: 12/12/2022] Open
Abstract
It has long been challenging to uncover the molecular mechanisms behind striking morphological innovations such as mammalian pregnancy. We studied the power of a robust comparative orthology pipeline based on gene synteny to address such problems. We inferred orthology relations between human genes and genes from each of 43 other vertebrate genomes, resulting in ∼18,000 orthologous pairs for each genome comparison. By identifying genes that first appear coincident with origin of the placental mammals, we hypothesized that we would define a subset of the genome enriched for genes that played a role in placental evolution. We thus pinpointed orthologs that appeared before and after the divergence of eutherian mammals from marsupials. Reinforcing previous work, we found instead that much of the genetic toolkit of mammalian pregnancy evolved through the repurposing of preexisting genes to new roles. These genes acquired regulatory controls for their novel roles from a group of regulatory genes, many of which did in fact originate at the appearance of the eutherians. Thus, orthologs appearing at the origin of the eutherians are enriched in functions such as transcriptional regulation by Krüppel-associated box-zinc-finger proteins, innate immune responses, keratinization, and the melanoma-associated antigen protein class. Because the cellular mechanisms of invasive placentae are similar to those of metastatic cancers, we then used our orthology inferences to explore the association between placenta invasion and cancer metastasis. Again echoing previous work, we find that genes that are phylogenetically older are more likely to be implicated in cancer development.
Collapse
Affiliation(s)
- Yue Hao
- Bioinformatics Research Center, North Carolina State University
| | - Hyuk Jin Lee
- Division of Biological Sciences, University of Missouri-Columbia
| | | | | | | | - Jason A Somarelli
- Duke Cancer Institute, Duke University Medical Center.,Department of Medicine, Duke University School of Medicine
| | - Gavin C Conant
- Bioinformatics Research Center, North Carolina State University.,Division of Animal Sciences, University of Missouri-Columbia.,Program in Genetics, North Carolina State University.,Department of Biological Sciences, North Carolina State University
| |
Collapse
|
22
|
Zdobnov EM, Kuznetsov D, Tegenfeldt F, Manni M, Berkeley M, Kriventseva EV. OrthoDB in 2020: evolutionary and functional annotations of orthologs. Nucleic Acids Res 2021; 49:D389-D393. [PMID: 33196836 PMCID: PMC7779051 DOI: 10.1093/nar/gkaa1009] [Citation(s) in RCA: 63] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Revised: 10/12/2020] [Accepted: 10/29/2020] [Indexed: 12/22/2022] Open
Abstract
OrthoDB provides evolutionary and functional annotations of orthologs, inferred for a vast number of available organisms. OrthoDB is leading in the coverage and genomic diversity sampling of Eukaryotes, Prokaryotes and Viruses, and the sampling of Bacteria is further set to increase three-fold. The user interface has been enhanced in response to the massive growth in data. OrthoDB provides three views on the data: (i) a list of orthologous groups related to a user query, which are now arranged to visualize their hierarchical relations, (ii) a detailed view of an orthologous group, now featuring a Sankey diagram to facilitate navigation between the levels of orthology, from more finely-resolved to more general groups of orthologs, as well as an arrangement of orthologs into an interactive organism taxonomy structure, and (iii) we added a gene-centric view, showing the gene functional annotations and the pair-wise orthologs in example species. The OrthoDB standalone software for delineation of orthologs, Orthologer, is freely available. Online BUSCO assessments and mapping to OrthoDB of user-uploaded data enable interactive exploration of related annotations and generation of comparative charts. OrthoDB strives to predict orthologs from the broadest coverage of species, as well as to extensively collate available functional annotations, and to compute evolutionary annotations such as evolutionary rate and phyletic profile. OrthoDB data can be assessed via SPARQL RDF, REST API, downloaded or browsed online from https://orthodb.org.
Collapse
Affiliation(s)
- Evgeny M Zdobnov
- Department of Genetic Medicine and Development, University of Geneva Medical School, rue Michel-Servet 1, 1211 Geneva, Switzerland, and Swiss Institute of Bioinformatics, rue Michel-Servet 1, 1211 Geneva, Switzerland
| | - Dmitry Kuznetsov
- Department of Genetic Medicine and Development, University of Geneva Medical School, rue Michel-Servet 1, 1211 Geneva, Switzerland, and Swiss Institute of Bioinformatics, rue Michel-Servet 1, 1211 Geneva, Switzerland
| | - Fredrik Tegenfeldt
- Department of Genetic Medicine and Development, University of Geneva Medical School, rue Michel-Servet 1, 1211 Geneva, Switzerland, and Swiss Institute of Bioinformatics, rue Michel-Servet 1, 1211 Geneva, Switzerland
| | - Mosè Manni
- Department of Genetic Medicine and Development, University of Geneva Medical School, rue Michel-Servet 1, 1211 Geneva, Switzerland, and Swiss Institute of Bioinformatics, rue Michel-Servet 1, 1211 Geneva, Switzerland
| | - Matthew Berkeley
- Department of Genetic Medicine and Development, University of Geneva Medical School, rue Michel-Servet 1, 1211 Geneva, Switzerland, and Swiss Institute of Bioinformatics, rue Michel-Servet 1, 1211 Geneva, Switzerland
| | - Evgenia V Kriventseva
- Department of Genetic Medicine and Development, University of Geneva Medical School, rue Michel-Servet 1, 1211 Geneva, Switzerland, and Swiss Institute of Bioinformatics, rue Michel-Servet 1, 1211 Geneva, Switzerland
| |
Collapse
|
23
|
de Melo ES, Wallau GL. Mosquito genomes are frequently invaded by transposable elements through horizontal transfer. PLoS Genet 2020; 16:e1008946. [PMID: 33253164 PMCID: PMC7728395 DOI: 10.1371/journal.pgen.1008946] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2020] [Revised: 12/10/2020] [Accepted: 10/19/2020] [Indexed: 12/28/2022] Open
Abstract
Transposable elements (TEs) are mobile genetic elements that parasitize basically all eukaryotic species genomes. Due to their complexity, an in-depth TE characterization is only available for a handful of model organisms. In the present study, we performed a de novo and homology-based characterization of TEs in the genomes of 24 mosquito species and investigated their mode of inheritance. More than 40% of the genome of Aedes aegypti, Aedes albopictus, and Culex quinquefasciatus is composed of TEs, while it varied substantially among Anopheles species (0.13%-19.55%). Class I TEs are the most abundant among mosquitoes and at least 24 TE superfamilies were found. Interestingly, TEs have been extensively exchanged by horizontal transfer (172 TE families of 16 different superfamilies) among mosquitoes in the last 30 million years. Horizontally transferred TEs represents around 7% of the genome in Aedes species and a small fraction in Anopheles genomes. Most of these horizontally transferred TEs are from the three ubiquitous LTR superfamilies: Gypsy, Bel-Pao and Copia. Searching more than 32,000 genomes, we also uncovered transfers between mosquitoes and two different Phyla-Cnidaria and Nematoda-and two subphyla-Chelicerata and Crustacea, identifying a vector, the worm Wuchereria bancrofti, that enabled the horizontal spread of a Tc1-mariner element among various Anopheles species. These data also allowed us to reconstruct the horizontal transfer network of this TE involving more than 40 species. In summary, our results suggest that TEs are frequently exchanged by horizontal transfers among mosquitoes, influencing mosquito's genome size and variability.
Collapse
Affiliation(s)
- Elverson Soares de Melo
- Department of Entomology, Aggeu Magalhães Institute–Oswaldo Cruz Foundation (Fiocruz), Recife, Pernambuco, Brazil
| | - Gabriel Luz Wallau
- Department of Entomology, Aggeu Magalhães Institute–Oswaldo Cruz Foundation (Fiocruz), Recife, Pernambuco, Brazil
| |
Collapse
|
24
|
Lebedev R, Trabelcy B, Langier Goncalves I, Gerchman Y, Sapir A. Metabolic Reconfiguration in C. elegans Suggests a Pathway for Widespread Sterol Auxotrophy in the Animal Kingdom. Curr Biol 2020; 30:3031-3038.e7. [PMID: 32559444 DOI: 10.1016/j.cub.2020.05.070] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2020] [Revised: 04/24/2020] [Accepted: 05/20/2020] [Indexed: 12/23/2022]
Abstract
Cholesterol is one of the hallmarks of animals. In vertebrates, the cholesterol synthesis pathway (CSP) is the primary source of cholesterol that has numerous structural and regulative roles [1]. Nevertheless, the few invertebrates tested for cholesterol synthesis show complete sterol auxotrophy [2-6], raising questions about how animals thrive without cholesterol synthesis and about the prevalence of sterol auxotrophy in animals. In the nematode Caenorhabditis elegans (C. elegans), sterols are the precursors of the steroid hormone dafachronic acid that coordinates development to adulthood [7, 8]; thus, sterol-deprived C. elegans arrest at the diapause "dauer" larval stage [9]. Using this system, we have identified a pathway that converts plant and fungal sterols into cholesterol through the activity of enzymes with sequence similarity to specific human CSP enzymes. Based on this finding, we propose that two critical steps shaped the evolution of animal sterol auxotrophy: (1) the loss of the orthologs of the first three enzymes of the CSP and (2) the co-opting of other downstream enzymes of the CSP for the utilization of dietary sterols. Using this mechanistic signature, we studied the evolution of cholesterol auxotrophy across the animal kingdom. Complete sets of CSP enzymes in basal animals suggest that the loss of cholesterol synthesis occurred during animal evolution. A sterol auxothropy signature in the genomes of many invertebrates, including nematodes and most arthropods, suggests widespread cholesterol auxotrophy in animals. Thus, we propose that this co-opted pathway supports widespread cholesterol auxotrophy by interkingdom interactions between cholesterol-auxotrophic animals and sterol-producing fungi and plants.
Collapse
Affiliation(s)
- Ron Lebedev
- Department of Biology and the Environment, Faculty of Natural Sciences, University of Haifa, Oranim, Tivon 36006, Israel
| | - Benjamin Trabelcy
- Department of Biology and the Environment, Faculty of Natural Sciences, University of Haifa, Oranim, Tivon 36006, Israel
| | - Irina Langier Goncalves
- Department of Biology and the Environment, Faculty of Natural Sciences, University of Haifa, Oranim, Tivon 36006, Israel
| | - Yoram Gerchman
- Department of Biology and the Environment, Faculty of Natural Sciences, University of Haifa, Oranim, Tivon 36006, Israel
| | - Amir Sapir
- Department of Biology and the Environment, Faculty of Natural Sciences, University of Haifa, Oranim, Tivon 36006, Israel.
| |
Collapse
|
25
|
Uchiyama I, Mihara M, Nishide H, Chiba H, Kato M. MBGD update 2018: microbial genome database based on hierarchical orthology relations covering closely related and distantly related comparisons. Nucleic Acids Res 2020; 47:D382-D389. [PMID: 30462302 PMCID: PMC6324027 DOI: 10.1093/nar/gky1054] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2018] [Accepted: 11/03/2018] [Indexed: 01/20/2023] Open
Abstract
The Microbial Genome Database for Comparative Analysis (MBGD) is a database for comparative genomics based on comprehensive orthology analysis of bacteria, archaea and unicellular eukaryotes. MBGD now contains 6318 genomes. To utilize the database for both closely related and distantly related genomes, MBGD previously provided two types of ortholog tables: the standard ortholog table containing one representative genome from each genus covering the entire taxonomic range and the taxon specific ortholog tables for each taxon. However, this approach has a drawback in that the standard ortholog table contains only genes that are conserved in the representative genomes. To address this problem, we developed a stepwise procedure to construct ortholog tables hierarchically in a bottom-up manner. By using this approach, the new standard ortholog table now covers the entire gene repertoire stored in MBGD. In addition, we have enhanced several functionalities, including rapid and flexible keyword searching, profile-based sequence searching for orthology assignment to a user query sequence, and displaying a phylogenetic tree of each taxon based on the concatenated core gene sequences. For integrative database searching, the core data in MBGD are represented in Resource Description Framework (RDF) and a SPARQL interface is provided to search them. MBGD is available at http://mbgd.genome.ad.jp/.
Collapse
Affiliation(s)
- Ikuo Uchiyama
- Laboratory of Genome Informatics, National Institute for Basic Biology, National Institutes of Natural Sciences, Nishigonaka 38, Myodaiji, Okazaki, Aichi 444-8585, Japan.,Data Integration and Analysis Facility, National Institute for Basic Biology, National Institutes of Natural Sciences, Nishigonaka 38, Myodaiji, Okazaki, Aichi 444-8585, Japan
| | - Motohiro Mihara
- Dynacom Co., Ltd. 5-1-27, Onoedori, Chuo-ku, Kobe, Hyogo 651-0088, Japan
| | - Hiroyo Nishide
- Data Integration and Analysis Facility, National Institute for Basic Biology, National Institutes of Natural Sciences, Nishigonaka 38, Myodaiji, Okazaki, Aichi 444-8585, Japan
| | - Hirokazu Chiba
- Database Center for Life Science, Research Organization of Information and Systems 178-4-4 Wakashiba, Kashiwa, Chiba 277-0871, Japan
| | - Masaki Kato
- Laboratory of Genome Informatics, National Institute for Basic Biology, National Institutes of Natural Sciences, Nishigonaka 38, Myodaiji, Okazaki, Aichi 444-8585, Japan
| |
Collapse
|
26
|
Kriventseva EV, Kuznetsov D, Tegenfeldt F, Manni M, Dias R, Simão FA, Zdobnov EM. OrthoDB v10: sampling the diversity of animal, plant, fungal, protist, bacterial and viral genomes for evolutionary and functional annotations of orthologs. Nucleic Acids Res 2020; 47:D807-D811. [PMID: 30395283 PMCID: PMC6323947 DOI: 10.1093/nar/gky1053] [Citation(s) in RCA: 487] [Impact Index Per Article: 121.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2018] [Accepted: 10/29/2018] [Indexed: 11/13/2022] Open
Abstract
OrthoDB (https://www.orthodb.org) provides evolutionary and functional annotations of orthologs. This update features a major scaling up of the resource coverage, sampling the genomic diversity of 1271 eukaryotes, 6013 prokaryotes and 6488 viruses. These include putative orthologs among 448 metazoan, 117 plant, 549 fungal, 148 protist, 5609 bacterial, and 404 archaeal genomes, picking up the best sequenced and annotated representatives for each species or operational taxonomic unit. OrthoDB relies on a concept of hierarchy of levels-of-orthology to enable more finely resolved gene orthologies for more closely related species. Since orthologs are the most likely candidates to retain functions of their ancestor gene, OrthoDB is aimed at narrowing down hypotheses about gene functions and enabling comparative evolutionary studies. Optional registered-user sessions allow on-line BUSCO assessments of gene set completeness and mapping of the uploaded data to OrthoDB to enable further interactive exploration of related annotations and generation of comparative charts. The accelerating expansion of genomics data continues to add valuable information, and OrthoDB strives to provide orthologs from the broadest coverage of species, as well as to extensively collate available functional annotations and to compute evolutionary annotations. The data can be browsed online, downloaded or assessed via REST API or SPARQL RDF compatible with both UniProt and Ensembl.
Collapse
Affiliation(s)
- Evgenia V Kriventseva
- Department of Genetic Medicine and Development, University of Geneva Medical School, rue Michel-Servet 1, 1211 Geneva, Switzerland.,Swiss Institute of Bioinformatics, rue Michel-Servet 1, 1211 Geneva, Switzerland
| | - Dmitry Kuznetsov
- Department of Genetic Medicine and Development, University of Geneva Medical School, rue Michel-Servet 1, 1211 Geneva, Switzerland.,Swiss Institute of Bioinformatics, rue Michel-Servet 1, 1211 Geneva, Switzerland
| | - Fredrik Tegenfeldt
- Department of Genetic Medicine and Development, University of Geneva Medical School, rue Michel-Servet 1, 1211 Geneva, Switzerland.,Swiss Institute of Bioinformatics, rue Michel-Servet 1, 1211 Geneva, Switzerland
| | - Mosè Manni
- Department of Genetic Medicine and Development, University of Geneva Medical School, rue Michel-Servet 1, 1211 Geneva, Switzerland.,Swiss Institute of Bioinformatics, rue Michel-Servet 1, 1211 Geneva, Switzerland
| | - Renata Dias
- Department of Genetic Medicine and Development, University of Geneva Medical School, rue Michel-Servet 1, 1211 Geneva, Switzerland.,Swiss Institute of Bioinformatics, rue Michel-Servet 1, 1211 Geneva, Switzerland
| | - Felipe A Simão
- Department of Genetic Medicine and Development, University of Geneva Medical School, rue Michel-Servet 1, 1211 Geneva, Switzerland.,Swiss Institute of Bioinformatics, rue Michel-Servet 1, 1211 Geneva, Switzerland
| | - Evgeny M Zdobnov
- Department of Genetic Medicine and Development, University of Geneva Medical School, rue Michel-Servet 1, 1211 Geneva, Switzerland.,Swiss Institute of Bioinformatics, rue Michel-Servet 1, 1211 Geneva, Switzerland
| |
Collapse
|
27
|
Vakirlis N, Carvunis AR, McLysaght A. Synteny-based analyses indicate that sequence divergence is not the main source of orphan genes. eLife 2020; 9:e53500. [PMID: 32066524 PMCID: PMC7028367 DOI: 10.7554/elife.53500] [Citation(s) in RCA: 66] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2019] [Accepted: 01/07/2020] [Indexed: 12/20/2022] Open
Abstract
The origin of 'orphan' genes, species-specific sequences that lack detectable homologues, has remained mysterious since the dawn of the genomic era. There are two dominant explanations for orphan genes: complete sequence divergence from ancestral genes, such that homologues are not readily detectable; and de novo emergence from ancestral non-genic sequences, such that homologues genuinely do not exist. The relative contribution of the two processes remains unknown. Here, we harness the special circumstance of conserved synteny to estimate the contribution of complete divergence to the pool of orphan genes. By separately comparing yeast, fly and human genes to related taxa using conservative criteria, we find that complete divergence accounts, on average, for at most a third of eukaryotic orphan and taxonomically restricted genes. We observe that complete divergence occurs at a stable rate within a phylum but at different rates between phyla, and is frequently associated with gene shortening akin to pseudogenization.
Collapse
Affiliation(s)
- Nikolaos Vakirlis
- Smurfit Institute of GeneticsTrinity College Dublin, University of DublinDublinIreland
| | - Anne-Ruxandra Carvunis
- Department of Computational and Systems Biology, Pittsburgh Center for Evolutionary Biology and Medicine, School of MedicineUniversity of PittsburghPittsburghUnited States
| | - Aoife McLysaght
- Smurfit Institute of GeneticsTrinity College Dublin, University of DublinDublinIreland
| |
Collapse
|
28
|
de Moya RS, Allen JM, Sweet AD, Walden KKO, Palma RL, Smith VS, Cameron SL, Valim MP, Galloway TD, Weckstein JD, Johnson KP. Extensive host-switching of avian feather lice following the Cretaceous-Paleogene mass extinction event. Commun Biol 2019; 2:445. [PMID: 31815200 PMCID: PMC6884534 DOI: 10.1038/s42003-019-0689-7] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2019] [Accepted: 11/08/2019] [Indexed: 01/08/2023] Open
Abstract
Nearly all lineages of birds host parasitic feather lice. Based on recent phylogenomic studies, the three major lineages of modern birds diverged from each other before the Cretaceous-Paleogene (K-Pg) mass extinction event. In contrast, studies of the phylogeny of feather lice on birds, indicate that these parasites diversified largely after this event. However, these studies were unable to reconstruct the ancestral avian host lineage for feather lice. Here we use genome sequences of a broad diversity of lice to reconstruct a phylogeny based on 1,075 genes. By comparing this louse evolutionary tree to the avian host tree, we show that feather lice began diversifying on the common ancestor of waterfowl and landfowl, then radiated onto other avian lineages by extensive host-switching. Dating analyses and cophylogenetic comparisons revealed that two of three lineages of birds that diverged before the K-Pg boundary acquired their feather lice after this event via host-switching.
Collapse
Affiliation(s)
- Robert S. de Moya
- Illinois Natural History Survey, Prairie Research Institute, University of Illinois, Champaign, IL USA
- Department of Entomology, University of Illinois, Urbana, IL USA
| | - Julie M. Allen
- Illinois Natural History Survey, Prairie Research Institute, University of Illinois, Champaign, IL USA
- Department of Biology, University of Nevada, Reno, NV USA
| | - Andrew D. Sweet
- Illinois Natural History Survey, Prairie Research Institute, University of Illinois, Champaign, IL USA
- Department of Entomology, Purdue University, West Lafayette, IN USA
| | | | - Ricardo L. Palma
- Museum of New Zealand Te Papa Tongarewa, Wellington, New Zealand
| | - Vincent S. Smith
- Department of Life Sciences, The Natural History Museum, London, UK
| | | | | | - Terry D. Galloway
- Department of Entomology, University of Manitoba, Winnipeg, Manitoba Canada
| | - Jason D. Weckstein
- Department of Ornithology, Academy of Natural Sciences of Drexel University, Philadelphia, PA USA
| | - Kevin P. Johnson
- Illinois Natural History Survey, Prairie Research Institute, University of Illinois, Champaign, IL USA
| |
Collapse
|
29
|
Rivetti C, Allen TEH, Brown JB, Butler E, Carmichael PL, Colbourne JK, Dent M, Falciani F, Gunnarsson L, Gutsell S, Harrill JA, Hodges G, Jennings P, Judson R, Kienzler A, Margiotta-Casaluci L, Muller I, Owen SF, Rendal C, Russell PJ, Scott S, Sewell F, Shah I, Sorrel I, Viant MR, Westmoreland C, White A, Campos B. Vision of a near future: Bridging the human health-environment divide. Toward an integrated strategy to understand mechanisms across species for chemical safety assessment. Toxicol In Vitro 2019; 62:104692. [PMID: 31669395 DOI: 10.1016/j.tiv.2019.104692] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2019] [Revised: 09/25/2019] [Accepted: 10/14/2019] [Indexed: 12/31/2022]
Abstract
There is a growing recognition that application of mechanistic approaches to understand cross-species shared molecular targets and pathway conservation in the context of hazard characterization, provide significant opportunities in risk assessment (RA) for both human health and environmental safety. Specifically, it has been recognized that a more comprehensive and reliable understanding of similarities and differences in biological pathways across a variety of species will better enable cross-species extrapolation of potential adverse toxicological effects. Ultimately, this would also advance the generation and use of mechanistic data for both human health and environmental RA. A workshop brought together representatives from industry, academia and government to discuss how to improve the use of existing data, and to generate new NAMs data to derive better mechanistic understanding between humans and environmentally-relevant species, ultimately resulting in holistic chemical safety decisions. Thanks to a thorough dialogue among all participants, key challenges, current gaps and research needs were identified, and potential solutions proposed. This discussion highlighted the common objective to progress toward more predictive, mechanistically based, data-driven and animal-free chemical safety assessments. Overall, the participants recognized that there is no single approach which would provide all the answers for bridging the gap between mechanism-based human health and environmental RA, but acknowledged we now have the incentive, tools and data availability to address this concept, maximizing the potential for improvements in both human health and environmental RA.
Collapse
Affiliation(s)
- Claudia Rivetti
- Unilever, Safety and Environmental Assurance Centre, Colworth Science Park, Sharnbrook, Bedfordshire MK44 1LQ, United Kingdom
| | - Timothy E H Allen
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
| | - James B Brown
- Department of Genome Dynamics Lawrence Berkeley National Laboratory, University of California Berkeley, Berkeley, California 94720, USA
| | - Emma Butler
- Unilever, Safety and Environmental Assurance Centre, Colworth Science Park, Sharnbrook, Bedfordshire MK44 1LQ, United Kingdom
| | - Paul L Carmichael
- Unilever, Safety and Environmental Assurance Centre, Colworth Science Park, Sharnbrook, Bedfordshire MK44 1LQ, United Kingdom
| | - John K Colbourne
- School of Biosciences, University of Birmingham, Birmingham B15 2TT, United Kingdom
| | - Matthew Dent
- Unilever, Safety and Environmental Assurance Centre, Colworth Science Park, Sharnbrook, Bedfordshire MK44 1LQ, United Kingdom
| | - Francesco Falciani
- Institute for Integrative Biology, University of Liverpool, L69 7ZB Liverpool, United Kingdom
| | - Lina Gunnarsson
- Biosciences, College of Life and Environmental Sciences, University of Exeter, Geoffrey Pope, Stocker Road, Exeter, Devon EX4 4QD, United Kingdom
| | - Steve Gutsell
- Unilever, Safety and Environmental Assurance Centre, Colworth Science Park, Sharnbrook, Bedfordshire MK44 1LQ, United Kingdom
| | - Joshua A Harrill
- National Center for Computational Toxicology, Office of Research & Development, U.S. Environmental Protection Agency, Mail Code B205-01, Research Triangle Park, Durham, North Carolina 27711, USA
| | - Geoff Hodges
- Unilever, Safety and Environmental Assurance Centre, Colworth Science Park, Sharnbrook, Bedfordshire MK44 1LQ, United Kingdom
| | - Paul Jennings
- Division of Molecular and Computational Toxicology, Amsterdam Institute for Molecules, Medicines and Systems, Vrije Universiteit Amsterdam, Amsterdam, the Netherlands
| | - Richard Judson
- National Center for Computational Toxicology, Office of Research & Development, U.S. Environmental Protection Agency, Mail Code B205-01, Research Triangle Park, Durham, North Carolina 27711, USA
| | - Aude Kienzler
- European Commission, Joint Research Centre (JRC), Ispra, VA, Italy
| | | | - Iris Muller
- Unilever, Safety and Environmental Assurance Centre, Colworth Science Park, Sharnbrook, Bedfordshire MK44 1LQ, United Kingdom
| | - Stewart F Owen
- AstraZeneca, Alderley Park, Macclesfield, Cheshire SK10 4TF, United Kingdom
| | - Cecilie Rendal
- Unilever, Safety and Environmental Assurance Centre, Colworth Science Park, Sharnbrook, Bedfordshire MK44 1LQ, United Kingdom
| | - Paul J Russell
- Unilever, Safety and Environmental Assurance Centre, Colworth Science Park, Sharnbrook, Bedfordshire MK44 1LQ, United Kingdom
| | - Sharon Scott
- Unilever, Safety and Environmental Assurance Centre, Colworth Science Park, Sharnbrook, Bedfordshire MK44 1LQ, United Kingdom
| | - Fiona Sewell
- NC3Rs, Gibbs Building, 215 Euston Road, London NW1 2BE, United Kingdom
| | - Imran Shah
- National Center for Computational Toxicology, Office of Research & Development, U.S. Environmental Protection Agency, Mail Code B205-01, Research Triangle Park, Durham, North Carolina 27711, USA
| | - Ian Sorrel
- Unilever, Safety and Environmental Assurance Centre, Colworth Science Park, Sharnbrook, Bedfordshire MK44 1LQ, United Kingdom
| | - Mark R Viant
- School of Biosciences, University of Birmingham, Birmingham B15 2TT, United Kingdom
| | - Carl Westmoreland
- Unilever, Safety and Environmental Assurance Centre, Colworth Science Park, Sharnbrook, Bedfordshire MK44 1LQ, United Kingdom
| | - Andrew White
- Unilever, Safety and Environmental Assurance Centre, Colworth Science Park, Sharnbrook, Bedfordshire MK44 1LQ, United Kingdom
| | - Bruno Campos
- Unilever, Safety and Environmental Assurance Centre, Colworth Science Park, Sharnbrook, Bedfordshire MK44 1LQ, United Kingdom.
| |
Collapse
|
30
|
Nuclear Orthologs Derived from Whole Genome Sequencing Indicate Cryptic Diversity in the Bemisia tabaci (Insecta: Aleyrodidae) Complex of Whiteflies. DIVERSITY-BASEL 2019. [DOI: 10.3390/d11090151] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
The Bemisia tabaci complex of whiteflies contains globally important pests thought to contain cryptic species corresponding to geographically structured phylogenetic clades. Although mostly morphologically indistinguishable, differences have been shown to exist among populations in behavior, plant virus vector capacity, ability to hybridize, and DNA sequence divergence. These differences allow for certain populations to become invasive and cause great economic damage in a monoculture setting. Although high mitochondrial DNA divergences have been reported between putative conspecifics of the B. tabaci species complex, there is limited data that exists across the whole genome for this group. Using data from 2184 orthologs obtained from whole genome sequencing (Illumina), a phylogenetic analysis using maximum likelihood and coalescent methodologies was completed on ten individuals of the B. tabaci complex. In addition, automatic barcode gap discovery methods were employed, and results suggest the existence of five species. Although the divergences of the mitochondrial cytochrome oxidase I gene are high among members of this complex, nuclear divergences are much lower in comparison. Single-copy orthologs from whole genome sequencing demonstrate divergent population structures among members of the B. tabaci complex and the sequences provide an important resource to aid in future genomic studies of the group.
Collapse
|
31
|
Imrie L, Le Bihan T, O'Toole Á, Hickner PV, Dunn WA, Weise B, Rund SSC. Genome annotation improvements from cross-phyla proteogenomics and time-of-day differences in malaria mosquito proteins using untargeted quantitative proteomics. PLoS One 2019; 14:e0220225. [PMID: 31356616 PMCID: PMC6663012 DOI: 10.1371/journal.pone.0220225] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2018] [Accepted: 07/11/2019] [Indexed: 12/12/2022] Open
Abstract
The malaria mosquito, Anopheles stephensi, and other mosquitoes modulate their biology to match the time-of-day. In the present work, we used a non-hypothesis driven approach (untargeted proteomics) to identify proteins in mosquito tissue, and then quantified the relative abundance of the identified proteins from An. stephensi bodies. Using these quantified protein levels, we then analyzed the data for proteins that were only detectable at certain times-of-the day, highlighting the need to consider time-of-day in experimental design. Further, we extended our time-of-day analysis to look for proteins which cycle in a rhythmic 24-hour ("circadian") manner, identifying 31 rhythmic proteins. Finally, to maximize the utility of our data, we performed a proteogenomic analysis to improve the genome annotation of An. stephensi. We compare peptides that were detected using mass spectrometry but are 'missing' from the An. stephensi predicted proteome, to reference proteomes from 38 other primarily human disease vector species. We found 239 such peptide matches and reveal that genome annotation can be improved using proteogenomic analysis from taxonomically diverse reference proteomes. Examination of 'missing' peptides revealed reading frame errors, errors in gene-calling, overlapping gene models, and suspected gaps in the genome assembly.
Collapse
Affiliation(s)
- Lisa Imrie
- SynthSys–Synthetic and Systems Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, United Kingdom
| | - Thierry Le Bihan
- SynthSys–Synthetic and Systems Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, United Kingdom
- Centre for Immunity, Infection and Evolution, University of Edinburgh, Edinburgh, United Kingdom
- Rapid Novor, Kitchener, Ontario, Canada
| | - Áine O'Toole
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh, United Kingdom
| | - Paul V. Hickner
- Eck Institute for Global Health, University of Notre Dame, Notre Dame, Indiana, United States of America
| | - W. Augustine Dunn
- Boston Children's Hospital, Boston, Massachusetts, United States of America
| | - Benjamin Weise
- Centre for Immunity, Infection and Evolution, University of Edinburgh, Edinburgh, United Kingdom
| | - Samuel S. C. Rund
- Centre for Immunity, Infection and Evolution, University of Edinburgh, Edinburgh, United Kingdom
- Eck Institute for Global Health, University of Notre Dame, Notre Dame, Indiana, United States of America
- * E-mail:
| |
Collapse
|
32
|
Heller D, Szklarczyk D, Mering CV. Tree reconciliation combined with subsampling improves large scale inference of orthologous group hierarchies. BMC Bioinformatics 2019; 20:228. [PMID: 31060495 PMCID: PMC6501302 DOI: 10.1186/s12859-019-2828-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2018] [Accepted: 04/17/2019] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND An orthologous group (OG) comprises a set of orthologous and paralogous genes that share a last common ancestor (LCA). OGs are defined with respect to a chosen taxonomic level, which delimits the position of the LCA in time to a specified speciation event. A hierarchy of OGs expands on this notion, connecting more general OGs, distant in time, to more recent, fine-grained OGs, thereby spanning multiple levels of the tree of life. Large scale inference of OG hierarchies with independently computed taxonomic levels can suffer from inconsistencies between successive levels, such as the position in time of a duplication event. This can be due to confounding genetic signal or algorithmic limitations. Importantly, inconsistencies limit the potential use of OGs for functional annotation and third-party applications. RESULTS Here we present a new methodology to ensure hierarchical consistency of OGs across taxonomic levels. To resolve an inconsistency, we subsample the protein space of the OG members and perform gene tree-species tree reconciliation for each sampling. Differently from previous approaches, by subsampling the protein space, we avoid the notoriously difficult task of accurately building and reconciling very large phylogenies. We implement the method into a high-throughput pipeline and apply it to the eggNOG database. We use independent protein domain definitions to validate its performance. CONCLUSION The presented consistency pipeline shows that, contrary to previous limitations, tree reconciliation can be a useful instrument for the construction of OG hierarchies. The key lies in the combination of sampling smaller trees and aggregating their reconciliations for robustness. Results show comparable or greater performance to previous pipelines. The code is available on Github at: https://github.com/meringlab/og_consistency_pipeline .
Collapse
Affiliation(s)
- Davide Heller
- Institute of Molecular Life Sciences, University of Zurich, Winterthurerstrasse 190, Zurich, 8057 Switzerland
- SIB Swiss Institute of Bioinformatics, Quartier Sorge, Batiment Genopode, Lausanne, 1015 Switzerland
| | - Damian Szklarczyk
- Institute of Molecular Life Sciences, University of Zurich, Winterthurerstrasse 190, Zurich, 8057 Switzerland
- SIB Swiss Institute of Bioinformatics, Quartier Sorge, Batiment Genopode, Lausanne, 1015 Switzerland
| | - Christian von Mering
- Institute of Molecular Life Sciences, University of Zurich, Winterthurerstrasse 190, Zurich, 8057 Switzerland
- SIB Swiss Institute of Bioinformatics, Quartier Sorge, Batiment Genopode, Lausanne, 1015 Switzerland
| |
Collapse
|
33
|
Abstract
The distinction between orthologs and paralogs, genes that started diverging by speciation versus duplication, is relevant in a wide range of contexts, most notably phylogenetic tree inference and protein function annotation. In this chapter, we provide an overview of the methods used to infer orthology and paralogy. We survey both graph-based approaches (and their various grouping strategies) and tree-based approaches, which solve the more general problem of gene/species tree reconciliation. We discuss conceptual differences among the various orthology inference methods and databases and examine the difficult issue of verifying and benchmarking orthology predictions. Finally, we review typical applications of orthologous genes, groups, and reconciled trees and conclude with thoughts on future methodological developments.
Collapse
|
34
|
Johnson KP, Dietrich CH, Friedrich F, Beutel RG, Wipfler B, Peters RS, Allen JM, Petersen M, Donath A, Walden KKO, Kozlov AM, Podsiadlowski L, Mayer C, Meusemann K, Vasilikopoulos A, Waterhouse RM, Cameron SL, Weirauch C, Swanson DR, Percy DM, Hardy NB, Terry I, Liu S, Zhou X, Misof B, Robertson HM, Yoshizawa K. Phylogenomics and the evolution of hemipteroid insects. Proc Natl Acad Sci U S A 2018; 115:12775-12780. [PMID: 30478043 PMCID: PMC6294958 DOI: 10.1073/pnas.1815820115] [Citation(s) in RCA: 166] [Impact Index Per Article: 27.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
Hemipteroid insects (Paraneoptera), with over 10% of all known insect diversity, are a major component of terrestrial and aquatic ecosystems. Previous phylogenetic analyses have not consistently resolved the relationships among major hemipteroid lineages. We provide maximum likelihood-based phylogenomic analyses of a taxonomically comprehensive dataset comprising sequences of 2,395 single-copy, protein-coding genes for 193 samples of hemipteroid insects and outgroups. These analyses yield a well-supported phylogeny for hemipteroid insects. Monophyly of each of the three hemipteroid orders (Psocodea, Thysanoptera, and Hemiptera) is strongly supported, as are most relationships among suborders and families. Thysanoptera (thrips) is strongly supported as sister to Hemiptera. However, as in a recent large-scale analysis sampling all insect orders, trees from our data matrices support Psocodea (bark lice and parasitic lice) as the sister group to the holometabolous insects (those with complete metamorphosis). In contrast, four-cluster likelihood mapping of these data does not support this result. A molecular dating analysis using 23 fossil calibration points suggests hemipteroid insects began diversifying before the Carboniferous, over 365 million years ago. We also explore implications for understanding the timing of diversification, the evolution of morphological traits, and the evolution of mitochondrial genome organization. These results provide a phylogenetic framework for future studies of the group.
Collapse
Affiliation(s)
- Kevin P Johnson
- Illinois Natural History Survey, Prairie Research Institute, University of Illinois at Urbana-Champaign, Champaign, IL 61820;
| | - Christopher H Dietrich
- Illinois Natural History Survey, Prairie Research Institute, University of Illinois at Urbana-Champaign, Champaign, IL 61820
| | - Frank Friedrich
- Institut für Zoologie, Universität Hamburg, 20146 Hamburg, Germany
| | - Rolf G Beutel
- Institut für Zoologie und Evolutionsforschung, Friedrich-Schiller-Universität Jena, 07743 Jena, Germany
| | - Benjamin Wipfler
- Institut für Zoologie und Evolutionsforschung, Friedrich-Schiller-Universität Jena, 07743 Jena, Germany
- Center of Taxonomy and Evolutionary Research, Arthropoda Department, Zoological Research Museum Alexander Koenig, 53113 Bonn, Germany
| | - Ralph S Peters
- Center of Taxonomy and Evolutionary Research, Arthropoda Department, Zoological Research Museum Alexander Koenig, 53113 Bonn, Germany
| | - Julie M Allen
- Illinois Natural History Survey, Prairie Research Institute, University of Illinois at Urbana-Champaign, Champaign, IL 61820
- Department of Biology, University of Nevada, Reno, NV 89557
| | - Malte Petersen
- Center for Molecular Biodiversity Research, Zoological Research Museum Alexander Koenig, 53113 Bonn, Germany
| | - Alexander Donath
- Center for Molecular Biodiversity Research, Zoological Research Museum Alexander Koenig, 53113 Bonn, Germany
| | - Kimberly K O Walden
- Department of Entomology, University of Illinois at Urbana-Champaign, Urbana, IL 61801
| | - Alexey M Kozlov
- Scientific Computing Group, Heidelberg Institute for Theoretical Studies, 69118 Heidelberg, Germany
| | - Lars Podsiadlowski
- Center for Molecular Biodiversity Research, Zoological Research Museum Alexander Koenig, 53113 Bonn, Germany
- Institute of Evolutionary Biology and Ecology, University of Bonn, 53121 Bonn, Germany
| | - Christoph Mayer
- Center for Molecular Biodiversity Research, Zoological Research Museum Alexander Koenig, 53113 Bonn, Germany
| | - Karen Meusemann
- Center for Molecular Biodiversity Research, Zoological Research Museum Alexander Koenig, 53113 Bonn, Germany
- Evolutionary Biology and Ecology, Institute for Biology I (Zoology), University of Freiburg, 79104 Freiburg, Germany
- Australian National Insect Collection, Commonwealth Scientific and Industrial Research Organisation National Research Collections Australia, Acton, ACT 2601 Canberra, Australia
| | - Alexandros Vasilikopoulos
- Center for Molecular Biodiversity Research, Zoological Research Museum Alexander Koenig, 53113 Bonn, Germany
| | - Robert M Waterhouse
- Department of Ecology and Evolution, University of Lausanne and Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Stephen L Cameron
- Department of Entomology, Purdue University, West Lafayette, IN 47907
| | | | - Daniel R Swanson
- Illinois Natural History Survey, Prairie Research Institute, University of Illinois at Urbana-Champaign, Champaign, IL 61820
| | - Diana M Percy
- Department of Life Sciences, Natural History Museum, London, SW7 5BD United Kingdom
- Department of Botany, University of British Columbia, Vancouver V6T 1Z4, Canada
| | - Nate B Hardy
- Department of Entomology and Plant Pathology, Auburn University, Auburn, AL 36849
| | - Irene Terry
- School of Biological Sciences, University of Utah, Salt Lake City, UT 84112
| | - Shanlin Liu
- BGI-Shenzhen, Shenzhen, 518083 Guangdong Province, People's Republic of China
| | - Xin Zhou
- Department of Entomology, China Agricultural University, 100193 Beijing, People's Republic of China
| | - Bernhard Misof
- Center for Molecular Biodiversity Research, Zoological Research Museum Alexander Koenig, 53113 Bonn, Germany
| | - Hugh M Robertson
- Department of Entomology, University of Illinois at Urbana-Champaign, Urbana, IL 61801
| | | |
Collapse
|
35
|
Train CM, Glover NM, Gonnet GH, Altenhoff AM, Dessimoz C. Orthologous Matrix (OMA) algorithm 2.0: more robust to asymmetric evolutionary rates and more scalable hierarchical orthologous group inference. Bioinformatics 2018; 33:i75-i82. [PMID: 28881964 PMCID: PMC5870696 DOI: 10.1093/bioinformatics/btx229] [Citation(s) in RCA: 60] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023] Open
Abstract
Motivation Accurate orthology inference is a fundamental step in many phylogenetics and comparative analysis. Many methods have been proposed, including OMA (Orthologous MAtrix). Yet substantial challenges remain, in particular in coping with fragmented genes or genes evolving at different rates after duplication, and in scaling to large datasets. With more and more genomes available, it is necessary to improve the scalability and robustness of orthology inference methods. Results We present improvements in the OMA algorithm: (i) refining the pairwise orthology inference step to account for same-species paralogs evolving at different rates, and (ii) minimizing errors in the pairwise orthology verification step by testing the consistency of pairwise distance estimates, which can be problematic in the presence of fragmentary sequences. In addition we introduce a more scalable procedure for hierarchical orthologous group (HOG) clustering, which are several orders of magnitude faster on large datasets. Using the Quest for Orthologs consortium orthology benchmark service, we show that these changes translate into substantial improvement on multiple empirical datasets. Availability and Implementation This new OMA 2.0 algorithm is used in the OMA database (http://omabrowser.org) from the March 2017 release onwards, and can be run on custom genomes using OMA standalone version 2.0 and above (http://omabrowser.org/standalone).
Collapse
Affiliation(s)
- Clément-Marie Train
- Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland.,Swiss Institute of Bioinformatics, Lausanne, Switzerland.,Center of Integrative Genomics, University of Lausanne, Lausanne, Switzerland
| | - Natasha M Glover
- Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland.,Swiss Institute of Bioinformatics, Lausanne, Switzerland.,Center of Integrative Genomics, University of Lausanne, Lausanne, Switzerland
| | - Gaston H Gonnet
- Department of Computer Science, ETH Zurich, Zurich, Switzerland
| | - Adrian M Altenhoff
- Swiss Institute of Bioinformatics, Lausanne, Switzerland.,Department of Computer Science, ETH Zurich, Zurich, Switzerland
| | - Christophe Dessimoz
- Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland.,Swiss Institute of Bioinformatics, Lausanne, Switzerland.,Center of Integrative Genomics, University of Lausanne, Lausanne, Switzerland.,Department of Genetics, Evolution and Environment, University College London, London, UK.,Department of Computer Science, University College London, London, UK
| |
Collapse
|
36
|
Gerdol M, Fujii Y, Hasan I, Koike T, Shimojo S, Spazzali F, Yamamoto K, Ozeki Y, Pallavicini A, Fujita H. The purplish bifurcate mussel Mytilisepta virgata gene expression atlas reveals a remarkable tissue functional specialization. BMC Genomics 2017; 18:590. [PMID: 28789640 PMCID: PMC5549309 DOI: 10.1186/s12864-017-4012-z] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2017] [Accepted: 08/02/2017] [Indexed: 12/25/2022] Open
Abstract
BACKGROUND Mytilisepta virgata is a marine mussel commonly found along the coasts of Japan. Although this species has been the subject of occasional studies concerning its ecological role, growth and reproduction, it has been so far almost completely neglected from a genetic and molecular point of view. In the present study we present a high quality de novo assembled transcriptome of the Japanese purplish mussel, which represents the first publicly available collection of expressed sequences for this species. RESULTS The assembled transcriptome comprises almost 50,000 contigs, with a N50 statistics of ~1 kilobase and a high estimated completeness based on the rate of BUSCOs identified, standing as one of the most exhaustive sequence resources available for mytiloid bivalves to date. Overall this data, accompanied by gene expression profiles from gills, digestive gland, mantle rim, foot and posterior adductor muscle, presents an accurate snapshot of the great functional specialization of these five tissues in adult mussels. CONCLUSIONS We highlight that one of the most striking features of the M. virgata transcriptome is the high abundance and diversification of lectin-like transcripts, which pertain to different gene families and appear to be expressed in particular in the digestive gland and in the gills. Therefore, these two tissues might be selected as preferential targets for the isolation of molecules with interesting carbohydrate-binding properties. In addition, by molecular phylogenomics, we provide solid evidence in support of the classification of M. virgata within the Brachidontinae subfamily. This result is in agreement with the previously proposed hypothesis that the morphological features traditionally used to group Mytilisepta spp. and Septifer spp. within the same clade are inappropriate due to homoplasy.
Collapse
Affiliation(s)
- Marco Gerdol
- Department of Life Sciences, University of Trieste, Via Giorgieri 5, 34126 Trieste, Italy
| | - Yuki Fujii
- Department of Pharmacy, Faculty of Pharmaceutical Science, Nagasaki International University, 2825-7 Huis Ten Bosch, Sasebo, Nagasaki, 859-3298 Japan
| | - Imtiaj Hasan
- Department of Life and Environmental System Science, Graduate School of NanoBio Sciences, Yokohama City University, 22-2 Seto, Kanazawa-ku, Yokohama, 236-0027 Japan
- Department of Biochemistry and Molecular Biology, Faculty of Science, University of Rajshahi, Rajshahi, 6205 Bangladesh
| | - Toru Koike
- Department of Pharmacy, Faculty of Pharmaceutical Science, Nagasaki International University, 2825-7 Huis Ten Bosch, Sasebo, Nagasaki, 859-3298 Japan
| | - Shunsuke Shimojo
- Department of Pharmacy, Faculty of Pharmaceutical Science, Nagasaki International University, 2825-7 Huis Ten Bosch, Sasebo, Nagasaki, 859-3298 Japan
| | - Francesca Spazzali
- Department of Life Sciences, University of Trieste, Via Giorgieri 5, 34126 Trieste, Italy
| | - Kaname Yamamoto
- Department of Pharmacy, Faculty of Pharmaceutical Science, Nagasaki International University, 2825-7 Huis Ten Bosch, Sasebo, Nagasaki, 859-3298 Japan
| | - Yasuhiro Ozeki
- Department of Life and Environmental System Science, Graduate School of NanoBio Sciences, Yokohama City University, 22-2 Seto, Kanazawa-ku, Yokohama, 236-0027 Japan
| | - Alberto Pallavicini
- Department of Life Sciences, University of Trieste, Via Giorgieri 5, 34126 Trieste, Italy
| | - Hideaki Fujita
- Department of Pharmacy, Faculty of Pharmaceutical Science, Nagasaki International University, 2825-7 Huis Ten Bosch, Sasebo, Nagasaki, 859-3298 Japan
| |
Collapse
|
37
|
Zdobnov EM, Tegenfeldt F, Kuznetsov D, Waterhouse RM, Simão FA, Ioannidis P, Seppey M, Loetscher A, Kriventseva EV. OrthoDB v9.1: cataloging evolutionary and functional annotations for animal, fungal, plant, archaeal, bacterial and viral orthologs. Nucleic Acids Res 2016; 45:D744-D749. [PMID: 27899580 PMCID: PMC5210582 DOI: 10.1093/nar/gkw1119] [Citation(s) in RCA: 295] [Impact Index Per Article: 36.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2016] [Revised: 10/26/2016] [Accepted: 11/08/2016] [Indexed: 11/25/2022] Open
Abstract
OrthoDB is a comprehensive catalog of orthologs, genes inherited by extant species from a single gene in their last common ancestor. In 2016 OrthoDB reached its 9th release, growing to over 22 million genes from over 5000 species, now adding plants, archaea and viruses. In this update we focused on usability of this fast-growing wealth of data: updating the user and programmatic interfaces to browse and query the data, and further enhancing the already extensive integration of available gene functional annotations. Collating functional annotations from over 100 resources, and enabled us to propose descriptive titles for 87% of ortholog groups. Additionally, OrthoDB continues to provide computed evolutionary annotations and to allow user queries by sequence homology. The OrthoDB resource now enables users to generate publication-quality comparative genomics charts, as well as to upload, analyze and interactively explore their own private data. OrthoDB is available from http://orthodb.org.
Collapse
Affiliation(s)
- Evgeny M Zdobnov
- Department of Genetic Medicine and Development, University of Geneva Medical School, rue Michel-Servet 1, 1211 Geneva, Switzerland, and Swiss Institute of Bioinformatics, rue Michel-Servet 1, 1211 Geneva, Switzerland
| | - Fredrik Tegenfeldt
- Department of Genetic Medicine and Development, University of Geneva Medical School, rue Michel-Servet 1, 1211 Geneva, Switzerland, and Swiss Institute of Bioinformatics, rue Michel-Servet 1, 1211 Geneva, Switzerland
| | - Dmitry Kuznetsov
- Department of Genetic Medicine and Development, University of Geneva Medical School, rue Michel-Servet 1, 1211 Geneva, Switzerland, and Swiss Institute of Bioinformatics, rue Michel-Servet 1, 1211 Geneva, Switzerland
| | - Robert M Waterhouse
- Department of Genetic Medicine and Development, University of Geneva Medical School, rue Michel-Servet 1, 1211 Geneva, Switzerland, and Swiss Institute of Bioinformatics, rue Michel-Servet 1, 1211 Geneva, Switzerland
| | - Felipe A Simão
- Department of Genetic Medicine and Development, University of Geneva Medical School, rue Michel-Servet 1, 1211 Geneva, Switzerland, and Swiss Institute of Bioinformatics, rue Michel-Servet 1, 1211 Geneva, Switzerland
| | - Panagiotis Ioannidis
- Department of Genetic Medicine and Development, University of Geneva Medical School, rue Michel-Servet 1, 1211 Geneva, Switzerland, and Swiss Institute of Bioinformatics, rue Michel-Servet 1, 1211 Geneva, Switzerland
| | - Mathieu Seppey
- Department of Genetic Medicine and Development, University of Geneva Medical School, rue Michel-Servet 1, 1211 Geneva, Switzerland, and Swiss Institute of Bioinformatics, rue Michel-Servet 1, 1211 Geneva, Switzerland
| | - Alexis Loetscher
- Department of Genetic Medicine and Development, University of Geneva Medical School, rue Michel-Servet 1, 1211 Geneva, Switzerland, and Swiss Institute of Bioinformatics, rue Michel-Servet 1, 1211 Geneva, Switzerland
| | - Evgenia V Kriventseva
- Department of Genetic Medicine and Development, University of Geneva Medical School, rue Michel-Servet 1, 1211 Geneva, Switzerland, and Swiss Institute of Bioinformatics, rue Michel-Servet 1, 1211 Geneva, Switzerland
| |
Collapse
|
38
|
Nyberg KG, Machado CA. Comparative Expression Dynamics of Intergenic Long Noncoding RNAs in the Genus Drosophila. Genome Biol Evol 2016; 8:1839-58. [PMID: 27189981 PMCID: PMC4943187 DOI: 10.1093/gbe/evw116] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
Thousands of long noncoding RNAs (lncRNAs) have been annotated in eukaryotic genomes, but comparative transcriptomic approaches are necessary to understand their biological impact and evolution. To facilitate such comparative studies in Drosophila, we identified and characterized lncRNAs in a second Drosophilid—the evolutionary model Drosophila pseudoobscura. Using RNA-Seq and computational filtering of protein-coding potential, we identified 1,589 intergenic lncRNA loci in D. pseudoobscura. We surveyed multiple sex-specific developmental stages and found, like in Drosophila melanogaster, increasingly prolific lncRNA expression through male development and an overrepresentation of lncRNAs in the testes. Other trends seen in D. melanogaster, like reduced pupal expression, were not observed. Nonrandom distributions of female-biased and non-testis-specific male-biased lncRNAs between the X chromosome and autosomes are consistent with selection-based models of gene trafficking to optimize genomic location of sex-biased genes. The numerous testis-specific lncRNAs, however, are randomly distributed between the X and autosomes, and we cannot reject the hypothesis that many of these are likely to be spurious transcripts. Finally, using annotated lncRNAs in both species, we identified 134 putative lncRNA homologs between D. pseudoobscura and D. melanogaster and find that many have conserved developmental expression dynamics, making them ideal candidates for future functional analyses.
Collapse
Affiliation(s)
- Kevin G Nyberg
- Department of Biology, University of Maryland, College Park
| | | |
Collapse
|
39
|
Stanley EC, Azzinaro PA, Vierra DA, Howlett NG, Irvine SQ. The Simple Chordate Ciona intestinalis Has a Reduced Complement of Genes Associated with Fanconi Anemia. Evol Bioinform Online 2016; 12:133-48. [PMID: 27279728 PMCID: PMC4898443 DOI: 10.4137/ebo.s37920] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2015] [Revised: 02/10/2016] [Accepted: 02/16/2016] [Indexed: 12/26/2022] Open
Abstract
Fanconi anemia (FA) is a human genetic disease characterized by congenital defects, bone marrow failure, and increased cancer risk. FA is associated with mutation in one of 24 genes. The protein products of these genes function cooperatively in the FA pathway to orchestrate the repair of DNA interstrand cross-links. Few model organisms exist for the study of FA. Seeking a model organism with a simpler version of the FA pathway, we searched the genome of the simple chordate Ciona intestinalis for homologs of the human FA-associated proteins. BLAST searches, sequence alignments, hydropathy comparisons, maximum likelihood phylogenetic analysis, and structural modeling were used to infer the likelihood of homology between C. intestinalis and human FA proteins. Our analysis indicates that C. intestinalis indeed has a simpler and potentially functional FA pathway. The C. intestinalis genome was searched for candidates for homology to 24 human FA and FA-associated proteins. Support was found for the existence of homologs for 13 of these 24 human genes in C. intestinalis. Members of each of the three commonly recognized FA gene functional groups were found. In group I, we identified homologs of FANCE, FANCL, FANCM, and UBE2T/FANCT. Both members of group II, FANCD2 and FANCI, have homologs in C. intestinalis. In group III, we found evidence for homologs of FANCJ, FANCO, FANCQ/ERCC4, FANCR/RAD51, and FANCS/BRCA1, as well as the FA-associated proteins ERCC1 and FAN1. Evidence was very weak for the existence of homologs in C. intestinalis for any other recognized FA genes. This work supports the notion that C. intestinalis, as a close relative of vertebrates, but having a much reduced complement of FA genes, offers a means of studying the function of certain FA proteins in a simpler pathway than that of vertebrate cells.
Collapse
Affiliation(s)
- Edward C Stanley
- Integrative and Evolutionary Biology Graduate Specialization, University of Rhode Island, Kingston, RI, USA
| | - Paul A Azzinaro
- Cell and Molecular Biology Graduate Specialization, University of Rhode Island, Kingston, RI, USA
| | - David A Vierra
- Cell and Molecular Biology Graduate Specialization, University of Rhode Island, Kingston, RI, USA
| | - Niall G Howlett
- Cell and Molecular Biology Graduate Specialization, University of Rhode Island, Kingston, RI, USA.; Department of Cell and Molecular Biology, University of Rhode Island, Kingston, RI, USA
| | - Steven Q Irvine
- Integrative and Evolutionary Biology Graduate Specialization, University of Rhode Island, Kingston, RI, USA.; Department of Biological Sciences, University of Rhode Island, Kingston, RI, USA
| |
Collapse
|
40
|
Horiike T, Minai R, Miyata D, Nakamura Y, Tateno Y. Ortholog-Finder: A Tool for Constructing an Ortholog Data Set. Genome Biol Evol 2016; 8:446-57. [PMID: 26782935 PMCID: PMC4779612 DOI: 10.1093/gbe/evw005] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Orthologs are widely used for phylogenetic analysis of species; however, identifying genuine orthologs among distantly related species is challenging, because genes obtained through horizontal gene transfer (HGT) and out-paralogs derived from gene duplication before speciation are often present among the predicted orthologs. We developed a program, “Ortholog-Finder,” to obtain ortholog data sets for performing phylogenetic analysis by using all open-reading frame data of species. The program includes five processes for minimizing the effects of HGT and out-paralogs in phylogeny construction: 1) HGT filtering: Genes derived from HGT could be detected and deleted from the initial sequence data set by examining their base compositions. 2) Out-paralog filtering: Out-paralogs are detected and deleted from the data set based on sequence similarity. 3) Classification of phylogenetic trees: Phylogenetic trees generated for ortholog candidates are classified as monophyletic or polyphyletic trees. 4) Tree splitting: Polyphyletic trees are bisected to obtain monophyletic trees and remove HGT genes and out-paralogs. 5) Threshold changing: Out-paralogs are further excluded from the data set based on the difference in the similarity scores of genuine orthologs and out-paralogs. We examined how out-paralogs and HGTs affected phylogenetic trees constructed for species based on ortholog data sets obtained by Ortholog-Finder with the use of simulation data, and we determined the effects of confounding factors. We then used Ortholog-Finder in phylogeny construction for 12 Gram-positive bacteria from two phyla and validated each node of the constructed tree by comparison with individually constructed ortholog trees.
Collapse
Affiliation(s)
- Tokumasa Horiike
- Department of Biological and Environmental Science, Shizuoka University, Japan
| | - Ryoichi Minai
- The Genome Institute, Japanese Foundation of Cancer Research, Tokyo, Japan
| | - Daisuke Miyata
- Department of Economics, Chiba University of Commerce, Ichikawa, Japan
| | - Yoji Nakamura
- Research Center for Aquatic Genomics, National Research Institute of Fisheries Science, Fisheries Research Agency, Kanagawa, Japan
| | - Yoshio Tateno
- School of New Sciences, Daegu Gyoungbook Institute of Science and Technology, Daegu, Republic of Korea
| |
Collapse
|
41
|
Oh Brother, Where Art Thou? Finding Orthologs in the Twilight and Midnight Zones of Sequence Similarity. Evol Biol 2016. [DOI: 10.1007/978-3-319-41324-2_22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
42
|
Coleman BD, Marivin A, Parag-Sharma K, DiGiacomo V, Kim S, Pepper JS, Casler J, Nguyen LT, Koelle MR, Garcia-Marcos M. Evolutionary Conservation of a GPCR-Independent Mechanism of Trimeric G Protein Activation. Mol Biol Evol 2015; 33:820-37. [PMID: 26659249 PMCID: PMC4760084 DOI: 10.1093/molbev/msv336] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Trimeric G protein signaling is a fundamental mechanism of cellular communication in eukaryotes. The core of this mechanism consists of activation of G proteins by the guanine-nucleotide exchange factor (GEF) activity of G protein coupled receptors. However, the duration and amplitude of G protein-mediated signaling are controlled by a complex network of accessory proteins that appeared and diversified during evolution. Among them, nonreceptor proteins with GEF activity are the least characterized. We recently found that proteins of the ccdc88 family possess a Gα-binding and activating (GBA) motif that confers GEF activity and regulates mammalian cell behavior. A sequence similarity-based search revealed that ccdc88 genes are highly conserved across metazoa but the GBA motif is absent in most invertebrates. This prompted us to investigate whether the GBA motif is present in other nonreceptor proteins in invertebrates. An unbiased bioinformatics search in Caenorhabditis elegans identified GBAS-1 (GBA and SPK domain containing-1) as a GBA motif-containing protein with homologs only in closely related worm species. We demonstrate that GBAS-1 has GEF activity for the nematode G protein GOA-1 and that the two proteins are coexpressed in many cells of living worms. Furthermore, we show that GBAS-1 can activate mammalian Gα-subunits and provide structural insights into the evolutionarily conserved determinants of the GBA–G protein interface. These results demonstrate that the GBA motif is a functional GEF module conserved among highly divergent proteins across evolution, indicating that the GBA-Gα binding mode is strongly constrained under selective pressure to mediate receptor-independent G protein activation in metazoans.
Collapse
Affiliation(s)
| | - Arthur Marivin
- Department of Biochemistry, Boston University School of Medicine
| | | | | | - Seongseop Kim
- Department of Molecular Biophysics and Biochemistry, Yale University School of Medicine
| | - Judy S Pepper
- Department of Molecular Biophysics and Biochemistry, Yale University School of Medicine
| | - Jason Casler
- Department of Biochemistry, Boston University School of Medicine
| | - Lien T Nguyen
- Department of Biochemistry, Boston University School of Medicine
| | - Michael R Koelle
- Department of Molecular Biophysics and Biochemistry, Yale University School of Medicine
| | | |
Collapse
|
43
|
Emms DM, Kelly S. OrthoFinder: solving fundamental biases in whole genome comparisons dramatically improves orthogroup inference accuracy. Genome Biol 2015. [PMID: 26243257 DOI: 10.1186/s13059-015-0721-722] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/16/2023] Open
Abstract
Identifying homology relationships between sequences is fundamental to biological research. Here we provide a novel orthogroup inference algorithm called OrthoFinder that solves a previously undetected gene length bias in orthogroup inference, resulting in significant improvements in accuracy. Using real benchmark datasets we demonstrate that OrthoFinder is more accurate than other orthogroup inference methods by between 8 % and 33 %. Furthermore, we demonstrate the utility of OrthoFinder by providing a complete classification of transcription factor gene families in plants revealing 6.9 million previously unobserved relationships.
Collapse
Affiliation(s)
- David M Emms
- Department of Plant Sciences, University of Oxford, South Parks Road, Oxford, OX1 3RB, UK.
| | - Steven Kelly
- Department of Plant Sciences, University of Oxford, South Parks Road, Oxford, OX1 3RB, UK.
| |
Collapse
|
44
|
Emms DM, Kelly S. OrthoFinder: solving fundamental biases in whole genome comparisons dramatically improves orthogroup inference accuracy. Genome Biol 2015; 16:157. [PMID: 26243257 PMCID: PMC4531804 DOI: 10.1186/s13059-015-0721-2] [Citation(s) in RCA: 2002] [Impact Index Per Article: 222.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2014] [Accepted: 07/08/2015] [Indexed: 01/12/2023] Open
Abstract
Identifying homology relationships between sequences is fundamental to biological research. Here we provide a novel orthogroup inference algorithm called OrthoFinder that solves a previously undetected gene length bias in orthogroup inference, resulting in significant improvements in accuracy. Using real benchmark datasets we demonstrate that OrthoFinder is more accurate than other orthogroup inference methods by between 8 % and 33 %. Furthermore, we demonstrate the utility of OrthoFinder by providing a complete classification of transcription factor gene families in plants revealing 6.9 million previously unobserved relationships.
Collapse
Affiliation(s)
- David M Emms
- Department of Plant Sciences, University of Oxford, South Parks Road, Oxford, OX1 3RB, UK.
| | - Steven Kelly
- Department of Plant Sciences, University of Oxford, South Parks Road, Oxford, OX1 3RB, UK.
| |
Collapse
|
45
|
Fraïsse C, Belkhir K, Welch JJ, Bierne N. Local interspecies introgression is the main cause of extreme levels of intraspecific differentiation in mussels. Mol Ecol 2015; 25:269-86. [DOI: 10.1111/mec.13299] [Citation(s) in RCA: 78] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2015] [Revised: 06/19/2015] [Accepted: 06/19/2015] [Indexed: 12/15/2022]
Affiliation(s)
- Christelle Fraïsse
- Institut des Sciences de l'Evolution (UMR 5554); CNRS - Université Montpellier; Place Eugène Bataillon 34095 Montpellier France
- Station Marine; Université Montpellier; 2 rue des Chantiers 34200 Sète France
- Department of Genetics; University of Cambridge; Downing Street CB2 3EH Cambridge UK
| | - Khalid Belkhir
- Institut des Sciences de l'Evolution (UMR 5554); CNRS - Université Montpellier; Place Eugène Bataillon 34095 Montpellier France
| | - John J. Welch
- Department of Genetics; University of Cambridge; Downing Street CB2 3EH Cambridge UK
| | - Nicolas Bierne
- Institut des Sciences de l'Evolution (UMR 5554); CNRS - Université Montpellier; Place Eugène Bataillon 34095 Montpellier France
- Station Marine; Université Montpellier; 2 rue des Chantiers 34200 Sète France
| |
Collapse
|
46
|
Kapheim KM, Pan H, Li C, Salzberg SL, Puiu D, Magoc T, Robertson HM, Hudson ME, Venkat A, Fischman BJ, Hernandez A, Yandell M, Ence D, Holt C, Yocum GD, Kemp WP, Bosch J, Waterhouse RM, Zdobnov EM, Stolle E, Kraus FB, Helbing S, Moritz RFA, Glastad KM, Hunt BG, Goodisman MAD, Hauser F, Grimmelikhuijzen CJP, Pinheiro DG, Nunes FMF, Soares MPM, Tanaka ÉD, Simões ZLP, Hartfelder K, Evans JD, Barribeau SM, Johnson RM, Massey JH, Southey BR, Hasselmann M, Hamacher D, Biewer M, Kent CF, Zayed A, Blatti C, Sinha S, Johnston JS, Hanrahan SJ, Kocher SD, Wang J, Robinson GE, Zhang G. Social evolution. Genomic signatures of evolutionary transitions from solitary to group living. Science 2015; 348:1139-43. [PMID: 25977371 PMCID: PMC5471836 DOI: 10.1126/science.aaa4788] [Citation(s) in RCA: 239] [Impact Index Per Article: 26.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2014] [Accepted: 05/06/2015] [Indexed: 12/14/2022]
Abstract
The evolution of eusociality is one of the major transitions in evolution, but the underlying genomic changes are unknown. We compared the genomes of 10 bee species that vary in social complexity, representing multiple independent transitions in social evolution, and report three major findings. First, many important genes show evidence of neutral evolution as a consequence of relaxed selection with increasing social complexity. Second, there is no single road map to eusociality; independent evolutionary transitions in sociality have independent genetic underpinnings. Third, though clearly independent in detail, these transitions do have similar general features, including an increase in constrained protein evolution accompanied by increases in the potential for gene regulation and decreases in diversity and abundance of transposable elements. Eusociality may arise through different mechanisms each time, but would likely always involve an increase in the complexity of gene networks.
Collapse
Affiliation(s)
- Karen M Kapheim
- Carl R. WoeseInstitute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA. Department of Entomology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA. Department of Biology, Utah State University, Logan, UT 84322, USA.
| | - Hailin Pan
- China National GeneBank, BGI-Shenzhen, Shenzhen, 518083, China
| | - Cai Li
- China National GeneBank, BGI-Shenzhen, Shenzhen, 518083, China. Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Copenhagen, 1350, Denmark
| | - Steven L Salzberg
- Departments of Biomedical Engineering, Computer Science, and Biostatistics, Johns Hopkins University, Baltimore, MD 21218, USA. Center for Computational Biology, McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA
| | - Daniela Puiu
- Center for Computational Biology, McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA
| | - Tanja Magoc
- Center for Computational Biology, McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA
| | - Hugh M Robertson
- Carl R. WoeseInstitute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA. Department of Entomology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Matthew E Hudson
- Carl R. WoeseInstitute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA. Department of Crop Sciences, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Aarti Venkat
- Carl R. WoeseInstitute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA. Department of Crop Sciences, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA. Department of Human Genetics, University of Chicago, Chicago, IL 60637, USA
| | - Brielle J Fischman
- Carl R. WoeseInstitute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA. Program in Ecology and Evolutionary Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA. Department of Biology, Hobart and William Smith Colleges, Geneva, NY 14456, USA
| | - Alvaro Hernandez
- Roy J. Carver Biotechnology Center, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Mark Yandell
- Department of Human Genetics, Eccles Institute of Human Genetics, University of Utah, Salt Lake City, UT 84112, USA. USTAR Center for Genetic Discovery, University of Utah, Salt Lake City, UT 84112, USA
| | - Daniel Ence
- Department of Human Genetics, Eccles Institute of Human Genetics, University of Utah, Salt Lake City, UT 84112, USA
| | - Carson Holt
- Department of Human Genetics, Eccles Institute of Human Genetics, University of Utah, Salt Lake City, UT 84112, USA. USTAR Center for Genetic Discovery, University of Utah, Salt Lake City, UT 84112, USA
| | - George D Yocum
- U.S. Department of Agriculture-Agricultural Research Service (USDA-ARS) Red River Valley Agricultural Research Center, Biosciences Research Laboratory, Fargo, ND 58102, USA
| | - William P Kemp
- U.S. Department of Agriculture-Agricultural Research Service (USDA-ARS) Red River Valley Agricultural Research Center, Biosciences Research Laboratory, Fargo, ND 58102, USA
| | - Jordi Bosch
- Center for Ecological Research and Forestry Applications (CREAF), Universitat Autonoma de Barcelona, 08193 Bellaterra, Spain
| | - Robert M Waterhouse
- Department of Genetic Medicine and Development, University of Geneva Medical School, 1211 Geneva, Switzerland. Swiss Institute of Bioinformatics, 1211 Geneva, Switzerland. Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology (MIT), Cambridge, MA 02139, USA. The Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Evgeny M Zdobnov
- Department of Genetic Medicine and Development, University of Geneva Medical School, 1211 Geneva, Switzerland. Swiss Institute of Bioinformatics, 1211 Geneva, Switzerland
| | - Eckart Stolle
- Institute of Biology, Department Zoology, Martin-Luther-University Halle-Wittenberg, Hoher Weg 4, D-06099 Halle (Saale), Germany. Queen Mary University of London, School of Biological and Chemical Sciences Organismal Biology Research Group, London E1 4NS, UK
| | - F Bernhard Kraus
- Institute of Biology, Department Zoology, Martin-Luther-University Halle-Wittenberg, Hoher Weg 4, D-06099 Halle (Saale), Germany. Department of Laboratory Medicine, University Hospital Halle, Ernst Grube Strasse 40, D-06120 Halle (Saale), Germany
| | - Sophie Helbing
- Institute of Biology, Department Zoology, Martin-Luther-University Halle-Wittenberg, Hoher Weg 4, D-06099 Halle (Saale), Germany
| | - Robin F A Moritz
- Institute of Biology, Department Zoology, Martin-Luther-University Halle-Wittenberg, Hoher Weg 4, D-06099 Halle (Saale), Germany. German Centre for Integrative Biodiversity Research (iDiv), Halle-Jena-Leipzig, 04103 Leipzig, Germany
| | - Karl M Glastad
- School of Biology, Georgia Institute of Technology, Atlanta, GA 30332, USA
| | - Brendan G Hunt
- Department of Entomology, University of Georgia, Griffin, GA 30223, USA
| | | | - Frank Hauser
- Center for Functional and Comparative Insect Genomics, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Cornelis J P Grimmelikhuijzen
- Center for Functional and Comparative Insect Genomics, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Daniel Guariz Pinheiro
- Departamento de Biologia, Faculdade de Filosofia, Ciências e Letras de Ribeirão Preto, Universidade de São Paulo, 14040-901 Ribeirão Preto, SP, Brazil. Departamento de Tecnologia, Faculdade de Ciências Agrárias e Veterinárias, Universidade Estadual Paulista (UNESP), 14884-900 Jaboticabal, SP, Brazil
| | - Francis Morais Franco Nunes
- Departamento de Genética e Evolução, Centro de Ciências Biológicas e da Saúde, Universidade Federal de São Carlos, 13565-905 São Carlos, SP, Brazil
| | - Michelle Prioli Miranda Soares
- Departamento de Biologia, Faculdade de Filosofia, Ciências e Letras de Ribeirão Preto, Universidade de São Paulo, 14040-901 Ribeirão Preto, SP, Brazil
| | - Érica Donato Tanaka
- Departamento de Genética, Faculdade de Medicina de Ribeirão Preto, Universidade de São Paulo, 14049-900 Ribeirão Preto, SP, Brazil
| | - Zilá Luz Paulino Simões
- Departamento de Biologia, Faculdade de Filosofia, Ciências e Letras de Ribeirão Preto, Universidade de São Paulo, 14040-901 Ribeirão Preto, SP, Brazil
| | - Klaus Hartfelder
- Departamento de Biologia Celular e Molecular e Bioagentes Patogênicos, Faculdade de Medicina de Ribeirão Preto, Universidade de São Paulo, 14049-900 Ribeirão Preto, SP, Brazil
| | - Jay D Evans
- USDA-ARS Bee Research Lab, Beltsville, MD 20705 USA
| | - Seth M Barribeau
- Department of Biology, East Carolina University, Greenville, NC 27858, USA
| | - Reed M Johnson
- Department of Entomology, Ohio Agricultural Research and Development Center, Ohio State University, Wooster, OH 44691, USA
| | - Jonathan H Massey
- Department of Entomology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA. Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI 48109, USA
| | - Bruce R Southey
- Department of Animal Sciences, University of Illinois, Urbana, IL 61801, USA
| | - Martin Hasselmann
- Department of Population Genomics, Institute of Animal Husbandry and Animal Breeding, University of Hohenheim, Germany
| | - Daniel Hamacher
- Department of Population Genomics, Institute of Animal Husbandry and Animal Breeding, University of Hohenheim, Germany
| | - Matthias Biewer
- Department of Population Genomics, Institute of Animal Husbandry and Animal Breeding, University of Hohenheim, Germany
| | - Clement F Kent
- Department of Biology, York University, Toronto, ON M3J 1P3, Canada. Janelia Farm Research Campus, Howard Hughes Medical Institue, Ashburn, VA 20147, USA
| | - Amro Zayed
- Department of Biology, York University, Toronto, ON M3J 1P3, Canada
| | - Charles Blatti
- Carl R. WoeseInstitute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA. Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Saurabh Sinha
- Carl R. WoeseInstitute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA. Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - J Spencer Johnston
- Department of Entomology, Texas A&M University, College Station, TX 77843, USA
| | - Shawn J Hanrahan
- Department of Entomology, Texas A&M University, College Station, TX 77843, USA
| | - Sarah D Kocher
- Department of Organismic and Evolutionary Biology, Museum of Comparative Zoology, Harvard University, Cambridge, MA 02138, USA
| | - Jun Wang
- China National GeneBank, BGI-Shenzhen, Shenzhen, 518083, China. Department of Biology, University of Copenhagen, 2200 Copenhagen, Denmark. Princess Al Jawhara Center of Excellence in the Research of Hereditary Disorders, King Abdulaziz University, Jeddah 21589, Saudi Arabia. Macau University of Science and Technology, Avenida Wai long, Taipa, Macau 999078, China. Department of Medicine, University of Hong Kong, Hong Kong.
| | - Gene E Robinson
- Carl R. WoeseInstitute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA. Center for Advanced Study Professor in Entomology and Neuroscience, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA.
| | - Guojie Zhang
- China National GeneBank, BGI-Shenzhen, Shenzhen, 518083, China. Centre for Social Evolution, Department of Biology, Universitetsparken 15, University of Copenhagen, DK-2100 Copenhagen, Denmark.
| |
Collapse
|
47
|
Under-detection of endospore-forming Firmicutes in metagenomic data. Comput Struct Biotechnol J 2015; 13:299-306. [PMID: 25973144 PMCID: PMC4427659 DOI: 10.1016/j.csbj.2015.04.002] [Citation(s) in RCA: 73] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2014] [Revised: 04/06/2015] [Accepted: 04/18/2015] [Indexed: 11/24/2022] Open
Abstract
Microbial diversity studies based on metagenomic sequencing have greatly enhanced our knowledge of the microbial world. However, one caveat is the fact that not all microorganisms are equally well detected, questioning the universality of this approach. Firmicutes are known to be a dominant bacterial group. Several Firmicutes species are endospore formers and this property makes them hardy in potentially harsh conditions, and thus likely to be present in a wide variety of environments, even as residents and not functional players. While metagenomic libraries can be expected to contain endospore formers, endospores are known to be resilient to many traditional methods of DNA isolation and thus potentially undetectable. In this study we evaluated the representation of endospore-forming Firmicutes in 73 published metagenomic datasets using two molecular markers unique to this bacterial group (spo0A and gpr). Both markers were notably absent in well-known habitats of Firmicutes such as soil, with spo0A found only in three mammalian gut microbiomes. A tailored DNA extraction method resulted in the detection of a large diversity of endospore-formers in amplicon sequencing of the 16S rRNA and spo0A genes. However, shotgun classification was still poor with only a minor fraction of the community assigned to Firmicutes. Thus, removing a specific bias in a molecular workflow improves detection in amplicon sequencing, but it was insufficient to overcome the limitations for detecting endospore-forming Firmicutes in whole-genome metagenomics. In conclusion, this study highlights the importance of understanding the specific methodological biases that can contribute to improve the universality of metagenomic approaches. Endospore formers were under-detected by profile analysis of sporulation genes in metagenomes. Endospore formers were absent even from those habitats known to harbor them. A tailored DNA extraction method improved detection in amplicon sequencing. Ameliorated DNA extraction did not improve shotgun classification. Endospore-formers represent an undetectable community fraction by metagenomic approaches.
Collapse
|
48
|
Krasnov A, Wesmajervi Breiland MS, Hatlen B, Afanasyev S, Skugor S. Sexual maturation and administration of 17β-estradiol and testosterone induce complex gene expression changes in skin and increase resistance of Atlantic salmon to ectoparasite salmon louse. Gen Comp Endocrinol 2015; 212:34-43. [PMID: 25599658 DOI: 10.1016/j.ygcen.2015.01.002] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/19/2014] [Revised: 12/25/2014] [Accepted: 01/10/2015] [Indexed: 12/29/2022]
Abstract
The crustacean ectoparasitic salmon louse (Lepeophtheirus salmonis) is a major problem of Atlantic salmon aquaculture in the Northern hemisphere. Host-pathogen interactions in this system are highly complex. Resistance to the parasite involves variations in genetic background, nutrition, properties of skin, and status of the endocrine and immune systems. This study addressed the relationship between sex hormones and lice infection. Field observation revealed a sharp reduction of lice prevalence during sexual maturation with no difference between male and female fish. To determine if higher resistance against lice was related to sex hormones, post-smolt salmon were administered control feed and feeds containing 17β-estradiol (20 mg/kg) and testosterone (25 mg/kg) during a 3-week pre-challenge period. After challenge with lice, counts were reduced 2-fold and 1.5-fold in fish that received 17β-estradiol and testosterone, respectively. Gene expression analyses were performed from skin of salmon collected in the field trial and from the controlled lab experiment at three time points (end of feeding-before challenge, 3 days post challenge (dpc) and 16 dpc) using oligonucleotide microarray and qPCR. Differential expression was observed in genes associated with diverse biological processes. Both studies revealed similar changes of several antibacterial acute phase proteins; of note was induction of cathelicidin and down-regulation of a defensin gene. Treatment with hormones revealed their ability to modulate T helper cell (Th)-mediated immunity in skin. Enhanced protection achieved by 17β-estradiol administration might in part be due to the skewing of Th responses away from the prototypic anti-parasitic Th2 immunity and towards the more effective Th1 responses. Multiple genes involved in wound healing, differentiation and remodelling of skin tissue were stimulated during maturation but suppressed with sex hormones. Such opposite regulation suggested that these processes were not associated with resistance to the parasite under the studied conditions. Both studies revealed regulation of a suite of genes encoding putative large mucosal proteins found exclusively in fish. Marked decrease of erythrocyte markers indicated reduced circulation while down-regulation of multiple zymogen granule membrane proteins and transporters of cholesterol and other compounds suggested limited availability of nutrients for the parasites.
Collapse
Affiliation(s)
| | | | | | - Sergey Afanasyev
- Nofima AS, PO Box 6122, NO-9291 Tromsø, Norway; Sechenov Institute of Evolutionary Physiology and Biochemistry, M. Toreza av. 44, Peterburg 194223, Russia.
| | - Stanko Skugor
- SLRC-Sea Lice Research Center, Faculty of Veterinary Medicine and Biosciences, Norwegian University of Life Sciences, Box 8146, NO-0033 Oslo, Norway.
| |
Collapse
|
49
|
Waterhouse RM. A maturing understanding of the composition of the insect gene repertoire. CURRENT OPINION IN INSECT SCIENCE 2015; 7:15-23. [PMID: 32846661 DOI: 10.1016/j.cois.2015.01.004] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/04/2014] [Accepted: 01/07/2015] [Indexed: 06/11/2023]
Abstract
Recent insect genome sequencing initiatives have dramatically accelerated the accumulation of genomics data resources sampling species from different lineages to explore the incredible diversity of insect biology. These efforts have built a comprehensive catalogue of the insect gene repertoire, which is expanded with each newly-sequenced genome and continually refined using knowledge from cross-species comparisons and new sources of evidence. Since the sequencing of the very first insect genomes, comparative analyses have identified shared (homologous) and equivalent (orthologous) genes, as well as subsets of genes that appear to be unique. With the number of available insect genomes fast approaching one hundred, a maturing understanding of the composition of the insect gene repertoire broadly partitions it into an expected core of universally-present orthologues and a diverse array of lineage-specific and species-specific genes. While homology and orthology help to build evolutionarily-informed functional hypotheses for many genes from these newly-sequenced genomes, experimental interrogations are required to test such hypotheses and to probe the functions of genes for which homology offers no clues. Such taxonomically-restricted genes may represent the current contents of an evolutionary melting pot, out of which novel adaptations have emerged to make insects the most successful group of animals on Earth.
Collapse
Affiliation(s)
- Robert M Waterhouse
- Department of Genetic Medicine and Development, University of Geneva Medical School, rue Michel-Servet 1, 1211 Geneva, Switzerland; Swiss Institute of Bioinformatics, rue Michel-Servet 1, 1211 Geneva, Switzerland; Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, 32 Vassar Street, Cambridge, MA 02139, USA; The Broad Institute of MIT and Harvard, 415 Main Street, Cambridge, MA 02142, USA.
| |
Collapse
|
50
|
Ryu JY, Kim HU, Lee SY. Human genes with a greater number of transcript variants tend to show biological features of housekeeping and essential genes. MOLECULAR BIOSYSTEMS 2015; 11:2798-807. [DOI: 10.1039/c5mb00322a] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]
Abstract
Human genes with a greater number of transcript variants are more likely to play functionally important roles such as cellular maintenance and survival.
Collapse
Affiliation(s)
- Jae Yong Ryu
- Metabolic and Biomolecular Engineering National Research Laboratory
- Department of Chemical and Biomolecular Engineering (BK21 Plus Program)
- Center for Systems and Synthetic Biotechnology
- Institute for the BioCentury
- Korea Advanced Institute of Science and Technology (KAIST)
| | - Hyun Uk Kim
- Metabolic and Biomolecular Engineering National Research Laboratory
- Department of Chemical and Biomolecular Engineering (BK21 Plus Program)
- Center for Systems and Synthetic Biotechnology
- Institute for the BioCentury
- Korea Advanced Institute of Science and Technology (KAIST)
| | - Sang Yup Lee
- Metabolic and Biomolecular Engineering National Research Laboratory
- Department of Chemical and Biomolecular Engineering (BK21 Plus Program)
- Center for Systems and Synthetic Biotechnology
- Institute for the BioCentury
- Korea Advanced Institute of Science and Technology (KAIST)
| |
Collapse
|