Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Liu Q, Mackey AJ, Roos DS, Pereira FCN. Evigan: a hidden variable model for integrating gene evidence for eukaryotic gene prediction. ACTA ACUST UNITED AC 2008;24:597-605. [PMID: 18187439 DOI: 10.1093/bioinformatics/btn004] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]

For:	Liu Q, Mackey AJ, Roos DS, Pereira FCN. Evigan: a hidden variable model for integrating gene evidence for eukaryotic gene prediction. ACTA ACUST UNITED AC 2008;24:597-605. [PMID: 18187439 DOI: 10.1093/bioinformatics/btn004] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]

Number

Cited by Other Article(s)

Brůna T, Lomsadze A, Borodovsky M. GeneMark-ETP significantly improves the accuracy of automatic annotation of large eukaryotic genomes. Genome Res 2024;34:757-768. [PMID: 38866548 PMCID: PMC11216313 DOI: 10.1101/gr.278373.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Accepted: 05/02/2024] [Indexed: 06/14/2024]

Bruna T, Lomsadze A, Borodovsky M. A new gene finding tool GeneMark-ETP significantly improves the accuracy of automatic annotation of large eukaryotic genomes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.01.13.524024. [PMID: 36711453 PMCID: PMC9882169 DOI: 10.1101/2023.01.13.524024] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]

Gabriel L, Hoff KJ, Brůna T, Borodovsky M, Stanke M. TSEBRA: transcript selector for BRAKER. BMC Bioinformatics 2021;22:566. [PMID: 34823473 PMCID: PMC8620231 DOI: 10.1186/s12859-021-04482-0] [Citation(s) in RCA: 80] [Impact Index Per Article: 26.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2021] [Accepted: 11/15/2021] [Indexed: 11/10/2022] Open

Li F, Zhao X, Li M, He K, Huang C, Zhou Y, Li Z, Walters JR. Insect genomes: progress and challenges. INSECT MOLECULAR BIOLOGY 2019;28:739-758. [PMID: 31120160 DOI: 10.1111/imb.12599] [Citation(s) in RCA: 78] [Impact Index Per Article: 15.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/02/2017] [Revised: 03/22/2019] [Accepted: 05/14/2019] [Indexed: 05/24/2023]

GAAP: A Genome Assembly + Annotation Pipeline. BIOMED RESEARCH INTERNATIONAL 2019;2019:4767354. [PMID: 31346518 PMCID: PMC6617929 DOI: 10.1155/2019/4767354] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/24/2019] [Revised: 05/20/2019] [Accepted: 05/26/2019] [Indexed: 12/24/2022]

Making sense of genomes of parasitic worms: Tackling bioinformatic challenges. Biotechnol Adv 2016;34:663-686. [DOI: 10.1016/j.biotechadv.2016.03.001] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2015] [Revised: 02/25/2016] [Accepted: 03/01/2016] [Indexed: 01/25/2023]

Pan J, Hu X, Li P, Li H, He W, Zhang Y, Lin Y. Domain adaptation via Multi-Layer Transfer Learning. Neurocomputing 2016. [DOI: 10.1016/j.neucom.2015.12.097] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]

Multi-bridge transfer learning. Knowl Based Syst 2016. [DOI: 10.1016/j.knosys.2016.01.016] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]

Campbell MS, Yandell M. An Introduction to Genome Annotation. CURRENT PROTOCOLS IN BIOINFORMATICS 2015;52:4.1.1-4.1.17. [PMID: 26678385 DOI: 10.1002/0471250953.bi0401s52] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]

McMullan M, Gardiner A, Bailey K, Kemen E, Ward BJ, Cevik V, Robert-Seilaniantz A, Schultz-Larsen T, Balmuth A, Holub E, van Oosterhout C, Jones JDG. Evidence for suppression of immunity as a driver for genomic introgressions and host range expansion in races of Albugo candida, a generalist parasite. eLife 2015;4. [PMID: 25723966 PMCID: PMC4384639 DOI: 10.7554/elife.04550] [Citation(s) in RCA: 56] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2014] [Accepted: 02/26/2015] [Indexed: 12/13/2022] Open

Sharma R, Mishra B, Runge F, Thines M. Gene loss rather than gene gain is associated with a host jump from monocots to dicots in the Smut Fungus Melanopsichium pennsylvanicum. Genome Biol Evol 2014;6:2034-49. [PMID: 25062916 PMCID: PMC4159001 DOI: 10.1093/gbe/evu148] [Citation(s) in RCA: 80] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open

van der Burgt A, Severing E, Collemare J, de Wit PJGM. Automated alignment-based curation of gene models in filamentous fungi. BMC Bioinformatics 2014;15:19. [PMID: 24433567 PMCID: PMC3898260 DOI: 10.1186/1471-2105-15-19] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2013] [Accepted: 01/11/2014] [Indexed: 11/16/2022] Open

Abstract

Background

Automated gene-calling is still an error-prone process, particularly for the highly plastic genomes of fungal species. Improvement through quality control and manual curation of gene models is a time-consuming process that requires skilled biologists and is only marginally performed. The wealth of available fungal genomes has not yet been exploited by an automated method that applies quality control of gene models in order to obtain more accurate genome annotations.

Results

We provide a novel method named alignment-based fungal gene prediction (ABFGP) that is particularly suitable for plastic genomes like those of fungi. It can assess gene models on a gene-by-gene basis making use of informant gene loci. Its performance was benchmarked on 6,965 gene models confirmed by full-length unigenes from ten different fungi. 79.4% of all gene models were correctly predicted by ABFGP. It improves the output of ab initio gene prediction software due to a higher sensitivity and precision for all gene model components. Applicability of the method was shown by revisiting the annotations of six different fungi, using gene loci from up to 29 fungal genomes as informants. Between 7,231 and 8,337 genes were assessed by ABFGP and for each genome between 1,724 and 3,505 gene model revisions were proposed. The reliability of the proposed gene models is assessed by an a posteriori introspection procedure of each intron and exon in the multiple gene model alignment. The total number and type of proposed gene model revisions in the six fungal genomes is correlated to the quality of the genome assembly, and to sequencing strategies used in the sequencing centre, highlighting different types of errors in different annotation pipelines. The ABFGP method is particularly successful in discovering sequence errors and/or disruptive mutations causing truncated and erroneous gene models.

Conclusions

The ABFGP method is an accurate and fully automated quality control method for fungal gene catalogues that can be easily implemented into existing annotation pipelines. With the exponential release of new genomes, the ABFGP method will help decreasing the number of gene models that require additional manual curation.

Collapse

Alamancos GP, Agirre E, Eyras E. Methods to study splicing from high-throughput RNA sequencing data. Methods Mol Biol 2014;1126:357-97. [PMID: 24549677 DOI: 10.1007/978-1-62703-980-2_26] [Citation(s) in RCA: 54] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022]

Goodswen SJ, Kennedy PJ, Ellis JT. Evaluating high-throughput ab initio gene finders to discover proteins encoded in eukaryotic pathogen genomes missed by laboratory techniques. PLoS One 2012;7:e50609. [PMID: 23226328 PMCID: PMC3511556 DOI: 10.1371/journal.pone.0050609] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2012] [Accepted: 10/24/2012] [Indexed: 11/25/2022] Open

Bernal A, Crammer K, Pereira F. Automated gene-model curation using global discriminative learning. Bioinformatics 2012;28:1571-8. [DOI: 10.1093/bioinformatics/bts176] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Yandell M, Ence D. A beginner's guide to eukaryotic genome annotation. Nat Rev Genet 2012;13:329-42. [DOI: 10.1038/nrg3174] [Citation(s) in RCA: 366] [Impact Index Per Article: 30.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]

Sato A, Oshima K, Noguchi H, Ogawa M, Takahashi T, Oguma T, Koyama Y, Itoh T, Hattori M, Hanya Y. Draft genome sequencing and comparative analysis of Aspergillus sojae NBRC4239. DNA Res 2011;18:165-76. [PMID: 21659486 PMCID: PMC3111232 DOI: 10.1093/dnares/dsr009] [Citation(s) in RCA: 61] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Kemen E, Gardiner A, Schultz-Larsen T, Kemen AC, Balmuth AL, Robert-Seilaniantz A, Bailey K, Holub E, Studholme DJ, MacLean D, Jones JDG. Gene gain and loss during evolution of obligate parasitism in the white rust pathogen of Arabidopsis thaliana. PLoS Biol 2011;9:e1001094. [PMID: 21750662 PMCID: PMC3130010 DOI: 10.1371/journal.pbio.1001094] [Citation(s) in RCA: 213] [Impact Index Per Article: 16.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2010] [Accepted: 05/10/2011] [Indexed: 01/21/2023] Open

Abstract

Biotrophic eukaryotic plant pathogens require a living host for their growth and form an intimate haustorial interface with parasitized cells. Evolution to biotrophy occurred independently in fungal rusts and powdery mildews, and in oomycete white rusts and downy mildews. Biotroph evolution and molecular mechanisms of biotrophy are poorly understood. It has been proposed, but not shown, that obligate biotrophy results from (i) reduced selection for maintenance of biosynthetic pathways and (ii) gain of mechanisms to evade host recognition or suppress host defence. Here we use Illumina sequencing to define the genome, transcriptome, and gene models for the obligate biotroph oomycete and Arabidopsis parasite, Albugo laibachii. A. laibachii is a member of the Chromalveolata, which incorporates Heterokonts (containing the oomycetes), Apicomplexa (which includes human parasites like Plasmodium falciparum and Toxoplasma gondii), and four other taxa. From comparisons with other oomycete plant pathogens and other chromalveolates, we reveal independent loss of molybdenum-cofactor-requiring enzymes in downy mildews, white rusts, and the malaria parasite P. falciparum. Biotrophy also requires “effectors” to suppress host defence; we reveal RXLR and Crinkler effectors shared with other oomycetes, and also discover and verify a novel class of effectors, the “CHXCs”, by showing effector delivery and effector functionality. Our findings suggest that evolution to progressively more intimate association between host and parasite results in reduced selection for retention of certain biosynthetic pathways, and particularly reduced selection for retention of molybdopterin-requiring biosynthetic pathways. These mechanisms are not only relevant to plant pathogenic oomycetes but also to human pathogens within the Chromalveolata.

Plant pathogens that cannot grow except on their hosts are called obligate biotrophs. How such biotrophy evolves is poorly understood. In this study, we sequenced the genome of the obligate biotroph white rust pathogen (Albugo laibachii, Oomycota) of Arabidopsis. From comparisons with other oomycete plant pathogens, diatoms, and the human pathogen Plasmodium falciparum, we reveal a loss of important metabolic enzymes. We also reveal the appearance of defence-suppressing “effectors”, some carrying motifs known from other oomycete effectors, and discover and experimentally verify a novel class of effectors that share a CHXC motif within 50 amino acids of the signal peptide cleavage site. Obligate biotrophy involves an intimate association within host cells at the haustorial interface (where the parasite penetrates the host cell's cell wall), where nutrients are acquired from the host and effectors are delivered to the host. We found that A. laibachii, like Hyaloperonospora arabidopsidis and Plasmodium falciparum, lacks molybdopterin-requiring biosynthetic pathways, suggesting relaxed selection for retention of, or even selection against, this pathway. We propose that when defence suppression becomes sufficiently effective, hosts become such a reliable source of nutrients that a free-living phase can be lost. These mechanisms leading to obligate biotrophy and host specificity are relevant not only to plant pathogenic oomycetes but also to human pathogens.

Collapse

Sorber K, Dimon MT, DeRisi JL. RNA-Seq analysis of splicing in Plasmodium falciparum uncovers new splice junctions, alternative splicing and splicing of antisense transcripts. Nucleic Acids Res 2011;39:3820-35. [PMID: 21245033 PMCID: PMC3089446 DOI: 10.1093/nar/gkq1223] [Citation(s) in RCA: 98] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open

Bahl A, Davis PH, Behnke M, Dzierszinski F, Jagalur M, Chen F, Shanmugam D, White MW, Kulp D, Roos DS. A novel multifunctional oligonucleotide microarray for Toxoplasma gondii. BMC Genomics 2010;11:603. [PMID: 20974003 PMCID: PMC3017859 DOI: 10.1186/1471-2164-11-603] [Citation(s) in RCA: 57] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2010] [Accepted: 10/25/2010] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Microarrays are invaluable tools for genome interrogation, SNP detection, and expression analysis, among other applications. Such broad capabilities would be of value to many pathogen research communities, although the development and use of genome-scale microarrays is often a costly undertaking. Therefore, effective methods for reducing unnecessary probes while maintaining or expanding functionality would be relevant to many investigators.

RESULTS

Taking advantage of available genome sequences and annotation for Toxoplasma gondii (a pathogenic parasite responsible for illness in immunocompromised individuals) and Plasmodium falciparum (a related parasite responsible for severe human malaria), we designed a single oligonucleotide microarray capable of supporting a wide range of applications at relatively low cost, including genome-wide expression profiling for Toxoplasma, and single-nucleotide polymorphism (SNP)-based genotyping of both T. gondii and P. falciparum. Expression profiling of the three clonotypic lineages dominating T. gondii populations in North America and Europe provides a first comprehensive view of the parasite transcriptome, revealing that ~49% of all annotated genes are expressed in parasite tachyzoites (the acutely lytic stage responsible for pathogenesis) and 26% of genes are differentially expressed among strains. A novel design utilizing few probes provided high confidence genotyping, used here to resolve recombination points in the clonal progeny of sexual crosses. Recent sequencing of additional T. gondii isolates identifies >620 K new SNPs, including ~11 K that intersect with expression profiling probes, yielding additional markers for genotyping studies, and further validating the utility of a combined expression profiling/genotyping array design. Additional applications facilitating SNP and transcript discovery, alternative statistical methods for quantifying gene expression, etc. are also pursued at pilot scale to inform future array designs.

CONCLUSIONS

In addition to providing an initial global view of the T. gondii transcriptome across major lineages and permitting detailed resolution of recombination points in a historical sexual cross, the multifunctional nature of this array also allowed opportunities to exploit probes for purposes beyond their intended use, enhancing analyses. This array is in widespread use by the T. gondii research community, and several aspects of the design strategy are likely to be useful for other pathogens.

Collapse

De novo assembly of a 40 Mb eukaryotic genome from short sequence reads: Sordaria macrospora, a model organism for fungal morphogenesis. PLoS Genet 2010;6:e1000891. [PMID: 20386741 PMCID: PMC2851567 DOI: 10.1371/journal.pgen.1000891] [Citation(s) in RCA: 140] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2009] [Accepted: 03/02/2010] [Indexed: 01/09/2023] Open

Abstract

Filamentous fungi are of great importance in ecology, agriculture, medicine, and biotechnology. Thus, it is not surprising that genomes for more than 100 filamentous fungi have been sequenced, most of them by Sanger sequencing. While next-generation sequencing techniques have revolutionized genome resequencing, e.g. for strain comparisons, genetic mapping, or transcriptome and ChIP analyses, de novo assembly of eukaryotic genomes still presents significant hurdles, because of their large size and stretches of repetitive sequences. Filamentous fungi contain few repetitive regions in their 30-90 Mb genomes and thus are suitable candidates to test de novo genome assembly from short sequence reads. Here, we present a high-quality draft sequence of the Sordaria macrospora genome that was obtained by a combination of Illumina/Solexa and Roche/454 sequencing. Paired-end Solexa sequencing of genomic DNA to 85-fold coverage and an additional 10-fold coverage by single-end 454 sequencing resulted in approximately 4 Gb of DNA sequence. Reads were assembled to a 40 Mb draft version (N50 of 117 kb) with the Velvet assembler. Comparative analysis with Neurospora genomes increased the N50 to 498 kb. The S. macrospora genome contains even fewer repeat regions than its closest sequenced relative, Neurospora crassa. Comparison with genomes of other fungi showed that S. macrospora, a model organism for morphogenesis and meiosis, harbors duplications of several genes involved in self/nonself-recognition. Furthermore, S. macrospora contains more polyketide biosynthesis genes than N. crassa. Phylogenetic analyses suggest that some of these genes may have been acquired by horizontal gene transfer from a distantly related ascomycete group. Our study shows that, for typical filamentous fungi, de novo assembly of genomes from short sequence reads alone is feasible, that a mixture of Solexa and 454 sequencing substantially improves the assembly, and that the resulting data can be used for comparative studies to address basic questions of fungal biology.

Collapse

Inskeep WP, Rusch DB, Jay ZJ, Herrgard MJ, Kozubal MA, Richardson TH, Macur RE, Hamamura N, Jennings RD, Fouke BW, Reysenbach AL, Roberto F, Young M, Schwartz A, Boyd ES, Badger JH, Mathur EJ, Ortmann AC, Bateson M, Geesey G, Frazier M. Metagenomes from high-temperature chemotrophic systems reveal geochemical controls on microbial community structure and function. PLoS One 2010;5:e9773. [PMID: 20333304 PMCID: PMC2841643 DOI: 10.1371/journal.pone.0009773] [Citation(s) in RCA: 134] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2009] [Accepted: 02/25/2010] [Indexed: 01/07/2023] Open

Abstract

The Yellowstone caldera contains the most numerous and diverse geothermal systems on Earth, yielding an extensive array of unique high-temperature environments that host a variety of deeply-rooted and understudied Archaea, Bacteria and Eukarya. The combination of extreme temperature and chemical conditions encountered in geothermal environments often results in considerably less microbial diversity than other terrestrial habitats and offers a tremendous opportunity for studying the structure and function of indigenous microbial communities and for establishing linkages between putative metabolisms and element cycling. Metagenome sequence (14–15,000 Sanger reads per site) was obtained for five high-temperature (>65°C) chemotrophic microbial communities sampled from geothermal springs (or pools) in Yellowstone National Park (YNP) that exhibit a wide range in geochemistry including pH, dissolved sulfide, dissolved oxygen and ferrous iron. Metagenome data revealed significant differences in the predominant phyla associated with each of these geochemical environments. Novel members of the Sulfolobales are dominant in low pH environments, while other Crenarchaeota including distantly-related Thermoproteales and Desulfurococcales populations dominate in suboxic sulfidic sediments. Several novel archaeal groups are well represented in an acidic (pH 3) Fe-oxyhydroxide mat, where a higher O₂ influx is accompanied with an increase in archaeal diversity. The presence or absence of genes and pathways important in S oxidation-reduction, H₂-oxidation, and aerobic respiration (terminal oxidation) provide insight regarding the metabolic strategies of indigenous organisms present in geothermal systems. Multiple-pathway and protein-specific functional analysis of metagenome sequence data corroborated results from phylogenetic analyses and clearly demonstrate major differences in metabolic potential across sites. The distribution of functional genes involved in electron transport is consistent with the hypothesis that geochemical parameters (e.g., pH, sulfide, Fe, O₂) control microbial community structure and function in YNP geothermal springs.

Collapse

Affiliation(s)

William P. Inskeep Thermal Biology Institute and Department of Land Resources and Environmental Sciences, Montana State University, Bozeman, Montana, United States of America * E-mail: (WPI); (DBR)
Douglas B. Rusch J. Craig Venter Institute, Rockville, Maryland, United States of America * E-mail: (WPI); (DBR)
Zackary J. Jay Thermal Biology Institute and Department of Land Resources and Environmental Sciences, Montana State University, Bozeman, Montana, United States of America
Markus J. Herrgard Synthetic Genomics Inc., La Jolla, California, United States of America
Mark A. Kozubal Thermal Biology Institute and Department of Land Resources and Environmental Sciences, Montana State University, Bozeman, Montana, United States of America
Toby H. Richardson Synthetic Genomics Inc., La Jolla, California, United States of America
Richard E. Macur Thermal Biology Institute and Department of Land Resources and Environmental Sciences, Montana State University, Bozeman, Montana, United States of America
Natsuko Hamamura Center for Marine Environmental Studies, Ehime University, Matsuyama, Japan
Ryan deM. Jennings Thermal Biology Institute and Department of Land Resources and Environmental Sciences, Montana State University, Bozeman, Montana, United States of America
Bruce W. Fouke University of Illinois, Urbana, Illinois, United States of America
Anna-Louise Reysenbach Center for Marine Environmental Studies, Ehime University, Matsuyama, Japan
Frank Roberto Idaho National Laboratory, Idaho Falls, Idaho, United States of America
Mark Young Thermal Biology Institute and Department of Plant Sciences and Plant Pathology, Montana State University, Bozeman, Montana, United States of America
Ariel Schwartz Synthetic Genomics Inc., La Jolla, California, United States of America
Eric S. Boyd Thermal Biology Institute and Department of Microbiology, Montana State University, Bozeman, Montana, United States of America
Jonathan H. Badger J. Craig Venter Institute, Rockville, Maryland, United States of America
Eric J. Mathur Synthetic Genomics Inc., La Jolla, California, United States of America
Alice C. Ortmann Department of Marine Science, University of South Alabama, Mobile, Alabama, United States of America
Mary Bateson Thermal Biology Institute and Department of Plant Sciences and Plant Pathology, Montana State University, Bozeman, Montana, United States of America
Gill Geesey Thermal Biology Institute and Department of Microbiology, Montana State University, Bozeman, Montana, United States of America
Marvin Frazier J. Craig Venter Institute, Rockville, Maryland, United States of America

Collapse

Madupu R, Brinkac LM, Harrow J, Wilming LG, Böhme U, Lamesch P, Hannick LI. Meeting report: a workshop on Best Practices in Genome Annotation. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2010;2010:baq001. [PMID: 20428316 PMCID: PMC2860899 DOI: 10.1093/database/baq001] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/12/2009] [Revised: 01/08/2010] [Accepted: 01/11/2010] [Indexed: 01/28/2023]

Romero-Zaliz R, Rubio-Escudero C, Zwir I, del Val C. Optimization of multi-classifiers for computational biology: application to gene finding and expression. Theor Chem Acc 2009. [DOI: 10.1007/s00214-009-0648-3] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]

Coghlan A, Fiedler TJ, McKay SJ, Flicek P, Harris TW, Blasiar D, Stein LD. nGASP--the nematode genome annotation assessment project. BMC Bioinformatics 2008;9:549. [PMID: 19099578 PMCID: PMC2651883 DOI: 10.1186/1471-2105-9-549] [Citation(s) in RCA: 55] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2008] [Accepted: 12/19/2008] [Indexed: 11/15/2022] Open

Abstract

Background

While the C. elegans genome is extensively annotated, relatively little information is available for other Caenorhabditis species. The nematode genome annotation assessment project (nGASP) was launched to objectively assess the accuracy of protein-coding gene prediction software in C. elegans, and to apply this knowledge to the annotation of the genomes of four additional Caenorhabditis species and other nematodes. Seventeen groups worldwide participated in nGASP, and submitted 47 prediction sets across 10 Mb of the C. elegans genome. Predictions were compared to reference gene sets consisting of confirmed or manually curated gene models from WormBase.

Results

The most accurate gene-finders were 'combiner' algorithms, which made use of transcript- and protein-alignments and multi-genome alignments, as well as gene predictions from other gene-finders. Gene-finders that used alignments of ESTs, mRNAs and proteins came in second. There was a tie for third place between gene-finders that used multi-genome alignments and ab initio gene-finders. The median gene level sensitivity of combiners was 78% and their specificity was 42%, which is nearly the same accuracy reported for combiners in the human genome. C. elegans genes with exons of unusual hexamer content, as well as those with unusually many exons, short exons, long introns, a weak translation start signal, weak splice sites, or poorly conserved orthologs posed the greatest difficulty for gene-finders.

Conclusion

This experiment establishes a baseline of gene prediction accuracy in Caenorhabditis genomes, and has guided the choice of gene-finders for the annotation of newly sequenced genomes of Caenorhabditis and other nematode species. We have created new gene sets for C. briggsae, C. remanei, C. brenneri, C. japonica, and Brugia malayi using some of the best-performing gene-finders.

Collapse

Chen Z, Harb OS, Roos DS. In silico identification of specialized secretory-organelle proteins in apicomplexan parasites and in vivo validation in Toxoplasma gondii. PLoS One 2008;3:e3611. [PMID: 18974850 PMCID: PMC2575384 DOI: 10.1371/journal.pone.0003611] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2008] [Accepted: 10/06/2008] [Indexed: 12/04/2022] Open

Liu Q, Crammer K, Pereira FCN, Roos DS. Reranking candidate gene models with cross-species comparison for improved gene prediction. BMC Bioinformatics 2008;9:433. [PMID: 18854050 PMCID: PMC2587481 DOI: 10.1186/1471-2105-9-433] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2008] [Accepted: 10/14/2008] [Indexed: 11/10/2022] Open