Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Galperin MY, Kolker E. New metrics for comparative genomics. Curr Opin Biotechnol 2006;17:440-7. [PMID: 16978854 PMCID: PMC1764326 DOI: 10.1016/j.copbio.2006.08.007] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2006] [Revised: 08/10/2006] [Accepted: 08/25/2006] [Indexed: 10/24/2022]

For:	Galperin MY, Kolker E. New metrics for comparative genomics. Curr Opin Biotechnol 2006;17:440-7. [PMID: 16978854 PMCID: PMC1764326 DOI: 10.1016/j.copbio.2006.08.007] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2006] [Revised: 08/10/2006] [Accepted: 08/25/2006] [Indexed: 10/24/2022]

Number

Cited by Other Article(s)

Galperin MY, Kristensen DM, Makarova KS, Wolf YI, Koonin EV. Microbial genome analysis: the COG approach. Brief Bioinform 2020;20:1063-1070. [PMID: 28968633 DOI: 10.1093/bib/bbx117] [Citation(s) in RCA: 144] [Impact Index Per Article: 36.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2017] [Revised: 08/01/2017] [Indexed: 11/15/2022] Open

Patnaik BB, Chung JM, Hwang HJ, Sang MK, Park JE, Min HR, Cho HC, Dewangan N, Baliarsingh S, Kang SW, Park SY, Jo YH, Park HS, Kim WJ, Han YS, Lee JS, Lee YS. Transcriptome analysis of air-breathing land slug, Incilaria fruhstorferi reveals functional insights into growth, immunity, and reproduction. BMC Genomics 2019;20:154. [PMID: 30808280 PMCID: PMC6390351 DOI: 10.1186/s12864-019-5526-3] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2018] [Accepted: 02/11/2019] [Indexed: 01/27/2023] Open

Abstract

Background

Incilaria (= Meghimatium) fruhstorferi is an air-breathing land slug found in restricted habitats of Japan, Taiwan and selected provinces of South Korea (Jeju, Chuncheon, Busan, and Deokjeokdo). The species is on a decline due to depletion of forest cover, predation by natural enemies, and collection. To facilitate the conservation of the species, it is important to decide on a number of traits related to growth, immunity and reproduction addressing fitness advantage of the species.

Results

The visceral mass transcriptome of I. fruhstorferi was enabled using the Illumina HiSeq 4000 sequencing platform. According to BUSCO (Benchmarking Universal Single-Copy Orthologs) method, the transcriptome was considered complete with 91.8% of ortholog genes present (Single: 70.7%; Duplicated: 21.1%). A total of 96.79% of the raw read sequences were processed as clean reads. TransDecoder identified 197,271 contigs that contained candidate-coding regions. Of a total of 50,230 unigenes, 34,470 (68.62% of the total unigenes) annotated to homologous proteins in the Protostome database (PANM-DB). The GO term and KEGG pathway analysis indicated genes involved in metabolism, phosphatidylinositol signalling system, aminobenzoate degradation, and T-cell receptor signalling pathway. Many genes associated with molluscan innate immunity were categorized under pathogen recognition receptor, TLR signalling pathway, MyD88 dependent pathway, endogenous ligands, immune effectors, antimicrobial peptides, apoptosis, and adaptation-related. The reproduction-associated unigenes showed homology to protein fem-1, spermatogenesis-associated protein, sperm associated antigen, and testis expressed sequences, among others. In addition, we identified key growth-related genes categorized under somatotrophic axis, muscle growth, chitinases and collagens. A total of 4822 Simple Sequence Repeats (SSRs) were also identified from the unigene sequences of I. fruhstorferi.

Conclusions

This is the first available genomic information for non-model land slug, I. fruhstorferi focusing on genes related to growth, immunity, and reproduction, with additional focus on microsatellites and repeating elements. The transcriptome provides access to greater number of traits of unknown relevance in the species that could be exploited for in-depth analyses of evolutionary plasticity and making informed choices during conservation planning. This would be appropriate for understanding the dynamics of the species on a priority basis considering the ecological, health, and social benefits.

Electronic supplementary material

The online version of this article (10.1186/s12864-019-5526-3) contains supplementary material, which is available to authorized users.

Collapse

Affiliation(s)

Bharat Bhusan Patnaik School of Biotech Sciences, Trident Academy of Creative Technology (TACT), F2-B, Chandaka Industrial Estate, Chandrasekharpur, Bhubaneswar, Odisha, 751024, India
Jong Min Chung Department of Life Science and Biotechnology, College of Natural Sciences, Soonchunhyang University, 22 Soonchunhyangro, Shinchang-myeon, Asan, Chungchungnam-do, 31538, South Korea
Hee Ju Hwang Department of Life Science and Biotechnology, College of Natural Sciences, Soonchunhyang University, 22 Soonchunhyangro, Shinchang-myeon, Asan, Chungchungnam-do, 31538, South Korea
Min Kyu Sang Department of Life Science and Biotechnology, College of Natural Sciences, Soonchunhyang University, 22 Soonchunhyangro, Shinchang-myeon, Asan, Chungchungnam-do, 31538, South Korea
Jie Eun Park Department of Life Science and Biotechnology, College of Natural Sciences, Soonchunhyang University, 22 Soonchunhyangro, Shinchang-myeon, Asan, Chungchungnam-do, 31538, South Korea
Hye Rin Min Department of Life Science and Biotechnology, College of Natural Sciences, Soonchunhyang University, 22 Soonchunhyangro, Shinchang-myeon, Asan, Chungchungnam-do, 31538, South Korea
Hang Chul Cho Department of Life Science and Biotechnology, College of Natural Sciences, Soonchunhyang University, 22 Soonchunhyangro, Shinchang-myeon, Asan, Chungchungnam-do, 31538, South Korea
Neha Dewangan School of Biotech Sciences, Trident Academy of Creative Technology (TACT), F2-B, Chandaka Industrial Estate, Chandrasekharpur, Bhubaneswar, Odisha, 751024, India
Snigdha Baliarsingh School of Biotech Sciences, Trident Academy of Creative Technology (TACT), F2-B, Chandaka Industrial Estate, Chandrasekharpur, Bhubaneswar, Odisha, 751024, India
Se Won Kang Biological Resource Center, Korea Research Institute of Bioscience and Biotechnology (KRIBB), 181, Ipsin-gil, Jungeup-si, Jeollabuk-do, 56212, South Korea
So Young Park Nakdonggang National Institute of Biological Resources, Biodiversity Conservation and Change Research Division, 137, Donam-2-gil, Sangju-si, Gyeongsangbuk-do, 37242, South Korea
Yong Hun Jo College of Agriculture and Life Science, Chonnam National University, 77 Yongbong-ro, Buk-gu, Gwangju, 61186, South Korea
Hong Seog Park Research Institute, GnC BIO Co., LTD, 621-6 Banseok-dong, Yuseong-gu, Daejeon, 34069, Republic of Korea
Wan Jong Kim Department of Life Science and Biotechnology, College of Natural Sciences, Soonchunhyang University, 22 Soonchunhyangro, Shinchang-myeon, Asan, Chungchungnam-do, 31538, South Korea
Yeon Soo Han College of Agriculture and Life Science, Chonnam National University, 77 Yongbong-ro, Buk-gu, Gwangju, 61186, South Korea
Jun Sang Lee Institute of Basic Science, Soonchunhyang University, 22 Soonchunhyangro, Shinchang-myeon, Asan, Chungchungnam-do, 31538, South Korea
Yong Seok Lee Department of Life Science and Biotechnology, College of Natural Sciences, Soonchunhyang University, 22 Soonchunhyangro, Shinchang-myeon, Asan, Chungchungnam-do, 31538, South Korea.

Collapse

Tatarinova TV, Lysnyansky I, Nikolsky YV, Bolshoy A. The mysterious orphans of Mycoplasmataceae. Biol Direct 2016;11:2. [PMID: 26747447 PMCID: PMC4706650 DOI: 10.1186/s13062-015-0104-3] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2015] [Accepted: 12/30/2015] [Indexed: 01/08/2023] Open

Abstract

Background

The length of a protein sequence is largely determined by its function. In certain species, it may be also affected by additional factors, such as growth temperature or acidity. In 2002, it was shown that in the bacterium Escherichia coli and in the archaeon Archaeoglobus fulgidus, protein sequences with no homologs were, on average, shorter than those with homologs (BMC Evol Biol 2:20, 2002). It is now generally accepted that in bacterial and archaeal genomes the distributions of protein length are different between sequences with and without homologs. In this study, we examine this postulate by conducting a comprehensive analysis of all annotated prokaryotic genomes and by focusing on certain exceptions.

Results

We compared the distribution of lengths of “having homologs proteins” (HHPs) and “non-having homologs proteins” (orphans or ORFans) in all currently completely sequenced and COG-annotated prokaryotic genomes. As expected, the HHPs and ORFans have strikingly different length distributions in almost all genomes. As previously established, the HHPs, indeed are, on average, longer than the ORFans, and the length distributions for the ORFans have a relatively narrow peak, in contrast to the HHPs, whose lengths spread over a wider range of values. However, about thirty genomes do not obey these rules. Practically all genomes of Mycoplasma and Ureaplasma have atypical ORFans distributions, with the mean lengths of ORFan larger than the mean lengths of HHPs. These genera constitute over 80 % of atypical genomes.

Conclusions

We confirmed on a ubiquitous set of genomes that the previous observation of HHPs and ORFans have different gene length distributions. We also showed that Mycoplasmataceae genomes have very distinctive distributions of ORFans lengths. We offer several possible biological explanations of this phenomenon, such as an adaptation to Mycoplasmataceae’s ecological niche, specifically its “quiet” co-existence with host organisms, resulting in long ABC transporters.

Electronic supplementary material

The online version of this article (doi:10.1186/s13062-015-0104-3) contains supplementary material, which is available to authorized users.

Collapse

Higdon R, Earl RK, Stanberry L, Hudac CM, Montague E, Stewart E, Janko I, Choiniere J, Broomall W, Kolker N, Bernier RA, Kolker E. The promise of multi-omics and clinical data integration to identify and target personalized healthcare approaches in autism spectrum disorders. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2016;19:197-208. [PMID: 25831060 DOI: 10.1089/omi.2015.0020] [Citation(s) in RCA: 67] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]

Haft DH. Using comparative genomics to drive new discoveries in microbiology. Curr Opin Microbiol 2015;23:189-96. [PMID: 25617609 DOI: 10.1016/j.mib.2014.11.017] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2014] [Revised: 11/19/2014] [Accepted: 11/20/2014] [Indexed: 01/17/2023]

Galperin MY, Makarova KS, Wolf YI, Koonin EV. Expanded microbial genome coverage and improved protein family annotation in the COG database. Nucleic Acids Res 2014;43:D261-9. [PMID: 25428365 DOI: 10.1093/nar/gku1223] [Citation(s) in RCA: 987] [Impact Index Per Article: 98.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023] Open

Abstract

Microbial genome sequencing projects produce numerous sequences of deduced proteins, only a small fraction of which have been or will ever be studied experimentally. This leaves sequence analysis as the only feasible way to annotate these proteins and assign to them tentative functions. The Clusters of Orthologous Groups of proteins (COGs) database (http://www.ncbi.nlm.nih.gov/COG/), first created in 1997, has been a popular tool for functional annotation. Its success was largely based on (i) its reliance on complete microbial genomes, which allowed reliable assignment of orthologs and paralogs for most genes; (ii) orthology-based approach, which used the function(s) of the characterized member(s) of the protein family (COG) to assign function(s) to the entire set of carefully identified orthologs and describe the range of potential functions when there were more than one; and (iii) careful manual curation of the annotation of the COGs, aimed at detailed prediction of the biological function(s) for each COG while avoiding annotation errors and overprediction. Here we present an update of the COGs, the first since 2003, and a comprehensive revision of the COG annotations and expansion of the genome coverage to include representative complete genomes from all bacterial and archaeal lineages down to the genus level. This re-analysis of the COGs shows that the original COG assignments had an error rate below 0.5% and allows an assessment of the progress in functional genomics in the past 12 years. During this time, functions of many previously uncharacterized COGs have been elucidated and tentative functional assignments of many COGs have been validated, either by targeted experiments or through the use of high-throughput methods. A particularly important development is the assignment of functions to several widespread, conserved proteins many of which turned out to participate in translation, in particular rRNA maturation and tRNA modification. The new version of the COGs is expected to become an important tool for microbial genomics.

Collapse

Kucharova V, Wiker HG. Proteogenomics in microbiology: taking the right turn at the junction of genomics and proteomics. Proteomics 2014;14:2360-675. [PMID: 25263021 DOI: 10.1002/pmic.201400168] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2014] [Revised: 08/18/2014] [Accepted: 09/23/2014] [Indexed: 12/14/2022]

Stanberry L, Rekepalli B, Liu Y, Giblock P, Higdon R, Montague E, Broomall W, Kolker N, Kolker E. Optimizing high performance computing workflow for protein functional annotation. CONCURRENCY AND COMPUTATION : PRACTICE & EXPERIENCE 2014;26:2112-2121. [PMID: 25313296 PMCID: PMC4194055 DOI: 10.1002/cpe.3264] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]

Römling U, Kjelleberg S, Normark S, Nyman L, Uhlin BE, Åkerlund B. Microbial biofilm formation: a need to act. J Intern Med 2014;276:98-110. [PMID: 24796496 DOI: 10.1111/joim.12242] [Citation(s) in RCA: 102] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]

Higdon R, Haynes W, Stanberry L, Stewart E, Yandl G, Howard C, Broomall W, Kolker N, Kolker E. Unraveling the Complexities of Life Sciences Data. BIG DATA 2013;1:42-50. [PMID: 27447037 DOI: 10.1089/big.2012.1505] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]

Affiliation(s)

Roger Higdon 1 Bioinformatics and High-throughput Analysis Laboratory, Seattle Children's Research Institute , Seattle, Washington 2 High-throughput Analysis Core, Center for Developmental Therapeutics, Seattle Children's Research Institute , Seattle, Washington 3 Predictive Analytics, Seattle Children's , Seattle, Washington 4 Data-Enabled Life Sciences Alliance (DELSA Global) , Seattle, Washington
Winston Haynes 1 Bioinformatics and High-throughput Analysis Laboratory, Seattle Children's Research Institute , Seattle, Washington 2 High-throughput Analysis Core, Center for Developmental Therapeutics, Seattle Children's Research Institute , Seattle, Washington 3 Predictive Analytics, Seattle Children's , Seattle, Washington 4 Data-Enabled Life Sciences Alliance (DELSA Global) , Seattle, Washington
Larissa Stanberry 1 Bioinformatics and High-throughput Analysis Laboratory, Seattle Children's Research Institute , Seattle, Washington 2 High-throughput Analysis Core, Center for Developmental Therapeutics, Seattle Children's Research Institute , Seattle, Washington 3 Predictive Analytics, Seattle Children's , Seattle, Washington 4 Data-Enabled Life Sciences Alliance (DELSA Global) , Seattle, Washington
Elizabeth Stewart 1 Bioinformatics and High-throughput Analysis Laboratory, Seattle Children's Research Institute , Seattle, Washington 4 Data-Enabled Life Sciences Alliance (DELSA Global) , Seattle, Washington
Gregory Yandl 1 Bioinformatics and High-throughput Analysis Laboratory, Seattle Children's Research Institute , Seattle, Washington 2 High-throughput Analysis Core, Center for Developmental Therapeutics, Seattle Children's Research Institute , Seattle, Washington 4 Data-Enabled Life Sciences Alliance (DELSA Global) , Seattle, Washington
Chris Howard 4 Data-Enabled Life Sciences Alliance (DELSA Global) , Seattle, Washington 5 Center for Developmental Therapeutics, Seattle Children's Research Institute , Seattle, Washington
William Broomall 2 High-throughput Analysis Core, Center for Developmental Therapeutics, Seattle Children's Research Institute , Seattle, Washington 3 Predictive Analytics, Seattle Children's , Seattle, Washington 4 Data-Enabled Life Sciences Alliance (DELSA Global) , Seattle, Washington
Natali Kolker 2 High-throughput Analysis Core, Center for Developmental Therapeutics, Seattle Children's Research Institute , Seattle, Washington 3 Predictive Analytics, Seattle Children's , Seattle, Washington 4 Data-Enabled Life Sciences Alliance (DELSA Global) , Seattle, Washington
Eugene Kolker 1 Bioinformatics and High-throughput Analysis Laboratory, Seattle Children's Research Institute , Seattle, Washington 2 High-throughput Analysis Core, Center for Developmental Therapeutics, Seattle Children's Research Institute , Seattle, Washington 3 Predictive Analytics, Seattle Children's , Seattle, Washington 4 Data-Enabled Life Sciences Alliance (DELSA Global) , Seattle, Washington 6 Departments of Biomedical Informatics & Medical Education and Pediatrics, University of Washington , Seattle, Washington

Collapse

Ponomarenko E, Poverennaya E, Pyatnitskiy M, Lisitsa A, Moshkovskii S, Ilgisonis E, Chernobrovkin A, Archakov A. Comparative ranking of human chromosomes based on post-genomic data. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2012;16:604-11. [PMID: 22966780 DOI: 10.1089/omi.2012.0034] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]

THUILLARD MARC, MOULTON VINCENT. IDENTIFYING AND RECONSTRUCTING LATERAL TRANSFERS FROM DISTANCE MATRICES BY COMBINING THE MINIMUM CONTRADICTION METHOD AND NEIGHBOR-NET. J Bioinform Comput Biol 2011;9:453-70. [DOI: 10.1142/s0219720011005409] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2010] [Revised: 02/01/2011] [Accepted: 02/13/2011] [Indexed: 11/18/2022]

The genetic organisation of prokaryotic two-component system signalling pathways. BMC Genomics 2010;11:720. [PMID: 21172000 PMCID: PMC3018481 DOI: 10.1186/1471-2164-11-720] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2010] [Accepted: 12/20/2010] [Indexed: 11/16/2022] Open

Comparative genome biology of a serogroup B carriage and disease strain supports a polygenic nature of meningococcal virulence. J Bacteriol 2010;192:5363-77. [PMID: 20709895 DOI: 10.1128/jb.00883-10] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open

Galperin MY, Koonin EV. From complete genome sequence to 'complete' understanding? Trends Biotechnol 2010;28:398-406. [PMID: 20647113 PMCID: PMC3065831 DOI: 10.1016/j.tibtech.2010.05.006] [Citation(s) in RCA: 119] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2010] [Revised: 05/18/2010] [Accepted: 05/28/2010] [Indexed: 12/29/2022]

Mahadevan P, Seto D. Rapid pair-wise synteny analysis of large bacterial genomes using web-based GeneOrder4.0. BMC Res Notes 2010;3:41. [PMID: 20178631 PMCID: PMC2844394 DOI: 10.1186/1756-0500-3-41] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2010] [Accepted: 02/23/2010] [Indexed: 11/30/2022] Open

Galperin MY, Higdon R, Kolker E. Interplay of heritage and habitat in the distribution of bacterial signal transduction systems. MOLECULAR BIOSYSTEMS 2010;6:721-8. [PMID: 20237650 DOI: 10.1039/b908047c] [Citation(s) in RCA: 76] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]

Louie B, Higdon R, Kolker E. A statistical model of protein sequence similarity and function similarity reveals overly-specific function predictions. PLoS One 2009;4:e7546. [PMID: 19844580 PMCID: PMC2760442 DOI: 10.1371/journal.pone.0007546] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2009] [Accepted: 09/13/2009] [Indexed: 12/02/2022] Open

Abstract

Background

Predicting protein function from primary sequence is an important open problem in modern biology. Not only are there many thousands of proteins of unknown function, current approaches for predicting function must be improved upon. One problem in particular is overly-specific function predictions which we address here with a new statistical model of the relationship between protein sequence similarity and protein function similarity.

Methodology

Our statistical model is based on sets of proteins with experimentally validated functions and numeric measures of function specificity and function similarity derived from the Gene Ontology. The model predicts the similarity of function between two proteins given their amino acid sequence similarity measured by statistics from the BLAST sequence alignment algorithm. A novel aspect of our model is that it predicts the degree of function similarity shared between two proteins over a continuous range of sequence similarity, facilitating prediction of function with an appropriate level of specificity.

Significance

Our model shows nearly exact function similarity for proteins with high sequence similarity (bit score >244.7, e-value >1e⁻⁶², non-redundant NCBI protein database (NRDB)) and only small likelihood of specific function match for proteins with low sequence similarity (bit score <54.6, e-value <1e⁻⁰⁵, NRDB). For sequence similarity ranges in between our annotation model shows an increasing relationship between function similarity and sequence similarity, but with considerable variability. We applied the model to a large set of proteins of unknown function, and predicted functions for thousands of these proteins ranging from general to very specific. We also applied the model to a data set of proteins with previously assigned, specific functions that were electronically based. We show that, on average, these prior function predictions are more specific (quite possibly overly-specific) compared to predictions from our model that is based on proteins with experimentally determined function.

Collapse

Meinicke P. UFO: a web server for ultra-fast functional profiling of whole genome protein sequences. BMC Genomics 2009;10:409. [PMID: 19725959 PMCID: PMC2744726 DOI: 10.1186/1471-2164-10-409] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2009] [Accepted: 09/02/2009] [Indexed: 11/10/2022] Open

Kaddi C, Quo CF, Wang MD. Quantitative metrics for bio-modeling algorithm selection. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2009;2008:4613-6. [PMID: 19163744 DOI: 10.1109/iembs.2008.4650241] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]

Genomes and knowledge - a questionable relationship? Trends Microbiol 2008;16:512-9. [PMID: 18819801 DOI: 10.1016/j.tim.2008.08.001] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2008] [Revised: 08/15/2008] [Accepted: 08/21/2008] [Indexed: 11/22/2022]

Thuillard M. Minimum contradiction matrices in whole genome phylogenies. Evol Bioinform Online 2008;4:237-47. [PMID: 19204821 PMCID: PMC2614196 DOI: 10.4137/ebo.s909] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open

Jones JT, Moens M, Mota M, Li H, Kikuchi T. Bursaphelenchus xylophilus: opportunities in comparative genomics and molecular host-parasite interactions. MOLECULAR PLANT PATHOLOGY 2008;9:357-68. [PMID: 18705876 PMCID: PMC6640334 DOI: 10.1111/j.1364-3703.2007.00461.x] [Citation(s) in RCA: 93] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/02/2023]

Wilson GA, Feil EJ, Lilley AK, Field D. Large-scale comparative genomic ranking of taxonomically restricted genes (TRGs) in bacterial and archaeal genomes. PLoS One 2007;2:e324. [PMID: 17389915 PMCID: PMC1824705 DOI: 10.1371/journal.pone.0000324] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2007] [Accepted: 02/18/2007] [Indexed: 11/19/2022] Open