Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Mulder N, Apweiler R. InterPro and InterProScan: tools for protein sequence classification and comparison. Methods Mol Biol 2007;396:59-70. [PMID: 18025686 DOI: 10.1007/978-1-59745-515-2_5] [Citation(s) in RCA: 292] [Impact Index Per Article: 17.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]

For:	Mulder N, Apweiler R. InterPro and InterProScan: tools for protein sequence classification and comparison. Methods Mol Biol 2007;396:59-70. [PMID: 18025686 DOI: 10.1007/978-1-59745-515-2_5] [Citation(s) in RCA: 292] [Impact Index Per Article: 17.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]

Number

Cited by Other Article(s)

251

Kislyuk AO, Katz LS, Agrawal S, Hagen MS, Conley AB, Jayaraman P, Nelakuditi V, Humphrey JC, Sammons SA, Govil D, Mair RD, Tatti KM, Tondella ML, Harcourt BH, Mayer LW, Jordan IK. A computational genomics pipeline for prokaryotic sequencing projects. ACTA ACUST UNITED AC 2010;26:1819-26. [PMID: 20519285 PMCID: PMC2905547 DOI: 10.1093/bioinformatics/btq284] [Citation(s) in RCA: 59] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]

252

Hawkins T, Chitale M, Kihara D. Functional enrichment analyses and construction of functional similarity networks with high confidence function prediction by PFP. BMC Bioinformatics 2010;11:265. [PMID: 20482861 PMCID: PMC2882935 DOI: 10.1186/1471-2105-11-265] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2009] [Accepted: 05/19/2010] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

A new paradigm of biological investigation takes advantage of technologies that produce large high throughput datasets, including genome sequences, interactions of proteins, and gene expression. The ability of biologists to analyze and interpret such data relies on functional annotation of the included proteins, but even in highly characterized organisms many proteins can lack the functional evidence necessary to infer their biological relevance.

RESULTS

Here we have applied high confidence function predictions from our automated prediction system, PFP, to three genome sequences, Escherichia coli, Saccharomyces cerevisiae, and Plasmodium falciparum (malaria). The number of annotated genes is increased by PFP to over 90% for all of the genomes. Using the large coverage of the function annotation, we introduced the functional similarity networks which represent the functional space of the proteomes. Four different functional similarity networks are constructed for each proteome, one each by considering similarity in a single Gene Ontology (GO) category, i.e. Biological Process, Cellular Component, and Molecular Function, and another one by considering overall similarity with the funSim score. The functional similarity networks are shown to have higher modularity than the protein-protein interaction network. Moreover, the funSim score network is distinct from the single GO-score networks by showing a higher clustering degree exponent value and thus has a higher tendency to be hierarchical. In addition, examining function assignments to the protein-protein interaction network and local regions of genomes has identified numerous cases where subnetworks or local regions have functionally coherent proteins. These results will help interpreting interactions of proteins and gene orders in a genome. Several examples of both analyses are highlighted.

CONCLUSION

The analyses demonstrate that applying high confidence predictions from PFP can have a significant impact on a researchers' ability to interpret the immense biological data that are being generated today. The newly introduced functional similarity networks of the three organisms show different network properties as compared with the protein-protein interaction networks.

Collapse

253

Jung J, Yi G, Sukno SA, Thon MR. PoGO: Prediction of Gene Ontology terms for fungal proteins. BMC Bioinformatics 2010;11:215. [PMID: 20429880 PMCID: PMC2882390 DOI: 10.1186/1471-2105-11-215] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2010] [Accepted: 04/29/2010] [Indexed: 11/10/2022] Open

254

Jiang SY, Ma Z, Ramachandran S. Evolutionary history and stress regulation of the lectin superfamily in higher plants. BMC Evol Biol 2010;10:79. [PMID: 20236552 PMCID: PMC2846932 DOI: 10.1186/1471-2148-10-79] [Citation(s) in RCA: 84] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2009] [Accepted: 03/18/2010] [Indexed: 02/02/2023] Open

255

Li J, Hosseini Moghaddam SH, Chen X, Chen M, Zhong B. Shotgun strategy-based proteome profiling analysis on the head of silkworm Bombyx mori. Amino Acids 2010;39:751-61. [PMID: 20198493 DOI: 10.1007/s00726-010-0517-3] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2009] [Accepted: 02/05/2010] [Indexed: 01/09/2023]

256

Lokanathan Y, Mohd-Adnan A, Wan KL, Nathan S. Transcriptome analysis of the Cryptocaryon irritans tomont stage identifies potential genes for the detection and control of cryptocaryonosis. BMC Genomics 2010;11:76. [PMID: 20113487 PMCID: PMC2828411 DOI: 10.1186/1471-2164-11-76] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2009] [Accepted: 01/29/2010] [Indexed: 01/26/2023] Open

Abstract

Background

Cryptocaryon irritans is a parasitic ciliate that causes cryptocaryonosis (white spot disease) in marine fish. Diagnosis of cryptocaryonosis often depends on the appearance of white spots on the surface of the fish, which are usually visible only during later stages of the disease. Identifying suitable biomarkers of this parasite would aid the development of diagnostic tools and control strategies for C. irritans. The C. irritans genome is virtually unexplored; therefore, we generated and analyzed expressed sequence tags (ESTs) of the parasite to identify genes that encode for surface proteins, excretory/secretory proteins and repeat-containing proteins.

Results

ESTs were generated from a cDNA library of C. irritans tomonts isolated from infected Asian sea bass, Lates calcarifer. Clustering of the 5356 ESTs produced 2659 unique transcripts (UTs) containing 1989 singletons and 670 consensi. BLAST analysis showed that 74% of the UTs had significant similarity (E-value < 10^-5) to sequences that are currently available in the GenBank database, with more than 15% of the significant hits showing unknown function. Forty percent of the UTs had significant similarity to ciliates from the genera Tetrahymena and Paramecium. Comparative gene family analysis with related taxa showed that many protein families are conserved among the protozoans. Based on gene ontology annotation, functional groups were successfully assigned to 790 UTs. Genes encoding excretory/secretory proteins and membrane and membrane-associated proteins were identified because these proteins often function as antigens and are good antibody targets. A total of 481 UTs were classified as encoding membrane proteins, 54 were classified as encoding for membrane-bound proteins, and 155 were found to contain excretory/secretory protein-coding sequences. Amino acid repeat-containing proteins and GPI-anchored proteins were also identified as potential candidates for the development of diagnostic and control strategies for C. irritans.

Conclusions

We successfully discovered and examined a large portion of the previously unexplored C. irritans transcriptome and identified potential genes for the development and validation of diagnostic and control strategies for cryptocaryonosis.

Collapse

257

Gong J, Wei T, Zhang N, Jamitzky F, Heckl WM, Rössle SC, Stark RW. TollML: a database of toll-like receptor structural motifs. J Mol Model 2010;16:1283-9. [PMID: 20084417 DOI: 10.1007/s00894-009-0640-9] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2009] [Accepted: 11/19/2009] [Indexed: 02/06/2023]

258

Rebhan M. Protein sequence databases. Methods Mol Biol 2010;609:45-57. [PMID: 20221912 DOI: 10.1007/978-1-60327-241-4_3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]

259

Fang Y, Xie K, Hou X, Hu H, Xiong L. Systematic analysis of GT factor family of rice reveals a novel subfamily involved in stress responses. Mol Genet Genomics 2009;283:157-69. [PMID: 20039179 DOI: 10.1007/s00438-009-0507-x] [Citation(s) in RCA: 59] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2009] [Accepted: 12/11/2009] [Indexed: 01/25/2023]

260

Naamati G, Fromer M, Linial M. Expansion of tandem repeats in sea anemone Nematostella vectensis proteome: A source for gene novelty? BMC Genomics 2009;10:593. [PMID: 20003297 PMCID: PMC2805694 DOI: 10.1186/1471-2164-10-593] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2009] [Accepted: 12/10/2009] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

The complete proteome of the starlet sea anemone, Nematostella vectensis, provides insights into gene invention dating back to the Cnidarian-Bilaterian ancestor. With the addition of the complete proteomes of Hydra magnipapillata and Monosiga brevicollis, the investigation of proteins having unique features in early metazoan life has become practical. We focused on the properties and the evolutionary trends of tandem repeat (TR) sequences in Cnidaria proteomes.

RESULTS

We found that 11-16% of N. vectensis proteins contain tandem repeats. Most TRs cover 150 amino acid segments that are comprised of basic units of 5-20 amino acids. In total, the N. Vectensis proteome has about 3300 unique TR-units, but only a small fraction of them are shared with H. magnipapillata, M. brevicollis, or mammalian proteomes. The overall abundance of these TRs stands out relative to that of 14 proteomes representing the diversity among eukaryotes and within the metazoan world. TR-units are characterized by a unique composition of amino acids, with cysteine and histidine being over-represented. Structurally, most TR-segments are associated with coiled and disordered regions. Interestingly, 80% of the TR-segments can be read in more than one open reading frame. For over 100 of them, translation of the alternative frames would result in long proteins. Most domain families that are characterized as repeats in eukaryotes are found in the TR-proteomes from Nematostella and Hydra.

CONCLUSIONS

While most TR-proteins have originated from prediction tools and are still awaiting experimental validations, supportive evidence exists for hundreds of TR-units in Nematostella. The existence of TR-proteins in early metazoan life may have served as a robust mode for novel genes with previously overlooked structural and functional characteristics.

Collapse

261

Zeng J, Alhajj R, Demetrick DJ. Representative transcript sets for evaluating a translational initiation sites predictor. BMC Bioinformatics 2009;10:206. [PMID: 19573244 PMCID: PMC2712473 DOI: 10.1186/1471-2105-10-206] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2009] [Accepted: 07/02/2009] [Indexed: 11/10/2022] Open

Abstract

Background

Translational initiation site (TIS) prediction is a very important and actively studied topic in bioinformatics. In order to complete a comparative analysis, it is desirable to have several benchmark data sets which can be used to test the effectiveness of different algorithms. An ideal benchmark data set should be reliable, representative and readily available. Preferably, proteins encoded by members of the data set should also be representative of the protein population actually expressed in cellular specimens.

Results

In this paper, we report a general algorithm for constructing a reliable sequence collection that only includes mRNA sequences whose corresponding protein products present an average profile of the general protein population of a given organism, with respect to three major structural parameters. Four representative transcript collections, each derived from a model organism, have been obtained following the algorithm we propose. Evaluation of these data sets shows that they are reasonable representations of the spectrum of proteins obtained from cellular proteomic studies. Six state-of-the-art predictors have been used to test the usefulness of the construction algorithm that we proposed. Comparative study which reports the predictors' performance on our data set as well as three other existing benchmark collections has demonstrated the actual merits of our data sets as benchmark testing collections.

Conclusion

The proposed data set construction algorithm has demonstrated its property of being a general and widely applicable scheme. Our comparison with published proteomic studies has shown that the expression of our data set of transcripts generates a polypeptide population that is representative of that obtained from evaluation of biological specimens. Our data set thus represents "real world" transcripts that will allow more accurate evaluation of algorithms dedicated to identification of TISs, as well as other translational regulatory motifs within mRNA sequences. The algorithm proposed by us aims at compiling a redundancy-free data set by removing redundant copies of homologous proteins. The existence of such data sets may be useful for conducting statistical analyses of protein sequence-structure relations. At the current stage, our approach's focus is to obtain an "average" protein data set for any particular organism without posing much selection bias. However, with the three major protein structural parameters deeply integrated into the scheme, it would be a trivial task to extend the current method for obtaining a more selective protein data set, which may facilitate the study of some particular protein structure.

Collapse

262

Clifford M, Twigg J, Upton C. Evidence for a novel gene associated with human influenza A viruses. Virol J 2009;6:198. [PMID: 19917120 PMCID: PMC2780412 DOI: 10.1186/1743-422x-6-198] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2009] [Accepted: 11/16/2009] [Indexed: 02/06/2023] Open

Abstract

BACKGROUND

Influenza A virus genomes are comprised of 8 negative strand single-stranded RNA segments and are thought to encode 11 proteins, which are all translated from mRNAs complementary to the genomic strands. Although human, swine and avian influenza A viruses are very similar, cross-species infections are usually limited. However, antigenic differences are considerable and when viruses become established in a different host or if novel viruses are created by re-assortment devastating pandemics may arise.

RESULTS

Examination of influenza A virus genomes from the early 20th Century revealed the association of a 167 codon ORF encoded by the genomic strand of segment 8 with human isolates. Close to the timing of the 1948 pseudopandemic, a mutation occurred that resulted in the extension of this ORF to 216 codons. Since 1948, this ORF has been almost totally maintained in human influenza A viruses suggesting a selectable biological function. The discovery of cytotoxic T cells responding to an epitope encoded by this ORF suggests that it is translated into protein. Evidence of several other non-traditionally translated polypeptides in influenza A virus support the translation of this genomic strand ORF. The gene product is predicted to have a signal sequence and two transmembrane domains.

CONCLUSION

We hypothesize that the genomic strand of segment 8 of encodes a novel influenza A virus protein. The persistence and conservation of this genomic strand ORF for almost a century in human influenza A viruses provides strong evidence that it is translated into a polypeptide that enhances viral fitness in the human host. This has important consequences for the interpretation of experiments that utilize mutations in the NS1 and NEP genes of segment 8 and also for the consideration of events that may alter the spread and/or pathogenesis of swine and avian influenza A viruses in the human population.

Collapse

263

Li D, Su Z, Dong J, Wang T. An expression database for roots of the model legume Medicago truncatula under salt stress. BMC Genomics 2009;10:517. [PMID: 19906315 PMCID: PMC2779821 DOI: 10.1186/1471-2164-10-517] [Citation(s) in RCA: 58] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2009] [Accepted: 11/11/2009] [Indexed: 12/29/2022] Open

264

Souza CS, Oliveira BM, Costa GGL, Schriefer A, Selbach-Schnadelbach A, Uetanabaro APT, Pirovani CP, Pereira GAG, Taranto AG, Cascardo JCDM, Góes-Neto A. Identification and characterization of a class III chitin synthase gene of Moniliophthora perniciosa, the fungus that causes witches' broom disease of cacao. J Microbiol 2009;47:431-40. [PMID: 19763417 DOI: 10.1007/s12275-008-0166-3] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2008] [Accepted: 04/01/2009] [Indexed: 11/30/2022]

265

Savas S, Geraci J, Jurisica I, Liu G. A comprehensive catalogue of functional genetic variations in the EGFR pathway: protein-protein interaction analysis reveals novel genes and polymorphisms important for cancer research. Int J Cancer 2009;125:1257-65. [PMID: 19499547 DOI: 10.1002/ijc.24535] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]

266

Jiang SY, Christoffels A, Ramamoorthy R, Ramachandran S. Expansion mechanisms and functional annotations of hypothetical genes in the rice genome. PLANT PHYSIOLOGY 2009;150:1997-2008. [PMID: 19535473 PMCID: PMC2719134 DOI: 10.1104/pp.109.139402] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/02/2009] [Accepted: 06/15/2009] [Indexed: 05/18/2023]

Abstract

In each completely sequenced genome, 30% to 50% of genes are annotated as uncharacterized hypothetical genes. In the rice (Oryza sativa) genome, 10,918 hypothetical genes were annotated in the latest version (release 6) of the Michigan State University rice genome annotation. We have implemented an integrative approach to analyze their duplication/expansion and function. The analyses show that tandem/segmental duplication and transposition/retrotransposition have significantly contributed to the expansion of hypothetical genes despite their different contribution rates. A total of 3,769 hypothetical genes have been detected from retrogene, tandem, segmental, Pack-MULE, or long terminated direct repeat-related duplication/expansion. The nonsynonymous substitutions per site and synonymous substitutions per site analyses showed that 21.65% of them were still functional, accounting for 7.47% of total hypothetical genes. Global expression analyses have identified 1,672 expressed hypothetical genes. Among them, 415 genes might function in a developmental stage-specific manner. Antisense strand expression and small RNA analyses have demonstrated that a high percentage of these hypothetical genes might play important roles in negatively regulating gene expression. Homologous searches against Arabidopsis (Arabidopsis thaliana), maize (Zea mays), sorghum (Sorghum bicolor), and indica rice genomes suggest that most of the hypothetical genes could be annotated from recently evolved genomic sequences. These data advance the understanding of rice hypothetical genes as being involved in lineage-specific expansion and that they function in a specific developmental stage. Our analyses also provide a valuable means to facilitate the characterization and functional annotation of hypothetical genes in other organisms.

Collapse

267

Almeida CR, Stoco PH, Wagner G, Sincero TC, Rotava G, Bayer-Santos E, Rodrigues JB, Sperandio MM, Maia AA, Ojopi EP, Zaha A, Ferreira HB, Tyler KM, Dávila AM, Grisard EC, Dias-Neto E. Transcriptome analysis of Taenia solium cysticerci using Open Reading Frame ESTs (ORESTES). Parasit Vectors 2009;2:35. [PMID: 19646239 PMCID: PMC2731055 DOI: 10.1186/1756-3305-2-35] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2009] [Accepted: 07/31/2009] [Indexed: 12/31/2022] Open

268

Shen YQ, Lang BF, Burger G. Diversity and dispersal of a ubiquitous protein family: acyl-CoA dehydrogenases. Nucleic Acids Res 2009;37:5619-31. [PMID: 19625492 PMCID: PMC2761260 DOI: 10.1093/nar/gkp566] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open

269

Van Auken K, Jaffery J, Chan J, Müller HM, Sternberg PW. Semi-automated curation of protein subcellular localization: a text mining-based approach to Gene Ontology (GO) Cellular Component curation. BMC Bioinformatics 2009;10:228. [PMID: 19622167 PMCID: PMC2719631 DOI: 10.1186/1471-2105-10-228] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2009] [Accepted: 07/21/2009] [Indexed: 11/28/2022] Open

Abstract

Background

Manual curation of experimental data from the biomedical literature is an expensive and time-consuming endeavor. Nevertheless, most biological knowledge bases still rely heavily on manual curation for data extraction and entry. Text mining software that can semi- or fully automate information retrieval from the literature would thus provide a significant boost to manual curation efforts.

Results

We employ the Textpresso category-based information retrieval and extraction system , developed by WormBase to explore how Textpresso might improve the efficiency with which we manually curate C. elegans proteins to the Gene Ontology's Cellular Component Ontology. Using a training set of sentences that describe results of localization experiments in the published literature, we generated three new curation task-specific categories (Cellular Components, Assay Terms, and Verbs) containing words and phrases associated with reports of experimentally determined subcellular localization. We compared the results of manual curation to that of Textpresso queries that searched the full text of articles for sentences containing terms from each of the three new categories plus the name of a previously uncurated C. elegans protein, and found that Textpresso searches identified curatable papers with recall and precision rates of 79.1% and 61.8%, respectively (F-score of 69.5%), when compared to manual curation. Within those documents, Textpresso identified relevant sentences with recall and precision rates of 30.3% and 80.1% (F-score of 44.0%). From returned sentences, curators were able to make 66.2% of all possible experimentally supported GO Cellular Component annotations with 97.3% precision (F-score of 78.8%). Measuring the relative efficiencies of Textpresso-based versus manual curation we find that Textpresso has the potential to increase curation efficiency by at least 8-fold, and perhaps as much as 15-fold, given differences in individual curatorial speed.

Conclusion

Textpresso is an effective tool for improving the efficiency of manual, experimentally based curation. Incorporating a Textpresso-based Cellular Component curation pipeline at WormBase has allowed us to transition from strictly manual curation of this data type to a more efficient pipeline of computer-assisted validation. Continued development of curation task-specific Textpresso categories will provide an invaluable resource for genomics databases that rely heavily on manual curation.

Collapse

270

Molecular evolution and functional divergence of HAK potassium transporter gene family in rice (Oryza sativa L.). J Genet Genomics 2009;36:161-72. [PMID: 19302972 DOI: 10.1016/s1673-8527(08)60103-4] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2008] [Revised: 12/02/2008] [Accepted: 12/10/2008] [Indexed: 11/22/2022]

271

Chitale M, Hawkins T, Park C, Kihara D. ESG: extended similarity group method for automated protein function prediction. ACTA ACUST UNITED AC 2009;25:1739-45. [PMID: 19435743 DOI: 10.1093/bioinformatics/btp309] [Citation(s) in RCA: 70] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]

272

Da Silva M, Upton C. Vaccinia virus G8R protein: a structural ortholog of proliferating cell nuclear antigen (PCNA). PLoS One 2009;4:e5479. [PMID: 19421403 PMCID: PMC2674943 DOI: 10.1371/journal.pone.0005479] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2009] [Accepted: 04/15/2009] [Indexed: 11/30/2022] Open

273

Godin KS, Walbott H, Leulliot N, van Tilbeurgh H, Varani G. The box H/ACA snoRNP assembly factor Shq1p is a chaperone protein homologous to Hsp90 cochaperones that binds to the Cbf5p enzyme. J Mol Biol 2009;390:231-44. [PMID: 19426738 DOI: 10.1016/j.jmb.2009.04.076] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2008] [Revised: 04/27/2009] [Accepted: 04/28/2009] [Indexed: 11/15/2022]

274

May P, Christian JO, Kempa S, Walther D. ChlamyCyc: an integrative systems biology database and web-portal for Chlamydomonas reinhardtii. BMC Genomics 2009;10:209. [PMID: 19409111 PMCID: PMC2688524 DOI: 10.1186/1471-2164-10-209] [Citation(s) in RCA: 57] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2009] [Accepted: 05/04/2009] [Indexed: 01/10/2023] Open

275

Ooi HS, Kwo CY, Wildpaner M, Sirota FL, Eisenhaber B, Maurer-Stroh S, Wong WC, Schleiffer A, Eisenhaber F, Schneider G. ANNIE: integrated de novo protein sequence annotation. Nucleic Acids Res 2009;37:W435-40. [PMID: 19389726 PMCID: PMC2703921 DOI: 10.1093/nar/gkp254] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open

276

Price DRG, Bell HA, Hinchliffe G, Fitches E, Weaver R, Gatehouse JA. A venom metalloproteinase from the parasitic wasp Eulophus pennicornis is toxic towards its host, tomato moth (Lacanobia oleracae). INSECT MOLECULAR BIOLOGY 2009;18:195-202. [PMID: 19320760 DOI: 10.1111/j.1365-2583.2009.00864.x] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]

277

Hong Y, Chalkia D, Ko KD, Bhardwaj G, Chang GS, van Rossum DB, Patterson RL. Phylogenetic Profiles Reveal Structural and Functional Determinants of Lipid-binding. ACTA ACUST UNITED AC 2009;2:139-149. [PMID: 19946567 DOI: 10.4172/jpb.1000071] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Abstract

One of the major challenges in the genomic era is annotating structure/function to the vast quantities of sequence information now available. Indeed, most of the protein sequence database lacks comprehensive annotation, even when experimental evidence exists. Further, within structurally resolved and functionally annotated protein domains, additional functionalities contained in these domains are not apparent. To add further complication, small changes in the amino-acid sequence can lead to profound changes in both structure and function, underscoring the need for rapid and reliable methods to analyze these types of data. Phylogenetic profiles provide a quantitative method that can relate the structural and functional properties of proteins, as well as their evolutionary relationships. Using all of the structurally resolved Src-Homology-2 (SH2) domains, we demonstrate that knowledge-bases can be used to create single-amino acid phylogenetic profiles which reliably annotate lipid-binding. Indeed, these measures isolate the known phosphotyrosine and hydrophobic pockets as integral to lipid-binding function. In addition, we determined that the SH2 domain of Tec family kinases bind to lipids with varying affinity and specificity. Simulating mutations in Bruton's tyrosine kinase (BTK) that cause X-Linked Agammaglobulinemia (XLA) predict that these mutations alter lipid-binding, which we confirm experimentally. In light of these results, we propose that XLA-causing mutations in the SH3-SH2 domain of BTK alter lipid-binding, which could play a causative role in the XLA-phenotype. Overall, our study suggests that the number of lipid-binding proteins is drastically underestimated and, with further development, phylogenetic profiles can provide a method for rapidly increasing the functional annotation of protein sequences.

Collapse

278

Bellgard MI, Wanchanthuek P, La T, Ryan K, Moolhuijzen P, Albertyn Z, Shaban B, Motro Y, Dunn DS, Schibeci D, Hunter A, Barrero R, Phillips ND, Hampson DJ. Genome sequence of the pathogenic intestinal spirochete brachyspira hyodysenteriae reveals adaptations to its lifestyle in the porcine large intestine. PLoS One 2009;4:e4641. [PMID: 19262690 PMCID: PMC2650404 DOI: 10.1371/journal.pone.0004641] [Citation(s) in RCA: 90] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2008] [Accepted: 01/06/2009] [Indexed: 11/30/2022] Open

279

Fontana P, Cestaro A, Velasco R, Formentin E, Toppo S. Rapid annotation of anonymous sequences from genome projects using semantic similarities and a weighting scheme in gene ontology. PLoS One 2009;4:e4619. [PMID: 19247487 PMCID: PMC2645684 DOI: 10.1371/journal.pone.0004619] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2008] [Accepted: 01/09/2009] [Indexed: 11/22/2022] Open

280

Derrien T, Thézé J, Vaysse A, André C, Ostrander EA, Galibert F, Hitte C. Revisiting the missing protein-coding gene catalog of the domestic dog. BMC Genomics 2009;10:62. [PMID: 19193219 PMCID: PMC2644713 DOI: 10.1186/1471-2164-10-62] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2008] [Accepted: 02/04/2009] [Indexed: 12/19/2022] Open

281

DOG 1.0: illustrator of protein domain structures. Cell Res 2009;19:271-3. [DOI: 10.1038/cr.2009.6] [Citation(s) in RCA: 405] [Impact Index Per Article: 27.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022] Open

282

The 2008 update of the Aspergillus nidulans genome annotation: a community effort. Fungal Genet Biol 2008;46 Suppl 1:S2-13. [PMID: 19146970 DOI: 10.1016/j.fgb.2008.12.003] [Citation(s) in RCA: 87] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2008] [Revised: 12/15/2008] [Accepted: 12/15/2008] [Indexed: 01/28/2023]

283

Jackson AP, Quail MA, Berriman M. Insights into the genome sequence of a free-living Kinetoplastid: Bodo saltans (Kinetoplastida: Euglenozoa). BMC Genomics 2008;9:594. [PMID: 19068121 PMCID: PMC2621209 DOI: 10.1186/1471-2164-9-594] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2008] [Accepted: 12/09/2008] [Indexed: 12/02/2022] Open

284

Reeves GA, Eilbeck K, Magrane M, O'Donovan C, Montecchi-Palazzi L, Harris MA, Orchard S, Jimenez RC, Prlic A, Hubbard TJP, Hermjakob H, Thornton JM. The Protein Feature Ontology: a tool for the unification of protein feature annotations. Bioinformatics 2008;24:2767-72. [PMID: 18936051 PMCID: PMC2912506 DOI: 10.1093/bioinformatics/btn528] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

285

Vizcaíno JA, Mueller M, Hermjakob H, Martens L. Charting online OMICS resources: A navigational chart for clinical researchers. Proteomics Clin Appl 2008;3:18-29. [PMID: 21136933 DOI: 10.1002/prca.200800082] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2008] [Indexed: 12/22/2022]

286

Wei T, Gong J, Jamitzky F, Heckl WM, Stark RW, Rössle SC. LRRML: a conformational database and an XML description of leucine-rich repeats (LRRs). BMC STRUCTURAL BIOLOGY 2008;8:47. [PMID: 18986514 PMCID: PMC2645405 DOI: 10.1186/1472-6807-8-47] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/05/2008] [Accepted: 11/05/2008] [Indexed: 11/22/2022]

287

Loewenstein Y, Linial M. Connect the dots: exposing hidden protein family connections from the entire sequence tree. Bioinformatics 2008;24:i193-9. [PMID: 18689824 DOI: 10.1093/bioinformatics/btn301] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

288

Uhl GR, Drgon T, Johnson C, Li CY, Contoreggi C, Hess J, Naiman D, Liu QR. Molecular genetics of addiction and related heritable phenotypes: genome-wide association approaches identify "connectivity constellation" and drug target genes with pleiotropic effects. Ann N Y Acad Sci 2008;1141:318-81. [PMID: 18991966 PMCID: PMC3922196 DOI: 10.1196/annals.1441.018] [Citation(s) in RCA: 131] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]

Abstract

Genome-wide association (GWA) can elucidate molecular genetic bases for human individual differences in complex phenotypes that include vulnerability to addiction. Here, we review (a) evidence that supports polygenic models with (at least) modest heterogeneity for the genetic architectures of addiction and several related phenotypes; (b) technical and ethical aspects of importance for understanding GWA data, including genotyping in individual samples versus DNA pools, analytic approaches, power estimation, and ethical issues in genotyping individuals with illegal behaviors; (c) the samples and the data that shape our current understanding of the molecular genetics of individual differences in vulnerability to substance dependence and related phenotypes; (d) overlaps between GWA data sets for dependence on different substances; and (e) overlaps between GWA data for addictions versus other heritable, brain-based phenotypes that include bipolar disorder, cognitive ability, frontal lobe brain volume, the ability to successfully quit smoking, neuroticism, and Alzheimer's disease. These convergent results identify potential targets for drugs that might modify addictions and play roles in these other phenotypes. They add to evidence that individual differences in the quality and quantity of brain connections make pleiotropic contributions to individual differences in vulnerability to addictions and to related brain disorders and phenotypes. A "connectivity constellation" of brain phenotypes and disorders appears to receive substantial pathogenic contributions from individual differences in a constellation of genes whose variants provide individual differences in the specification of brain connectivities during development and in adulthood. Heritable brain differences that underlie addiction vulnerability thus lie squarely in the midst of the repertoire of heritable brain differences that underlie vulnerability to other common brain disorders and phenotypes.

Collapse

289

Lacroix V, Cottret L, Thébault P, Sagot MF. An introduction to metabolic networks and their structural analysis. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2008;5:594-617. [PMID: 18989046 DOI: 10.1109/tcbb.2008.79] [Citation(s) in RCA: 61] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]

290

Li CY, Liu QR, Zhang PW, Li XM, Wei L, Uhl GR. OKCAM: an ontology-based, human-centered knowledgebase for cell adhesion molecules. Nucleic Acids Res 2008;37:D251-60. [PMID: 18790807 PMCID: PMC2686464 DOI: 10.1093/nar/gkn568] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open

291

Aftab S, Semenec L, Chu JSC, Chen N. Identification and characterization of novel human tissue-specific RFX transcription factors. BMC Evol Biol 2008;8:226. [PMID: 18673564 PMCID: PMC2533330 DOI: 10.1186/1471-2148-8-226] [Citation(s) in RCA: 99] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2008] [Accepted: 08/01/2008] [Indexed: 02/06/2023] Open

292

Espadaler J, Eswar N, Querol E, Avilés FX, Sali A, Marti-Renom MA, Oliva B. Prediction of enzyme function by combining sequence similarity and protein interactions. BMC Bioinformatics 2008;9:249. [PMID: 18505562 PMCID: PMC2430716 DOI: 10.1186/1471-2105-9-249] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2007] [Accepted: 05/27/2008] [Indexed: 11/18/2022] Open

293

The metagenome of a biogas-producing microbial community of a production-scale biogas plant fermenter analysed by the 454-pyrosequencing technology. J Biotechnol 2008;136:77-90. [PMID: 18597880 DOI: 10.1016/j.jbiotec.2008.05.008] [Citation(s) in RCA: 261] [Impact Index Per Article: 16.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2008] [Revised: 04/16/2008] [Accepted: 05/08/2008] [Indexed: 11/21/2022]

Abstract

Composition and gene content of a biogas-producing microbial community from a production-scale biogas plant fed with renewable primary products was analysed by means of a metagenomic approach applying the ultrafast 454-pyrosequencing technology. Sequencing of isolated total community DNA on a Genome Sequencer FLX System resulted in 616,072 reads with an average read length of 230 bases accounting for 141,664,289 bases sequence information. Assignment of obtained single reads to COG (Clusters of Orthologous Groups of proteins) categories revealed a genetic profile characteristic for an anaerobic microbial consortium conducting fermentative metabolic pathways. Assembly of single reads resulted in the formation of 8752 contigs larger than 500 bases in size. Contigs longer than 10kb mainly encode house-keeping proteins, e.g. DNA polymerase, recombinase, DNA ligase, sigma factor RpoD and genes involved in sugar and amino acid metabolism. A significant portion of contigs was allocated to the genome sequence of the archaeal methanogen Methanoculleus marisnigri JR1. Mapping of single reads to the M. marisnigri JR1 genome revealed that approximately 64% of the reference genome including methanogenesis gene regions are deeply covered. These results suggest that species related to those of the genus Methanoculleus play a dominant role in methanogenesis in the analysed fermentation sample. Moreover, assignment of numerous contig sequences to clostridial genomes including gene regions for cellulolytic functions indicates that clostridia are important for hydrolysis of cellulosic plant biomass in the biogas fermenter under study. Metagenome sequence data from a biogas-producing microbial community residing in a fermenter of a biogas plant provide the basis for a rational approach to improve the biotechnological process of biogas production.

Collapse

294

Ivanic J, Wallqvist A, Reifman J. Evidence of probabilistic behaviour in protein interaction networks. BMC SYSTEMS BIOLOGY 2008;2:11. [PMID: 18237403 PMCID: PMC2267158 DOI: 10.1186/1752-0509-2-11] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/13/2007] [Accepted: 01/31/2008] [Indexed: 11/25/2022]