Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For:	Henikoff JG, Henikoff S. Blocks database and its applications. Methods Enzymol 1996;266:88-105. [PMID: 8743679 DOI: 10.1016/s0076-6879(96)66008-x] [Citation(s) in RCA: 67] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]

Number

Cited by Other Article(s)

Ren FD, Liu YZ, Ding KW, Chang LL, Cao DL, Liu S. Finite temperature string by K-means clustering sampling with order parameters as collective variables for molecular crystals: application to polymorphic transformation between β-CL-20 and ε-CL-20. Phys Chem Chem Phys 2024;26:3500-3515. [PMID: 38206084 DOI: 10.1039/d3cp05389j] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2024]

Thalén F, Köhne CG, Bleidorn C. Patchwork: Alignment-Based Retrieval and Concatenation of Phylogenetic Markers from Genomic Data. Genome Biol Evol 2023;15:evad227. [PMID: 38085033 PMCID: PMC10735302 DOI: 10.1093/gbe/evad227] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/06/2023] [Indexed: 12/23/2023] Open

Abdelmoteleb M, Zhang C, Furey B, Kozubal M, Griffiths H, Champeaud M, Goodman RE. Evaluating potential risks of food allergy of novel food sources based on comparison of proteins predicted from genomes and compared to www.AllergenOnline.org. Food Chem Toxicol 2021;147:111888. [DOI: 10.1016/j.fct.2020.111888] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2020] [Revised: 11/23/2020] [Accepted: 11/25/2020] [Indexed: 12/15/2022]

Talyan S, Andrade-Navarro MA, Muro EM. Identification of transcribed protein coding sequence remnants within lincRNAs. Nucleic Acids Res 2019;46:8720-8729. [PMID: 29986053 PMCID: PMC6158594 DOI: 10.1093/nar/gky608] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2018] [Accepted: 06/26/2018] [Indexed: 12/21/2022] Open

Das JK, Choudhury PP, Chaturvedi N, Tayyab M, Hassan SS. Ranking and clustering of Drosophila olfactory receptors using mathematical morphology. Genomics 2019;111:549-559. [DOI: 10.1016/j.ygeno.2018.03.010] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2017] [Revised: 02/12/2018] [Accepted: 03/07/2018] [Indexed: 11/26/2022]

Mirabal P, Abreu J, Seco D. Assessing the best edit in perturbation-based iterative refinement algorithms to compute the median string. Pattern Recognit Lett 2019. [DOI: 10.1016/j.patrec.2019.02.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]

Jin Y, Goodman RE, Tetteh AO, Lu M, Tripathi L. Bioinformatics analysis to assess potential risks of allergenicity and toxicity of HRAP and PFLP proteins in genetically modified bananas resistant to Xanthomonas wilt disease. Food Chem Toxicol 2017;109:81-89. [PMID: 28830835 DOI: 10.1016/j.fct.2017.08.024] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2017] [Revised: 08/16/2017] [Accepted: 08/19/2017] [Indexed: 11/17/2022]

Large-Scale Sequence Comparison. Methods Mol Biol 2016;1525:191-224. [PMID: 27896723 DOI: 10.1007/978-1-4939-6622-6_9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/10/2023]

Korostelev YD, Zharov IA, Mironov AA, Rakhmaininova AB, Gelfand MS. Identification of Position-Specific Correlations between DNA-Binding Domains and Their Binding Sites. Application to the MerR Family of Transcription Factors. PLoS One 2016;11:e0162681. [PMID: 27690309 PMCID: PMC5045206 DOI: 10.1371/journal.pone.0162681] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2015] [Accepted: 08/26/2016] [Indexed: 11/25/2022] Open

Reaching optimized parameter set: protein secondary structure prediction using neural network. Neural Comput Appl 2016. [DOI: 10.1007/s00521-015-2150-2] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]

Siruguri V, Bharatraj DK, Vankudavath RN, Rao Mendu VV, Gupta V, Goodman RE. Evaluation of Bar, Barnase, and Barstar recombinant proteins expressed in genetically engineered Brassica juncea (Indian mustard) for potential risks of food allergy using bioinformatics and literature searches. Food Chem Toxicol 2015;83:93-102. [DOI: 10.1016/j.fct.2015.06.003] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2015] [Revised: 06/02/2015] [Accepted: 06/03/2015] [Indexed: 11/26/2022]

Combinations of long peptide sequence blocks can be used to describe toxin diversification in venomous animals. Toxicon 2015;95:84-92. [DOI: 10.1016/j.toxicon.2015.01.005] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2014] [Revised: 01/07/2015] [Accepted: 01/13/2015] [Indexed: 11/19/2022]

Fast and sensitive protein alignment using DIAMOND. Nat Methods 2014;12:59-60. [PMID: 25402007 DOI: 10.1038/nmeth.3176] [Citation(s) in RCA: 6523] [Impact Index Per Article: 652.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2014] [Accepted: 10/20/2014] [Indexed: 01/28/2023]

Wong AKC, Lee ESA. Aligning and Clustering Patterns to Reveal the Protein Functionality of Sequences. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2014;11:548-560. [PMID: 26356022 DOI: 10.1109/tcbb.2014.2306840] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]

Shahbaaz M, Hassan MI, Ahmad F. Functional annotation of conserved hypothetical proteins from Haemophilus influenzae Rd KW20. PLoS One 2013;8:e84263. [PMID: 24391926 PMCID: PMC3877243 DOI: 10.1371/journal.pone.0084263] [Citation(s) in RCA: 67] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2013] [Accepted: 11/21/2013] [Indexed: 11/18/2022] Open

Komáromi I, Bagoly Z, Muszbek L. Factor XIII: novel structural and functional aspects. J Thromb Haemost 2011;9:9-20. [PMID: 20880254 DOI: 10.1111/j.1538-7836.2010.04070.x] [Citation(s) in RCA: 131] [Impact Index Per Article: 10.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]

Brylinski M, Skolnick J. Comparison of structure-based and threading-based approaches to protein functional annotation. Proteins 2010;78:118-34. [PMID: 19731377 PMCID: PMC2804779 DOI: 10.1002/prot.22566] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]

Edgar RC. Optimizing substitution matrix choice and gap parameters for sequence alignment. BMC Bioinformatics 2009;10:396. [PMID: 19954534 PMCID: PMC2791778 DOI: 10.1186/1471-2105-10-396] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2009] [Accepted: 12/02/2009] [Indexed: 12/04/2022] Open

Reumers J, Maurer-Stroh S, Schymkowitz J, Rousseau F. Protein sequences encode safeguards against aggregation. Hum Mutat 2009;30:431-7. [PMID: 19156839 DOI: 10.1002/humu.20905] [Citation(s) in RCA: 68] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]

Mizuno Y, Kurochkin IV, Herberth M, Okazaki Y, Schönbach C. Predicted mouse peroxisome-targeted proteins and their actual subcellular locations. BMC Bioinformatics 2008;9 Suppl 12:S16. [PMID: 19091015 PMCID: PMC2638156 DOI: 10.1186/1471-2105-9-s12-s16] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Punta M, Ofran Y. The rough guide to in silico function prediction, or how to use sequence and structure information to predict protein function. PLoS Comput Biol 2008;4:e1000160. [PMID: 18974821 PMCID: PMC2518264 DOI: 10.1371/journal.pcbi.1000160] [Citation(s) in RCA: 66] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open

Doolittle RF, Jiang Y, Nand J. Genomic evidence for a simpler clotting scheme in jawless vertebrates. J Mol Evol 2008;66:185-96. [PMID: 18283387 DOI: 10.1007/s00239-008-9074-8] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2007] [Revised: 12/30/2007] [Accepted: 01/25/2008] [Indexed: 11/24/2022]

Abstract

Mammalian blood clotting involves numerous components, most of which are the result of gene duplications that occurred early in vertebrate evolution and after the divergence of protochordates. As such, the genomes of the jawless fish (hagfish and lamprey) offer the best possibility for finding systems that might have a reduced set of the many clotting factors observed in higher vertebrates. The most straightforward way of inventorying these factors may be through whole genome sequencing. In this regard, the NCBI Trace database ( http://www.ncbi.nlm.nih.gov/Traces/trace.cgi ) for the lamprey (Petromyzon marinus) contains more than 18 million raw DNA sequences determined by whole-genome shotgun methodology. The data are estimated to be about sixfold redundant, indicating that coverage is sufficiently complete to permit judgments about the presence or absence of particular genes. A search for 20 proteins whose sequences were determined prior to the trace database study found all 20. A subsequent search for specified coagulation factors revealed a lamprey system with a smaller number of components than is found in other vertebrates in that factors V and VIII seem to be represented by a single gene, and factor IX, which is ordinarily a cofactor of factor VIII, is not present. Fortuitously, after the completion of the survey of the Trace database, a draft assembly based on the same database was posted. The draft assembly allowed many of the identified Trace fragments to be linked into longer sequences that fully support the conclusion that lampreys have a simpler clotting scheme compared with other vertebrates. The data are also consistent with the hypothesis that a whole-genome duplication or other large scale block duplication occurred after the divergence of jawless fish from other vertebrates and allowed the simultaneous appearance of a second set of two functionally paired proteins in the vertebrate clotting scheme.

Collapse

Chen K, Huang X. Structural analysis of SNARE motifs from sea perch, Lateolabrax japonicus by computerized approaches. Comput Biol Chem 2007;31:378-83. [PMID: 17890158 DOI: 10.1016/j.compbiolchem.2007.08.002] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2006] [Accepted: 08/10/2007] [Indexed: 10/22/2022]

Goodman RE, Taylor SL, Yamamura J, Kobayashi T, Kawakami H, Kruger CL, Thompson GP. Assessment of the potential allergenicity of a Milk Basic Protein fraction. Food Chem Toxicol 2007;45:1787-94. [PMID: 17482742 DOI: 10.1016/j.fct.2007.03.014] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2006] [Revised: 03/19/2007] [Accepted: 03/19/2007] [Indexed: 11/27/2022]

Abstract

BACKGROUND

A specific basic fraction of bovine milk, termed Milk Basic Protein (MBP), has the potential to provide nutritionally important benefits if used as a food ingredient. Although derived from milk, MBP is intended for use as an ingredient in other foods. Cows' milk is a well studied, commonly allergenic food. Although the proteins in MBP are not identified as milk allergens, food products containing MBP will be labelled as containing milk as a caution to milk allergic consumers under food labelling guidelines in the US and the European Union as MBP has not been demonstrated to be free of milk allergens. However, as part of an overall safety evaluation of MBP, the developers sought to evaluate the potential allergenicity of the primary protein components for characteristics of allergenic food proteins and to assess whether intake of these proteins at intended use levels could present a significant new allergenic risk for consumers.

OBJECTIVE

To evaluate the potential allergenicity of the five identified proteins in MBP. While extensive studies have not demonstrated allergenicity of lactoferrin, the four other proteins are less studied. The four were tested here by sequence identity comparison to known allergens, and for stability of these proteins in acidic pepsin as a characteristic common to many food allergens.

METHODS

Sequences of the proteins were compared to those listed in AllergenOnline.com, by methods recommended for the evaluation of proteins introduced in crops through genetic engineering. Pepsin stability was assessed by incubating the various proteins in simulated gastric fluid at pH 1.2 with porcine pepsin for up to 60 min at 37 degrees C, with samples withdrawn and analyzed at specific times.

RESULTS

No significant sequence similarities were identified for the MBP proteins compared to known allergens. All but one of the protein components of MBP were digested relatively quickly by pepsin. The more stable protein will be of low abundance as consumed in contrast to most pepsin-stable food allergens.

CONCLUSIONS

Based on molecular characteristics and expected exposure, the protein components in MBP are unlikely to present any increased risk of allergy for milk allergic subjects or of cross-reactivity for other allergic subjects. However, since the proteins are derived from milk, products containing MBP will need to be labelled as containing milk proteins to warn milk allergic subjects of the potential risk of allergic reactions.

Collapse

Monderer-Rothkoff G, Amster-Choder O. Genetic dissection of the divergent activities of the multifunctional membrane sensor BglF. J Bacteriol 2007;189:8601-15. [PMID: 17905978 PMCID: PMC2168942 DOI: 10.1128/jb.01220-07] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open

Sulakhe D, Rodriguez A, D'Souza M, Wilde M, Nefedova V, Foster I, Maltsev N. GNARE: automated system for high-throughput genome analysis with grid computational backend. J Clin Monit Comput 2006;19:361-9. [PMID: 16328950 DOI: 10.1007/s10877-005-3463-y] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2005] [Accepted: 06/30/2005] [Indexed: 10/25/2022]

Abstract

Recent progress in genomics and experimental biology has brought exponential growth of the biological information available for computational analysis in public genomics databases. However, applying the potentially enormous scientific value of this information to the understanding of biological systems requires computing and data storage technology of an unprecedented scale. The Grid, with its aggregated and distributed computational and storage infrastructure, offers an ideal platform for high-throughput bioinformatics analysis. To leverage this we have developed the Genome Analysis Research Environment (GNARE)--a scalable computational system for the high-throughput analysis of genomes, which provides an integrated database and computational backend for data-driven bioinformatics applications. GNARE efficiently automates the major steps of genome analysis including acquisition of data from multiple genomic databases; data analysis by a diverse set of bioinformatics tools; and storage of results and annotations. High-throughput computations in GNARE are performed using distributed heterogeneous Grid computing resources such as Grid2003, TeraGrid, and the DOE Science Grid. Multi-step genome analysis workflows involving massive data processing, the use of application-specific tools and algorithms and updating of an integrated database to provide interactive web access to results are all expressed and controlled by a "virtual data" model which transparently maps computational workflows to distributed Grid resources. This paper describes how Grid technologies such as Globus, Condor, and the Gryphyn Virtual Data System were applied in the development of GNARE. It focuses on our approach to Grid resource allocation and to the use of GNARE as a computational framework for the development of bioinformatics applications.

Collapse

Araúzo-Bravo MJ, Ahmad S, Sarai A. Dimensionality of amino acid space and solvent accessibility prediction with neural networks. Comput Biol Chem 2006;30:160-8. [PMID: 16545617 DOI: 10.1016/j.compbiolchem.2005.12.003] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2005] [Revised: 12/16/2005] [Accepted: 12/16/2005] [Indexed: 11/18/2022]

Swartz TH, Ikewada S, Ishikawa O, Ito M, Krulwich TA. The Mrp system: a giant among monovalent cation/proton antiporters? Extremophiles 2005;9:345-54. [PMID: 15980940 DOI: 10.1007/s00792-005-0451-6] [Citation(s) in RCA: 123] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2005] [Accepted: 04/08/2005] [Indexed: 10/25/2022]

Spence P, Bard J, Jones P, Betty M. The identification of G-protein coupled receptors in sequence databases. Expert Opin Ther Pat 2005. [DOI: 10.1517/13543776.8.3.235] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]

Qian B, Ortiz AR, Baker D. Improvement of comparative model accuracy by free-energy optimization along principal components of natural structural variation. Proc Natl Acad Sci U S A 2004;101:15346-51. [PMID: 15492216 PMCID: PMC524448 DOI: 10.1073/pnas.0404703101] [Citation(s) in RCA: 56] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Man O, Gilad Y, Lancet D. Prediction of the odorant binding site of olfactory receptor proteins by human-mouse comparisons. Protein Sci 2004;13:240-54. [PMID: 14691239 PMCID: PMC2286516 DOI: 10.1110/ps.03296404] [Citation(s) in RCA: 117] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]

Roberts MD, Martin NL, Kropinski AM. The genome and proteome of coliphage T1. Virology 2004;318:245-66. [PMID: 14972552 DOI: 10.1016/j.virol.2003.09.020] [Citation(s) in RCA: 71] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2003] [Revised: 09/18/2003] [Accepted: 09/22/2003] [Indexed: 11/19/2022]

Liu J, Rost B. CHOP proteins into structural domain-like fragments. Proteins 2004;55:678-88. [PMID: 15103630 DOI: 10.1002/prot.20095] [Citation(s) in RCA: 57] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]

Abstract

We developed a method CHOP dissecting proteins into domain-like fragments. The basic idea was to cut proteins beginning from very reliable experimental information (PDB), proceeding to expert annotations of domain-like regions (Pfam-A), and completing through cuts based on termini of known proteins. In this way, CHOP dissected more than two thirds of all proteins from 62 proteomes. Analysis of our structural domain-like fragments revealed four surprising results. First, >70% of all dissected proteins contained more than one fragment. Second, most domains spanned on average over approximately 100 residues. This average was similar for eukaryotic and prokaryotic proteins, and it is also valid-although previously not described-for all proteins in the PDB. Third, single-domain proteins were significant longer than most domains in multidomain proteins. Fourth, three fourths of all domains appeared shorter than 210 residues. We believe that our CHOP fragments constituted an important resource for functional and structural genomics. Nevertheless, our main motivation to develop CHOP was that the single-linkage clustering method failed to adequately group full-length proteins. In contrast, CLUP-the simple clustering scheme CLUP introduced here-succeeded largely to group the CHOP fragments from 62 proteomes such that all members of one cluster shared a basic structural core. CLUP found >63,000 multi- and >118,000 single-member clusters. Although most fragments were restricted to a particular cluster, approximately 24% of the fragments were duplicated in at least two clusters. Our thresholds for grouping two fragments into the same cluster were rather conservative. Nevertheless, our results suggested that structural genomics initiatives have to target >30,000 fragments to at least cover the multimember clusters in 62 proteomes.

Collapse

Baxter SM, Rosenblum JS, Knutson S, Nelson MR, Montimurro JS, Di Gennaro JA, Speir JA, Burbaum JJ, Fetrow JS. Synergistic Computational and Experimental Proteomics Approaches for More Accurate Detection of Active Serine Hydrolases in Yeast. Mol Cell Proteomics 2004;3:209-25. [PMID: 14645503 DOI: 10.1074/mcp.m300082-mcp200] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open

Abstract

An analysis of the structurally and catalytically diverse serine hydrolase protein family in the Saccharomyces cerevisiae proteome was undertaken using two independent but complementary, large-scale approaches. The first approach is based on computational analysis of serine hydrolase active site structures; the second utilizes the chemical reactivity of the serine hydrolase active site in complex mixtures. These proteomics approaches share the ability to fractionate the complex proteome into functional subsets. Each method identified a significant number of sequences, but 15 proteins were identified by both methods. Eight of these were unannotated in the Saccharomyces Genome Database at the time of this study and are thus novel serine hydrolase identifications. Three of the previously uncharacterized proteins are members of a eukaryotic serine hydrolase family, designated as Fsh (family of serine hydrolase), identified here for the first time. OVCA2, a potential human tumor suppressor, and DYR-SCHPO, a dihydrofolate reductase from Schizosaccharomyces pombe, are members of this family. Comparing the combined results to results of other proteomic methods showed that only four of the 15 proteins were identified in a recent large-scale, "shotgun" proteomic analysis and eight were identified using a related, but similar, approach (neither identifies function). Only 10 of the 15 were annotated using alternate motif-based computational tools. The results demonstrate the precision derived from combining complementary, function-based approaches to extract biological information from complex proteomes. The chemical proteomics technology indicates that a functional protein is being expressed in the cell, while the computational proteomics technology adds details about the specific type of function and residue that is likely being labeled. The combination of synergistic methods facilitates analysis, enriches true positive results, and increases confidence in novel identifications. This work also highlights the risks inherent in annotation transfer and the use of scoring functions for determination of correct annotations.

Collapse

Alberti-Segui C, Morales AJ, Xing H, Kessler MM, Willins DA, Weinstock KG, Cottarel G, Fechtel K, Rogers B. Identification of potential cell-surface proteins inCandida albicansand investigation of the role of a putative cell-surface glycosidase in adhesion and virulence. Yeast 2004;21:285-302. [PMID: 15042589 DOI: 10.1002/yea.1061] [Citation(s) in RCA: 66] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open

Boll M, Foltz M, Rubio-Aliaga I, Daniel H. A cluster of proton/amino acid transporter genes in the human and mouse genomes. Genomics 2003;82:47-56. [PMID: 12809675 DOI: 10.1016/s0888-7543(03)00099-5] [Citation(s) in RCA: 47] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]

Chan JKL, Sun L, Yang XJ, Zhu G, Wu Z. Functional characterization of an amino-terminal region of HDAC4 that possesses MEF2 binding and transcriptional repressive activity. J Biol Chem 2003;278:23515-21. [PMID: 12709441 DOI: 10.1074/jbc.m301922200] [Citation(s) in RCA: 59] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open

Muggleton SH, Bryant CH, Srinivasan A, Whittaker A, Topp S, Rawlings C. Are grammatical representations useful for learning from biological sequence data?--a case study. J Comput Biol 2002;8:493-521. [PMID: 11694180 DOI: 10.1089/106652701753216512] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Abstract

This paper investigates whether Chomsky-like grammar representations are useful for learning cost-effective, comprehensible predictors of members of biological sequence families. The Inductive Logic Programming (ILP) Bayesian approach to learning from positive examples is used to generate a grammar for recognising a class of proteins known as human neuropeptide precursors (NPPs). Collectively, five of the co-authors of this paper, have extensive expertise on NPPs and general bioinformatics methods. Their motivation for generating a NPP grammar was that none of the existing bioinformatics methods could provide sufficient cost-savings during the search for new NPPs. Prior to this project experienced specialists at SmithKline Beecham had tried for many months to hand-code such a grammar but without success. Our best predictor makes the search for novel NPPs more than 100 times more efficient than randomly selecting proteins for synthesis and testing them for biological activity. As far as these authors are aware, this is both the first biological grammar learnt using ILP and the first real-world scientific application of the ILP Bayesian approach to learning from positive examples. A group of features is derived from this grammar. Other groups of features of NPPs are derived using other learning strategies. Amalgams of these groups are formed. A recognition model is generated for each amalgam using C4.5 and C4.5rules and its performance is measured using both predictive accuracy and a new cost function, Relative Advantage (RA). The highest RA was achieved by a model which includes grammar-derived features. This RA is significantly higher than the best RA achieved without the use of the grammar-derived features. Predictive accuracy is not a good measure of performance for this domain because it does not discriminate well between NPP recognition models: despite covering varying numbers of (the rare) positives, all the models are awarded a similar (high) score by predictive accuracy because they all exclude most of the abundant negatives.

Collapse

Fogolari F, Tessari S, Molinari H. Singular value decomposition analysis of protein sequence alignment score data. Proteins 2002;46:161-70. [PMID: 11807944 DOI: 10.1002/prot.10032] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

Lacy DB, Mourez M, Fouassier A, Collier RJ. Mapping the anthrax protective antigen binding site on the lethal and edema factors. J Biol Chem 2002;277:3006-10. [PMID: 11714723 DOI: 10.1074/jbc.m109997200] [Citation(s) in RCA: 87] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open

Altmann CR, Bell E, Sczyrba A, Pun J, Bekiranov S, Gaasterland T, Brivanlou AH. Microarray-based analysis of early development in Xenopus laevis. Dev Biol 2001;236:64-75. [PMID: 11456444 DOI: 10.1006/dbio.2001.0298] [Citation(s) in RCA: 58] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]

Toth J, Cutforth T, Gelinas AD, Bethoney KA, Bard J, Harrison CJ. Crystal structure of an ephrin ectodomain. Dev Cell 2001;1:83-92. [PMID: 11703926 DOI: 10.1016/s1534-5807(01)00002-8] [Citation(s) in RCA: 82] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]

Chasman D, Adams RM. Predicting the functional consequences of non-synonymous single nucleotide polymorphisms: structure-based assessment of amino acid variation. J Mol Biol 2001;307:683-706. [PMID: 11254390 DOI: 10.1006/jmbi.2001.4510] [Citation(s) in RCA: 298] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]

Skolnick J, Kihara D. Defrosting the frozen approximation: PROSPECTOR? A new approach to threading. Proteins 2001. [DOI: 10.1002/1097-0134(20010215)42:3<319::aid-prot30>3.0.co;2-a] [Citation(s) in RCA: 109] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Ober D, Hartmann T. Homospermidine synthase, the first pathway-specific enzyme of pyrrolizidine alkaloid biosynthesis, evolved from deoxyhypusine synthase. Proc Natl Acad Sci U S A 1999;96:14777-82. [PMID: 10611289 PMCID: PMC24724 DOI: 10.1073/pnas.96.26.14777] [Citation(s) in RCA: 113] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Ober D, Hartmann T. Deoxyhypusine synthase from tobacco. cDNA isolation, characterization, and bacterial expression of an enzyme with extended substrate specificity. J Biol Chem 1999;274:32040-7. [PMID: 10542236 DOI: 10.1074/jbc.274.45.32040] [Citation(s) in RCA: 67] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open

Rigoutsos I, Floratos A, Ouzounis C, Gao Y, Parida L. Dictionary building via unsupervised hierarchical motif discovery in the sequence space of natural proteins. Proteins 1999;37:264-77. [PMID: 10584071 DOI: 10.1002/(sici)1097-0134(19991101)37:2<264::aid-prot11>3.0.co;2-c] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]

Maga JA, LeBowitz JH. Unravelling the kinetoplastid paraflagellar rod. Trends Cell Biol 1999;9:409-13. [PMID: 10481179 DOI: 10.1016/s0962-8924(99)01635-9] [Citation(s) in RCA: 46] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]

Lacy DB, Stevens RC. Sequence homology and structural analysis of the clostridial neurotoxins. J Mol Biol 1999;291:1091-104. [PMID: 10518945 DOI: 10.1006/jmbi.1999.2945] [Citation(s) in RCA: 256] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]

Huang GM, Ng WL, Farkas J, He L, Liang HA, Gordon D, Yu J, Hood L. Prostate cancer expression profiling by cDNA sequencing analysis. Genomics 1999;59:178-86. [PMID: 10409429 DOI: 10.1006/geno.1999.5822] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]