Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Mathé C, Sagot MF, Schiex T, Rouzé P. Current methods of gene prediction, their strengths and weaknesses. Nucleic Acids Res 2002;30:4103-17. [PMID: 12364589 PMCID: PMC140543 DOI: 10.1093/nar/gkf543] [Citation(s) in RCA: 209] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2002] [Revised: 08/07/2002] [Accepted: 08/07/2002] [Indexed: 11/14/2022] Open

For:	Mathé C, Sagot MF, Schiex T, Rouzé P. Current methods of gene prediction, their strengths and weaknesses. Nucleic Acids Res 2002;30:4103-17. [PMID: 12364589 PMCID: PMC140543 DOI: 10.1093/nar/gkf543] [Citation(s) in RCA: 209] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2002] [Revised: 08/07/2002] [Accepted: 08/07/2002] [Indexed: 11/14/2022] Open

Number

Cited by Other Article(s)

101

Wei C, Peng J, Xiong Z, Yang J, Wang J, Jin Q. Subproteomic tools to increase genome annotation complexity. Proteomics 2008;8:4209-13. [DOI: 10.1002/pmic.200800226] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]

102

Sparks ME, Brendel V. MetWAMer: eukaryotic translation initiation site prediction. BMC Bioinformatics 2008;9:381. [PMID: 18801175 PMCID: PMC2603428 DOI: 10.1186/1471-2105-9-381] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2008] [Accepted: 09/18/2008] [Indexed: 11/20/2022] Open

103

Armañanzas R, Inza I, Santana R, Saeys Y, Flores JL, Lozano JA, Peer YVD, Blanco R, Robles V, Bielza C, Larrañaga P. A review of estimation of distribution algorithms in bioinformatics. BioData Min 2008;1:6. [PMID: 18822112 PMCID: PMC2576251 DOI: 10.1186/1756-0381-1-6] [Citation(s) in RCA: 47] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2008] [Accepted: 09/11/2008] [Indexed: 11/10/2022] Open

104

Ter-Hovhannisyan V, Lomsadze A, Chernoff YO, Borodovsky M. Gene prediction in novel fungal genomes using an ab initio algorithm with unsupervised training. Genome Res 2008;18:1979-90. [PMID: 18757608 DOI: 10.1101/gr.081612.108] [Citation(s) in RCA: 654] [Impact Index Per Article: 40.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]

105

Graf A, Gasser B, Dragosits M, Sauer M, Leparc GG, Tüchler T, Kreil DP, Mattanovich D. Novel insights into the unfolded protein response using Pichia pastoris specific DNA microarrays. BMC Genomics 2008;9:390. [PMID: 18713468 PMCID: PMC2533675 DOI: 10.1186/1471-2164-9-390] [Citation(s) in RCA: 93] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2008] [Accepted: 08/19/2008] [Indexed: 11/24/2022] Open

Abstract

Background

DNA Microarrays are regarded as a valuable tool for basic and applied research in microbiology. However, for many industrially important microorganisms the lack of commercially available microarrays still hampers physiological research. Exemplarily, our understanding of protein folding and secretion in the yeast Pichia pastoris is presently widely dependent on conclusions drawn from analogies to Saccharomyces cerevisiae. To close this gap for a yeast species employed for its high capacity to produce heterologous proteins, we developed full genome DNA microarrays for P. pastoris and analyzed the unfolded protein response (UPR) in this yeast species, as compared to S. cerevisiae.

Results

By combining the partially annotated gene list of P. pastoris with de novo gene finding a list of putative open reading frames was generated for which an oligonucleotide probe set was designed using the probe design tool TherMODO (a thermodynamic model-based oligoset design optimizer). To evaluate the performance of the novel array design, microarrays carrying the oligo set were hybridized with samples from treatments with dithiothreitol (DTT) or a strain overexpressing the UPR transcription factor HAC1, both compared with a wild type strain in normal medium as untreated control. DTT treatment was compared with literature data for S. cerevisiae, and revealed similarities, but also important differences between the two yeast species. Overexpression of HAC1, the most direct control for UPR genes, resulted in significant new understanding of this important regulatory pathway in P. pastoris, and generally in yeasts.

Conclusion

The differences observed between P. pastoris and S. cerevisiae underline the importance of DNA microarrays for industrial production strains. P. pastoris reacts to DTT treatment mainly by the regulation of genes related to chemical stimulus, electron transport and respiration, while the overexpression of HAC1 induced many genes involved in translation, ribosome biogenesis, and organelle biosynthesis, indicating that the regulatory events triggered by DTT treatment only partially overlap with the reactions to overexpression of HAC1. The high reproducibility of the results achieved with two different oligo sets is a good indication for their robustness, and underlines the importance of less stringent selection of regulated features, in order to avoid a large number of false negative results.

Collapse

106

Singhal P, Jayaram B, Dixit SB, Beveridge DL. Prokaryotic gene finding based on physicochemical characteristics of codons calculated from molecular dynamics simulations. Biophys J 2008;94:4173-83. [PMID: 18326660 PMCID: PMC2480686 DOI: 10.1529/biophysj.107.116392] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2007] [Accepted: 11/29/2007] [Indexed: 01/27/2023] Open

107

Levasseur A, Pontarotti P, Poch O, Thompson JD. Strategies for reliable exploitation of evolutionary concepts in high throughput biology. Evol Bioinform Online 2008;4:121-37. [PMID: 19204813 PMCID: PMC2614184 DOI: 10.4137/ebo.s597] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022] Open

108

Jones SJM. Prediction of genomic functional elements. Annu Rev Genomics Hum Genet 2008;7:315-38. [PMID: 16824019 DOI: 10.1146/annurev.genom.7.080505.115745] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

109

Mena-Chalco JP, Carrer H, Zana Y, Cesar RM. Identification of protein coding regions using the modified Gabor-wavelet transform. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2008;5:198-207. [PMID: 18451429 DOI: 10.1109/tcbb.2007.70259] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]

110

Ryabov Y, Gribskov M. Spontaneous symmetry breaking in genome evolution. Nucleic Acids Res 2008;36:2756-63. [PMID: 18367477 PMCID: PMC2377439 DOI: 10.1093/nar/gkn086] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open

111

Ferro M, Tardif M, Reguer E, Cahuzac R, Bruley C, Vermat T, Nugues E, Vigouroux M, Vandenbrouck Y, Garin J, Viari A. PepLine: a software pipeline for high-throughput direct mapping of tandem mass spectrometry data on genomic sequences. J Proteome Res 2008;7:1873-83. [PMID: 18348511 DOI: 10.1021/pr070415k] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]

112

Abeel T, Saeys Y, Bonnet E, Rouzé P, Van de Peer Y. Generic eukaryotic core promoter prediction using structural features of DNA. Genes Dev 2008;18:310-23. [PMID: 18096745 PMCID: PMC2203629 DOI: 10.1101/gr.6991408] [Citation(s) in RCA: 133] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2007] [Accepted: 11/14/2007] [Indexed: 11/24/2022]

113

Harbers M. The current status of cDNA cloning. Genomics 2008;91:232-42. [PMID: 18222633 DOI: 10.1016/j.ygeno.2007.11.004] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2007] [Revised: 11/10/2007] [Accepted: 11/17/2007] [Indexed: 11/19/2022]

114

Gene Identification: Classical and Computational Intelligence Approaches. ACTA ACUST UNITED AC 2008. [DOI: 10.1109/tsmcc.2007.906066] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]

115

An artificial neural network method for combining gene prediction based on equitable weights. Neurocomputing 2008. [DOI: 10.1016/j.neucom.2007.07.019] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]

116

Markov G, Lecointre G, Demeneix B, Laudet V. The “street light syndrome”, or how protein taxonomy can bias experimental manipulations. Bioessays 2008;30:349-57. [DOI: 10.1002/bies.20730] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

117

Dogan RI, Getoor L, Wilbur WJ, Mount SM. Features generated for computational splice-site prediction correspond to functional elements. BMC Bioinformatics 2007;8:410. [PMID: 17958908 PMCID: PMC2241647 DOI: 10.1186/1471-2105-8-410] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2007] [Accepted: 10/24/2007] [Indexed: 11/16/2022] Open

118

Roos FF, Jacob R, Grossmann J, Fischer B, Buhmann JM, Gruissem W, Baginsky S, Widmayer P. PepSplice: cache-efficient search algorithms for comprehensive identification of tandem mass spectra. Bioinformatics 2007;23:3016-23. [PMID: 17768164 DOI: 10.1093/bioinformatics/btm417] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

119

Yin C, Yau SST. Prediction of protein coding regions by the 3-base periodicity analysis of a DNA sequence. J Theor Biol 2007;247:687-94. [PMID: 17509616 DOI: 10.1016/j.jtbi.2007.03.038] [Citation(s) in RCA: 119] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2006] [Revised: 03/24/2007] [Accepted: 03/26/2007] [Indexed: 11/30/2022]

120

Ma BG. How to describe genes: Enlightenment from the quaternary number system. Biosystems 2007;90:20-7. [PMID: 16945479 DOI: 10.1016/j.biosystems.2006.06.004] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2005] [Revised: 06/15/2006] [Accepted: 06/19/2006] [Indexed: 11/17/2022]

121

Andreini C, Banci L, Bertini I, Elmi S, Rosato A. Non-heme iron through the three domains of life. Proteins 2007;67:317-24. [PMID: 17286284 DOI: 10.1002/prot.21324] [Citation(s) in RCA: 63] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

122

Identification and characterization of insect-specific proteins by genome data analysis. BMC Genomics 2007;8:93. [PMID: 17407609 PMCID: PMC1852559 DOI: 10.1186/1471-2164-8-93] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2006] [Accepted: 04/04/2007] [Indexed: 11/10/2022] Open

Abstract

Background

Insects constitute the vast majority of known species with their importance including biodiversity, agricultural, and human health concerns. It is likely that the successful adaptation of the Insecta clade depends on specific components in its proteome that give rise to specialized features. However, proteome determination is an intensive undertaking. Here we present results from a computational method that uses genome analysis to characterize insect and eukaryote proteomes as an approximation complementary to experimental approaches.

Results

Homologs in common to Drosophila melanogaster, Anopheles gambiae, Bombyx mori, Tribolium castaneum, and Apis mellifera were compared to the complete genomes of three non-insect eukaryotes (opisthokonts) Homo sapiens, Caenorhabditis elegans and Saccharomyces cerevisiae. This operation yielded 154 groups of orthologous proteins in Drosophila to be insect-specific homologs; 466 groups were determined to be common to eukaryotes (represented by three opisthokonts). ESTs from the hemimetabolous insect Locust migratoria were also considered in order to approximate their corresponding genes in the insect-specific homologs. Stress and stimulus response proteins were found to constitute a higher fraction in the insect-specific homologs than in the homologs common to eukaryotes.

Conclusion

The significant representation of stress response and stimulus response proteins in proteins determined to be insect-specific, along with specific cuticle and pheromone/odorant binding proteins, suggest that communication and adaptation to environments may distinguish insect evolution relative to other eukaryotes. The tendency for low Ka/Ks ratios in the insect-specific protein set suggests purifying selection pressure. The generally larger number of paralogs in the insect-specific proteins may indicate adaptation to environment changes. Instances in our insect-specific protein set have been arrived at through experiments reported in the literature, supporting the accuracy of our approach.

Collapse

123

Zhu W, Buell CR. Improvement of whole-genome annotation of cereals through comparative analyses. Genome Res 2007;17:299-310. [PMID: 17284677 PMCID: PMC1800921 DOI: 10.1101/gr.5881807] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]

124

Bernal A, Crammer K, Hatzigeorgiou A, Pereira F. Global discriminative learning for higher-accuracy computational gene prediction. PLoS Comput Biol 2007;3:e54. [PMID: 17367206 PMCID: PMC1828702 DOI: 10.1371/journal.pcbi.0030054] [Citation(s) in RCA: 60] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2006] [Accepted: 02/01/2007] [Indexed: 11/18/2022] Open

Abstract

Most ab initio gene predictors use a probabilistic sequence model, typically a hidden Markov model, to combine separately trained models of genomic signals and content. By combining separate models of relevant genomic features, such gene predictors can exploit small training sets and incomplete annotations, and can be trained fairly efficiently. However, that type of piecewise training does not optimize prediction accuracy and has difficulty in accounting for statistical dependencies among different parts of the gene model. With genomic information being created at an ever-increasing rate, it is worth investigating alternative approaches in which many different types of genomic evidence, with complex statistical dependencies, can be integrated by discriminative learning to maximize annotation accuracy. Among discriminative learning methods, large-margin classifiers have become prominent because of the success of support vector machines (SVM) in many classification tasks. We describe CRAIG, a new program for ab initio gene prediction based on a conditional random field model with semi-Markov structure that is trained with an online large-margin algorithm related to multiclass SVMs. Our experiments on benchmark vertebrate datasets and on regions from the ENCODE project show significant improvements in prediction accuracy over published gene predictors that use intrinsic features only, particularly at the gene level and on genes with long introns.

We describe a new approach to statistical learning for sequence data that is broadly applicable to computational biology problems and that has experimentally demonstrated advantages over current hidden Markov model (HMM)-based methods for sequence analysis. The methods we describe in this paper, implemented in the CRAIG program, allow researchers to modularly specify and train sequence analysis models that combine a wide range of weakly informative features into globally optimal predictions. Our results for the gene prediction problem show significant improvements over existing ab initio gene predictors on a variety of tests, including the specially challenging ENCODE regions. Such improved predictions, particularly on initial and single exons, could benefit researchers who are seeking more accurate means of recognizing such important features as signal peptides and regulatory regions. More generally, we believe that our method, by combining the structure-describing capabilities of HMMs with the accuracy of margin-based classification methods, provides a general tool for statistical learning in biological sequences that will replace HMMs in any sequence modeling task for which there is annotated training data.

Collapse

125

Danchin EGJ, Levasseur A, Rascol VL, Gouret P, Pontarotti P. The use of evolutionary biology concepts for genome annotation. JOURNAL OF EXPERIMENTAL ZOOLOGY PART B-MOLECULAR AND DEVELOPMENTAL EVOLUTION 2007;308:26-36. [PMID: 17016828 DOI: 10.1002/jez.b.21131] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]

126

Savidor A, Donahoo RS, Hurtado-Gonzales O, Verberkmoes NC, Shah MB, Lamour KH, McDonald WH. Expressed peptide tags: an additional layer of data for genome annotation. J Proteome Res 2007;5:3048-58. [PMID: 17081056 DOI: 10.1021/pr060134x] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]

Abstract

While genome sequencing is becoming ever more routine, genome annotation remains a challenging process. Identification of the coding sequences within the genomic milieu presents a tremendous challenge, especially for eukaryotes with their complex gene architectures. Here, we present a method to assist the annotation process through the use of proteomic data and bioinformatics. Mass spectra of digested protein preparations of the organism of interest were acquired and searched against a protein database created by a six-frame translation of the genome. The identified peptides were mapped back to the genome, compared to the current annotation, and then categorized as supporting or extending the current genome annotation. We named the classified peptides Expressed Peptide Tags (EPTs). The well-annotated bacterium Rhodopseudomonas palustris was used as a control for the method and showed a high degree of correlation between EPT mapping and the current annotation, with 86% of the EPTs confirming existing gene calls and less than 1% of the EPTs expanding on the current annotation. The eukaryotic plant pathogens Phytophthora ramorum and Phytophthora sojae, whose genomes have been recently sequenced and are much less well-annotated, were also subjected to this method. A series of algorithmic steps were taken to increase the confidence of EPT identification for these organisms, including generation of smaller subdatabases to be searched against, and definition of EPT criteria that accommodates the more complex eukaryotic gene architecture. As expected, the analysis of the Phytophthora species showed less correlation between EPT mapping and their current annotation. While approximately 76% of Phytophthora EPTs supported the current annotation, a portion of them (7.7% and 12.9% for P. ramorum and P. sojae, respectively) suggested modification to current gene calls or identified novel genes that were missed by the current genome annotation of these organisms.

Collapse

127

Saeys Y, Rouzé P, Van de Peer Y. In search of the small ones: improved prediction of short exons in vertebrates, plants, fungi and protists. Bioinformatics 2007;23:414-20. [PMID: 17204465 DOI: 10.1093/bioinformatics/btl639] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

128

Suen G, Arshinoff BI, Taylor RG, Welch RD. Practical Applications of Bacterial Functional Genomics. Biotechnol Genet Eng Rev 2007;24:213-42. [DOI: 10.1080/02648725.2007.10648101] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]

129

Gorantla M, Babu PR, Lachagari VBR, Reddy AMM, Wusirika R, Bennetzen JL, Reddy AR. Identification of stress-responsive genes in an indica rice (Oryza sativa L.) using ESTs generated from drought-stressed seedlings. JOURNAL OF EXPERIMENTAL BOTANY 2007;58:253-65. [PMID: 17132712 DOI: 10.1093/jxb/erl213] [Citation(s) in RCA: 60] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]

130

Malousi A, Kouidou S, Maglaveras N. Detecting over-represented motifs in alternatively spliced exons using Gibbs sampling. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2007;2007:139-142. [PMID: 18001908 DOI: 10.1109/iembs.2007.4352242] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]

131

Shimizu K, Adachi J, Muraoka Y. ANGLE: a sequencing errors resistant program for predicting protein coding regions in unfinished cDNA. J Bioinform Comput Biol 2006;4:649-64. [PMID: 16960968 DOI: 10.1142/s0219720006002260] [Citation(s) in RCA: 66] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2005] [Revised: 09/19/2005] [Accepted: 11/20/2005] [Indexed: 11/18/2022]

132

Knapp K, Chen YPP. An evaluation of contemporary hidden Markov model genefinders with a predicted exon taxonomy. Nucleic Acids Res 2006;35:317-24. [PMID: 17170005 PMCID: PMC1802560 DOI: 10.1093/nar/gkl1026] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2006] [Revised: 11/13/2006] [Accepted: 11/13/2006] [Indexed: 11/15/2022] Open

133

Segovia-Juarez JL, Colombano S, Kirschner D. Identifying DNA splice sites using hypernetworks with artificial molecular evolution. Biosystems 2006;87:117-24. [PMID: 17116361 DOI: 10.1016/j.biosystems.2006.09.004] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2005] [Revised: 07/08/2006] [Accepted: 07/15/2006] [Indexed: 11/28/2022]

134

Pareek A, Singh A, Kumar M, Kushwaha HR, Lynn AM, Singla-Pareek SL. Whole-genome analysis of Oryza sativa reveals similar architecture of two-component signaling machinery with Arabidopsis. PLANT PHYSIOLOGY 2006;142:380-97. [PMID: 16891544 PMCID: PMC1586034 DOI: 10.1104/pp.106.086371] [Citation(s) in RCA: 91] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]

135

Pighetti GM, Rambeaud M. Genome conservation between the bovine and human interleukin-8 receptor complex: improper annotation of bovine interleukin-8 receptor b identified. Vet Immunol Immunopathol 2006;114:335-40. [PMID: 16982101 DOI: 10.1016/j.vetimm.2006.08.008] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2006] [Revised: 07/28/2006] [Accepted: 08/14/2006] [Indexed: 11/29/2022]

136

El-Mogharbel N, Wakefield M, Deakin JE, Tsend-Ayush E, Grützner F, Alsop A, Ezaz T, Marshall Graves JA. DMRT gene cluster analysis in the platypus: new insights into genomic organization and regulatory regions. Genomics 2006;89:10-21. [PMID: 16962738 DOI: 10.1016/j.ygeno.2006.07.017] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2006] [Revised: 07/31/2006] [Accepted: 07/31/2006] [Indexed: 10/24/2022]

137

Hsieh SJ, Lin CY, Liu NH, Chow WY, Tang CY. GeneAlign: a coding exon prediction tool based on phylogenetical comparisons. Nucleic Acids Res 2006;34:W280-4. [PMID: 16845010 PMCID: PMC1538901 DOI: 10.1093/nar/gkl307] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open

138

Marashi SA, Eslahchi C, Pezeshk H, Sadeghi M. Impact of RNA structure on the prediction of donor and acceptor splice sites. BMC Bioinformatics 2006;7:297. [PMID: 16772025 PMCID: PMC1526458 DOI: 10.1186/1471-2105-7-297] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2006] [Accepted: 06/13/2006] [Indexed: 11/10/2022] Open

139

Alonso JM, Ecker JR. Moving forward in reverse: genetic technologies to enable genome-wide phenomic screens in Arabidopsis. Nat Rev Genet 2006;7:524-36. [PMID: 16755288 DOI: 10.1038/nrg1893] [Citation(s) in RCA: 186] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

140

Mazzarelli JM, White P, Gorski R, Brestelli J, Pinney DF, Arsenlis A, Katokhin A, Belova O, Bogdanova V, Elisafenko E, Gubina M, Nizolenko L, Perelman P, Puzakov M, Shilov A, Trifonoff V, Vorobjeva N, Kolchanov N, Kaestner KH, Stoeckert CJ. Novel genes identified by manual annotation and microarray expression analysis in the pancreas. Genomics 2006;88:752-761. [PMID: 16725306 DOI: 10.1016/j.ygeno.2006.04.005] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2006] [Accepted: 04/14/2006] [Indexed: 10/24/2022]

Affiliation(s)

Joan M Mazzarelli Center for Bioinformatics, School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA; Department of Genetics, School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA.
Peter White Department of Genetics, School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
Regina Gorski Center for Bioinformatics, School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA; Department of Genetics, School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
John Brestelli Center for Bioinformatics, School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA; Department of Genetics, School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
Deborah F Pinney Center for Bioinformatics, School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA; Department of Genetics, School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
Athanasios Arsenlis Department of Genetics, School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
Alexey Katokhin Institute of Cytology and Genetics SB RAS, Novosibirsk, Russia
Olga Belova Institute of Cytology and Genetics SB RAS, Novosibirsk, Russia
Vera Bogdanova Institute of Cytology and Genetics SB RAS, Novosibirsk, Russia
Eugenij Elisafenko Institute of Cytology and Genetics SB RAS, Novosibirsk, Russia
Marina Gubina Institute of Cytology and Genetics SB RAS, Novosibirsk, Russia
Lilia Nizolenko Institute of Cytology and Genetics SB RAS, Novosibirsk, Russia
Polina Perelman Institute of Cytology and Genetics SB RAS, Novosibirsk, Russia
Mikhail Puzakov Institute of Cytology and Genetics SB RAS, Novosibirsk, Russia
Alexandre Shilov Institute of Cytology and Genetics SB RAS, Novosibirsk, Russia
Vladimir Trifonoff Institute of Cytology and Genetics SB RAS, Novosibirsk, Russia
Nadezhda Vorobjeva Institute of Cytology and Genetics SB RAS, Novosibirsk, Russia
Nikolay Kolchanov Institute of Cytology and Genetics SB RAS, Novosibirsk, Russia
Klaus H Kaestner Department of Genetics, School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
Christian J Stoeckert Center for Bioinformatics, School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA; Department of Genetics, School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA

Collapse

141

Dana AN, Hillenmeyer ME, Lobo NF, Kern MK, Romans PA, Collins FH. Differential gene expression in abdomens of the malaria vector mosquito, Anopheles gambiae, after sugar feeding, blood feeding and Plasmodium berghei infection. BMC Genomics 2006;7:119. [PMID: 16712725 PMCID: PMC1508153 DOI: 10.1186/1471-2164-7-119] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2005] [Accepted: 05/19/2006] [Indexed: 11/10/2022] Open

142

Ge B, Gurd S, Gaudin T, Dore C, Lepage P, Harmsen E, Hudson TJ, Pastinen T. Survey of allelic expression using EST mining. Genome Res 2006;15:1584-91. [PMID: 16251468 PMCID: PMC1310646 DOI: 10.1101/gr.4023805] [Citation(s) in RCA: 98] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]

143

Horellou MH, Chevreaud C, Mathieux V, Conard J, de Mazancourt P. Fibrinogen Paris IX: a case of symptomatic hypofibrinogenemia with Bbeta Y236C and Bbeta IVS7-1G-->C mutations. J Thromb Haemost 2006;4:1134-6. [PMID: 16689768 DOI: 10.1111/j.1538-7836.2006.01881.x] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

144

Dutta S, Singhal P, Agrawal P, Tomer R, Kritee K, Khurana E, Jayaram B. A physicochemical model for analyzing DNA sequences. J Chem Inf Model 2006;46:78-85. [PMID: 16426042 DOI: 10.1021/ci050119x] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

145

Agrawal R, Stormo GD. Using mRNAs lengths to accurately predict the alternatively spliced gene products in Caenorhabditis elegans. Bioinformatics 2006;22:1239-44. [PMID: 16595562 DOI: 10.1093/bioinformatics/btl076] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

146

Andreini C, Banci L, Bertini I, Rosato A. Counting the zinc-proteins encoded in the human genome. J Proteome Res 2006;5:196-201. [PMID: 16396512 DOI: 10.1021/pr050361j] [Citation(s) in RCA: 702] [Impact Index Per Article: 39.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]

Abstract

Metalloproteins are proteins capable of binding one or more metal ions, which may be required for their biological function, or for regulation of their activities or for structural purposes. Genome sequencing projects have provided a huge number of protein primary sequences, but, even though several different elaborate analyses and annotations have been enabled by a rich and ever-increasing portfolio of bioinformatic tools, metal-binding properties remain difficult to predict as well as to investigate experimentally. Consequently, the present knowledge about metalloproteins is only partial. The present bioinformatic research proposes a strategy to answer the question of how many and which proteins encoded in the human genome may require zinc for their physiological function. This is achieved by a combination of approaches, which include: (i) searching in the proteome for the zinc-binding patterns that, on their turn, are obtained from all available X-ray data; (ii) using libraries of metal-binding protein domains based on multiple sequence alignments of known metalloproteins obtained from the Pfam database; and (iii) mining the annotations of human gene sequences, which are based on any type of information available. It is found that 1684 proteins in the human proteome are independently identified by all three approaches as zinc-proteins, 746 are identified by two, and 777 are identified by only one method. By assuming that all proteins identified by at least two approaches are truly zinc-binding and inspecting the proteins identified by a single method, it can be proposed that ca. 2800 human proteins are potentially zinc-binding in vivo, corresponding to 10% of the human proteome, with an uncertainty of 400 sequences. Available functional information suggests that the large majority of human zinc-binding proteins are involved in the regulation of gene expression. The most abundant class of zinc-binding proteins in humans is that of zinc-fingers, with Cys4 and Cys2His2 being the most common types of coordination environment.

Collapse

147

Shafer P, Lin DM, Yona G. EST2Prot: mapping EST sequences to proteins. BMC Genomics 2006;7:41. [PMID: 16515706 PMCID: PMC1456965 DOI: 10.1186/1471-2164-7-41] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2005] [Accepted: 03/04/2006] [Indexed: 11/12/2022] Open

148

Larrañaga P, Calvo B, Santana R, Bielza C, Galdiano J, Inza I, Lozano JA, Armañanzas R, Santafé G, Pérez A, Robles V. Machine learning in bioinformatics. Brief Bioinform 2006;7:86-112. [PMID: 16761367 DOI: 10.1093/bib/bbk007] [Citation(s) in RCA: 360] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

149

Ko P, Narayanan M, Kalyanaraman A, Aluru S. Space-conserving optimal DNA-protein alignment. PROCEEDINGS. IEEE COMPUTATIONAL SYSTEMS BIOINFORMATICS CONFERENCE 2006:80-8. [PMID: 16448002 DOI: 10.1109/csb.2004.1332420] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]

150

Marashi SA, Goodarzi H, Sadeghi M, Eslahchi C, Pezeshk H. Importance of RNA secondary structure information for yeast donor and acceptor splice site predictions by neural networks. Comput Biol Chem 2005;30:50-7. [PMID: 16386465 DOI: 10.1016/j.compbiolchem.2005.10.009] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2005] [Revised: 10/19/2005] [Accepted: 10/19/2005] [Indexed: 10/25/2022]