101
|
Stockinger H, Attwood T, Chohan SN, Côté R, Cudré-Mauroux P, Falquet L, Fernandes P, Finn RD, Hupponen T, Korpelainen E, Labarga A, Laugraud A, Lima T, Pafilis E, Pagni M, Pettifer S, Phan I, Rahman N. Experience using web services for biological sequence analysis. Brief Bioinform 2008; 9:493-505. [PMID: 18621748 DOI: 10.1093/bib/bbn029] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
Abstract
Programmatic access to data and tools through the web using so-called web services has an important role to play in bioinformatics. In this article, we discuss the most popular approaches based on SOAP/WS-I and REST and describe our, a cross section of the community, experiences with providing and using web services in the context of biological sequence analysis. We briefly review main technological approaches as well as best practice hints that are useful for both users and developers. Finally, syntactic and semantic data integration issues with multiple web services are discussed.
Collapse
Affiliation(s)
- Heinz Stockinger
- Swiss Institute of Bioinformatics, Vital-IT Group, Lausanne, Switzerland.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
102
|
Chica C, Labarga A, Gould CM, López R, Gibson TJ. A tree-based conservation scoring method for short linear motifs in multiple alignments of protein sequences. BMC Bioinformatics 2008; 9:229. [PMID: 18460207 PMCID: PMC2396637 DOI: 10.1186/1471-2105-9-229] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2007] [Accepted: 05/06/2008] [Indexed: 12/16/2022] Open
Abstract
Background The structure of many eukaryotic cell regulatory proteins is highly modular. They are assembled from globular domains, segments of natively disordered polypeptides and short linear motifs. The latter are involved in protein interactions and formation of regulatory complexes. The function of such proteins, which may be difficult to define, is the aggregate of the subfunctions of the modules. It is therefore desirable to efficiently predict linear motifs with some degree of accuracy, yet sequence database searches return results that are not significant. Results We have developed a method for scoring the conservation of linear motif instances. It requires only primary sequence-derived information (e.g. multiple alignment and sequence tree) and takes into account the degenerate nature of linear motif patterns. On our benchmarking, the method accurately scores 86% of the known positive instances, while distinguishing them from random matches in 78% of the cases. The conservation score is implemented as a real time application designed to be integrated into other tools. It is currently accessible via a Web Service or through a graphical interface. Conclusion The conservation score improves the prediction of linear motifs, by discarding those matches that are unlikely to be functional because they have not been conserved during the evolution of the protein sequences. It is especially useful for instances in non-structured regions of the proteins, where a domain masking filtering strategy is not applicable.
Collapse
Affiliation(s)
- Claudia Chica
- EMBL Structural and Computational Biology Unit, Meyerhofstrasse 1, 69117 Heidelberg, Germany.
| | | | | | | | | |
Collapse
|
103
|
Götz S, García-Gómez JM, Terol J, Williams TD, Nagaraj SH, Nueda MJ, Robles M, Talón M, Dopazo J, Conesa A. High-throughput functional annotation and data mining with the Blast2GO suite. Nucleic Acids Res 2008; 36:3420-35. [PMID: 18445632 PMCID: PMC2425479 DOI: 10.1093/nar/gkn176] [Citation(s) in RCA: 2908] [Impact Index Per Article: 181.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Functional genomics technologies have been widely adopted in the biological research of both model and non-model species. An efficient functional annotation of DNA or protein sequences is a major requirement for the successful application of these approaches as functional information on gene products is often the key to the interpretation of experimental results. Therefore, there is an increasing need for bioinformatics resources which are able to cope with large amount of sequence data, produce valuable annotation results and are easily accessible to laboratories where functional genomics projects are being undertaken. We present the Blast2GO suite as an integrated and biologist-oriented solution for the high-throughput and automatic functional annotation of DNA or protein sequences based on the Gene Ontology vocabulary. The most outstanding Blast2GO features are: (i) the combination of various annotation strategies and tools controlling type and intensity of annotation, (ii) the numerous graphical features such as the interactive GO-graph visualization for gene-set function profiling or descriptive charts, (iii) the general sequence management features and (iv) high-throughput capabilities. We used the Blast2GO framework to carry out a detailed analysis of annotation behaviour through homology transfer and its impact in functional genomics research. Our aim is to offer biologists useful information to take into account when addressing the task of functionally characterizing their sequence data.
Collapse
Affiliation(s)
- Stefan Götz
- Bioinformatics Department, Centro de Investigación Principe Felipe, Valencia, Spain
| | | | | | | | | | | | | | | | | | | |
Collapse
|
104
|
Patient S, Wieser D, Kleen M, Kretschmann E, Jesus Martin M, Apweiler R. UniProtJAPI: a remote API for accessing UniProt data. Bioinformatics 2008; 24:1321-2. [PMID: 18390879 DOI: 10.1093/bioinformatics/btn122] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023] Open
Abstract
Programmatic access to the UniProt Knowledgebase (UniProtKB) is essential for many bioinformatics applications dealing with protein data. We have created a Java library named UniProtJAPI, which facilitates the integration of UniProt data into Java-based software applications. The library supports queries and similarity searches that return UniProtKB entries in the form of Java objects. These objects contain functional annotations or sequence information associated with a UniProt entry. Here, we briefly describe the UniProtJAPI and demonstrate its usage.
Collapse
Affiliation(s)
- Samuel Patient
- The European Bioinformatics Institute, Hinxton, Cambridge CB10 1SD, UK.
| | | | | | | | | | | |
Collapse
|
105
|
Fine-tuning of galactoglucan biosynthesis in Sinorhizobium meliloti by differential WggR (ExpG)-, PhoB-, and MucR-dependent regulation of two promoters. J Bacteriol 2008; 190:3456-66. [PMID: 18344362 DOI: 10.1128/jb.00062-08] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Depending on the phosphate concentration encountered in the environment Sinorhizobium meliloti 2011 synthesizes two different exopolysaccharides (EPS). Galactoglucan (EPS II) is produced under phosphate starvation but also in the presence of extra copies of the transcriptional regulator WggR (ExpG) or as a consequence of a mutation in mucR. The galactoglucan biosynthesis gene cluster contains the operons wga (expA), wge (expE), wgd (expD), and wggR (expG). Two promoters, differentially controlled by WggR, PhoB, and MucR, were identified upstream of each of these operons. The proximal promoters of the wga, wge, and wgd transcription units were constitutively active when separated from the upstream regulatory sequences. Promoter activity studies and the positions of predicted PhoB and WggR binding sites suggested that the proximal promoters are cooperatively induced by PhoB and WggR. MucR was shown to strongly inhibit the distal promoters and bound to the DNA in the vicinity of the distal transcription start sites. An additional inhibitory effect on the distal promoter of the structural galactoglucan biosynthesis genes was identified as a new feature of WggR in a mucR mutant. A regulatory model of the fine-tuning of galactoglucan production is proposed.
Collapse
|
106
|
Hammami R, Zouhir A, Naghmouchi K, Ben Hamida J, Fliss I. SciDBMaker: new software for computer-aided design of specialized biological databases. BMC Bioinformatics 2008; 9:121. [PMID: 18298861 PMCID: PMC2267701 DOI: 10.1186/1471-2105-9-121] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2007] [Accepted: 02/25/2008] [Indexed: 12/22/2022] Open
Abstract
Background The exponential growth of research in molecular biology has brought concomitant proliferation of databases for stocking its findings. A variety of protein sequence databases exist. While all of these strive for completeness, the range of user interests is often beyond their scope. Large databases covering a broad range of domains tend to offer less detailed information than smaller, more specialized resources, often creating a need to combine data from many sources in order to obtain a complete picture. Scientific researchers are continually developing new specific databases to enhance their understanding of biological processes. Description In this article, we present the implementation of a new tool for protein data analysis. With its easy-to-use user interface, this software provides the opportunity to build more specialized protein databases from a universal protein sequence database such as Swiss-Prot. A family of proteins known as bacteriocins is analyzed as 'proof of concept'. Conclusion SciDBMaker is stand-alone software that allows the extraction of protein data from the Swiss-Prot database, sequence analysis comprising physicochemical profile calculations, homologous sequences search, multiple sequence alignments and the building of new and more specialized databases. It compiles information with relative ease, updates and compares various data relevant to a given protein family and could solve the problem of dispersed biological search results.
Collapse
Affiliation(s)
- Riadh Hammami
- Unité de Protéomie Fonctionnelle & Biopréservation Alimentaire, Institut Supérieur des Sciences Biologiques Appliquées de Tunis, Université El Manar, Tunisie.
| | | | | | | | | |
Collapse
|
107
|
Chalmel F, Primig M. The Annotation, Mapping, Expression and Network (AMEN) suite of tools for molecular systems biology. BMC Bioinformatics 2008; 9:86. [PMID: 18254954 PMCID: PMC2375118 DOI: 10.1186/1471-2105-9-86] [Citation(s) in RCA: 66] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2007] [Accepted: 02/06/2008] [Indexed: 11/10/2022] Open
Abstract
Background High-throughput genome biological experiments yield large and multifaceted datasets that require flexible and user-friendly analysis tools to facilitate their interpretation by life scientists. Many solutions currently exist, but they are often limited to specific steps in the complex process of data management and analysis and some require extensive informatics skills to be installed and run efficiently. Results We developed the Annotation, Mapping, Expression and Network (AMEN) software as a stand-alone, unified suite of tools that enables biological and medical researchers with basic bioinformatics training to manage and explore genome annotation, chromosomal mapping, protein-protein interaction, expression profiling and proteomics data. The current version provides modules for (i) uploading and pre-processing data from microarray expression profiling experiments, (ii) detecting groups of significantly co-expressed genes, and (iii) searching for enrichment of functional annotations within those groups. Moreover, the user interface is designed to simultaneously visualize several types of data such as protein-protein interaction networks in conjunction with expression profiles and cellular co-localization patterns. We have successfully applied the program to interpret expression profiling data from budding yeast, rodents and human. Conclusion AMEN is an innovative solution for molecular systems biological data analysis freely available under the GNU license. The program is available via a website at the Sourceforge portal which includes a user guide with concrete examples, links to external databases and helpful comments to implement additional functionalities. We emphasize that AMEN will continue to be developed and maintained by our laboratory because it has proven to be extremely useful for our genome biological research program.
Collapse
Affiliation(s)
- Frédéric Chalmel
- Institut National de la Santé et de la Recherche Médicale Unité 625, Groupe d'Etude de la Reproduction chez l'Homme et les Mammifères, Institut Fédératif de Recherche 140, F-35042 Rennes, France.
| | | |
Collapse
|
108
|
Conesa A, Götz S. Blast2GO: A Comprehensive Suite for Functional Analysis in Plant Genomics. INTERNATIONAL JOURNAL OF PLANT GENOMICS 2008; 2008:619832. [PMID: 18483572 PMCID: PMC2375974 DOI: 10.1155/2008/619832] [Citation(s) in RCA: 1344] [Impact Index Per Article: 84.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/05/2007] [Accepted: 11/26/2007] [Indexed: 05/09/2023]
Abstract
Functional annotation of novel sequence data is a primary requirement for the utilization of functional genomics approaches in plant research. In this paper, we describe the Blast2GO suite as a comprehensive bioinformatics tool for functional annotation of sequences and data mining on the resulting annotations, primarily based on the gene ontology (GO) vocabulary. Blast2GO optimizes function transfer from homologous sequences through an elaborate algorithm that considers similarity, the extension of the homology, the database of choice, the GO hierarchy, and the quality of the original annotations. The tool includes numerous functions for the visualization, management, and statistical analysis of annotation results, including gene set enrichment analysis. The application supports InterPro, enzyme codes, KEGG pathways, GO direct acyclic graphs (DAGs), and GOSlim. Blast2GO is a suitable tool for plant genomics research because of its versatility, easy installation, and friendly use.
Collapse
Affiliation(s)
- Ana Conesa
- Bioinformatics Department,
Centro de Investigación Príncipe Felipe,
4012 Valencia,
Spain
- *Ana Conesa:
| | - Stefan Götz
- Bioinformatics Department,
Centro de Investigación Príncipe Felipe,
4012 Valencia,
Spain
| |
Collapse
|
109
|
Schriek S, Rückert C, Staiger D, Pistorius EK, Michel KP. Bioinformatic evaluation of L-arginine catabolic pathways in 24 cyanobacteria and transcriptional analysis of genes encoding enzymes of L-arginine catabolism in the cyanobacterium Synechocystis sp. PCC 6803. BMC Genomics 2007; 8:437. [PMID: 18045455 PMCID: PMC2242806 DOI: 10.1186/1471-2164-8-437] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2007] [Accepted: 11/28/2007] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND So far very limited knowledge exists on L-arginine catabolism in cyanobacteria, although six major L-arginine-degrading pathways have been described for prokaryotes. Thus, we have performed a bioinformatic analysis of possible L-arginine-degrading pathways in cyanobacteria. Further, we chose Synechocystis sp. PCC 6803 for a more detailed bioinformatic analysis and for validation of the bioinformatic predictions on L-arginine catabolism with a transcript analysis. RESULTS We have evaluated 24 cyanobacterial genomes of freshwater or marine strains for the presence of putative L-arginine-degrading enzymes. We identified an L-arginine decarboxylase pathway in all 24 strains. In addition, cyanobacteria have one or two further pathways representing either an arginase pathway or L-arginine deiminase pathway or an L-arginine oxidase/dehydrogenase pathway. An L-arginine amidinotransferase pathway as a major L-arginine-degrading pathway is not likely but can not be entirely excluded. A rather unusual finding was that the cyanobacterial L-arginine deiminases are substantially larger than the enzymes in non-photosynthetic bacteria and that they are membrane-bound. A more detailed bioinformatic analysis of Synechocystis sp. PCC 6803 revealed that three different L-arginine-degrading pathways may in principle be functional in this cyanobacterium. These are (i) an L-arginine decarboxylase pathway, (ii) an L-arginine deiminase pathway, and (iii) an L-arginine oxidase/dehydrogenase pathway. A transcript analysis of cells grown either with nitrate or L-arginine as sole N-source and with an illumination of 50 mumol photons m-2 s-1 showed that the transcripts for the first enzyme(s) of all three pathways were present, but that the transcript levels for the L-arginine deiminase and the L-arginine oxidase/dehydrogenase were substantially higher than that of the three isoenzymes of L-arginine decarboxylase. CONCLUSION The evaluation of 24 cyanobacterial genomes revealed that five different L-arginine-degrading pathways are present in the investigated cyanobacterial species. In Synechocystis sp. PCC 6803 an L-arginine deiminase pathway and an L-arginine oxidase/dehydrogenase pathway represent the major pathways, while the L-arginine decarboxylase pathway most likely only functions in polyamine biosynthesis. The transcripts encoding the enzymes of the two major pathways were constitutively expressed with the exception of the transcript for the carbamate kinase, which was substantially up-regulated in cells grown with L-arginine.
Collapse
Affiliation(s)
- Sarah Schriek
- Lehrstuhl für Molekulare Zellphysiologie, Universität Bielefeld, Universitätsstr, 25, D-33615 Bielefeld, Germany.
| | | | | | | | | |
Collapse
|
110
|
Bryne JC, Valen E, Tang MHE, Marstrand T, Winther O, da Piedade I, Krogh A, Lenhard B, Sandelin A. JASPAR, the open access database of transcription factor-binding profiles: new content and tools in the 2008 update. Nucleic Acids Res 2007; 36:D102-6. [PMID: 18006571 PMCID: PMC2238834 DOI: 10.1093/nar/gkm955] [Citation(s) in RCA: 517] [Impact Index Per Article: 30.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
JASPAR is a popular open-access database for matrix models describing DNA-binding preferences for transcription factors and other DNA patterns. With its third major release, JASPAR has been expanded and equipped with additional functions aimed at both casual and power users. The heart of the JASPAR database-the JASPAR CORE sub-database-has increased by 12% in size, and three new specialized sub-databases have been added. New functions include clustering of matrix models by similarity, generation of random matrices by sampling from selected sets of existing models and a language-independent Web Service applications programming interface for matrix retrieval. JASPAR is available at http://jaspar.genereg.net.
Collapse
Affiliation(s)
- Jan Christian Bryne
- Computational Biology Unit, Bergen Center for Computational Science, University of Bergen, Thormøhlensgate 55, N-5008 Bergen, Norway
| | | | | | | | | | | | | | | | | |
Collapse
|
111
|
Stary A, Suwattanasophon C, Wolschann P, Buchbauer G. Differences in (-)citronellal binding to various odorant receptors. Biochem Biophys Res Commun 2007; 361:941-5. [PMID: 17681278 DOI: 10.1016/j.bbrc.2007.07.137] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2007] [Accepted: 07/19/2007] [Indexed: 10/23/2022]
Abstract
To test the hypothesis that olfactory receptors (ORs) recognize different molecular features of odor molecules termed "odotypes", we studied receptor-ligand interactions of two human and two mouse ORs, recognizing (-)citronellal. Structurally similar receptors provide identical binding pockets (OLFR43, OR1A1, and OR1A2), and have comparable EC(50) values. Other ORs with lower sequence identity bind (-)citronellal in a different way, leading to different EC(50) values.
Collapse
Affiliation(s)
- Anna Stary
- Institute for Theoretical Chemistry, University of Vienna, Waehringer Strasse 17, A-1090 Vienna, Austria.
| | | | | | | |
Collapse
|
112
|
Fox JA, McMillan S, Ouellette BFF. Conducting research on the web: 2007 update for the bioinformatics links directory. Nucleic Acids Res 2007; 35:W3-5. [PMID: 17586821 PMCID: PMC1933129 DOI: 10.1093/nar/gkm459] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The Bioinformatics Links Directory, http://bioinformatics.ca/links_directory, is an actively maintained compilation of servers published in this and previous issues of Nucleic Acids Research issues together with many other useful tools, databases and resources for life sciences research. The 2007 update includes the 130 websites highlighted in the July 2007 Web Server issue of Nucleic Acids Research and brings the total number of servers listed in the Bioinformatics Links Directory to just under 1200 links. In addition to the updated content, the 2007 update of the Bioinformatics Links Directory includes new features for improved navigation, accessibility and open data exchange. A complete listing of all links listed in this Nucleic Acids Research 2007 Web Server issue can be accessed online at, http://bioinformatics.ca/links_directory/narweb2007. The 2007 update of the Bioinformatics Links Directory, which includes the Web Server list and summaries is also available online, at the Nucleic Acids Research web site, http://nar.oupjournals.org.
Collapse
|