601
|
Weniger M, Engelmann JC, Schultz J. Genome Expression Pathway Analysis Tool--analysis and visualization of microarray gene expression data under genomic, proteomic and metabolic context. BMC Bioinformatics 2007; 8:179. [PMID: 17543125 PMCID: PMC1896182 DOI: 10.1186/1471-2105-8-179] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2007] [Accepted: 06/02/2007] [Indexed: 12/31/2022] Open
Abstract
Background Regulation of gene expression is relevant to many areas of biology and medicine, in the study of treatments, diseases, and developmental stages. Microarrays can be used to measure the expression level of thousands of mRNAs at the same time, allowing insight into or comparison of different cellular conditions. The data derived out of microarray experiments is highly dimensional and often noisy, and interpretation of the results can get intricate. Although programs for the statistical analysis of microarray data exist, most of them lack an integration of analysis results and biological interpretation. Results We have developed GEPAT, Genome Expression Pathway Analysis Tool, offering an analysis of gene expression data under genomic, proteomic and metabolic context. We provide an integration of statistical methods for data import and data analysis together with a biological interpretation for subsets of probes or single probes on the chip. GEPAT imports various types of oligonucleotide and cDNA array data formats. Different normalization methods can be applied to the data, afterwards data annotation is performed. After import, GEPAT offers various statistical data analysis methods, as hierarchical, k-means and PCA clustering, a linear model based t-test or chromosomal profile comparison. The results of the analysis can be interpreted by enrichment of biological terms, pathway analysis or interaction networks. Different biological databases are included, to give various information for each probe on the chip. GEPAT offers no linear work flow, but allows the usage of any subset of probes and samples as a start for a new data analysis. GEPAT relies on established data analysis packages, offers a modular approach for an easy extension, and can be run on a computer grid to allow a large number of users. It is freely available under the LGPL open source license for academic and commercial users at . Conclusion GEPAT is a modular, scalable and professional-grade software integrating analysis and interpretation of microarray gene expression data. An installation available for academic users can be found at .
Collapse
Affiliation(s)
- Markus Weniger
- Department of Bioinformatics, Biocenter, University of Würzburg, 97074 Würzburg, Germany
| | - Julia C Engelmann
- Department of Bioinformatics, Biocenter, University of Würzburg, 97074 Würzburg, Germany
| | - Jörg Schultz
- Department of Bioinformatics, Biocenter, University of Würzburg, 97074 Würzburg, Germany
| |
Collapse
|
602
|
Pagni M, Ioannidis V, Cerutti L, Zahn-Zabal M, Jongeneel CV, Hau J, Martin O, Kuznetsov D, Falquet L. MyHits: improvements to an interactive resource for analyzing protein sequences. Nucleic Acids Res 2007; 35:W433-7. [PMID: 17545200 PMCID: PMC1933190 DOI: 10.1093/nar/gkm352] [Citation(s) in RCA: 157] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The MyHits web site (http://myhits.isb-sib.ch) is an integrated service dedicated to the analysis of protein sequences. Since its first description in 2004, both the user interface and the back end of the server were improved. A number of tools (e.g. MAFFT, Jacop, Dotlet, Jalview, ESTScan) were added or updated to improve the usability of the service. The MySQL schema and its associated API were revamped and the database engine (HitKeeper) was separated from the web interface. This paper summarizes the current status of the server, with an emphasis on the new services.
Collapse
Affiliation(s)
- Marco Pagni
- Swiss Institute of Bioinformatics (SIB), Vital-IT Group, UNIL-Génopode, CH-1015 Lausanne, Swiss Institute of Bioinformatics (SIB), EMBnet Group, UNIL-Génopode, CH-1015 Lausanne, Swiss Institute of Bioinformatics (SIB), Swiss-Prot Group, UNIGE-CMU, CH-1211 Genève 4, Ludwig Institute for Cancer Research, UNIL-Génopode, CH-1015 Lausanne and Nestlé Research Center, Department of BioAnalytical Science, PO Box 44, CH-1000 Lausanne 26, Switzerland
- *To whom correspondence should be addressed. +41-21-692-40-38+41-21-692-40-65
| | - Vassilios Ioannidis
- Swiss Institute of Bioinformatics (SIB), Vital-IT Group, UNIL-Génopode, CH-1015 Lausanne, Swiss Institute of Bioinformatics (SIB), EMBnet Group, UNIL-Génopode, CH-1015 Lausanne, Swiss Institute of Bioinformatics (SIB), Swiss-Prot Group, UNIGE-CMU, CH-1211 Genève 4, Ludwig Institute for Cancer Research, UNIL-Génopode, CH-1015 Lausanne and Nestlé Research Center, Department of BioAnalytical Science, PO Box 44, CH-1000 Lausanne 26, Switzerland
| | - Lorenzo Cerutti
- Swiss Institute of Bioinformatics (SIB), Vital-IT Group, UNIL-Génopode, CH-1015 Lausanne, Swiss Institute of Bioinformatics (SIB), EMBnet Group, UNIL-Génopode, CH-1015 Lausanne, Swiss Institute of Bioinformatics (SIB), Swiss-Prot Group, UNIGE-CMU, CH-1211 Genève 4, Ludwig Institute for Cancer Research, UNIL-Génopode, CH-1015 Lausanne and Nestlé Research Center, Department of BioAnalytical Science, PO Box 44, CH-1000 Lausanne 26, Switzerland
| | - Monique Zahn-Zabal
- Swiss Institute of Bioinformatics (SIB), Vital-IT Group, UNIL-Génopode, CH-1015 Lausanne, Swiss Institute of Bioinformatics (SIB), EMBnet Group, UNIL-Génopode, CH-1015 Lausanne, Swiss Institute of Bioinformatics (SIB), Swiss-Prot Group, UNIGE-CMU, CH-1211 Genève 4, Ludwig Institute for Cancer Research, UNIL-Génopode, CH-1015 Lausanne and Nestlé Research Center, Department of BioAnalytical Science, PO Box 44, CH-1000 Lausanne 26, Switzerland
| | - C. Victor Jongeneel
- Swiss Institute of Bioinformatics (SIB), Vital-IT Group, UNIL-Génopode, CH-1015 Lausanne, Swiss Institute of Bioinformatics (SIB), EMBnet Group, UNIL-Génopode, CH-1015 Lausanne, Swiss Institute of Bioinformatics (SIB), Swiss-Prot Group, UNIGE-CMU, CH-1211 Genève 4, Ludwig Institute for Cancer Research, UNIL-Génopode, CH-1015 Lausanne and Nestlé Research Center, Department of BioAnalytical Science, PO Box 44, CH-1000 Lausanne 26, Switzerland
| | - Jörg Hau
- Swiss Institute of Bioinformatics (SIB), Vital-IT Group, UNIL-Génopode, CH-1015 Lausanne, Swiss Institute of Bioinformatics (SIB), EMBnet Group, UNIL-Génopode, CH-1015 Lausanne, Swiss Institute of Bioinformatics (SIB), Swiss-Prot Group, UNIGE-CMU, CH-1211 Genève 4, Ludwig Institute for Cancer Research, UNIL-Génopode, CH-1015 Lausanne and Nestlé Research Center, Department of BioAnalytical Science, PO Box 44, CH-1000 Lausanne 26, Switzerland
| | - Olivier Martin
- Swiss Institute of Bioinformatics (SIB), Vital-IT Group, UNIL-Génopode, CH-1015 Lausanne, Swiss Institute of Bioinformatics (SIB), EMBnet Group, UNIL-Génopode, CH-1015 Lausanne, Swiss Institute of Bioinformatics (SIB), Swiss-Prot Group, UNIGE-CMU, CH-1211 Genève 4, Ludwig Institute for Cancer Research, UNIL-Génopode, CH-1015 Lausanne and Nestlé Research Center, Department of BioAnalytical Science, PO Box 44, CH-1000 Lausanne 26, Switzerland
| | - Dmitri Kuznetsov
- Swiss Institute of Bioinformatics (SIB), Vital-IT Group, UNIL-Génopode, CH-1015 Lausanne, Swiss Institute of Bioinformatics (SIB), EMBnet Group, UNIL-Génopode, CH-1015 Lausanne, Swiss Institute of Bioinformatics (SIB), Swiss-Prot Group, UNIGE-CMU, CH-1211 Genève 4, Ludwig Institute for Cancer Research, UNIL-Génopode, CH-1015 Lausanne and Nestlé Research Center, Department of BioAnalytical Science, PO Box 44, CH-1000 Lausanne 26, Switzerland
| | - Laurent Falquet
- Swiss Institute of Bioinformatics (SIB), Vital-IT Group, UNIL-Génopode, CH-1015 Lausanne, Swiss Institute of Bioinformatics (SIB), EMBnet Group, UNIL-Génopode, CH-1015 Lausanne, Swiss Institute of Bioinformatics (SIB), Swiss-Prot Group, UNIGE-CMU, CH-1211 Genève 4, Ludwig Institute for Cancer Research, UNIL-Génopode, CH-1015 Lausanne and Nestlé Research Center, Department of BioAnalytical Science, PO Box 44, CH-1000 Lausanne 26, Switzerland
| |
Collapse
|
603
|
Khodiyar VK, Maltais LJ, Ruef BJ, Sneddon KMB, Smith JR, Shimoyama M, Cabral F, Dumontet C, Dutcher SK, Harvey RJ, Lafanechère L, Murray JM, Nogales E, Piquemal D, Stanchi F, Povey S, Lovering RC. A revised nomenclature for the human and rodent alpha-tubulin gene family. Genomics 2007; 90:285-9. [PMID: 17543498 DOI: 10.1016/j.ygeno.2007.04.008] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2007] [Revised: 04/20/2007] [Accepted: 04/24/2007] [Indexed: 01/21/2023]
Abstract
An essential component of microtubules, alpha-tubulin is also a multigene family in many species. An orthology-based nomenclature for this gene family has previously been difficult to assign due to incomplete genome builds and the high degree of sequence similarity between members of this family. Using the current genome builds, sequence analysis of human, mouse, and rat alpha-tubulin genes has enabled an updated nomenclature to be generated. This revised nomenclature provides a unified language for the discussion of these genes in mammalian species; it has been approved by the gene nomenclature committees of the three species and is supported by researchers in the field.
Collapse
Affiliation(s)
- Varsha K Khodiyar
- HUGO Gene Nomenclature Committee, Department of Biology, University College London, Wolfson House, 4 Stephenson Way, London NW1 2HE, UK.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
604
|
Cases I, Pisano DG, Andres E, Carro A, Fernández JM, Gómez-López G, Rodriguez JM, Vera JF, Valencia A, Rojas AM. CARGO: a web portal to integrate customized biological information. Nucleic Acids Res 2007; 35:W16-20. [PMID: 17483515 PMCID: PMC1933121 DOI: 10.1093/nar/gkm280] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
There is a huge quantity of information generated in Life Sciences, and it is dispersed in many databases and repositories. Despite the broad availability of the information, there is a great demand for methods that are able to look for, gather and display distributed data in a standardized and friendly way. CARGO (Cancer And Related Genes Online) is a configurable biological web portal designed as a tool to facilitate, integrate and visualize results from Internet resources, independently of their native format or access method. Through the use of small agents, called widgets, supported by a Rich Internet Application (RIA) paradigm based on AJAX, CARGO provides pieces of minimal, relevant and descriptive biological information. The tool is designed to be used by experimental biologists with no training in bioinformatics. In the current state, the system presents a list of human cancer genes. Available at http://cargo.bioinfo.cnio.es.
Collapse
Affiliation(s)
- Ildefonso Cases
- SCOMPBio Group, Bioinformatics Unit (UBio), Structural Biology and Biocomputing Programme, Spanish National Cancer Research Centre (CNIO), Madrid, Spain and Spanish National Institute for Bioinformatics (INB), Spain
| | - David G. Pisano
- SCOMPBio Group, Bioinformatics Unit (UBio), Structural Biology and Biocomputing Programme, Spanish National Cancer Research Centre (CNIO), Madrid, Spain and Spanish National Institute for Bioinformatics (INB), Spain
- *To whom correspondence should be addressed: Tel.: +34-91-224-6900+34-91-224-8006
| | - Eduardo Andres
- SCOMPBio Group, Bioinformatics Unit (UBio), Structural Biology and Biocomputing Programme, Spanish National Cancer Research Centre (CNIO), Madrid, Spain and Spanish National Institute for Bioinformatics (INB), Spain
| | - Angel Carro
- SCOMPBio Group, Bioinformatics Unit (UBio), Structural Biology and Biocomputing Programme, Spanish National Cancer Research Centre (CNIO), Madrid, Spain and Spanish National Institute for Bioinformatics (INB), Spain
| | - José M. Fernández
- SCOMPBio Group, Bioinformatics Unit (UBio), Structural Biology and Biocomputing Programme, Spanish National Cancer Research Centre (CNIO), Madrid, Spain and Spanish National Institute for Bioinformatics (INB), Spain
| | - Gonzalo Gómez-López
- SCOMPBio Group, Bioinformatics Unit (UBio), Structural Biology and Biocomputing Programme, Spanish National Cancer Research Centre (CNIO), Madrid, Spain and Spanish National Institute for Bioinformatics (INB), Spain
| | - Jose M. Rodriguez
- SCOMPBio Group, Bioinformatics Unit (UBio), Structural Biology and Biocomputing Programme, Spanish National Cancer Research Centre (CNIO), Madrid, Spain and Spanish National Institute for Bioinformatics (INB), Spain
| | - Jaime F. Vera
- SCOMPBio Group, Bioinformatics Unit (UBio), Structural Biology and Biocomputing Programme, Spanish National Cancer Research Centre (CNIO), Madrid, Spain and Spanish National Institute for Bioinformatics (INB), Spain
| | - Alfonso Valencia
- SCOMPBio Group, Bioinformatics Unit (UBio), Structural Biology and Biocomputing Programme, Spanish National Cancer Research Centre (CNIO), Madrid, Spain and Spanish National Institute for Bioinformatics (INB), Spain
| | - Ana M. Rojas
- SCOMPBio Group, Bioinformatics Unit (UBio), Structural Biology and Biocomputing Programme, Spanish National Cancer Research Centre (CNIO), Madrid, Spain and Spanish National Institute for Bioinformatics (INB), Spain
| |
Collapse
|
605
|
Gibbs RA, Rogers J, Katze MG, Bumgarner R, Weinstock GM, Mardis ER, Remington KA, Strausberg RL, Venter JC, Wilson RK, Batzer MA, Bustamante CD, Eichler EE, Hahn MW, Hardison RC, Makova KD, Miller W, Milosavljevic A, Palermo RE, Siepel A, Sikela JM, Attaway T, Bell S, Bernard KE, Buhay CJ, Chandrabose MN, Dao M, Davis C, Delehaunty KD, Ding Y, Dinh HH, Dugan-Rocha S, Fulton LA, Gabisi RA, Garner TT, Godfrey J, Hawes AC, Hernandez J, Hines S, Holder M, Hume J, Jhangiani SN, Joshi V, Khan ZM, Kirkness EF, Cree A, Fowler RG, Lee S, Lewis LR, Li Z, Liu YS, Moore SM, Muzny D, Nazareth LV, Ngo DN, Okwuonu GO, Pai G, Parker D, Paul HA, Pfannkoch C, Pohl CS, Rogers YH, Ruiz SJ, Sabo A, Santibanez J, Schneider BW, Smith SM, Sodergren E, Svatek AF, Utterback TR, Vattathil S, Warren W, White CS, Chinwalla AT, Feng Y, Halpern AL, Hillier LW, Huang X, Minx P, Nelson JO, Pepin KH, Qin X, Sutton GG, Venter E, Walenz BP, Wallis JW, Worley KC, Yang SP, Jones SM, Marra MA, Rocchi M, Schein JE, Baertsch R, Clarke L, Csürös M, Glasscock J, Harris RA, Havlak P, Jackson AR, Jiang H, et alGibbs RA, Rogers J, Katze MG, Bumgarner R, Weinstock GM, Mardis ER, Remington KA, Strausberg RL, Venter JC, Wilson RK, Batzer MA, Bustamante CD, Eichler EE, Hahn MW, Hardison RC, Makova KD, Miller W, Milosavljevic A, Palermo RE, Siepel A, Sikela JM, Attaway T, Bell S, Bernard KE, Buhay CJ, Chandrabose MN, Dao M, Davis C, Delehaunty KD, Ding Y, Dinh HH, Dugan-Rocha S, Fulton LA, Gabisi RA, Garner TT, Godfrey J, Hawes AC, Hernandez J, Hines S, Holder M, Hume J, Jhangiani SN, Joshi V, Khan ZM, Kirkness EF, Cree A, Fowler RG, Lee S, Lewis LR, Li Z, Liu YS, Moore SM, Muzny D, Nazareth LV, Ngo DN, Okwuonu GO, Pai G, Parker D, Paul HA, Pfannkoch C, Pohl CS, Rogers YH, Ruiz SJ, Sabo A, Santibanez J, Schneider BW, Smith SM, Sodergren E, Svatek AF, Utterback TR, Vattathil S, Warren W, White CS, Chinwalla AT, Feng Y, Halpern AL, Hillier LW, Huang X, Minx P, Nelson JO, Pepin KH, Qin X, Sutton GG, Venter E, Walenz BP, Wallis JW, Worley KC, Yang SP, Jones SM, Marra MA, Rocchi M, Schein JE, Baertsch R, Clarke L, Csürös M, Glasscock J, Harris RA, Havlak P, Jackson AR, Jiang H, Liu Y, Messina DN, Shen Y, Song HXZ, Wylie T, Zhang L, Birney E, Han K, Konkel MK, Lee J, Smit AFA, Ullmer B, Wang H, Xing J, Burhans R, Cheng Z, Karro JE, Ma J, Raney B, She X, Cox MJ, Demuth JP, Dumas LJ, Han SG, Hopkins J, Karimpour-Fard A, Kim YH, Pollack JR, Vinar T, Addo-Quaye C, Degenhardt J, Denby A, Hubisz MJ, Indap A, Kosiol C, Lahn BT, Lawson HA, Marklein A, Nielsen R, Vallender EJ, Clark AG, Ferguson B, Hernandez RD, Hirani K, Kehrer-Sawatzki H, Kolb J, Patil S, Pu LL, Ren Y, Smith DG, Wheeler DA, Schenck I, Ball EV, Chen R, Cooper DN, Giardine B, Hsu F, Kent WJ, Lesk A, Nelson DL, O'brien WE, Prüfer K, Stenson PD, Wallace JC, Ke H, Liu XM, Wang P, Xiang AP, Yang F, Barber GP, Haussler D, Karolchik D, Kern AD, Kuhn RM, Smith KE, Zwieg AS. Evolutionary and biomedical insights from the rhesus macaque genome. Science 2007; 316:222-34. [PMID: 17431167 DOI: 10.1126/science.1139247] [Show More Authors] [Citation(s) in RCA: 1023] [Impact Index Per Article: 56.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
The rhesus macaque (Macaca mulatta) is an abundant primate species that diverged from the ancestors of Homo sapiens about 25 million years ago. Because they are genetically and physiologically similar to humans, rhesus monkeys are the most widely used nonhuman primate in basic and applied biomedical research. We determined the genome sequence of an Indian-origin Macaca mulatta female and compared the data with chimpanzees and humans to reveal the structure of ancestral primate genomes and to identify evidence for positive selection and lineage-specific expansions and contractions of gene families. A comparison of sequences from individual animals was used to investigate their underlying genetic diversity. The complete description of the macaque genome blueprint enhances the utility of this animal model for biomedical research and improves our understanding of the basic biology of the species.
Collapse
|
606
|
van Deursen D, Botma GJ, Jansen H, Verhoeven AJM. Comparative genomics and experimental promoter analysis reveal functional liver-specific elements in mammalian hepatic lipase genes. BMC Genomics 2007; 8:99. [PMID: 17428321 PMCID: PMC1853088 DOI: 10.1186/1471-2164-8-99] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2007] [Accepted: 04/11/2007] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Mammalian hepatic lipase (HL) genes are transcribed almost exclusively in hepatocytes. The basis for this liver-restricted expression is not completely understood. We hypothesized that the responsible cis-acting elements are conserved among mammalian HL genes. To identify these elements, we made a genomic comparison of 30 kb of 5'-flanking region of the rat, mouse, rhesus monkey, and human HL genes. The in silico data were verified by promoter-reporter assays in transfected hepatoma HepG2 and non-hepatoma HeLa cells using serial 5'-deletions of the rat HL (-2287/+9) and human HL (-685/+13) promoter region. RESULTS Highly conserved elements were present at the proximal promoter region, and at 14 and 22 kb upstream of the transcriptional start site. Both of these upstream elements increased transcriptional activity of the human HL (-685/+13) promoter region 2-3 fold. Within the proximal HL promoter region, conserved clusters of transcription factor binding sites (TFBS) were identified at -240/-200 (module A), -80/-40 (module B), and -25/+5 (module C) by the rVista software. In HepG2 cells, modules B and C, but not module A, were important for basal transcription. Module B contains putative binding sites for hepatocyte nuclear factors HNF1alpha. In the presence of module B, transcription from the minimal HL promoter was increased 1.5-2 fold in HepG2 cells, but inhibited 2-4 fold in HeLa cells. CONCLUSION Our data demonstrate that searching for conserved non-coding sequences by comparative genomics is a valuable tool in identifying candidate enhancer elements. With this approach, we found two putative enhancer elements in the far upstream region of the HL gene. In addition, we obtained evidence that the -80/-40 region of the HL gene is responsible for enhanced HL promoter activity in hepatoma cells, and for silencing HL promoter activity in non-liver cells.
Collapse
Affiliation(s)
- Diederik van Deursen
- Department of Biochemistry, Cardiovascular Research School COEUR, Erasmus MC, PO Box 1738, 3000 DR Rotterdam, The Netherlands
| | - Gert-Jan Botma
- Department of Biochemistry, Cardiovascular Research School COEUR, Erasmus MC, PO Box 1738, 3000 DR Rotterdam, The Netherlands
| | - Hans Jansen
- Department of Biochemistry, Cardiovascular Research School COEUR, Erasmus MC, PO Box 1738, 3000 DR Rotterdam, The Netherlands
- Department of Clinical Chemistry, Cardiovascular Research School COEUR, Erasmus MC, PO Box 1738, 3000 DR Rotterdam, The Netherlands
| | - Adrie JM Verhoeven
- Department of Biochemistry, Cardiovascular Research School COEUR, Erasmus MC, PO Box 1738, 3000 DR Rotterdam, The Netherlands
| |
Collapse
|
607
|
Samarghitean C, Väliaho J, Vihinen M. IDR knowledge base for primary immunodeficiencies. Immunome Res 2007; 3:6. [PMID: 17394641 PMCID: PMC1854887 DOI: 10.1186/1745-7580-3-6] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2007] [Accepted: 03/29/2007] [Indexed: 11/29/2022] Open
Abstract
Background The ImmunoDeficiency Resource (IDR) is a knowledge base for the integration of the clinical, biochemical, genetic, genomic, proteomic, structural, and computational data of primary immunodeficiencies. The need for the IDR arises from the lack of structured and systematic information about primary immunodeficiencies on the Internet, and from the lack of a common platform which enables doctors, researchers, students, nurses and patients to find out validated information about these diseases. Description The IDR knowledge base, first released in 1999, has grown substantially. It contains information for 158 diseases, both from a clinical as well as molecular point of view. The database and the user interface have been reformatted. This new IDR release has a richer and more complete breadth, depth and scope. The service provides the most complete and up-to-date dataset. The IDR has been integrated with several internal and external databases and services. The contents of the IDR are validated and selected for different types of users (doctors, nurses, researchers and students, as well as patients and their families). The search engine has been improved and allows either a detailed or a broad search from a simple user interface. Conclusion The IDR is the first knowledge base specifically designed to capture in a systematic and validated way both clinical and molecular information for primary immunodeficiencies. The service is freely available at http://bioinf.uta.fi/idr and is regularly updated. The IDR facilitates primary immunodeficiencies informatics and helps to parameterise in silico modelling of these diseases. The IDR is useful also as an advanced education tool for medical students, and physicians.
Collapse
Affiliation(s)
- Crina Samarghitean
- Institute of Medical Technology, FI-33014 University of Tampere, Finland
| | - Jouni Väliaho
- Institute of Medical Technology, FI-33014 University of Tampere, Finland
| | - Mauno Vihinen
- Institute of Medical Technology, FI-33014 University of Tampere, Finland
- Research Unit, Tampere University Hospital, FI-33520 Tampere, Finland
| |
Collapse
|
608
|
Wicker N, Carles A, Mills IG, Wolf M, Veerakumarasivam A, Edgren H, Boileau F, Wasylyk B, Schalken JA, Neal DE, Kallioniemi O, Poch O. A new look towards BAC-based array CGH through a comprehensive comparison with oligo-based array CGH. BMC Genomics 2007; 8:84. [PMID: 17394638 PMCID: PMC1852311 DOI: 10.1186/1471-2164-8-84] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2006] [Accepted: 03/29/2007] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Currently, two main technologies are used for screening of DNA copy number; the BAC (Bacterial Artificial Chromosome) and the recently developed oligonucleotide-based CGH (Chromosomal Comparative Genomic Hybridization) arrays which are capable of detecting small genomic regions with amplification or deletion. The correlation as well as the discriminative power of these platforms has never been compared statistically on a significant set of human patient samples. RESULTS In this paper, we present an exhaustive comparison between the two CGH platforms, undertaken at two independent sites using the same batch of DNA from 19 advanced prostate cancers. The comparison was performed directly on the raw data and a significant correlation was found between the two platforms. The correlation was greatly improved when the data were averaged over large chromosomic regions using a segmentation algorithm. In addition, this analysis has enabled the development of a statistical model to discriminate BAC outliers that might indicate microevents. These microevents were validated by the oligo platform results. CONCLUSION This article presents a genome-wide statistical validation of the oligo array platform on a large set of patient samples and demonstrates statistically its superiority over the BAC platform for the Identification of chromosomic events. Taking advantage of a large set of human samples treated by the two technologies, a statistical model has been developed to show that the BAC platform could also detect microevents.
Collapse
Affiliation(s)
- Nicolas Wicker
- Laboratoire de Bioinformatique et de Génomique lntégratives, Institut de Génétique et de Biologie Moléculaire et Cellulaire,1, rue Laurent Fries, BP 10142, 67404 Illkirch CEDEX, France
| | - Annaïck Carles
- Laboratoire de Bioinformatique et de Génomique lntégratives, Institut de Génétique et de Biologie Moléculaire et Cellulaire,1, rue Laurent Fries, BP 10142, 67404 Illkirch CEDEX, France
- Human Pathology, Institut de Génétique et de Biologie Moléculaire et Cellulaire,1, rue Laurent Fries, BP 10142, 67404 Illkirch CEDEX, France
| | - Ian G Mills
- Uro-Oncology Research Group, Cancer Research UK, Cambridge Research Institute, Li Ka Shing Centre, Robinson Way, Cambridge, CB2 0RE, UK
| | - Maija Wolf
- Medical Biotechnology, VTT Technical Research Centre of Finland and University of Turku, FIN-20520 Turku, Finland
| | - Abhi Veerakumarasivam
- Cancer Research UK Uro-Oncology Research Group, Department of Oncology, University of Cambridge, Hutchison/Medical Research Council Cancer Research Centre, Cambridge CB2 2XZ, England, UK
| | - Henrik Edgren
- Medical Biotechnology, VTT Technical Research Centre of Finland and University of Turku, FIN-20520 Turku, Finland
| | - Fabrice Boileau
- Laboratoire de Bioinformatique et de Génomique lntégratives, Institut de Génétique et de Biologie Moléculaire et Cellulaire,1, rue Laurent Fries, BP 10142, 67404 Illkirch CEDEX, France
| | - Bohdan Wasylyk
- Department of Urology (G4-105.1), Academic Medical Centre, University of Amsterdam, Meibergdreef 9, 1105 AZ Amsterdam, The Netherlands
| | - Jack A Schalken
- Department of Urology (G4-105.1), Academic Medical Centre, University of Amsterdam, Meibergdreef 9, 1105 AZ Amsterdam, The Netherlands
| | - David E Neal
- Uro-Oncology Research Group, Cancer Research UK, Cambridge Research Institute, Li Ka Shing Centre, Robinson Way, Cambridge, CB2 0RE, UK
| | - Olli Kallioniemi
- Medical Biotechnology, VTT Technical Research Centre of Finland and University of Turku, FIN-20520 Turku, Finland
| | - Olivier Poch
- Laboratoire de Bioinformatique et de Génomique lntégratives, Institut de Génétique et de Biologie Moléculaire et Cellulaire,1, rue Laurent Fries, BP 10142, 67404 Illkirch CEDEX, France
| |
Collapse
|
609
|
Suzek BE, Huang H, McGarvey P, Mazumder R, Wu CH. UniRef: comprehensive and non-redundant UniProt reference clusters. Bioinformatics 2007; 23:1282-8. [PMID: 17379688 DOI: 10.1093/bioinformatics/btm098] [Citation(s) in RCA: 965] [Impact Index Per Article: 53.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Redundant protein sequences in biological databases hinder sequence similarity searches and make interpretation of search results difficult. Clustering of protein sequence space based on sequence similarity helps organize all sequences into manageable datasets and reduces sampling bias and overrepresentation of sequences. RESULTS The UniRef (UniProt Reference Clusters) provide clustered sets of sequences from the UniProt Knowledgebase (UniProtKB) and selected UniProt Archive records to obtain complete coverage of sequence space at several resolutions while hiding redundant sequences. Currently covering >4 million source sequences, the UniRef100 database combines identical sequences and subfragments from any source organism into a single UniRef entry. UniRef90 and UniRef50 are built by clustering UniRef100 sequences at the 90 or 50% sequence identity levels. UniRef100, UniRef90 and UniRef50 yield a database size reduction of approximately 10, 40 and 70%, respectively, from the source sequence set. The reduced redundancy increases the speed of similarity searches and improves detection of distant relationships. UniRef entries contain summary cluster and membership information, including the sequence of a representative protein, member count and common taxonomy of the cluster, the accession numbers of all the merged entries and links to rich functional annotation in UniProtKB to facilitate biological discovery. UniRef has already been applied to broad research areas ranging from genome annotation to proteomics data analysis. AVAILABILITY UniRef is updated biweekly and is available for online search and retrieval at http://www.uniprot.org, as well as for download at ftp://ftp.uniprot.org/pub/databases/uniprot/uniref. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Baris E Suzek
- Protein Information Resource, Department of Biochemistry and Molecular & Cellular Biology, Georgetown University Medical Center, Washington, DC 20007, USA.
| | | | | | | | | |
Collapse
|
610
|
Non-coding sequence retrieval system for comparative genomic analysis of gene regulatory elements. BMC Bioinformatics 2007; 8:94. [PMID: 17362514 PMCID: PMC1838437 DOI: 10.1186/1471-2105-8-94] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2006] [Accepted: 03/15/2007] [Indexed: 11/30/2022] Open
Abstract
Background Completion of the human genome sequence along with other species allows for greater understanding of the biochemical mechanisms and processes that govern healthy as well as diseased states. The large size of the genome sequences has made them difficult to study using traditional methods. There are many studies focusing on the protein coding sequences, however, not much is known about the function of non-coding regions of the genome. It has been demonstrated that parts of the non-coding region play a critical role as gene regulatory elements. Enhancers that regulate transcription processes have been found in intergenic regions. Furthermore, it is observed that regulatory elements found in non-coding regions are highly conserved across different species. However, the analysis of these regulatory elements is not as straightforward as it may first seem. The development of a centralized resource that allows for the quick and easy retrieval of non-coding sequences from multiple species and is capable of handing multi-gene queries is critical for the analysis of non-coding sequences. Here we describe the development of a web-based non-coding sequence retrieval system. Results This paper presents a Non-Coding Sequences Retrieval System (NCSRS). The NCSRS is a web-based bioinformatics tool that performs fast and convenient retrieval of non-coding and coding sequences from multiple species related to a specific gene or set of genes. This tool has compiled resources from multiple sources into one easy to use and convenient web based interface. With no software installation necessary, the user needs only internet access to use this tool. Conclusion The unique features of this tool will be very helpful for those studying gene regulatory elements that exist in non-coding regions. The web based application can be accessed on the internet at: .
Collapse
|
611
|
Wyder S, Kriventseva EV, Schröder R, Kadowaki T, Zdobnov EM. Quantification of ortholog losses in insects and vertebrates. Genome Biol 2007; 8:R242. [PMID: 18021399 PMCID: PMC2258195 DOI: 10.1186/gb-2007-8-11-r242] [Citation(s) in RCA: 53] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2007] [Revised: 10/04/2007] [Accepted: 11/16/2007] [Indexed: 11/17/2022] Open
Abstract
BACKGROUND The increasing number of sequenced insect and vertebrate genomes of variable divergence enables refined comparative analyses to quantify the major modes of animal genome evolution and allows tracing of gene genealogy (orthology) and pinpointing of gene extinctions (losses), which can reveal lineage-specific traits. RESULTS To consistently quantify losses of orthologous groups of genes, we compared the gene repertoires of five vertebrates and five insects, including honeybee and Tribolium beetle, that represent insect orders outside the previously sequenced Diptera. We found hundreds of lost Urbilateria genes in each of the lineages and assessed their phylogenetic origin. The rate of losses correlates well with the species' rates of molecular evolution and radiation times, without distinction between insects and vertebrates, indicating their stochastic nature. Remarkably, this extends to the universal single-copy orthologs, losses of dozens of which have been tolerated in each species. Nevertheless, the propensity for loss differs substantially among genes, where roughly 20% of the orthologs have an 8-fold higher chance of becoming extinct. Extrapolation of our data also suggests that the Urbilateria genome contained more than 7,000 genes. CONCLUSION Our results indicate that the seemingly higher number of observed gene losses in insects can be explained by their two- to three-fold higher evolutionary rate. Despite the profound effect of many losses on cellular machinery, overall, they seem to be guided by neutral evolution.
Collapse
Affiliation(s)
- Stefan Wyder
- Department of Genetic Medicine and Development, University of Geneva Medical School, 1211 Geneva, Switzerland
- Swiss Institute of Bioinformatics, rue Michel-Servet, 1211 Geneva, Switzerland
| | - Evgenia V Kriventseva
- Swiss Institute of Bioinformatics, rue Michel-Servet, 1211 Geneva, Switzerland
- Department of Structural Biology and Bioinformatics, University of Geneva Medical School, rue Michel-Servet, 1211 Geneva, Switzerland
| | - Reinhard Schröder
- Interf. Institut für Zellbiologie, Abt. Genetik der Tiere, Universität Tübingen, 72076 Tübingen, Germany
| | - Tatsuhiko Kadowaki
- Graduate School of Bioagricultural Sciences, Nagoya University, Chikusa, Nagoya 464-8601, Japan
| | - Evgeny M Zdobnov
- Department of Genetic Medicine and Development, University of Geneva Medical School, 1211 Geneva, Switzerland
- Swiss Institute of Bioinformatics, rue Michel-Servet, 1211 Geneva, Switzerland
- Imperial College London, South Kensington Campus, London SW7 2AZ, UK
| |
Collapse
|
612
|
Marques-Bonet T, Sànchez-Ruiz J, Armengol L, Khaja R, Bertranpetit J, Lopez-Bigas N, Rocchi M, Gazave E, Navarro A. On the association between chromosomal rearrangements and genic evolution in humans and chimpanzees. Genome Biol 2007; 8:R230. [PMID: 17971225 PMCID: PMC2246304 DOI: 10.1186/gb-2007-8-10-r230] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2006] [Revised: 10/12/2007] [Accepted: 10/30/2007] [Indexed: 02/04/2023] Open
Abstract
BACKGROUND The role that chromosomal rearrangements might have played in the speciation processes that have separated the lineages of humans and chimpanzees has recently come into the spotlight. To date, however, results are contradictory. Here we revisit this issue by making use of the available human and chimpanzee genome sequence to study the relationship between chromosomal rearrangements and rates of DNA sequence evolution. RESULTS Contrary to previous findings for this pair of species, we show that genes located in the rearranged chromosomes that differentiate the genomes of humans and chimpanzees, especially genes within rearrangements themselves, present lower divergence than genes elsewhere in the genome. Still, there are considerable differences between individual chromosomes. Chromosome 4, in particular, presents higher divergence in genes located within its rearrangement. CONCLUSION A first conclusion of our analysis is that divergence is lower for genes located in rearranged chromosomes than for those in colinear chromosomes. We also report that non-coding regions within rearranged regions tend to have lower divergence than non-coding regions outside them. These results suggest an association between chromosomal rearrangements and lower non-coding divergence that has not been reported before, even if some chromosomes do not follow this trend and could be potentially associated with a speciation episode. In summary, without excluding it, our results suggest that chromosomal speciation has not been common along the human and chimpanzee lineage.
Collapse
Affiliation(s)
- Tomàs Marques-Bonet
- Unitat de Biologia Evolutiva Departament de Ciències Experimentals i de la Salut, Departament de Ciències Experimentals i de la Salut. Universitat Pompeu Fabra. Parc de Recerca Biomèdica de Barcelona. Dr. Aiguader 88. 08003 Barcelona. Catalonia, Spain
| | - Jesús Sànchez-Ruiz
- Unitat de Biologia Evolutiva Departament de Ciències Experimentals i de la Salut, Departament de Ciències Experimentals i de la Salut. Universitat Pompeu Fabra. Parc de Recerca Biomèdica de Barcelona. Dr. Aiguader 88. 08003 Barcelona. Catalonia, Spain
| | - Lluís Armengol
- Genes and Disease Program, Center for Genomic Regulation,. Parc de Recerca Biomèdica de Barcelona. Dr. Aiguader 88, 1. 08003 Barcelona. Catalonia, Spain
- CIBER Epidemiología y Salud Pública (CIBERESP), Spain
| | - Razi Khaja
- The Center for Applied Genomics. The Hospital for Sick Children. MaRS Centre - East Tower. 101 College Street, Room 14-706. Toronto, Ontario. Canada
| | - Jaume Bertranpetit
- Unitat de Biologia Evolutiva Departament de Ciències Experimentals i de la Salut, Departament de Ciències Experimentals i de la Salut. Universitat Pompeu Fabra. Parc de Recerca Biomèdica de Barcelona. Dr. Aiguader 88. 08003 Barcelona. Catalonia, Spain
- CIBER Epidemiología y Salud Pública (CIBERESP), Spain
| | - Núria Lopez-Bigas
- Research Unit on Biomedical Informatics of IMIM/UPF. Parc de Recerca Biomèdica de Barcelona. Dr. Aiguader 88. 08003 Barcelona. Catalonia, Spain
| | - Mariano Rocchi
- Dipartimento di Genetica e Microbiologia. Universita di Bari, Bari, Italy
| | - Elodie Gazave
- Unitat de Biologia Evolutiva Departament de Ciències Experimentals i de la Salut, Departament de Ciències Experimentals i de la Salut. Universitat Pompeu Fabra. Parc de Recerca Biomèdica de Barcelona. Dr. Aiguader 88. 08003 Barcelona. Catalonia, Spain
| | - Arcadi Navarro
- Unitat de Biologia Evolutiva Departament de Ciències Experimentals i de la Salut, Departament de Ciències Experimentals i de la Salut. Universitat Pompeu Fabra. Parc de Recerca Biomèdica de Barcelona. Dr. Aiguader 88. 08003 Barcelona. Catalonia, Spain
- Institucio Catalana de Recerca i Estudis Avancats (ICREA) and Unitat de Biologia Evolutiva, Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra. Parc de Recerca Biomèdica de Barcelona. Plaça Dr. Aiguader 88. 08003 Barcelona. Catalonia, Spain
- CIBER Epidemiología y Salud Pública (CIBERESP), Spain
- Population Genomics Node (GNV8) National Institute for Bioinformatics (INB), Spain
| |
Collapse
|