301
|
Kalathur RKR, Pinto JP, Hernández-Prieto MA, Machado RSR, Almeida D, Chaurasia G, Futschik ME. UniHI 7: an enhanced database for retrieval and interactive analysis of human molecular interaction networks. Nucleic Acids Res 2013; 42:D408-14. [PMID: 24214987 PMCID: PMC3965034 DOI: 10.1093/nar/gkt1100] [Citation(s) in RCA: 55] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023] Open
Abstract
Unified Human Interactome (UniHI) (http://www.unihi.org) is a database for retrieval, analysis and visualization of human molecular interaction networks. Its primary aim is to provide a comprehensive and easy-to-use platform for network-based investigations to a wide community of researchers in biology and medicine. Here, we describe a major update (version 7) of the database previously featured in NAR Database Issue. UniHI 7 currently includes almost 350 000 molecular interactions between genes, proteins and drugs, as well as numerous other types of data such as gene expression and functional annotation. Multiple options for interactive filtering and highlighting of proteins can be employed to obtain more reliable and specific network structures. Expression and other genomic data can be uploaded by the user to examine local network structures. Additional built-in tools enable ready identification of known drug targets, as well as of biological processes, phenotypes and pathways enriched with network proteins. A distinctive feature of UniHI 7 is its user-friendly interface designed to be utilized in an intuitive manner, enabling researchers less acquainted with network analysis to perform state-of-the-art network-based investigations.
Collapse
Affiliation(s)
- Ravi Kiran Reddy Kalathur
- Centre for Molecular and Structural Biomedicine, University of Algarve, Faro, Portugal and Institute for Theoretical Biology, Charité, Humboldt-University, Berlin, Germany
| | | | | | | | | | | | | |
Collapse
|
302
|
Rebholz-Schuhmann D, Grabmüller C, Kavaliauskas S, Croset S, Woollard P, Backofen R, Filsell W, Clark D. A case study: semantic integration of gene-disease associations for type 2 diabetes mellitus from literature and biomedical data resources. Drug Discov Today 2013; 19:882-9. [PMID: 24201223 DOI: 10.1016/j.drudis.2013.10.024] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2012] [Revised: 09/24/2013] [Accepted: 10/28/2013] [Indexed: 10/26/2022]
Abstract
In the Semantic Enrichment of the Scientific Literature (SESL) project, researchers from academia and from life science and publishing companies collaborated in a pre-competitive way to integrate and share information for type 2 diabetes mellitus (T2DM) in adults. This case study exposes benefits from semantic interoperability after integrating the scientific literature with biomedical data resources, such as UniProt Knowledgebase (UniProtKB) and the Gene Expression Atlas (GXA). We annotated scientific documents in a standardized way, by applying public terminological resources for diseases and proteins, and other text-mining approaches. Eventually, we compared the genetic causes of T2DM across the data resources to demonstrate the benefits from the SESL triple store. Our solution enables publishers to distribute their content with little overhead into remote data infrastructures, such as into any Virtual Knowledge Broker.
Collapse
Affiliation(s)
- Dietrich Rebholz-Schuhmann
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK; Computerlinguistik, Universität Zürich, Binzmühlestrasse 14, 8050 Zürich, Switzerland.
| | - Christoph Grabmüller
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Silvestras Kavaliauskas
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Samuel Croset
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Peter Woollard
- GlaxoSmithKline, GlaxoSmithKline Medicines Research Centre, Gunnels Wood Road, Stevenage SG1 2NY, UK
| | - Rolf Backofen
- Albert-Ludwigs-University Freiburg, Fahnenbergplatz, D-79085 Freiburg, Germany
| | - Wendy Filsell
- Unilever R&D, Colworth Science Park, Sharnbrook MK44 1LQ, UK
| | - Dominic Clark
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| |
Collapse
|
303
|
Li P, Hua X, Zhang Z, Li J, Wang J. Characterization of regulatory features of housekeeping and tissue-specific regulators within tissue regulatory networks. BMC SYSTEMS BIOLOGY 2013; 7:112. [PMID: 24172660 PMCID: PMC3843562 DOI: 10.1186/1752-0509-7-112] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/29/2013] [Accepted: 10/28/2013] [Indexed: 01/10/2023]
Abstract
Background Transcription factors (TFs) and miRNAs are essential for the regulation of gene expression; however, the global view of human gene regulatory networks remains poorly understood. For example, how is the expression of so many genes regulated by limited cohorts of regulators and how are genes differentially expressed in different tissues despite the genetic code being the same in all tissues? Results We analyzed the network properties of housekeeping and tissue-specific genes in gene regulatory networks from seven human tissues. Our results show that different classes of genes behave quite differently in these networks. Tissue-specific miRNAs show a higher average target number compared with non-tissue specific miRNAs, which indicates that tissue-specific miRNAs tend to regulate different sets of targets. Tissue-specific TFs exhibit higher in-degree, out-degree, cluster coefficient and betweenness values, indicating that they occupy central positions in the regulatory network and that they transfer genetic information from upstream genes to downstream genes more quickly than other TFs. Housekeeping TFs tend to have higher cluster coefficients compared with other genes that are neither housekeeping nor tissue specific, indicating that housekeeping TFs tend to regulate their targets synergistically. Several topological properties of disease-associated miRNAs and genes were found to be significantly different from those of non-disease-associated miRNAs and genes. Conclusions Tissue-specific miRNAs, TFs and disease genes have particular topological properties within the transcriptional regulatory networks of the seven human tissues examined. The tendency of tissue-specific miRNAs to regulate different sets of genes shows that a particular tissue-specific miRNA and its target gene set may form a regulatory module to execute particular functions in the process of tissue differentiation. The regulatory patterns of tissue-specific TFs reflect their vital role in regulatory networks and their importance to biological functions in their respective tissues. The topological differences between disease and non-disease genes may aid the discovery of new disease genes or drug targets. Determining the network properties of these regulatory factors will help define the basic principles of human gene regulation and the molecular mechanisms of disease.
Collapse
Affiliation(s)
| | | | | | - Jie Li
- The State Key Laboratory of Pharmaceutical Biotechnology, Jiangsu Engineering Research Center for MicroRNA Biology and Biotechnology, School of Life Science, Nanjing University, Nanjing, China.
| | | |
Collapse
|
304
|
Experimental design-based functional mining and characterization of high-throughput sequencing data in the sequence read archive. PLoS One 2013; 8:e77910. [PMID: 24167589 PMCID: PMC3805581 DOI: 10.1371/journal.pone.0077910] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2013] [Accepted: 09/05/2013] [Indexed: 01/23/2023] Open
Abstract
High-throughput sequencing technology, also called next-generation sequencing (NGS), has the potential to revolutionize the whole process of genome sequencing, transcriptomics, and epigenetics. Sequencing data is captured in a public primary data archive, the Sequence Read Archive (SRA). As of January 2013, data from more than 14,000 projects have been submitted to SRA, which is double that of the previous year. Researchers can download raw sequence data from SRA website to perform further analyses and to compare with their own data. However, it is extremely difficult to search entries and download raw sequences of interests with SRA because the data structure is complicated, and experimental conditions along with raw sequences are partly described in natural language. Additionally, some sequences are of inconsistent quality because anyone can submit sequencing data to SRA with no quality check. Therefore, as a criterion of data quality, we focused on SRA entries that were cited in journal articles. We extracted SRA IDs and PubMed IDs (PMIDs) from SRA and full-text versions of journal articles and retrieved 2748 SRA ID-PMID pairs. We constructed a publication list referring to SRA entries. Since, one of the main themes of -omics analyses is clarification of disease mechanisms, we also characterized SRA entries by disease keywords, according to the Medical Subject Headings (MeSH) extracted from articles assigned to each SRA entry. We obtained 989 SRA ID-MeSH disease term pairs, and constructed a disease list referring to SRA data. We previously developed feature profiles of diseases in a system called “Gendoo”. We generated hyperlinks between diseases extracted from SRA and the feature profiles of it. The developed project, publication and disease lists resulting from this study are available at our web service, called “DBCLS SRA” (http://sra.dbcls.jp/). This service will improve accessibility to high-quality data from SRA.
Collapse
|
305
|
Worthey EA. Analysis and annotation of whole-genome or whole-exome sequencing-derived variants for clinical diagnosis. CURRENT PROTOCOLS IN HUMAN GENETICS 2013; 79:9.24.1-9.24.24. [PMID: 24510652 DOI: 10.1002/0471142905.hg0924s79] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
Over the last several years, next-generation sequencing (NGS) has transformed genomic research through substantial advances in technology and reduction in the cost of sequencing, and also in the systems required for analysis of these large volumes of data. This technology is now being used as a standard molecular diagnostic test under particular circumstances in some clinical settings. The advances in sequencing have come so rapidly that the major bottleneck in identification of causal variants is no longer the sequencing but rather the analysis and interpretation. Interpretation of genetic findings in a clinical setting is scarcely a new challenge, but the task is increasingly complex in clinical genome-wide sequencing given the dramatic increase in dataset size and complexity. This increase requires the development of novel or repositioned analysis tools, methodologies, and processes. This unit provides an overview of these items. Specific challenges related to implementation in a clinical setting are discussed.
Collapse
Affiliation(s)
- Elizabeth A Worthey
- Department of Pediatrics, Medical College of Wisconsin, Milwaukee, Wisconsin.,The Human and Molecular Genetics Center, Medical College of Wisconsin, Milwaukee, Wisconsin.,Department of Computer Science, University of Wisconsin, Milwaukee, Wisconsin
| |
Collapse
|
306
|
Dorn C, Grunert M, Sperling SR. Application of high-throughput sequencing for studying genomic variations in congenital heart disease. Brief Funct Genomics 2013; 13:51-65. [PMID: 24095982 DOI: 10.1093/bfgp/elt040] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Congenital heart diseases (CHD) represent the most common birth defect in human. The majority of cases are caused by a combination of complex genetic alterations and environmental influences. In the past, many disease-causing mutations have been identified; however, there is still a large proportion of cardiac malformations with unknown precise origin. High-throughput sequencing technologies established during the last years offer novel opportunities to further study the genetic background underlying the disease. In this review, we provide a roadmap for designing and analyzing high-throughput sequencing studies focused on CHD, but also with general applicability to other complex diseases. The three main next-generation sequencing (NGS) platforms including their particular advantages and disadvantages are presented. To identify potentially disease-related genomic variations and genes, different filtering steps and gene prioritization strategies are discussed. In addition, available control datasets based on NGS are summarized. Finally, we provide an overview of current studies already using NGS technologies and showing that these techniques will help to further unravel the complex genetics underlying CHD.
Collapse
Affiliation(s)
- Cornelia Dorn
- Department of Cardiovascular Genetics, Experimental and Clinical Research Center (ECRC), Charité-University Medicine Berlin and Max Delbrück Center (MDC) for Molecular Medicine, Lindenberger Weg 80, 13125 Berlin, Germany. Department of Biochemistry, Free University Berlin, Berlin, Germany. Tel.: +49-(0)30-450540123; Fax: +49-(0)30-84131699;
| | | | | |
Collapse
|
307
|
Buseh AG, Stevens PE, Millon-Underwood S, Townsend L, Kelber ST. Community leaders' perspectives on engaging African Americans in biobanks and other human genetics initiatives. J Community Genet 2013; 4:483-94. [PMID: 23813337 PMCID: PMC3773318 DOI: 10.1007/s12687-013-0155-z] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2013] [Accepted: 06/17/2013] [Indexed: 01/26/2023] Open
Abstract
There is limited information about what African Americans think about biobanks and the ethical questions surrounding them. Likewise, there is a gap in capacity to successfully enroll African Americans as biobank donors. The purposes of this community-based participatory study were to: (a) explore African Americans' perspectives on genetics/genomic research, (b) understand facilitators and barriers to participation in such studies, and (c) enlist their ideas about how to attract and sustain engagement of African Americans in genetics initiatives. As the first phase in a mixed methods study, we conducted four focus groups with 21 African American community leaders in one US Midwest city. The sample consisted of executive directors of community organizations and prominent community activists. Data were analyzed thematically. Skepticism about biomedical research and lack of trust characterized discussions about biomedical research and biobanks. The Tuskegee Untreated Syphilis Study and the Henrietta Lacks case influenced their desire to protect their community from harm and exploitation. Connections between genetics and family history made genetics/genomics research personal, pitting intrusion into private affairs against solutions. Participants also expressed concerns about ethical issues involved in genomics research, calling attention to how research had previously been conducted in their community. Participants hoped personalized medicine might bring health benefits to their people and proposed African American communities have a "seat at the table." They called for basic respect, authentic collaboration, bidirectional education, transparency and prerogative, and meaningful benefits and remuneration. Key to building trust and overcoming African Americans' trepidation and resistance to participation in biobanks are early and persistent engagement with the community, partnerships with community stakeholders to map research priorities, ethical conduct of research, and a guarantee of equitable distribution of benefits from genomics discoveries.
Collapse
Affiliation(s)
- Aaron G. Buseh
- />College of Nursing, University of Wisconsin—Milwaukee, 1921 East Hartford Avenue, Cunningham Hall, Room 569, P.O. Box 413, Milwaukee, WI 53201 USA
| | - Patricia E. Stevens
- />College of Nursing, University of Wisconsin—Milwaukee, P. O. Box 413, Cunningham Hall, Room 566, Milwaukee, WI 53201 USA
| | - Sandra Millon-Underwood
- />College of Nursing, University of Wisconsin—Milwaukee, 1921 E. Hartford Avenue, Cunningham Hall, Room 422/423, P. O. Box 413, Milwaukee, WI 53201 USA
| | - Leolia Townsend
- />College of Nursing, University of Wisconsin—Milwaukee, P. O. Box 413, Cunningham Hall, Room 527, Milwaukee, WI 53201 USA
| | - Sheryl T. Kelber
- />College of Nursing Center for Nursing Research and Evaluation, University of Wisconsin—Milwaukee, P. O. Box 413, Milwaukee, WI 53201 USA
| |
Collapse
|
308
|
Bagga JS, D’Antonio LA. Role of conserved cis-regulatory elements in the post-transcriptional regulation of the human MECP2 gene involved in autism. Hum Genomics 2013; 7:19. [PMID: 24040966 PMCID: PMC3844687 DOI: 10.1186/1479-7364-7-19] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2013] [Accepted: 09/04/2013] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The MECP2 gene codes for methyl CpG binding protein 2 which regulates activities of other genes in the early development of the brain. Mutations in this gene have been associated with Rett syndrome, a form of autism. The purpose of this study was to investigate the role of evolutionarily conserved cis-elements in regulating the post-transcriptional expression of the MECP2 gene and to explore their possible correlations with a mutation that is known to cause mental retardation. RESULTS A bioinformatics approach was used to map evolutionarily conserved cis-regulatory elements in the transcribed regions of the human MECP2 gene and its mammalian orthologs. Cis-regulatory motifs including G-quadruplexes, microRNA target sites, and AU-rich elements have gained significant importance because of their role in key biological processes and as therapeutic targets. We discovered in the 5'-UTR (untranslated region) of MECP2 mRNA a highly conserved G-quadruplex which overlapped a known deletion in Rett syndrome patients with decreased levels of MeCP2 protein. We believe that this 5'-UTR G-quadruplex could be involved in regulating MECP2 translation. We mapped additional evolutionarily conserved G-quadruplexes, microRNA target sites, and AU-rich elements in the key sections of both untranslated regions. Our studies suggest the regulation of translation, mRNA turnover, and development-related alternative MECP2 polyadenylation, putatively involving interactions of conserved cis-regulatory elements with their respective trans factors and complex interactions among the trans factors themselves. We discovered highly conserved G-quadruplex motifs that were more prevalent near alternative splice sites as compared to the constitutive sites of the MECP2 gene. We also identified a pair of overlapping G-quadruplexes at an alternative 5' splice site that could potentially regulate alternative splicing in a negative as well as a positive way in the MECP2 pre-mRNAs. CONCLUSIONS A Rett syndrome mutation with decreased protein expression was found to be associated with a conserved G-quadruplex. Our studies suggest that MECP2 post-transcriptional gene expression could be regulated by several evolutionarily conserved cis-elements like G-quadruplex motifs, microRNA target sites, and AU-rich elements. This phylogenetic analysis has provided some interesting and valuable insights into the regulation of the MECP2 gene involved in autism.
Collapse
Affiliation(s)
- Joetsaroop S Bagga
- John P. Stevens High School, 855 Grove Ave., Edison, NJ 08820, USA
- Carnegie Mellon University, 5000 Forbes Ave., Pittsburgh, PA 15213, USA
| | | |
Collapse
|
309
|
Riera C, Lois S, de la Cruz X. Prediction of pathological mutations in proteins: the challenge of integrating sequence conservation and structure stability principles. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE 2013. [DOI: 10.1002/wcms.1170] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Affiliation(s)
- Casandra Riera
- Laboratory of Translational Bioinformatics in Neuroscience; VHIR; Barcelona Spain
| | - Sergio Lois
- Laboratory of Translational Bioinformatics in Neuroscience; VHIR; Barcelona Spain
| | - Xavier de la Cruz
- Laboratory of Translational Bioinformatics in Neuroscience; VHIR; Barcelona Spain
- Institució Catalana per la Recerca i Estudis Avançats (ICREA); Barcelona Spain
| |
Collapse
|
310
|
Han HW, Ohn JH, Moon J, Kim JH. Yin and Yang of disease genes and death genes between reciprocally scale-free biological networks. Nucleic Acids Res 2013; 41:9209-17. [PMID: 23935122 PMCID: PMC3814386 DOI: 10.1093/nar/gkt683] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
Biological networks often show a scale-free topology with node degree following a power-law distribution. Lethal genes tend to form functional hubs, whereas non-lethal disease genes are located at the periphery. Uni-dimensional analyses, however, are flawed. We created and investigated two distinct scale-free networks; a protein–protein interaction (PPI) and a perturbation sensitivity network (PSN). The hubs of both networks exhibit a low molecular evolutionary rate (P < 8 × 10−12, P < 2 × 10−4) and a high codon adaptation index (P < 2 × 10−16, P < 2 × 10−8), indicating that both hubs have been shaped under high evolutionary selective pressure. Moreover, the topologies of PPI and PSN are inversely proportional: hubs of PPI tend to be located at the periphery of PSN and vice versa. PPI hubs are highly enriched with lethal genes but not with disease genes, whereas PSN hubs are highly enriched with disease genes and drug targets but not with lethal genes. PPI hub genes are enriched with essential cellular processes, but PSN hub genes are enriched with environmental interaction processes, having more TATA boxes and transcription factor binding sites. It is concluded that biological systems may balance internal growth signaling and external stress signaling by unifying the two opposite scale-free networks that are seemingly opposite to each other but work in concert between death and disease.
Collapse
Affiliation(s)
- Hyun Wook Han
- Division of Biomedical Informatics, Seoul National University Biomedical Informatics (SNUBI), Seoul National University College of Medicine, Seoul 110799, Korea, College of Medicine, CHA General Hospital, CHA University, Seoul 135081, Korea and Systems Biomedical Informatics Research Center, Seoul National University, Seoul 110799, Korea
| | | | | | | |
Collapse
|
311
|
Bromberg Y. Building a genome analysis pipeline to predict disease risk and prevent disease. J Mol Biol 2013; 425:3993-4005. [PMID: 23928561 DOI: 10.1016/j.jmb.2013.07.038] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2013] [Revised: 07/26/2013] [Accepted: 07/28/2013] [Indexed: 12/24/2022]
Abstract
Reduced costs and increased speed and accuracy of sequencing can bring the genome-based evaluation of individual disease risk to the bedside. While past efforts have identified a number of actionable mutations, the bulk of genetic risk remains hidden in sequence data. The biggest challenge facing genomic medicine today is the development of new techniques to predict the specifics of a given human phenome (set of all expressed phenotypes) encoded by each individual variome (full set of genome variants) in the context of the given environment. Numerous tools exist for the computational identification of the functional effects of a single variant. However, the pipelines taking advantage of full genomic, exomic, transcriptomic (and other) sequences have only recently become a reality. This review looks at the building of methodologies for predicting "variome"-defined disease risk. It also discusses some of the challenges for incorporating such a pipeline into everyday medical practice.
Collapse
Affiliation(s)
- Y Bromberg
- Department of Biochemistry and Microbiology, Rutgers University, 76 Lipman Drive, New Brunswick, NJ 08873, USA.
| |
Collapse
|
312
|
Hecht M, Bromberg Y, Rost B. News from the protein mutability landscape. J Mol Biol 2013; 425:3937-48. [PMID: 23896297 DOI: 10.1016/j.jmb.2013.07.028] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2013] [Revised: 07/08/2013] [Accepted: 07/19/2013] [Indexed: 12/16/2022]
Abstract
Some mutations of protein residues matter more than others, and these are often conserved evolutionarily. The explosion of deep sequencing and genotyping increasingly requires the distinction between effect and neutral variants. The simplest approach predicts all mutations of conserved residues to have an effect; however, this works poorly, at best. Many computational tools that are optimized to predict the impact of point mutations provide more detail. Here, we expand the perspective from the view of single variants to the level of sketching the entire mutability landscape. This landscape is defined by the impact of substituting every residue at each position in a protein by each of the 19 non-native amino acids. We review some of the powerful conclusions about protein function, stability and their robustness to mutation that can be drawn from such an analysis. Large-scale experimental and computational mutagenesis experiments are increasingly furthering our understanding of protein function and of the genotype-phenotype associations. We also discuss how these can be used to improve predictions of protein function and pathogenicity of missense variants.
Collapse
Affiliation(s)
- Maximilian Hecht
- Department of Bioinformatics and Computational Biology I12, Technische Universität München, Boltzmannstrasse 3, 85748 Garching, Germany.
| | | | | |
Collapse
|
313
|
Guo Y, Wei X, Das J, Grimson A, Lipkin S, Clark A, Yu H. Dissecting disease inheritance modes in a three-dimensional protein network challenges the "guilt-by-association" principle. Am J Hum Genet 2013; 93:78-89. [PMID: 23791107 DOI: 10.1016/j.ajhg.2013.05.022] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2012] [Revised: 05/02/2013] [Accepted: 05/23/2013] [Indexed: 10/26/2022] Open
Abstract
To better understand different molecular mechanisms by which mutations lead to various human diseases, we classified 82,833 disease-associated mutations according to their inheritance modes (recessive versus dominant) and molecular types (in-frame [missense point mutations and in-frame indels] versus truncating [nonsense mutations and frameshift indels]) and systematically examined the effects of different classes of disease mutations in a three-dimensional protein interactome network with the atomic-resolution interface resolved for each interaction. We found that although recessive mutations affecting the interaction interface of two interacting proteins tend to cause the same disease, this widely accepted "guilt-by-association" principle does not apply to dominant mutations. Furthermore, recessive truncating mutations in regions encoding the same interface are much more likely to cause the same disease, even for interfaces close to the N terminus of the protein. Conversely, dominant truncating mutations tend to be enriched in regions encoding areas between interfaces. These results suggest that a significant fraction of truncating mutations can generate functional protein products. For example, TRIM27, a known cancer-associated protein, interacts with three proteins (MID2, TRIM42, and SIRPA) through two different interfaces. A dominant truncating mutation (c.1024delT [p.Tyr342Thrfs*30]) associated with ovarian carcinoma is located between the regions encoding the two interfaces; the altered protein retains its interaction with MID2 and TRIM42 through the first interface but loses its interaction with SIRPA through the second interface. Our findings will help clarify the molecular mechanisms of thousands of disease-associated genes and their tens of thousands of mutations, especially for those carrying truncating mutations, often erroneously considered "knockout" alleles.
Collapse
|
314
|
Wang X, Thijssen B, Yu H. Target essentiality and centrality characterize drug side effects. PLoS Comput Biol 2013; 9:e1003119. [PMID: 23874169 PMCID: PMC3708859 DOI: 10.1371/journal.pcbi.1003119] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2012] [Accepted: 05/15/2013] [Indexed: 01/19/2023] Open
Abstract
To investigate factors contributing to drug side effects, we systematically examine relationships between 4,199 side effects associated with 996 drugs and their 647 human protein targets. We find that it is the number of essential targets, not the number of total targets, that determines the side effects of corresponding drugs. Furthermore, within the context of a three-dimensional interaction network with atomic-resolution interaction interfaces, we find that drugs causing more side effects are also characterized by high degree and betweenness of their targets and highly shared interaction interfaces on these targets. Our findings suggest that both essentiality and centrality of a drug target are key factors contributing to side effects and should be taken into consideration in rational drug design. The ultimate goal of medical research is to develop effective treatments for disease with minimal side effects. Currently, about 20% of drug candidates failed at clinical trial phases II and III due to safety issues. Therefore, understanding the determining factors of drug side effects is of paramount importance to human health and the pharmaceutical industry. Here, we present the first systematic study to uncover key factors leading to drug side effects within the framework of the human protein interactome network. Our results show that it is the number of essential targets, not the number of total targets, of a drug that determines the occurrence of its side effects. Furthermore, we find that the centrality, both degree and betweenness, of the drug targets is also an important determining factor of drug side effects. Our findings will shed light on new factors to be incorporated into the drug development pipeline.
Collapse
Affiliation(s)
- Xiujuan Wang
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York, United States of America
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, New York, United States of America
| | - Bram Thijssen
- Department of Bioinformatics, Maastricht University, Maastricht, The Netherlands
| | - Haiyuan Yu
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York, United States of America
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, New York, United States of America
- * E-mail:
| |
Collapse
|
315
|
Mapping the functional yeast ABC transporter interactome. Nat Chem Biol 2013; 9:565-72. [PMID: 23831759 PMCID: PMC3835492 DOI: 10.1038/nchembio.1293] [Citation(s) in RCA: 66] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2012] [Accepted: 06/11/2013] [Indexed: 12/17/2022]
Abstract
ABC transporters are a ubiquitous class of integral membrane proteins of immense clinical interest because of their strong association with human disease and pharmacology. To improve our understanding of these proteins, we used Membrane Yeast Two-Hybrid (MYTH) technology to map the protein interactome of all non-mitochondrial ABC transporters in the model organism Saccharomy cescerevisiae, and combined this data with previously reported yeast ABC transporter interactions in the BioGRID database to generate a comprehensive, integrated interactome. We show that ABC transporters physically associate with proteins involved in a surprisingly diverse range of functions. We specifically examine the importance of the physical interactions of ABC transporters in both the regulation of one another and in the modulation of proteins involved in zinc homeostasis. The interaction network presented here will be a powerful resource for increasing our fundamental understanding of the cellular role and regulation of ABC transporters.
Collapse
|
316
|
Kamaraj B, Purohit R. Computational Screening of Disease-Associated Mutations in OCA2 Gene. Cell Biochem Biophys 2013; 68:97-109. [DOI: 10.1007/s12013-013-9697-2] [Citation(s) in RCA: 44] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
|
317
|
Alonso A, Marsal S, Tortosa R, Canela-Xandri O, Julià A. GStream: improving SNP and CNV coverage on genome-wide association studies. PLoS One 2013; 8:e68822. [PMID: 23844243 PMCID: PMC3700900 DOI: 10.1371/journal.pone.0068822] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2013] [Accepted: 06/03/2013] [Indexed: 11/22/2022] Open
Abstract
We present GStream, a method that combines genome-wide SNP and CNV genotyping in the Illumina microarray platform with unprecedented accuracy. This new method outperforms previous well-established SNP genotyping software. More importantly, the CNV calling algorithm of GStream dramatically improves the results obtained by previous state-of-the-art methods and yields an accuracy that is close to that obtained by purely CNV-oriented technologies like Comparative Genomic Hybridization (CGH). We demonstrate the superior performance of GStream using microarray data generated from HapMap samples. Using the reference CNV calls generated by the 1000 Genomes Project (1KGP) and well-known studies on whole genome CNV characterization based either on CGH or genotyping microarray technologies, we show that GStream can increase the number of reliably detected variants up to 25% compared to previously developed methods. Furthermore, the increased genome coverage provided by GStream allows the discovery of CNVs in close linkage disequilibrium with SNPs, previously associated with disease risk in published Genome-Wide Association Studies (GWAS). These results could provide important insights into the biological mechanism underlying the detected disease risk association. With GStream, large-scale GWAS will not only benefit from the combined genotyping of SNPs and CNVs at an unprecedented accuracy, but will also take advantage of the computational efficiency of the method.
Collapse
Affiliation(s)
- Arnald Alonso
- Rheumatology Research Group, Vall d'Hebron Hospital Research Institute, Barcelona, Spain
- Department of ESAII, Polytechnical University of Catalonia, Barcelona, Spain
| | - Sara Marsal
- Rheumatology Research Group, Vall d'Hebron Hospital Research Institute, Barcelona, Spain
| | - Raül Tortosa
- Rheumatology Research Group, Vall d'Hebron Hospital Research Institute, Barcelona, Spain
| | - Oriol Canela-Xandri
- Rheumatology Research Group, Vall d'Hebron Hospital Research Institute, Barcelona, Spain
| | - Antonio Julià
- Rheumatology Research Group, Vall d'Hebron Hospital Research Institute, Barcelona, Spain
| |
Collapse
|
318
|
In silico screening and molecular dynamics simulation of disease-associated nsSNP in TYRP1 gene and its structural consequences in OCA3. BIOMED RESEARCH INTERNATIONAL 2013; 2013:697051. [PMID: 23862152 PMCID: PMC3703794 DOI: 10.1155/2013/697051] [Citation(s) in RCA: 53] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/20/2013] [Revised: 05/23/2013] [Accepted: 05/23/2013] [Indexed: 11/17/2022]
Abstract
Oculocutaneous albinism type III (OCA3), caused by mutations of TYRP1 gene, is an autosomal recessive disorder characterized by reduced biosynthesis of melanin pigment in the hair, skin, and eyes. The TYRP1 gene encodes a protein called tyrosinase-related protein-1 (Tyrp1). Tyrp1 is involved in maintaining the stability of tyrosinase protein and modulating its catalytic activity in eumelanin synthesis. Tyrp1 is also involved in maintenance of melanosome structure and affects melanocyte proliferation and cell death. In this work we implemented computational analysis to filter the most probable mutation that might be associated with OCA3. We found R326H and R356Q as most deleterious and disease associated by using PolyPhen 2.0, SIFT, PANTHER, I-mutant 3.0, PhD-SNP, SNP&GO, Pmut, and Mutpred tools. To understand the atomic arrangement in 3D space, the native and mutant (R326H and R356Q) structures were modelled. Finally the structural analyses of native and mutant Tyrp1 proteins were investigated using molecular dynamics simulation (MDS) approach. MDS results showed more flexibility in native Tyrp1 structure. Due to mutation in Tyrp1 protein, it became more rigid and might disturb the structural conformation and catalytic function of the structure and might also play a significant role in inducing OCA3. The results obtained from this study would facilitate wet-lab researches to develop a potent drug therapies against OCA3.
Collapse
|
319
|
Makita Y, Kobayashi N, Yoshida Y, Doi K, Mochizuki Y, Nishikata K, Matsushima A, Takahashi S, Ishii M, Takatsuki T, Bhatia R, Khadbaatar Z, Watabe H, Masuya H, Toyoda T. PosMed: Ranking genes and bioresources based on Semantic Web Association Study. Nucleic Acids Res 2013; 41:W109-14. [PMID: 23761449 PMCID: PMC3692089 DOI: 10.1093/nar/gkt474] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Positional MEDLINE (PosMed; http://biolod.org/PosMed) is a powerful Semantic Web Association Study engine that ranks biomedical resources such as genes, metabolites, diseases and drugs, based on the statistical significance of associations between user-specified phenotypic keywords and resources connected directly or inferentially through a Semantic Web of biological databases such as MEDLINE, OMIM, pathways, co-expressions, molecular interactions and ontology terms. Since 2005, PosMed has long been used for in silico positional cloning studies to infer candidate disease-responsible genes existing within chromosomal intervals. PosMed is redesigned as a workbench to discover possible functional interpretations for numerous genetic variants found from exome sequencing of human disease samples. We also show that the association search engine enhances the value of mouse bioresources because most knockout mouse resources have no phenotypic annotation, but can be associated inferentially to phenotypes via genes and biomedical documents. For this purpose, we established text-mining rules to the biomedical documents by careful human curation work, and created a huge amount of correct linking between genes and documents. PosMed associates any phenotypic keyword to mouse resources with 20 public databases and four original data sets as of May 2013.
Collapse
Affiliation(s)
- Yuko Makita
- Bioinformatics and Systems Engineering Division, RIKEN, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
320
|
C GPD, Rajith B, Chakraborty C. Predicting the impact of deleterious mutations in the protein kinase domain of FGFR2 in the context of function, structure, and pathogenesis--a bioinformatics approach. Appl Biochem Biotechnol 2013; 170:1853-70. [PMID: 23754559 DOI: 10.1007/s12010-013-0315-y] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2013] [Accepted: 05/27/2013] [Indexed: 11/26/2022]
Abstract
Fibroblast growth factor receptor 2 (FGFR2) controls a wide range of biological functions by regulating the cellular proliferation, survival, migration and differentiation. A growing body of preclinical data demonstrated that deregulation of the FGFR signalling through genetic modification was observed in various types of cancers. However, the extent to which genetic modifications interfere with gene regulation and their involvement in cancer susceptibility remains largely unknown. In this work, we performed in silico profiling of harmful non-synonymous single nucleotide polymorphisms (SNPs) in the protein kinase domain of FGFR2. Tolerance index, position-specific independent count score, change in free energy score (ΔΔG), Eris and FoldX indicated that seven mutations were found to be deleterious and may alter the protein function and structure. Furthermore, based on physico-chemical properties, two mutations K659N and R747H were found to be most deleterious in protein kinase domain and taken for further structural analysis. Docking study showed a complete loss of binding affinity followed by interference in hydrogen bonding and surrounding residues due to K659N and R747H mutations. In order to elucidate the mechanism behind the impact of mutation that can generate a ripple effect throughout the protein structure and ultimately affect the function, in-depth molecular dynamics simulation and principal component analysis were performed. The obtained results indicate that K659N and R747H mutations have a distinct effect on the dynamic behaviour of FGFR2 protein. Our strategy may be helpful for understanding SNP effects on proteins with function and their role in human genetic diseases and for the development of novel pharmacological strategies.
Collapse
Affiliation(s)
- George Priya Doss C
- Medical Biotechnology Division, School of Biosciences and Technology, VIT University, Vellore, 632014, Tamil Nadu, India.
| | | | | |
Collapse
|
321
|
Cytoplasmic RNA viruses as potential vehicles for the delivery of therapeutic small RNAs. Virol J 2013; 10:185. [PMID: 23759022 PMCID: PMC3685532 DOI: 10.1186/1743-422x-10-185] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2012] [Accepted: 05/26/2013] [Indexed: 12/21/2022] Open
Abstract
Viral vectors have become the best option for the delivery of therapeutic genes in conventional and RNA interference-based gene therapies. The current viral vectors for the delivery of small regulatory RNAs are based on DNA viruses and retroviruses/lentiviruses. Cytoplasmic RNA viruses have been excluded as viral vectors for RNAi therapy because of the nuclear localization of the microprocessor complex and the potential degradation of the viral RNA genome during the excision of any virus-encoded pre-microRNAs. However, in the last few years, the presence of several species of small RNAs (e.g., virus-derived small interfering RNAs, virus-derived short RNAs, and unusually small RNAs) in animals and cell cultures that are infected with cytoplasmic RNA viruses has suggested the existence of a non-canonical mechanism of microRNA biogenesis. Several studies have been conducted on the tick-borne encephalitis virus and on the Sindbis virus in which microRNA precursors were artificially incorporated and demonstrated the production of mature microRNAs. The ability of these viruses to recruit Drosha to the cytoplasm during infection resulted in the efficient processing of virus-encoded microRNA without the viral genome entering the nucleus. In this review, we discuss the relevance of these findings with an emphasis on the potential use of cytoplasmic RNA viruses as vehicles for the efficient delivery of therapeutic small RNAs.
Collapse
|
322
|
Marcotte E, Boone C, Babu MM, Gavin AC. Network Biology editorial 2013. MOLECULAR BIOSYSTEMS 2013; 9:1557-8. [PMID: 23712464 DOI: 10.1039/c3mb90018e] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
323
|
Capriotti E, Altman RB, Bromberg Y. Collective judgment predicts disease-associated single nucleotide variants. BMC Genomics 2013; 14 Suppl 3:S2. [PMID: 23819846 PMCID: PMC3839641 DOI: 10.1186/1471-2164-14-s3-s2] [Citation(s) in RCA: 186] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Background In recent years the number of human genetic variants deposited into the publicly available databases has been increasing exponentially. The latest version of dbSNP, for example, contains ~50 million validated Single Nucleotide Variants (SNVs). SNVs make up most of human variation and are often the primary causes of disease. The non-synonymous SNVs (nsSNVs) result in single amino acid substitutions and may affect protein function, often causing disease. Although several methods for the detection of nsSNV effects have already been developed, the consistent increase in annotated data is offering the opportunity to improve prediction accuracy. Results Here we present a new approach for the detection of disease-associated nsSNVs (Meta-SNP) that integrates four existing methods: PANTHER, PhD-SNP, SIFT and SNAP. We first tested the accuracy of each method using a dataset of 35,766 disease-annotated mutations from 8,667 proteins extracted from the SwissVar database. The four methods reached overall accuracies of 64%-76% with a Matthew's correlation coefficient (MCC) of 0.38-0.53. We then used the outputs of these methods to develop a machine learning based approach that discriminates between disease-associated and polymorphic variants (Meta-SNP). In testing, the combined method reached 79% overall accuracy and 0.59 MCC, ~3% higher accuracy and ~0.05 higher correlation with respect to the best-performing method. Moreover, for the hardest-to-define subset of nsSNVs, i.e. variants for which half of the predictors disagreed with the other half, Meta-SNP attained 8% higher accuracy than the best predictor. Conclusions Here we find that the Meta-SNP algorithm achieves better performance than the best single predictor. This result suggests that the methods used for the prediction of variant-disease associations are orthogonal, encoding different biologically relevant relationships. Careful combination of predictions from various resources is therefore a good strategy for the selection of high reliability predictions. Indeed, for the subset of nsSNVs where all predictors were in agreement (46% of all nsSNVs in the set), our method reached 87% overall accuracy and 0.73 MCC. Meta-SNP server is freely accessible at http://snps.biofold.org/meta-snp.
Collapse
Affiliation(s)
- Emidio Capriotti
- Division of Informatics, Department of Pathology, University of Alabama at Birmingham, Birmingham, AL, USA.
| | | | | |
Collapse
|
324
|
Haerty W, Ponting CP. Mutations within lncRNAs are effectively selected against in fruitfly but not in human. Genome Biol 2013; 14:R49. [PMID: 23710818 PMCID: PMC4053968 DOI: 10.1186/gb-2013-14-5-r49] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2013] [Accepted: 05/27/2013] [Indexed: 02/07/2023] Open
Abstract
Background Previous studies in Drosophila and mammals have revealed levels of long non-coding RNAs (lncRNAs) sequence conservation that are intermediate between neutrally evolving and protein-coding sequence. These analyses compared conservation between species that diverged up to 75 million years ago. However, analysis of sequence polymorphisms within a species' population can provide an understanding of essentially contemporaneous selective constraints that are acting on lncRNAs and can quantify the deleterious effect of mutations occurring within these loci. Results We took advantage of polymorphisms derived from the genome sequences of 163 Drosophila melanogaster strains and 174 human individuals to calculate the distribution of fitness effects of single nucleotide polymorphisms occurring within intergenic lncRNAs and compared this to distributions for SNPs present within putatively neutral or protein-coding sequences. Our observations show that in D.melanogaster there is a significant excess of rare frequency variants within intergenic lncRNAs relative to neutrally evolving sequences, whereas selection on human intergenic lncRNAs appears to be effectively neutral. Approximately 30% of mutations within these fruitfly lncRNAs are estimated as being weakly deleterious. Conclusions These contrasting results can be attributed to the large difference in effective population sizes between the two species. Our results suggest that while the sequences of lncRNAs will be well conserved across insect species, such loci in mammals will accumulate greater proportions of deleterious changes through genetic drift.
Collapse
|
325
|
Whigham BT, Allingham RR. Developments in Ocular Genetics: Annual Review. ASIA-PACIFIC JOURNAL OF OPHTHALMOLOGY (PHILADELPHIA, PA.) 2013; 2:177-86. [PMID: 26108111 DOI: 10.1097/apo.0b013e318294b837] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
PURPOSE The purpose of this study was to summarize major developments in ocular genetics over the past year. DESIGN A literature review was performed for articles relating to the genetics of eye diseases and morphology. The search focused on articles published between September 15, 2011, and September 15, 2012. METHODS PubMed and Google Scholar search tools were used to search for ocular genetics articles in the desired date range. RESULTS Major advances have been reported in numerous areas including glaucoma, age-related macular degeneration, and keratoconus. Numerous novel associations have been identified through large genome-wide association studies. In addition, numerous disease genes have been identified through next-generation sequencing technologies. CONCLUSIONS Ocular genetics continues to advance at a rapid pace and benefit from new technologies. Numerous discoveries in the past year point toward areas for continued research.
Collapse
Affiliation(s)
- Benjamin T Whigham
- From the Department of Ophthalmology, Duke University Eye Center, Durham, NC
| | | |
Collapse
|
326
|
Abstract
Disease-causing aberrations in the normal function of a gene define that gene as a disease gene. Proving a causal link between a gene and a disease experimentally is expensive and time-consuming. Comprehensive prioritization of candidate genes prior to experimental testing drastically reduces the associated costs. Computational gene prioritization is based on various pieces of correlative evidence that associate each gene with the given disease and suggest possible causal links. A fair amount of this evidence comes from high-throughput experimentation. Thus, well-developed methods are necessary to reliably deal with the quantity of information at hand. Existing gene prioritization techniques already significantly improve the outcomes of targeted experimental studies. Faster and more reliable techniques that account for novel data types are necessary for the development of new diagnostics, treatments, and cure for many diseases.
Collapse
Affiliation(s)
- Yana Bromberg
- Department of Biochemistry and Microbiology, School of Environmental and Biological Sciences, Rutgers University, New Brunswick, New Jersey, USA.
| |
Collapse
|
327
|
Mikami T, Meguro A, Teshigawara T, Takeuchi M, Uemoto R, Kawagoe T, Nomura E, Asukata Y, Ishioka M, Iwasaki M, Fukagawa K, Konomi K, Shimazaki J, Nishida T, Mizuki N. Interleukin 1 beta promoter polymorphism is associated with keratoconus in a Japanese population. Mol Vis 2013; 19:845-51. [PMID: 23592922 PMCID: PMC3626376] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2012] [Accepted: 04/09/2013] [Indexed: 11/11/2022] Open
Abstract
PURPOSE Polymorphisms in the interleukin 1 alpha (IL1A) and IL1B gene regions were previously associated with keratoconus in a Korean population. In the present study, we investigated whether the IL1A and IL1B polymorphisms are associated with keratoconus in a Japanese population. METHODS A total of 169 Japanese patients with keratoconus and 390 Japanese healthy controls were recruited. We genotyped one IL1A single nucleotide polymorphism (SNP; rs2071376) and two IL1B SNPs (rs1143627 and rs16944) to compare the frequencies of alleles, genotypes, and haplotypes between cases and controls. RESULTS Statistically significant association was observed for rs1143627 (-31 T>C) in the IL1B promoter region; the T allele of rs1143627 was associated with an increased risk of keratoconus (p=0.014, corrected p value [pc]=0.043, odds ratio=1.38). The C allele of rs16944 (-511 C>T) in the IL1B promoter region had a 1.33-fold increased risk of keratoconus, although this increase did not reach statistical significance (p=0.033, pc=0.098). The TT genotype of rs1143627 was weakly associated with an increased risk of keratoconus (p=0.033, pc=0.099, odds ratio=1.52). However, no significant differences were found in the allele and genotype frequencies between the cases and controls for rs2071376 in IL1A. Regarding haplotypic diversity, the haplotype created by the T allele of rs1143627 and C allele of rs16944 was associated with a 1.72-fold increased risk of keratoconus (p=4.0×10(-5), pc=1.6×10(-4)). CONCLUSIONS Our results replicate associations reported recently in a Korean population. Thus, IL1B may play an important role in the development of keratoconus through genetic polymorphisms.
Collapse
Affiliation(s)
- Takenori Mikami
- Department of Ophthalmology and Visual Science, Yokohama City University Graduate School of Medicine, Kanagawa, Japan,Yokosuka Chuoh Eye Clinic, Kanagawa, Japan
| | - Akira Meguro
- Department of Ophthalmology and Visual Science, Yokohama City University Graduate School of Medicine, Kanagawa, Japan
| | - Takeshi Teshigawara
- Department of Ophthalmology and Visual Science, Yokohama City University Graduate School of Medicine, Kanagawa, Japan,Yokosuka Chuoh Eye Clinic, Kanagawa, Japan
| | - Masaki Takeuchi
- Department of Ophthalmology and Visual Science, Yokohama City University Graduate School of Medicine, Kanagawa, Japan
| | - Riyo Uemoto
- Department of Ophthalmology and Visual Science, Yokohama City University Graduate School of Medicine, Kanagawa, Japan
| | - Tatsukata Kawagoe
- Department of Ophthalmology and Visual Science, Yokohama City University Graduate School of Medicine, Kanagawa, Japan
| | - Eiichi Nomura
- Department of Ophthalmology and Visual Science, Yokohama City University Graduate School of Medicine, Kanagawa, Japan
| | - Yuri Asukata
- Department of Ophthalmology and Visual Science, Yokohama City University Graduate School of Medicine, Kanagawa, Japan
| | - Misaki Ishioka
- Department of Ophthalmology and Visual Science, Yokohama City University Graduate School of Medicine, Kanagawa, Japan,Ryogoku Eye Clinic, Tokyo, Japan
| | | | | | - Kenji Konomi
- Department of Ophthalmology, Tokyo Dental College, Ichikawa General Hospital, Chiba, Japan
| | - Jun Shimazaki
- Department of Ophthalmology, Tokyo Dental College, Ichikawa General Hospital, Chiba, Japan
| | - Teruo Nishida
- Department of Biomolecular Recognition and Ophthalmology, Yamaguchi University School of Medicine, Yamaguchi, Japan
| | - Nobuhisa Mizuki
- Department of Ophthalmology and Visual Science, Yokohama City University Graduate School of Medicine, Kanagawa, Japan
| |
Collapse
|
328
|
Cheng F, Li W, Wang X, Zhou Y, Wu Z, Shen J, Tang Y. Adverse drug events: database construction and in silico prediction. J Chem Inf Model 2013; 53:744-52. [PMID: 23521697 DOI: 10.1021/ci4000079] [Citation(s) in RCA: 88] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
Adverse drug events (ADEs) are the harms associated with uses of given medications at normal dosages, which are crucial for a drug to be approved in clinical use or continue to stay on the market. Many ADEs are not identified in trials until the drug is approved for clinical use, which results in adverse morbidity and mortality. To date, millions of ADEs have been reported around the world. Methods to avoid or reduce ADEs are an important issue for drug discovery and development. Here, we reported a comprehensive database of adverse drug events (namely MetaADEDB), which included more than 520,000 drug-ADE associations among 3059 unique compounds (including 1330 drugs) and 13,200 ADE items by data integration and text mining. All compounds and ADEs were annotated with the most commonly used concepts defined in Medical Subject Headings (MeSH). Meanwhile, a computational method, namely the phenotypic network inference model (PNIM), was developed for prediction of potential ADEs based on the database. The area under the receive operating characteristic curve (AUC) is more than 0.9 by 10-fold cross validation, while the AUC value was 0.912 for an external validation set extracted from the US-FDA Adverse Events Reporting System, which indicated that the prediction capability of the method was reliable. MetaADEDB is accessible free of charge at http://www.lmmd.org/online_services/metaadedb/. The database and the method provide us a useful tool to search for known side effects or predict potential side effects for a given drug or compound.
Collapse
Affiliation(s)
- Feixiong Cheng
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | | | | | | | | | | | | |
Collapse
|
329
|
George Priya Doss C, Nagasundaram N, Chakraborty C, Chen L, Zhu H. Extrapolating the effect of deleterious nsSNPs in the binding adaptability of flavopiridol with CDK7 protein: a molecular dynamics approach. Hum Genomics 2013; 7:10. [PMID: 23561625 PMCID: PMC3726351 DOI: 10.1186/1479-7364-7-10] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2013] [Accepted: 02/18/2013] [Indexed: 11/22/2022] Open
Abstract
Background Recent reports suggest the role of nonsynonymous single nucleotide polymorphisms (nsSNPs) in cyclin-dependent kinase 7 (CDK7) gene associated with defect in the DNA repair mechanism that may contribute to cancer risk. Among the various inhibitors developed so far, flavopiridol proved to be a potential antitumor drug in the phase-III clinical trial for chronic lymphocytic leukemia. Here, we described a theoretical assessment for the discovery of new drugs or drug targets in CDK7 protein owing to the changes caused by deleterious nsSNPs. Methods Three nsSNPs (I63R, H135R, and T285M) were predicted to have functional impact on protein function by SIFT, PolyPhen2, I-Mutant3, PANTHER, SNPs&GO, PhD-SNP, and screening for non-acceptable polymorphisms (SNAP). Furthermore, we analyzed the native and proposed mutant models in atomic level 10 ns simulation using the molecular dynamics (MD) approach. Finally, with the aid of Autodock 4.0 and PatchDock, we analyzed the binding efficacy of flavopiridol with CDK7 protein with respect to the deleterious mutations. Results By comparing the results of all seven prediction tools, three nsSNPs (I63R, H135R, and T285M) were predicted to have functional impact on the protein function. The results of protein stability analysis inferred that I63R and H135R exhibited less deviation in root mean square deviation in comparison with the native and T285M protein. The flexibility of all the three mutant models of CDK7 protein is diverse in comparison with the native protein. Following to that, docking study revealed the change in the active site residues and decrease in the binding affinity of flavopiridol with mutant proteins. Conclusion This theoretical approach is entirely based on computational methods, which has the ability to identify the disease-related SNPs in complex disorders by contrasting their costs and capabilities with those of the experimental methods. The identification of disease related SNPs by computational methods has the potential to create personalized tools for the diagnosis, prognosis, and treatment of diseases. Lay abstract Cell cycle regulatory protein, CDK7, is linked with DNA repair mechanism which can contribute to cancer risk. The main aim of this study is to extrapolate the relationship between the nsSNPs and their effects in drug-binding capability. In this work, we propose a new methodology which (1) efficiently identified the deleterious nsSNPs that tend to have functional effect on protein function upon mutation by computational tools, (2) analyze d the native protein and proposed mutant models in atomic level using MD approach, and (3) investigated the protein-ligand interactions to analyze the binding ability by docking analysis. This theoretical approach is entirely based on computational methods, which has the ability to identify the disease-related SNPs in complex disorders by contrasting their costs and capabilities with those of the experimental methods. Overall, this approach has the potential to create personalized tools for the diagnosis, prognosis, and treatment of diseases.
Collapse
Affiliation(s)
- C George Priya Doss
- Medical Biotechnology Division, Centre for Nanobiotechnology, School of Biosciences and Technology, VIT University, Vellore 632014, Tamil Nadu 632014, India.
| | | | | | | | | |
Collapse
|
330
|
Abstract
The University of California Santa Cruz (UCSC) Genome Browser is a popular Web-based tool for quickly displaying a requested portion of a genome at any scale, accompanied by a series of aligned annotation "tracks." The annotations generated by the UCSC Genome Bioinformatics Group and external collaborators include gene predictions, mRNA and expressed sequence tag alignments, simple nucleotide polymorphisms, expression and regulatory data, phenotype and variation data, and pairwise and multiple-species comparative genomics data. All information relevant to a region is presented in one window, facilitating biological analysis and interpretation. The database tables underlying the Genome Browser tracks can be viewed, downloaded, and manipulated using another Web-based application, the UCSC Table Browser. Users can upload personal datasets in a wide variety of formats as custom annotation tracks in both browsers for research or educational purposes. This unit describes how to use the Genome Browser and Table Browser for genome analysis, download the underlying database tables, and create and display custom annotation tracks.
Collapse
Affiliation(s)
- Donna Karolchik
- Center for Biomolecular Science and Engineering, University of California Santa Cruz, Santa Cruz, California, USA
| | | | | |
Collapse
|
331
|
Nguyen H, Luu TD, Poch O, Thompson JD. Knowledge discovery in variant databases using inductive logic programming. Bioinform Biol Insights 2013; 7:119-31. [PMID: 23589683 PMCID: PMC3615990 DOI: 10.4137/bbi.s11184] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Understanding the effects of genetic variation on the phenotype of an individual is a major goal of biomedical research, especially for the development of diagnostics and effective therapeutic solutions. In this work, we describe the use of a recent knowledge discovery from database (KDD) approach using inductive logic programming (ILP) to automatically extract knowledge about human monogenic diseases. We extracted background knowledge from MSV3d, a database of all human missense variants mapped to 3D protein structure. In this study, we identified 8,117 mutations in 805 proteins with known three-dimensional structures that were known to be involved in human monogenic disease. Our results help to improve our understanding of the relationships between structural, functional or evolutionary features and deleterious mutations. Our inferred rules can also be applied to predict the impact of any single amino acid replacement on the function of a protein. The interpretable rules are available at http://decrypthon.igbmc.fr/kd4v/.
Collapse
Affiliation(s)
- Hoan Nguyen
- Laboratoire de Bioinformatique et Génomique Intégratives, Institut de Génétique et de Biologie Moléculaire et Cellulaire Illkirch, France
| | | | | | | |
Collapse
|
332
|
Besio R, Gioia R, Cossu F, Monzani E, Nicolis S, Cucca L, Profumo A, Casella L, Tenni R, Bolognesi M, Rossi A, Forlino A. Kinetic and structural evidences on human prolidase pathological mutants suggest strategies for enzyme functional rescue. PLoS One 2013; 8:e58792. [PMID: 23516557 PMCID: PMC3596340 DOI: 10.1371/journal.pone.0058792] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2012] [Accepted: 02/06/2013] [Indexed: 12/17/2022] Open
Abstract
Prolidase is the only human enzyme responsible for the digestion of iminodipeptides containing proline or hydroxyproline at their C-terminal end, being a key player in extracellular matrix remodeling. Prolidase deficiency (PD) is an intractable loss of function disease, characterized by mutations in the prolidase gene. The exact causes of activity impairment in mutant prolidase are still unknown. We generated three recombinant prolidase forms, hRecProl-231delY, hRecProl-E412K and hRecProl-G448R, reproducing three mutations identified in homozygous PD patients. The enzymes showed very low catalytic efficiency, thermal instability and changes in protein conformation. No variation of Mn(II) cofactor affinity was detected for hRecProl-E412K; a compromised ability to bind the cofactor was found in hRecProl-231delY and Mn(II) was totally absent in hRecProl-G448R. Furthermore, local structure perturbations for all three mutants were predicted by in silico analysis. Our biochemical investigation of the three causative alleles identified in perturbed folding/instability, and in consequent partial prolidase degradation, the main reasons for enzyme inactivity. Based on the above considerations we were able to rescue part of the prolidase activity in patients’ fibroblasts through the induction of Heath Shock Proteins expression, hinting at new promising avenues for PD treatment.
Collapse
Affiliation(s)
- Roberta Besio
- Department of Molecular Medicine, Biochemistry Unit, University of Pavia, Pavia, Italy
| | - Roberta Gioia
- Department of Molecular Medicine, Biochemistry Unit, University of Pavia, Pavia, Italy
| | - Federica Cossu
- Department of BioSciences, CNR-IBF and CIMAINA, University of Milano, Milano, Italy
| | - Enrico Monzani
- Department of Chemistry, University of Pavia, Pavia, Italy
| | | | - Lucia Cucca
- Department of Chemistry, University of Pavia, Pavia, Italy
| | | | - Luigi Casella
- Department of Chemistry, University of Pavia, Pavia, Italy
| | - Ruggero Tenni
- Department of Molecular Medicine, Biochemistry Unit, University of Pavia, Pavia, Italy
| | - Martino Bolognesi
- Department of BioSciences, CNR-IBF and CIMAINA, University of Milano, Milano, Italy
| | - Antonio Rossi
- Department of Molecular Medicine, Biochemistry Unit, University of Pavia, Pavia, Italy
| | - Antonella Forlino
- Department of Molecular Medicine, Biochemistry Unit, University of Pavia, Pavia, Italy
- * E-mail:
| |
Collapse
|
333
|
Mrozek D, Małysiak-Mrozek B, Siążnik A. search GenBank: interactive orchestration and ad-hoc choreography of Web services in the exploration of the biomedical resources of the National Center For Biotechnology Information. BMC Bioinformatics 2013; 14:73. [PMID: 23452691 PMCID: PMC3602006 DOI: 10.1186/1471-2105-14-73] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2012] [Accepted: 02/22/2013] [Indexed: 11/27/2022] Open
Abstract
Background Due to the growing number of biomedical entries in data repositories of the National Center for Biotechnology Information (NCBI), it is difficult to collect, manage and process all of these entries in one place by third-party software developers without significant investment in hardware and software infrastructure, its maintenance and administration. Web services allow development of software applications that integrate in one place the functionality and processing logic of distributed software components, without integrating the components themselves and without integrating the resources to which they have access. This is achieved by appropriate orchestration or choreography of available Web services and their shared functions. After the successful application of Web services in the business sector, this technology can now be used to build composite software tools that are oriented towards biomedical data processing. Results We have developed a new tool for efficient and dynamic data exploration in GenBank and other NCBI databases. A dedicated search GenBank system makes use of NCBI Web services and a package of Entrez Programming Utilities (eUtils) in order to provide extended searching capabilities in NCBI data repositories. In search GenBank users can use one of the three exploration paths: simple data searching based on the specified user’s query, advanced data searching based on the specified user’s query, and advanced data exploration with the use of macros. search GenBank orchestrates calls of particular tools available through the NCBI Web service providing requested functionality, while users interactively browse selected records in search GenBank and traverse between NCBI databases using available links. On the other hand, by building macros in the advanced data exploration mode, users create choreographies of eUtils calls, which can lead to the automatic discovery of related data in the specified databases. Conclusions search GenBank extends standard capabilities of the NCBI Entrez search engine in querying biomedical databases. The possibility of creating and saving macros in the search GenBank is a unique feature and has a great potential. The potential will further grow in the future with the increasing density of networks of relationships between data stored in particular databases. search GenBank is available for public use at http://sgb.biotools.pl/.
Collapse
Affiliation(s)
- Dariusz Mrozek
- Institute of Informatics, Silesian University of Technology, Akademicka 16, Gliwice, 44-100, Poland.
| | | | | |
Collapse
|
334
|
Portales-Casamar E, Ch'ng C, Lui F, St-Georges N, Zoubarev A, Lai AY, Lee M, Kwok C, Kwok W, Tseng L, Pavlidis P. Neurocarta: aggregating and sharing disease-gene relations for the neurosciences. BMC Genomics 2013; 14:129. [PMID: 23442263 PMCID: PMC3599981 DOI: 10.1186/1471-2164-14-129] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2012] [Accepted: 02/23/2013] [Indexed: 12/22/2022] Open
Abstract
BACKGROUND Understanding the genetic basis of diseases is key to the development of better diagnoses and treatments. Unfortunately, only a small fraction of the existing data linking genes to phenotypes is available through online public resources and, when available, it is scattered across multiple access tools. DESCRIPTION Neurocarta is a knowledgebase that consolidates information on genes and phenotypes across multiple resources and allows tracking and exploring of the associations. The system enables automatic and manual curation of evidence supporting each association, as well as user-enabled entry of their own annotations. Phenotypes are recorded using controlled vocabularies such as the Disease Ontology to facilitate computational inference and linking to external data sources. The gene-to-phenotype associations are filtered by stringent criteria to focus on the annotations most likely to be relevant. Neurocarta is constantly growing and currently holds more than 30,000 lines of evidence linking over 7,000 genes to 2,000 different phenotypes. CONCLUSIONS Neurocarta is a one-stop shop for researchers looking for candidate genes for any disorder of interest. In Neurocarta, they can review the evidence linking genes to phenotypes and filter out the evidence they're not interested in. In addition, researchers can enter their own annotations from their experiments and analyze them in the context of existing public annotations. Neurocarta's in-depth annotation of neurodevelopmental disorders makes it a unique resource for neuroscientists working on brain development.
Collapse
Affiliation(s)
- Elodie Portales-Casamar
- Centre for High-Throughput Biology and Department of Psychiatry, University of British Columbia, 2125 East Mall, Vancouver, BC V6T1Z4, Canada
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
335
|
Distinct genomic aberrations between low-grade and high-grade gliomas of Chinese patients. PLoS One 2013; 8:e57168. [PMID: 23451178 PMCID: PMC3579804 DOI: 10.1371/journal.pone.0057168] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2012] [Accepted: 01/17/2013] [Indexed: 11/19/2022] Open
Abstract
Background Glioma is a type of tumor that develops in the central nerve system, mainly the brain. Alterations of genomic sequence and sequence segments (such as copy number variations or CNV and copy neutral loss of heterozygosities or cnLOH) are thought to be a major determinant of the tumor grade. Methods We mapped genomic variations between low-grade and high-grade gliomas (LGG and HGG) in Chinese population based on Illumina’s Beadchip and validated the results using real-time qPCR. Results At the cytoband level, we discovered: (1) unique losses in LGG on 5q, 8p and 11q, and in HGG on 6q, 11p, 13q and 19q; (2) unique gains in the LGG on 1p and in HGG at 5p, 7p, 7q and 20q; and (3) cnLOH in HGG only on 3q, 8q, 10p, 14q, 15q, 17p, 17q, 18q and 21q. Subsequently, we confirmed well-characterized oncogenes among tumor-related loci (such as EGFR and KIT) and detected novel genes that gained chromosome sequences (such as AASS, HYAL4, NDUFA5 and SPAM1) in both LGG and HGG. In addition, we found gains, losses, and cnLOH in several genes, including VN1R2, VN1R4, and ZNF677, in multiple samples. Mapping grade-associated pathways and their related gene ontology (GO) terms, we classified LGG-associated functions as “arachidonic acid metabolism”, “DNA binding” and “regulation of DNA-dependent transcription” and the HGG-associated as “neuroactive ligand-receptor interaction”, “neuronal cell body” and “defense response to bacterium”. Conclusion LGG and HGG appear to have different molecular signatures in genomic variations and our results provide invaluable information for the diagnosis and treatment of gliomas in patients with variable duration or diverse tumor differentiation.
Collapse
|
336
|
Reyes-Palomares A, Rodríguez-López R, Ranea JAG, Jiménez FS, Medina MA. Global analysis of the human pathophenotypic similarity gene network merges disease module components. PLoS One 2013; 8:e56653. [PMID: 23437198 PMCID: PMC3578923 DOI: 10.1371/journal.pone.0056653] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2012] [Accepted: 01/12/2013] [Indexed: 12/22/2022] Open
Abstract
The molecular complexity of genetic diseases requires novel approaches to break it down into coherent biological modules. For this purpose, many disease network models have been created and analyzed. We highlight two of them, "the human diseases networks" (HDN) and "the orphan disease networks" (ODN). However, in these models, each single node represents one disease or an ambiguous group of diseases. In these cases, the notion of diseases as unique entities reduces the usefulness of network-based methods. We hypothesize that using the clinical features (pathophenotypes) to define pathophenotypic connections between disease-causing genes improve our understanding of the molecular events originated by genetic disturbances. For this, we have built a pathophenotypic similarity gene network (PSGN) and compared it with the unipartite projections (based on gene-to-gene edges) similar to those used in previous network models (HDN and ODN). Unlike these disease network models, the PSGN uses semantic similarities. This pathophenotypic similarity has been calculated by comparing pathophenotypic annotations of genes (human abnormalities of HPO terms) in the "Human Phenotype Ontology". The resulting network contains 1075 genes (nodes) and 26197 significant pathophenotypic similarities (edges). A global analysis of this network reveals: unnoticed pairs of genes showing significant pathophenotypic similarity, a biological meaningful re-arrangement of the pathological relationships between genes, correlations of biochemical interactions with higher similarity scores and functional biases in metabolic and essential genes toward the pathophenotypic specificity and the pleiotropy, respectively. Additionally, pathophenotypic similarities and metabolic interactions of genes associated with maple syrup urine disease (MSUD) have been used to merge into a coherent pathological module.Our results indicate that pathophenotypes contribute to identify underlying co-dependencies among disease-causing genes that are useful to describe disease modularity.
Collapse
Affiliation(s)
- Armando Reyes-Palomares
- Department of Molecular Biology and Biochemistry, Faculty of Sciences, University of Málaga, Málaga, Spain
- CIBER de Enfermedades Raras (CIBERER), Málaga, Spain
| | - Rocío Rodríguez-López
- Department of Molecular Biology and Biochemistry, Faculty of Sciences, University of Málaga, Málaga, Spain
- CIBER de Enfermedades Raras (CIBERER), Málaga, Spain
| | - Juan A. G. Ranea
- Department of Molecular Biology and Biochemistry, Faculty of Sciences, University of Málaga, Málaga, Spain
- CIBER de Enfermedades Raras (CIBERER), Málaga, Spain
| | - Francisca Sánchez Jiménez
- Department of Molecular Biology and Biochemistry, Faculty of Sciences, University of Málaga, Málaga, Spain
- CIBER de Enfermedades Raras (CIBERER), Málaga, Spain
| | - Miguel Angel Medina
- Department of Molecular Biology and Biochemistry, Faculty of Sciences, University of Málaga, Málaga, Spain
- CIBER de Enfermedades Raras (CIBERER), Málaga, Spain
| |
Collapse
|
337
|
Wittkop T, TerAvest E, Evani US, Fleisch KM, Berman AE, Powell C, Shah NH, Mooney SD. STOP using just GO: a multi-ontology hypothesis generation tool for high throughput experimentation. BMC Bioinformatics 2013; 14:53. [PMID: 23409969 PMCID: PMC3635999 DOI: 10.1186/1471-2105-14-53] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2012] [Accepted: 01/28/2013] [Indexed: 12/21/2022] Open
Abstract
Background Gene Ontology (GO) enrichment analysis remains one of the most common methods for hypothesis generation from high throughput datasets. However, we believe that researchers strive to test other hypotheses that fall outside of GO. Here, we developed and evaluated a tool for hypothesis generation from gene or protein lists using ontological concepts present in manually curated text that describes those genes and proteins. Results As a consequence we have developed the method Statistical Tracking of Ontological Phrases (STOP) that expands the realm of testable hypotheses in gene set enrichment analyses by integrating automated annotations of genes to terms from over 200 biomedical ontologies. While not as precise as manually curated terms, we find that the additional enriched concepts have value when coupled with traditional enrichment analyses using curated terms. Conclusion Multiple ontologies have been developed for gene and protein annotation, by using a dataset of both manually curated GO terms and automatically recognized concepts from curated text we can expand the realm of hypotheses that can be discovered. The web application STOP is available at http://mooneygroup.org/stop/.
Collapse
|
338
|
Zhang Z, Witham S, Petukh M, Moroy G, Miteva M, Ikeguchi Y, Alexov E. A rational free energy-based approach to understanding and targeting disease-causing missense mutations. J Am Med Inform Assoc 2013; 20:643-51. [PMID: 23408511 DOI: 10.1136/amiajnl-2012-001505] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open
Abstract
BACKGROUND AND SIGNIFICANCE Intellectual disability is a condition characterized by significant limitations in cognitive abilities and social/behavioral adaptive skills and is an important reason for pediatric, neurologic, and genetic referrals. Approximately 10% of protein-encoding genes on the X chromosome are implicated in intellectual disability, and the corresponding intellectual disability is termed X-linked ID (XLID). Although few mutations and a small number of families have been identified and XLID is rare, collectively the impact of XLID is significant because patients usually are unable to fully participate in society. OBJECTIVE To reveal the molecular mechanisms of various intellectual disabilities and to suggest small molecules which by binding to the malfunctioning protein can reduce unwanted effects. METHODS Using various in silico methods we reveal the molecular mechanism of XLID in cases involving proteins with known 3D structure. The 3D structures were used to predict the effect of disease-causing missense mutations on the folding free energy, conformational dynamics, hydrogen bond network and, if appropriate, protein-protein binding free energy. RESULTS It is shown that the vast majority of XLID mutation sites are outside the active pocket and are accessible from the water phase, thus providing the opportunity to alter their effect by binding appropriate small molecules in the vicinity of the mutation site. CONCLUSIONS This observation is used to demonstrate, computationally and experimentally, that a particular condition, Snyder-Robinson syndrome caused by the G56S spermine synthase mutation, might be ameliorated by small molecule binding.
Collapse
Affiliation(s)
- Zhe Zhang
- Computational Biophysics and Bioinformatics, Department of Physics, Clemson University, Clemson, South Carolina 29634, USA
| | | | | | | | | | | | | |
Collapse
|
339
|
Pabinger S, Dander A, Fischer M, Snajder R, Sperk M, Efremova M, Krabichler B, Speicher MR, Zschocke J, Trajanoski Z. A survey of tools for variant analysis of next-generation genome sequencing data. Brief Bioinform 2013; 15:256-78. [PMID: 23341494 PMCID: PMC3956068 DOI: 10.1093/bib/bbs086] [Citation(s) in RCA: 335] [Impact Index Per Article: 27.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open
Abstract
Recent advances in genome sequencing technologies provide unprecedented opportunities to characterize individual genomic landscapes and identify mutations relevant for diagnosis and therapy. Specifically, whole-exome sequencing using next-generation sequencing (NGS) technologies is gaining popularity in the human genetics community due to the moderate costs, manageable data amounts and straightforward interpretation of analysis results. While whole-exome and, in the near future, whole-genome sequencing are becoming commodities, data analysis still poses significant challenges and led to the development of a plethora of tools supporting specific parts of the analysis workflow or providing a complete solution. Here, we surveyed 205 tools for whole-genome/whole-exome sequencing data analysis supporting five distinct analytical steps: quality assessment, alignment, variant identification, variant annotation and visualization. We report an overview of the functionality, features and specific requirements of the individual tools. We then selected 32 programs for variant identification, variant annotation and visualization, which were subjected to hands-on evaluation using four data sets: one set of exome data from two patients with a rare disease for testing identification of germline mutations, two cancer data sets for testing variant callers for somatic mutations, copy number variations and structural variations, and one semi-synthetic data set for testing identification of copy number variations. Our comprehensive survey and evaluation of NGS tools provides a valuable guideline for human geneticists working on Mendelian disorders, complex diseases and cancers.
Collapse
Affiliation(s)
- Stephan Pabinger
- Division for Bioinformatics, Innsbruck Medical University, Innrain 80, 6020 Innsbruck, Austria. Tel.: +43-512-9003-71401; Fax: +43-512-9003-73100;
| | | | | | | | | | | | | | | | | | | |
Collapse
|
340
|
Troubleshooting and deconvoluting label-free cell phenotypic assays in drug discovery. J Pharmacol Toxicol Methods 2013; 67:69-81. [PMID: 23340025 DOI: 10.1016/j.vascn.2013.01.004] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2012] [Revised: 12/10/2012] [Accepted: 01/04/2013] [Indexed: 01/04/2023]
Abstract
INTRODUCTION Central to drug discovery and development is to comprehend the target(s), potency, efficacy and safety of drug molecules using pharmacological assays. Owing to their ability to provide a holistic view of drug actions in native cells, label-free biosensor-enabled cell phenotypic assays have been emerging as new generation phenotypic assays for drug discovery. Despite the benefits associated with wide pathway coverage, high sensitivity, high information content, non-invasiveness and real-time kinetics, label-free cell phenotypic assays are often viewed to be a blackbox in the era of target-centric drug discovery. METHODS This article first reviews the biochemical and biological complexity of drug-target interactions, and then discusses the key characteristics of label-free cell phenotypic assays and presents a five-step strategy to troubleshooting and deconvoluting the label-free cell phenotypic profiles of drugs. RESULTS Drug-target interactions are intrinsically complicated. Label-free cell phenotypic signatures of drugs mirror the innate complexity of drug-target interactions, and can be effectively deconvoluted using the five-step strategy. DISCUSSION The past decades have witnessed dramatic expansion of pharmacological assays ranging from molecular to phenotypic assays, which is coincident with the realization of the innate complexity of drug-target interactions. The clinical features of a drug are defined by how it operates at the system level and by its distinct polypharmacology, ontarget, phenotypic and network pharmacology. Approaches to examine the biochemical, cellular and molecular mechanisms of action of drugs are essential to increase the efficiency of drug discovery and development. Label-free cell phenotypic assays and the troubleshooting and deconvoluting approach presented here may hold great promise in drug discovery and development.
Collapse
|
341
|
Ghersi D, Singh M. Disentangling function from topology to infer the network properties of disease genes. BMC SYSTEMS BIOLOGY 2013; 7:5. [PMID: 23324116 PMCID: PMC3614482 DOI: 10.1186/1752-0509-7-5] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/22/2012] [Accepted: 01/04/2013] [Indexed: 12/20/2022]
Abstract
BACKGROUND The topological features of disease genes within interaction networks are the subject of intense study, as they shed light on common mechanisms of pathology and are useful for uncovering additional disease genes. Computational analyses typically try to uncover whether disease genes exhibit distinct network features, as compared to all genes. RESULTS We demonstrate that the functional composition of disease gene sets is an important confounding factor in these types of analyses. We consider five disease sets and show that while they indeed have distinct topological features, they are also enriched in functions that a priori exhibit distinct network properties. To address this, we develop a computational framework to assess the network properties of disease genes based on a sampling algorithm that generates control gene sets that are functionally similar to the disease set. Using our function-constrained sampling approach, we demonstrate that for most of the topological properties studied, disease genes are more similar to sets of genes with similar functional make-up than they are to randomly selected genes; this suggests that these observed differences in topological properties reflect not only the distinguishing network features of disease genes but also their functional composition. Nevertheless, we also highlight many cases where disease genes have distinct topological properties even when accounting for function. CONCLUSIONS Our approach is an important first step in extracting the residual topological differences in disease genes when accounting for function, and leads to new insights into the network properties of disease genes.
Collapse
Affiliation(s)
- Dario Ghersi
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08540, USA
| | | |
Collapse
|
342
|
Meyer LR, Zweig AS, Hinrichs AS, Karolchik D, Kuhn RM, Wong M, Sloan CA, Rosenbloom KR, Roe G, Rhead B, Raney BJ, Pohl A, Malladi VS, Li CH, Lee BT, Learned K, Kirkup V, Hsu F, Heitner S, Harte RA, Haeussler M, Guruvadoo L, Goldman M, Giardine BM, Fujita PA, Dreszer TR, Diekhans M, Cline MS, Clawson H, Barber GP, Haussler D, Kent WJ. The UCSC Genome Browser database: extensions and updates 2013. Nucleic Acids Res 2013; 41:D64-9. [PMID: 23155063 PMCID: PMC3531082 DOI: 10.1093/nar/gks1048] [Citation(s) in RCA: 618] [Impact Index Per Article: 51.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2012] [Accepted: 10/08/2012] [Indexed: 11/14/2022] Open
Abstract
The University of California Santa Cruz (UCSC) Genome Browser (http://genome.ucsc.edu) offers online public access to a growing database of genomic sequence and annotations for a wide variety of organisms. The Browser is an integrated tool set for visualizing, comparing, analysing and sharing both publicly available and user-generated genomic datasets. As of September 2012, genomic sequence and a basic set of annotation 'tracks' are provided for 63 organisms, including 26 mammals, 13 non-mammal vertebrates, 3 invertebrate deuterostomes, 13 insects, 6 worms, yeast and sea hare. In the past year 19 new genome assemblies have been added, and we anticipate releasing another 28 in early 2013. Further, a large number of annotation tracks have been either added, updated by contributors or remapped to the latest human reference genome. Among these are an updated UCSC Genes track for human and mouse assemblies. We have also introduced several features to improve usability, including new navigation menus. This article provides an update to the UCSC Genome Browser database, which has been previously featured in the Database issue of this journal.
Collapse
Affiliation(s)
- Laurence R. Meyer
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Ann S. Zweig
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Angie S. Hinrichs
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Donna Karolchik
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Robert M. Kuhn
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Matthew Wong
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Cricket A. Sloan
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Kate R. Rosenbloom
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Greg Roe
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Brooke Rhead
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Brian J. Raney
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Andy Pohl
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Venkat S. Malladi
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Chin H. Li
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Brian T. Lee
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Katrina Learned
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Vanessa Kirkup
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Fan Hsu
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Steve Heitner
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Rachel A. Harte
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Maximilian Haeussler
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Luvina Guruvadoo
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Mary Goldman
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Belinda M. Giardine
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Pauline A. Fujita
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Timothy R. Dreszer
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Mark Diekhans
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Melissa S. Cline
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Hiram Clawson
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - Galt P. Barber
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - David Haussler
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| | - W. James Kent
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA, Centre for Genomic Regulation (CRG), C/ Dr. Aiguader, 88, 08003 Barcelona, Spain, Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802 and Howard Hughes Medical Institute, UCSC, Santa Cruz, CA 95064, USA
| |
Collapse
|
343
|
K B, Purohit R. Mutational analysis of TYR gene and its structural consequences in OCA1A. Gene 2013; 513:184-95. [DOI: 10.1016/j.gene.2012.09.128] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2012] [Revised: 09/01/2012] [Accepted: 09/23/2012] [Indexed: 01/19/2023]
|
344
|
Masoodi TA, Al Shammari SA, Al-Muammar MN, Alhamdan AA, Talluri VR. Exploration of deleterious single nucleotide polymorphisms in late-onset Alzheimer disease susceptibility genes. Gene 2013; 512:429-37. [DOI: 10.1016/j.gene.2012.08.026] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2012] [Revised: 07/27/2012] [Accepted: 08/17/2012] [Indexed: 02/03/2023]
|
345
|
Kim KJ, Hwang D, Kim WU. Systems Approach to Rheumatoid Arthritis. JOURNAL OF RHEUMATIC DISEASES 2013. [DOI: 10.4078/jrd.2013.20.6.348] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Affiliation(s)
- Ki-Jo Kim
- Division of Rheumatology, Department of Internal Medicine, St. Vincent's Hospital, The Catholic University of Korea, Suwon, Korea
| | - Daehee Hwang
- Center for Systems Biology of Plant Senescence and Life History, Daegu Gyeongbuk Institute of Science & Technology, Daegu, Korea
| | - Wan-Uk Kim
- Division of Rheumatology, Department of Internal Medicine, St. Vincent's Hospital, The Catholic University of Korea, Suwon, Korea
| |
Collapse
|
346
|
Piro RM, Molineris I, Di Cunto F, Eils R, König R. Disease-gene discovery by integration of 3D gene expression and transcription factor binding affinities. ACTA ACUST UNITED AC 2012; 29:468-75. [PMID: 23267172 DOI: 10.1093/bioinformatics/bts720] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
Abstract
MOTIVATION The computational evaluation of candidate genes for hereditary disorders is a non-trivial task. Several excellent methods for disease-gene prediction have been developed in the past 2 decades, exploiting widely differing data sources to infer disease-relevant functional relationships between candidate genes and disorders. We have shown recently that spatially mapped, i.e. 3D, gene expression data from the mouse brain can be successfully used to prioritize candidate genes for human Mendelian disorders of the central nervous system. RESULTS We improved our previous work 2-fold: (i) we demonstrate that condition-independent transcription factor binding affinities of the candidate genes' promoters are relevant for disease-gene prediction and can be integrated with our previous approach to significantly enhance its predictive power; and (ii) we define a novel similarity measure-termed Relative Intensity Overlap-for both 3D gene expression patterns and binding affinity profiles that better exploits their disease-relevant information content. Finally, we present novel disease-gene predictions for eight loci associated with different syndromes of unknown molecular basis that are characterized by mental retardation.
Collapse
Affiliation(s)
- Rosario M Piro
- Department of Theoretical Bioinformatics, German Cancer Research Center (Deutsches Krebsforschungszentrum, DKFZ), University of Heidelberg, Im 69120 Heidelberg, Germany.
| | | | | | | | | |
Collapse
|
347
|
Guo F, Wang D, Liu Z, Lu L, Zhang W, Sun H, Zhang H, Ma J, Wu S, Li N, Jiang Y, Zhu W, Qin J, Xu P, Li D, He F. CAPER: a chromosome-assembled human proteome browsER. J Proteome Res 2012; 12:179-86. [PMID: 23256906 DOI: 10.1021/pr300831z] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
High-throughput mass spectrometry and antibody-based experiments have begun to produce a large amount of proteomic data sets. Chromosome-based visualization of these data sets and their annotations can help effectively integrate, organize, and analyze them. Therefore, we developed a web-based, user-friendly Chromosome-Assembled human Proteome browsER (CAPER). To display proteomic data sets and related annotations comprehensively, CAPER employs two distinct visualization strategies: track-view for the sequence/site information and the correspondence between proteome, transcriptome, genome, and chromosome and heatmap-view for the qualitative and quantitative functional annotations. CAPER supports data browsing at multiple scales through Google Map-like smooth navigation, zooming, and positioning with chromosomes as the reference coordinate. Both track-view and heatmap-view can mutually switch, providing a high-quality user interface. Taken together, CAPER will greatly facilitate the complete annotation and functional interpretation of the human genome by proteomic approaches, thereby making a significant contribution to the Chromosome-Centric Human Proteome Project and even the human physiology/pathology research. CAPER can be accessed at http://www.bprc.ac.cn/CAPE .
Collapse
Affiliation(s)
- Feifei Guo
- Institute of Basic Medical Sciences and School of Basic Medicine, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing 100005, China
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
348
|
Beck T, Free RC, Thorisson GA, Brookes AJ. Semantically enabling a genome-wide association study database. J Biomed Semantics 2012; 3:9. [PMID: 23244533 PMCID: PMC3579732 DOI: 10.1186/2041-1480-3-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2012] [Accepted: 08/22/2012] [Indexed: 01/03/2023] Open
Abstract
Background The amount of data generated from genome-wide association studies (GWAS) has grown rapidly, but considerations for GWAS phenotype data reuse and interchange have not kept pace. This impacts on the work of GWAS Central – a free and open access resource for the advanced querying and comparison of summary-level genetic association data. The benefits of employing ontologies for standardising and structuring data are widely accepted. The complex spectrum of observed human phenotypes (and traits), and the requirement for cross-species phenotype comparisons, calls for reflection on the most appropriate solution for the organisation of human phenotype data. The Semantic Web provides standards for the possibility of further integration of GWAS data and the ability to contribute to the web of Linked Data. Results A pragmatic consideration when applying phenotype ontologies to GWAS data is the ability to retrieve all data, at the most granular level possible, from querying a single ontology graph. We found the Medical Subject Headings (MeSH) terminology suitable for describing all traits (diseases and medical signs and symptoms) at various levels of granularity and the Human Phenotype Ontology (HPO) most suitable for describing phenotypic abnormalities (medical signs and symptoms) at the most granular level. Diseases within MeSH are mapped to HPO to infer the phenotypic abnormalities associated with diseases. Building on the rich semantic phenotype annotation layer, we are able to make cross-species phenotype comparisons and publish a core subset of GWAS data as RDF nanopublications. Conclusions We present a methodology for applying phenotype annotations to a comprehensive genome-wide association dataset and for ensuring compatibility with the Semantic Web. The annotations are used to assist with cross-species genotype and phenotype comparisons. However, further processing and deconstructions of terms may be required to facilitate automatic phenotype comparisons. The provision of GWAS nanopublications enables a new dimension for exploring GWAS data, by way of intrinsic links to related data resources within the Linked Data web. The value of such annotation and integration will grow as more biomedical resources adopt the standards of the Semantic Web.
Collapse
Affiliation(s)
- Tim Beck
- Department of Genetics, University of Leicester, University Road, Leicester, UK.
| | | | | | | |
Collapse
|
349
|
Chiu YY, Lin CT, Huang JW, Hsu KC, Tseng JH, You SR, Yang JM. KIDFamMap: a database of kinase-inhibitor-disease family maps for kinase inhibitor selectivity and binding mechanisms. Nucleic Acids Res 2012. [PMID: 23193279 PMCID: PMC3531076 DOI: 10.1093/nar/gks1218] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023] Open
Abstract
Kinases play central roles in signaling pathways and are promising therapeutic targets for many diseases. Designing selective kinase inhibitors is an emergent and challenging task, because kinases share an evolutionary conserved ATP-binding site. KIDFamMap (http://gemdock.life.nctu.edu.tw/KIDFamMap/) is the first database to explore kinase-inhibitor families (KIFs) and kinase-inhibitor-disease (KID) relationships for kinase inhibitor selectivity and mechanisms. This database includes 1208 KIFs, 962 KIDs, 55 603 kinase-inhibitor interactions (KIIs), 35 788 kinase inhibitors, 399 human protein kinases, 339 diseases and 638 disease allelic variants. Here, a KIF can be defined as follows: (i) the kinases in the KIF with significant sequence similarity, (ii) the inhibitors in the KIF with significant topology similarity and (iii) the KIIs in the KIF with significant interaction similarity. The KIIs within a KIF are often conserved on some consensus KIDFamMap anchors, which represent conserved interactions between the kinase subsites and consensus moieties of their inhibitors. Our experimental results reveal that the members of a KIF often possess similar inhibition profiles. The KIDFamMap anchors can reflect kinase conformations types, kinase functions and kinase inhibitor selectivity. We believe that KIDFamMap provides biological insights into kinase inhibitor selectivity and binding mechanisms.
Collapse
Affiliation(s)
- Yi-Yuan Chiu
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu 30050, Taiwan
| | | | | | | | | | | | | |
Collapse
|
350
|
Peng K, Xu W, Zheng J, Huang K, Wang H, Tong J, Lin Z, Liu J, Cheng W, Fu D, Du P, Kibbe WA, Lin SM, Xia T. The Disease and Gene Annotations (DGA): an annotation resource for human disease. Nucleic Acids Res 2012. [PMID: 23197658 PMCID: PMC3531051 DOI: 10.1093/nar/gks1244] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Disease and Gene Annotations database (DGA, http://dga.nubic.northwestern.edu) is a collaborative effort aiming to provide a comprehensive and integrative annotation of the human genes in disease network context by integrating computable controlled vocabulary of the Disease Ontology (DO version 3 revision 2510, which has 8043 inherited, developmental and acquired human diseases), NCBI Gene Reference Into Function (GeneRIF) and molecular interaction network (MIN). DGA integrates these resources together using semantic mappings to build an integrative set of disease-to-gene and gene-to-gene relationships with excellent coverage based on current knowledge. DGA is kept current by periodically reparsing DO, GeneRIF, and MINs. DGA provides a user-friendly and interactive web interface system enabling users to efficiently query, download and visualize the DO tree structure and annotations as a tree, a network graph or a tabular list. To facilitate integrative analysis, DGA provides a web service Application Programming Interface for integration with external analytic tools.
Collapse
Affiliation(s)
- Kai Peng
- The Department of Electronics and Information Engineering, Huazhong University of Science and Technology, Wuhan, Hubei 430074, China
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|