1
|
James RA, Campbell IM, Chen ES, Boone PM, Rao MA, Bainbridge MN, Lupski JR, Yang Y, Eng CM, Posey JE, Shaw CA. A visual and curatorial approach to clinical variant prioritization and disease gene discovery in genome-wide diagnostics. Genome Med 2016; 8:13. [PMID: 26838676 PMCID: PMC4736244 DOI: 10.1186/s13073-016-0261-8] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2015] [Accepted: 01/05/2016] [Indexed: 12/22/2022] Open
Abstract
Background Genome-wide data are increasingly important in the clinical evaluation of human disease. However, the large number of variants observed in individual patients challenges the efficiency and accuracy of diagnostic review. Recent work has shown that systematic integration of clinical phenotype data with genotype information can improve diagnostic workflows and prioritization of filtered rare variants. We have developed visually interactive, analytically transparent analysis software that leverages existing disease catalogs, such as the Online Mendelian Inheritance in Man database (OMIM) and the Human Phenotype Ontology (HPO), to integrate patient phenotype and variant data into ranked diagnostic alternatives. Methods Our tool, “OMIM Explorer” (http://www.omimexplorer.com), extends the biomedical application of semantic similarity methods beyond those reported in previous studies. The tool also provides a simple interface for translating free-text clinical notes into HPO terms, enabling clinical providers and geneticists to contribute phenotypes to the diagnostic process. The visual approach uses semantic similarity with multidimensional scaling to collapse high-dimensional phenotype and genotype data from an individual into a graphical format that contextualizes the patient within a low-dimensional disease map. The map proposes a differential diagnosis and algorithmically suggests potential alternatives for phenotype queries—in essence, generating a computationally assisted differential diagnosis informed by the individual’s personal genome. Visual interactivity allows the user to filter and update variant rankings by interacting with intermediate results. The tool also implements an adaptive approach for disease gene discovery based on patient phenotypes. Results We retrospectively analyzed pilot cohort data from the Baylor Miraca Genetics Laboratory, demonstrating performance of the tool and workflow in the re-analysis of clinical exomes. Our tool assigned to clinically reported variants a median rank of 2, placing causal variants in the top 1 % of filtered candidates across the 47 cohort cases with reported molecular diagnoses of exome variants in OMIM Morbidmap genes. Our tool outperformed Phen-Gen, eXtasy, PhenIX, PHIVE, and hiPHIVE in the prioritization of these clinically reported variants. Conclusions Our integrative paradigm can improve efficiency and, potentially, the quality of genomic medicine by more effectively utilizing available phenotype information, catalog data, and genomic knowledge. Electronic supplementary material The online version of this article (doi:10.1186/s13073-016-0261-8) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Regis A James
- Program in Structural and Computational Biology and Molecular Biophysics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Ian M Campbell
- Department of Molecular & Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Edward S Chen
- Department of Molecular & Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Philip M Boone
- Department of Molecular & Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Mitchell A Rao
- Department of Molecular & Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Matthew N Bainbridge
- Department of Molecular & Human Genetics, Baylor College of Medicine, Houston, TX, USA.,Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
| | - James R Lupski
- Department of Molecular & Human Genetics, Baylor College of Medicine, Houston, TX, USA.,Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA.,Department of Pediatrics, Baylor College of Medicine, Houston, TX, USA.,Department of Pediatrics, Texas Children's Hospital, Houston, TX, USA
| | - Yaping Yang
- Department of Molecular & Human Genetics, Baylor College of Medicine, Houston, TX, USA.,Baylor Miraca Genetics Laboratories, Baylor College of Medicine, Houston, TX, USA
| | - Christine M Eng
- Department of Molecular & Human Genetics, Baylor College of Medicine, Houston, TX, USA.,Baylor Miraca Genetics Laboratories, Baylor College of Medicine, Houston, TX, USA
| | - Jennifer E Posey
- Department of Molecular & Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Chad A Shaw
- Program in Structural and Computational Biology and Molecular Biophysics, Baylor College of Medicine, Houston, TX, 77030, USA. .,Department of Molecular & Human Genetics, Baylor College of Medicine, Houston, TX, USA. .,Department of Statistics, Rice University, Houston, TX, 77005, USA.
| |
Collapse
|
2
|
Walking on a tissue-specific disease-protein-complex heterogeneous network for the discovery of disease-related protein complexes. BIOMED RESEARCH INTERNATIONAL 2013; 2013:732650. [PMID: 24455720 PMCID: PMC3888695 DOI: 10.1155/2013/732650] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/11/2013] [Accepted: 10/07/2013] [Indexed: 11/29/2022]
Abstract
Besides the pinpointing of individual disease-related genes, associating protein complexes to human inherited diseases is also of great importance, because a biological function usually arises from the cooperative behaviour of multiple proteins in a protein complex. Moreover, knowledge about disease-related protein complexes could also enhance the inference of disease genes and pathogenic genetic variants. Here, we have designed a computational systems biology approach to systematically analyse potential relationships between diseases and protein complexes. First, we construct a heterogeneous network which is composed of a disease-disease similarity layer, a tissue-specific protein-protein interaction layer, and a protein complex membership layer. Then, we propose a random walk model on this disease-protein-complex network for identifying protein complexes that are related to a query disease. With a series of leave-one-out cross-validation experiments, we show that our method not only possesses high performance but also demonstrates robustness regarding the parameters and the network structure. We further predict a landscape of associations between human diseases and protein complexes. This landscape can be used to facilitate the inference of disease genes, thereby benefiting studies on pathology of diseases.
Collapse
|
3
|
Chen Y, Wu X, Jiang R. Integrating human omics data to prioritize candidate genes. BMC Med Genomics 2013; 6:57. [PMID: 24344781 PMCID: PMC3878333 DOI: 10.1186/1755-8794-6-57] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2013] [Accepted: 12/12/2013] [Indexed: 01/07/2023] Open
Abstract
Background The identification of genes involved in human complex diseases remains a great challenge in computational systems biology. Although methods have been developed to use disease phenotypic similarities with a protein-protein interaction network for the prioritization of candidate genes, other valuable omics data sources have been largely overlooked in these methods. Methods With this understanding, we proposed a method called BRIDGE to prioritize candidate genes by integrating disease phenotypic similarities with such omics data as protein-protein interactions, gene sequence similarities, gene expression patterns, gene ontology annotations, and gene pathway memberships. BRIDGE utilizes a multiple regression model with lasso penalty to automatically weight different data sources and is capable of discovering genes associated with diseases whose genetic bases are completely unknown. Results We conducted large-scale cross-validation experiments and demonstrated that more than 60% known disease genes can be ranked top one by BRIDGE in simulated linkage intervals, suggesting the superior performance of this method. We further performed two comprehensive case studies by applying BRIDGE to predict novel genes and transcriptional networks involved in obesity and type II diabetes. Conclusion The proposed method provides an effective and scalable way for integrating multi omics data to infer disease genes. Further applications of BRIDGE will be benefit to providing novel disease genes and underlying mechanisms of human diseases.
Collapse
Affiliation(s)
| | | | - Rui Jiang
- Department of Automation, MOE Key Laboratory of Bioinformatics; Bioinformatics Division and Center for Synthetic & Systems Biology, TNLIST, Tsinghua University, Beijing 100084, China.
| |
Collapse
|
4
|
Somvanshi PR, Venkatesh KV. A conceptual review on systems biology in health and diseases: from biological networks to modern therapeutics. SYSTEMS AND SYNTHETIC BIOLOGY 2013; 8:99-116. [PMID: 24592295 DOI: 10.1007/s11693-013-9125-3] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/07/2013] [Accepted: 09/10/2013] [Indexed: 12/28/2022]
Abstract
Human physiology is an ensemble of various biological processes spanning from intracellular molecular interactions to the whole body phenotypic response. Systems biology endures to decipher these multi-scale biological networks and bridge the link between genotype to phenotype. The structure and dynamic properties of these networks are responsible for controlling and deciding the phenotypic state of a cell. Several cells and various tissues coordinate together to generate an organ level response which further regulates the ultimate physiological state. The overall network embeds a hierarchical regulatory structure, which when unusually perturbed can lead to undesirable physiological state termed as disease. Here, we treat a disease diagnosis problem analogous to a fault diagnosis problem in engineering systems. Accordingly we review the application of engineering methodologies to address human diseases from systems biological perspective. The review highlights potential networks and modeling approaches used for analyzing human diseases. The application of such analysis is illustrated in the case of cancer and diabetes. We put forth a concept of cell-to-human framework comprising of five modules (data mining, networking, modeling, experimental and validation) for addressing human physiology and diseases based on a paradigm of system level analysis. The review overtly emphasizes on the importance of multi-scale biological networks and subsequent modeling and analysis for drug target identification and designing efficient therapies.
Collapse
Affiliation(s)
- Pramod Rajaram Somvanshi
- Biosystems Engineering, Department of Chemical Engineering, Indian Institute of Technology Bombay, Powai, Mumbai, 400076 Maharashtra India
| | - K V Venkatesh
- Biosystems Engineering, Department of Chemical Engineering, Indian Institute of Technology Bombay, Powai, Mumbai, 400076 Maharashtra India
| |
Collapse
|