1
|
Zeighami Y, Bakken TE, Nickl-Jockschat T, Peterson Z, Jegga AG, Miller JA, Schulkin J, Evans AC, Lein ES, Hawrylycz M. A comparison of anatomic and cellular transcriptome structures across 40 human brain diseases. PLoS Biol 2023; 21:e3002058. [PMID: 37079537 PMCID: PMC10118126 DOI: 10.1371/journal.pbio.3002058] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Accepted: 03/02/2023] [Indexed: 04/21/2023] Open
Abstract
Genes associated with risk for brain disease exhibit characteristic expression patterns that reflect both anatomical and cell type relationships. Brain-wide transcriptomic patterns of disease risk genes provide a molecular-based signature, based on differential co-expression, that is often unique to that disease. Brain diseases can be compared and aggregated based on the similarity of their signatures which often associates diseases from diverse phenotypic classes. Analysis of 40 common human brain diseases identifies 5 major transcriptional patterns, representing tumor-related, neurodegenerative, psychiatric and substance abuse, and 2 mixed groups of diseases affecting basal ganglia and hypothalamus. Further, for diseases with enriched expression in cortex, single-nucleus data in the middle temporal gyrus (MTG) exhibits a cell type expression gradient separating neurodegenerative, psychiatric, and substance abuse diseases, with unique excitatory cell type expression differentiating psychiatric diseases. Through mapping of homologous cell types between mouse and human, most disease risk genes are found to act in common cell types, while having species-specific expression in those types and preserving similar phenotypic classification within species. These results describe structural and cellular transcriptomic relationships of disease risk genes in the adult brain and provide a molecular-based strategy for classifying and comparing diseases, potentially identifying novel disease relationships.
Collapse
Affiliation(s)
- Yashar Zeighami
- Douglas Research Centre, Department of Psychiatry, McGill University, Montreal, Canada
- Montreal Neurological Institute, McGill University, Montreal, Canada
| | - Trygve E. Bakken
- Allen Institute for Brain Science, Seattle, Washington, United States of America
| | - Thomas Nickl-Jockschat
- Department of Psychiatry, Neuroscience and Pharmacology, Iowa Neuroscience Institute, University of Iowa, Iowa City, Iowa, United States of America
| | - Zeru Peterson
- Department of Psychiatry, Neuroscience and Pharmacology, Iowa Neuroscience Institute, University of Iowa, Iowa City, Iowa, United States of America
| | - Anil G. Jegga
- Division of Biomedical Informatics, Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio, United States of America
- Department of Pediatrics, College of Medicine, University of Cincinnati, Cincinnati, Ohio, United States of America
| | - Jeremy A. Miller
- Allen Institute for Brain Science, Seattle, Washington, United States of America
| | - Jay Schulkin
- Department of Obstetrics and Gynecology, University of Washington, Seattle, Washington, United States of America
| | - Alan C. Evans
- Montreal Neurological Institute, McGill University, Montreal, Canada
| | - Ed S. Lein
- Allen Institute for Brain Science, Seattle, Washington, United States of America
| | - Michael Hawrylycz
- Allen Institute for Brain Science, Seattle, Washington, United States of America
- University of Washington, Department of Genome Sciences, Seattle, Washington, United States of America
| |
Collapse
|
2
|
Howlett-Prieto Q, Oommen C, Carrithers MD, Wunsch DC, Hier DB. Subtypes of relapsing-remitting multiple sclerosis identified by network analysis. Front Digit Health 2023; 4:1063264. [PMID: 36714613 PMCID: PMC9874946 DOI: 10.3389/fdgth.2022.1063264] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2022] [Accepted: 12/22/2022] [Indexed: 01/12/2023] Open
Abstract
We used network analysis to identify subtypes of relapsing-remitting multiple sclerosis subjects based on their cumulative signs and symptoms. The electronic medical records of 113 subjects with relapsing-remitting multiple sclerosis were reviewed, signs and symptoms were mapped to classes in a neuro-ontology, and classes were collapsed into sixteen superclasses by subsumption. After normalization and vectorization of the data, bipartite (subject-feature) and unipartite (subject-subject) network graphs were created using NetworkX and visualized in Gephi. Degree and weighted degree were calculated for each node. Graphs were partitioned into communities using the modularity score. Feature maps visualized differences in features by community. Network analysis of the unipartite graph yielded a higher modularity score (0.49) than the bipartite graph (0.25). The bipartite network was partitioned into five communities which were named fatigue, behavioral, hypertonia/weakness, abnormal gait/sphincter, and sensory, based on feature characteristics. The unipartite network was partitioned into five communities which were named fatigue, pain, cognitive, sensory, and gait/weakness/hypertonia based on features. Although we did not identify pure subtypes (e.g., pure motor, pure sensory, etc.) in this cohort of multiple sclerosis subjects, we demonstrated that network analysis could partition these subjects into different subtype communities. Larger datasets and additional partitioning algorithms are needed to confirm these findings and elucidate their significance. This study contributes to the literature investigating subtypes of multiple sclerosis by combining feature reduction by subsumption with network analysis.
Collapse
Affiliation(s)
- Quentin Howlett-Prieto
- Department of Neurology and Rehabilitation, University of Illinois at Chicago, Chicago, IL, United States
| | - Chelsea Oommen
- Department of Neurology and Rehabilitation, University of Illinois at Chicago, Chicago, IL, United States
| | - Michael D. Carrithers
- Department of Neurology and Rehabilitation, University of Illinois at Chicago, Chicago, IL, United States
| | - Donald C. Wunsch
- Department of Electrical and Computer Engineering, Missouri University of Science and Technology, Rolla, MO, United States
| | - Daniel B. Hier
- Department of Neurology and Rehabilitation, University of Illinois at Chicago, Chicago, IL, United States,Department of Electrical and Computer Engineering, Missouri University of Science and Technology, Rolla, MO, United States,Correspondence: Daniel B. Hier
| |
Collapse
|
3
|
Yang X, Xu W, Leng D, Wen Y, Wu L, Li R, Huang J, Bo X, He S. Exploring novel disease-disease associations based on multi-view fusion network. Comput Struct Biotechnol J 2023; 21:1807-1819. [PMID: 36923471 PMCID: PMC10009443 DOI: 10.1016/j.csbj.2023.02.038] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2022] [Revised: 02/02/2023] [Accepted: 02/22/2023] [Indexed: 03/06/2023] Open
Abstract
Established taxonomy system based on disease symptom and tissue characteristics have provided an important basis for physicians to correctly identify diseases and treat them successfully. However, these classifications tend to be based on phenotypic observations, lacking a molecular biological foundation. Therefore, there is an urgent to integrate multi-dimensional molecular biological information or multi-omics data to redefine disease classification in order to provide a powerful perspective for understanding the molecular structure of diseases. Therefore, we offer a flexible disease classification that integrates the biological process, gene expression, and symptom phenotype of diseases, and propose a disease-disease association network based on multi-view fusion. We applied the fusion approach to 223 diseases and divided them into 24 disease clusters. The contribution of internal and external edges of disease clusters were analyzed. The results of the fusion model were compared with Medical Subject Headings, a traditional and commonly used disease taxonomy. Then, experimental results of model performance comparison show that our approach performs better than other integration methods. As it was observed, the obtained clusters provided more interesting and novel disease-disease associations. This multi-view human disease association network describes relationships between diseases based on multiple molecular levels, thus breaking through the limitation of the disease classification system based on tissues and organs. This approach which motivates clinicians and researchers to reposition the understanding of diseases and explore diagnosis and therapy strategies, extends the existing disease taxonomy. Availability of data and materials The preprocessed dataset and source code supporting the conclusions of this article are available at GitHub repository https://github.com/yangxiaoxi89/mvHDN.
Collapse
Affiliation(s)
- Xiaoxi Yang
- Clinical Medicine Institute, Beijing Friendship Hospital, Capital Medical University, Beijing 100050, China.,Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Wenjian Xu
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China.,Rare Disease Center, Beijing Children's Hospital, Capital Medical University, National Center for Children's Health, Beijing 100045, China.,MOE Key Laboratory of Major Diseases in Children, Beijing 100045, China.,Beijing Key Laboratory for Genetics of Birth Defects, Beijing Pediatric Research Institute, Beijing 100045, China
| | - Dongjin Leng
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Yuqi Wen
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Lianlian Wu
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Ruijiang Li
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Jian Huang
- Clinical Medicine Institute, Beijing Friendship Hospital, Capital Medical University, Beijing 100050, China
| | - Xiaochen Bo
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Song He
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| |
Collapse
|
4
|
Kalgotra P, Sharda R. When will I get out of the Hospital? Modeling Length of Stay using Comorbidity Networks. J MANAGE INFORM SYST 2022. [DOI: 10.1080/07421222.2021.1990618] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Affiliation(s)
- Pankush Kalgotra
- Harbert College of Business, Auburn University Auburn, AL 36849 US
| | - Ramesh Sharda
- Vice Dean, Watson Graduate School of Management, Regents Professor of Management Science and Information Systems, Spears School of Business, Oklahoma State University, OK 74078, USA
| |
Collapse
|
5
|
Jing X. The Unified Medical Language System at 30 Years and How It Is Used and Published: Systematic Review and Content Analysis. JMIR Med Inform 2021; 9:e20675. [PMID: 34236337 PMCID: PMC8433943 DOI: 10.2196/20675] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2020] [Revised: 11/25/2020] [Accepted: 07/02/2021] [Indexed: 01/22/2023] Open
Abstract
BACKGROUND The Unified Medical Language System (UMLS) has been a critical tool in biomedical and health informatics, and the year 2021 marks its 30th anniversary. The UMLS brings together many broadly used vocabularies and standards in the biomedical field to facilitate interoperability among different computer systems and applications. OBJECTIVE Despite its longevity, there is no comprehensive publication analysis of the use of the UMLS. Thus, this review and analysis is conducted to provide an overview of the UMLS and its use in English-language peer-reviewed publications, with the objective of providing a comprehensive understanding of how the UMLS has been used in English-language peer-reviewed publications over the last 30 years. METHODS PubMed, ACM Digital Library, and the Nursing & Allied Health Database were used to search for studies. The primary search strategy was as follows: UMLS was used as a Medical Subject Headings term or a keyword or appeared in the title or abstract. Only English-language publications were considered. The publications were screened first, then coded and categorized iteratively, following the grounded theory. The review process followed the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines. RESULTS A total of 943 publications were included in the final analysis. Moreover, 32 publications were categorized into 2 categories; hence the total number of publications before duplicates are removed is 975. After analysis and categorization of the publications, UMLS was found to be used in the following emerging themes or areas (the number of publications and their respective percentages are given in parentheses): natural language processing (230/975, 23.6%), information retrieval (125/975, 12.8%), terminology study (90/975, 9.2%), ontology and modeling (80/975, 8.2%), medical subdomains (76/975, 7.8%), other language studies (53/975, 5.4%), artificial intelligence tools and applications (46/975, 4.7%), patient care (35/975, 3.6%), data mining and knowledge discovery (25/975, 2.6%), medical education (20/975, 2.1%), degree-related theses (13/975, 1.3%), digital library (5/975, 0.5%), and the UMLS itself (150/975, 15.4%), as well as the UMLS for other purposes (27/975, 2.8%). CONCLUSIONS The UMLS has been used successfully in patient care, medical education, digital libraries, and software development, as originally planned, as well as in degree-related theses, the building of artificial intelligence tools, data mining and knowledge discovery, foundational work in methodology, and middle layers that may lead to advanced products. Natural language processing, the UMLS itself, and information retrieval are the 3 most common themes that emerged among the included publications. The results, although largely related to academia, demonstrate that UMLS achieves its intended uses successfully, in addition to achieving uses broadly beyond its original intentions.
Collapse
Affiliation(s)
- Xia Jing
- Department of Public Health Sciences, College of Behavioral, Social and Health Sciences, Clemson University, Clemson, SC, United States
| |
Collapse
|
6
|
García Del Valle EP, Lagunes García G, Prieto Santamaría L, Zanin M, Menasalvas Ruiz E, Rodríguez-González A. Leveraging network analysis to evaluate biomedical named entity recognition tools. Sci Rep 2021; 11:13537. [PMID: 34188248 PMCID: PMC8242017 DOI: 10.1038/s41598-021-93018-w] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2021] [Accepted: 06/18/2021] [Indexed: 02/06/2023] Open
Abstract
The ever-growing availability of biomedical text sources has resulted in a boost in clinical studies based on their exploitation. Biomedical named-entity recognition (bio-NER) techniques have evolved remarkably in recent years and their application in research is increasingly successful. Still, the disparity of tools and the limited available validation resources are barriers preventing a wider diffusion, especially within clinical practice. We here propose the use of omics data and network analysis as an alternative for the assessment of bio-NER tools. Specifically, our method introduces quality criteria based on edge overlap and community detection. The application of these criteria to four bio-NER solutions yielded comparable results to strategies based on annotated corpora, without suffering from their limitations. Our approach can constitute a guide both for the selection of the best bio-NER tool given a specific task, and for the creation and validation of novel approaches.
Collapse
Affiliation(s)
| | - Gerardo Lagunes García
- ETS de Ingenieros Informáticos, Universidad Politécnica de Madrid, Boadilla del Monte, Madrid, Spain
- Centro de Tecnología Biomédica, ETS Ingenieros Informáticos, Universidad Politécnica de Madrid, Pozuelo de Alarcón, Madrid, Spain
| | - Lucía Prieto Santamaría
- Centro de Tecnología Biomédica, ETS Ingenieros Informáticos, Universidad Politécnica de Madrid, Pozuelo de Alarcón, Madrid, Spain
| | - Massimiliano Zanin
- Instituto de Física Interdisciplinar y Sistemas Complejos IFISC (CSIC-UIB), Campus UIB, Palma de Mallorca, Spain
| | - Ernestina Menasalvas Ruiz
- ETS de Ingenieros Informáticos, Universidad Politécnica de Madrid, Boadilla del Monte, Madrid, Spain
- Centro de Tecnología Biomédica, ETS Ingenieros Informáticos, Universidad Politécnica de Madrid, Pozuelo de Alarcón, Madrid, Spain
| | - Alejandro Rodríguez-González
- ETS de Ingenieros Informáticos, Universidad Politécnica de Madrid, Boadilla del Monte, Madrid, Spain
- Centro de Tecnología Biomédica, ETS Ingenieros Informáticos, Universidad Politécnica de Madrid, Pozuelo de Alarcón, Madrid, Spain
| |
Collapse
|
7
|
Díaz-Santiago E, Jabato FM, Rojano E, Seoane P, Pazos F, Perkins JR, Ranea JAG. Phenotype-genotype comorbidity analysis of patients with rare disorders provides insight into their pathological and molecular bases. PLoS Genet 2020; 16:e1009054. [PMID: 33001999 PMCID: PMC7553355 DOI: 10.1371/journal.pgen.1009054] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2019] [Revised: 10/13/2020] [Accepted: 08/16/2020] [Indexed: 12/15/2022] Open
Abstract
Genetic and molecular analysis of rare disease is made difficult by the small numbers of affected patients. Phenotypic comorbidity analysis can help rectify this by combining information from individuals with similar phenotypes and looking for overlap in terms of shared genes and underlying functional systems. However, few studies have combined comorbidity analysis with genomic data. We present a computational approach that connects patient phenotypes based on phenotypic co-occurence and uses genomic information related to the patient mutations to assign genes to the phenotypes, which are used to detect enriched functional systems. These phenotypes are clustered using network analysis to obtain functionally coherent phenotype clusters. We applied the approach to the DECIPHER database, containing phenotypic and genomic information for thousands of patients with heterogeneous rare disorders and copy number variants. Validity was demonstrated through overlap with known diseases, co-mention within the biomedical literature, semantic similarity measures, and patient cluster membership. These connected pairs formed multiple phenotype clusters, showing functional coherence, and mapped to genes and systems involved in similar pathological processes. Examples include claudin genes from the 22q11 genomic region associated with a cluster of phenotypes related to DiGeorge syndrome and genes related to the GO term anterior/posterior pattern specification associated with abnormal development. The clusters generated can help with the diagnosis of rare diseases, by suggesting additional phenotypes for a given patient and potential underlying functional systems. Other tools to find causal genes based on phenotype were also investigated. The approach has been implemented as a workflow, named PhenCo, which can be adapted to any set of patients for which phenomic and genomic data is available. Full details of the analysis, including the clusters formed, their constituent functional systems and underlying genes are given. Code to implement the workflow is available from GitHub. Although rare diseases each affect a small number of people, taken together they affect millions. Better diagnosis and understanding of the underlying mechanisms are needed. By combining phenotypic data for many rare disease patients, we can build clusters of comorbid phenotypes that tend to co-occur together. By using genomic information, we can supplement these clusters and look for related genes and functional systems, such as pathways and molecular mechanisms. We applied such an approach to thousands of rare disease patients from the DECIPHER resources. We were able to detect hundreds of pairs of comorbid phenotypes, and use them to build tens of phenotype clusters. By mapping genes to these phenotypes, based on data from the same patients, we were able to detect related genes and functional systems, such as genes mapping to the 22q11 genomic region underlying a cluster of phenotypes related to DiGeorge syndrome. To ensure that these clusters made sensible predictions, results were validated using literature co-mention, overlap with known disease and semantic similarity measures. These comorbidity patterns, along with their underlying molecular systems, can give important insights into disease mechanisms, moreover they can be used to direct differential-diagnosis of rare disease patients.
Collapse
Affiliation(s)
- Elena Díaz-Santiago
- Department of Molecular Biology and Biochemistry, University of Malaga, Malaga, Spain
| | - Fernando M. Jabato
- Department of Molecular Biology and Biochemistry, University of Malaga, Malaga, Spain
| | - Elena Rojano
- Department of Molecular Biology and Biochemistry, University of Malaga, Malaga, Spain
| | - Pedro Seoane
- Department of Molecular Biology and Biochemistry, University of Malaga, Malaga, Spain
- CIBER de Enfermedades Raras (CIBERER), ISCIII, Madrid, Spain
| | | | - James R. Perkins
- Department of Molecular Biology and Biochemistry, University of Malaga, Malaga, Spain
- CIBER de Enfermedades Raras (CIBERER), ISCIII, Madrid, Spain
- The Biomedical Research Institute of Malaga (IBIMA), Malaga, Spain
- * E-mail:
| | - Juan A. G. Ranea
- Department of Molecular Biology and Biochemistry, University of Malaga, Malaga, Spain
- CIBER de Enfermedades Raras (CIBERER), ISCIII, Madrid, Spain
- The Biomedical Research Institute of Malaga (IBIMA), Malaga, Spain
| |
Collapse
|
8
|
Chen S, Ghandikota S, Gautam Y, Mersha TB. AllergyGenDB: A literature and functional annotation-based omics database for allergic diseases. Allergy 2020; 75:1789-1793. [PMID: 32034783 DOI: 10.1111/all.14219] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2019] [Revised: 01/16/2020] [Accepted: 02/04/2020] [Indexed: 11/30/2022]
Affiliation(s)
- Siqi Chen
- Division of Asthma Research Cincinnati Children's Hospital Medical Center Cincinnati OH USA
- Department of Electrical Engineering and Computer Science University of Cincinnati Cincinnati OH USA
| | - Sudhir Ghandikota
- Division of Asthma Research Cincinnati Children's Hospital Medical Center Cincinnati OH USA
- Department of Electrical Engineering and Computer Science University of Cincinnati Cincinnati OH USA
| | - Yadu Gautam
- Division of Asthma Research Cincinnati Children's Hospital Medical Center Cincinnati OH USA
| | - Tesfaye B. Mersha
- Division of Asthma Research Cincinnati Children's Hospital Medical Center Cincinnati OH USA
- Department of Pediatrics University of Cincinnati Cincinnati, Cincinnati OH USA
| |
Collapse
|
9
|
Lagunes-García G, Rodríguez-González A, Prieto-Santamaría L, García Del Valle EP, Zanin M, Menasalvas-Ruiz E. DISNET: a framework for extracting phenotypic disease information from public sources. PeerJ 2020; 8:e8580. [PMID: 32110491 PMCID: PMC7032061 DOI: 10.7717/peerj.8580] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2019] [Accepted: 01/16/2020] [Indexed: 12/25/2022] Open
Abstract
Background Within the global endeavour of improving population health, one major challenge is the identification and integration of medical knowledge spread through several information sources. The creation of a comprehensive dataset of diseases and their clinical manifestations based on information from public sources is an interesting approach that allows one not only to complement and merge medical knowledge but also to increase it and thereby to interconnect existing data and analyse and relate diseases to each other. In this paper, we present DISNET (http://disnet.ctb.upm.es/), a web-based system designed to periodically extract the knowledge from signs and symptoms retrieved from medical databases, and to enable the creation of customisable disease networks. Methods We here present the main features of the DISNET system. We describe how information on diseases and their phenotypic manifestations is extracted from Wikipedia and PubMed websites; specifically, texts from these sources are processed through a combination of text mining and natural language processing techniques. Results We further present the validation of our system on Wikipedia and PubMed texts, obtaining the relevant accuracy. The final output includes the creation of a comprehensive symptoms-disease dataset, shared (free access) through the system's API. We finally describe, with some simple use cases, how a user can interact with it and extract information that could be used for subsequent analyses. Discussion DISNET allows retrieving knowledge about the signs, symptoms and diagnostic tests associated with a disease. It is not limited to a specific category (all the categories that the selected sources of information offer us) and clinical diagnosis terms. It further allows to track the evolution of those terms through time, being thus an opportunity to analyse and observe the progress of human knowledge on diseases. We further discussed the validation of the system, suggesting that it is good enough to be used to extract diseases and diagnostically-relevant terms. At the same time, the evaluation also revealed that improvements could be introduced to enhance the system's reliability.
Collapse
Affiliation(s)
- Gerardo Lagunes-García
- Centro de Tecnología Biomédica, Universidad Politécnica de Madrid, Pozuelo de Alarcón, Madrid, Spain
| | - Alejandro Rodríguez-González
- Centro de Tecnología Biomédica, Universidad Politécnica de Madrid, Pozuelo de Alarcón, Madrid, Spain.,Escuela Técnica Superior de Ingenieros Informáticos, Universidad Politécnica de Madrid, Boadilla del Monte, Madrid, Spain
| | - Lucía Prieto-Santamaría
- Centro de Tecnología Biomédica, Universidad Politécnica de Madrid, Pozuelo de Alarcón, Madrid, Spain
| | | | - Massimiliano Zanin
- Centro de Tecnología Biomédica, Universidad Politécnica de Madrid, Pozuelo de Alarcón, Madrid, Spain
| | - Ernestina Menasalvas-Ruiz
- Centro de Tecnología Biomédica, Universidad Politécnica de Madrid, Pozuelo de Alarcón, Madrid, Spain
| |
Collapse
|
10
|
Ljubic B, Pavlovski M, Alshehri J, Roychoudhury S, Bajic V, Van Neste C, Obradovic Z. Comorbidity network analysis and genetics of colorectal cancer. INFORMATICS IN MEDICINE UNLOCKED 2020. [DOI: 10.1016/j.imu.2020.100492] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023] Open
|
11
|
Qi M, Fan S, Wang Z, Yang X, Xie Z, Chen K, Zhang L, Lin T, Liu W, Lin X, Yan Y, Yang Y, Zhao H. Identifying Common Genes, Cell Types and Brain Regions Between Diseases of the Nervous System. Front Genet 2019; 10:1202. [PMID: 31850066 PMCID: PMC6895906 DOI: 10.3389/fgene.2019.01202] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2019] [Accepted: 10/30/2019] [Indexed: 12/12/2022] Open
Abstract
Background: Diseases of the nervous system are widely considered to be caused by genetic mutations, and they have been shown to share pathogenic genes. Discovering the shared mechanisms of these diseases is useful for designing common treatments. Method: In this study, by reviewing 518 articles published after 2007 on 20 diseases of the nervous system, we compiled data on 1607 mutations occurring in 365 genes, totals that are 1.9 and 3.2 times larger than those collected in the Clinvar database, respectively. A combination with the Clinvar data gives 2434 pathogenic mutations and 424 genes. Using this information, we measured the genetic similarities between the diseases according to the number of genes causing two diseases simultaneously. Further detection was carried out on the similarity between diseases in terms of cell types. Disease-related cell types were defined as those with disease-related gene enrichment among the marker genes of cells, as ascertained by analyzing single-cell sequencing data. Enrichment profiles of the disease-related genes over 25 cell types were constructed. The disease similarity in terms of cell types was obtained by calculating the distances between the enrichment profiles of these genes. The same strategy was applied to measure the disease similarity in terms of brain regions by analyzing the gene expression data from 10 brain regions. Results: The disease similarity was first measured in terms of genes. The result indicated that the proportions of overlapped genes between diseases were significantly correlated to the DMN scores (phenotypic similarity), with a Pearson correlation coefficient of 0.40 and P-value = 6.0×10-3. The disease similarity analysis for cell types identified that the distances between enrichment profiles of the disease-related genes were negatively correlated to the DMN scores, with Spearman correlation coefficient = -0.26 (P-value = 1.5 × 10-2). However, the brain region enrichment profile distances of the disease-related genes were not significantly correlated with the DMN score. Besides the similarity of diseases, this study identified novel relationships between diseases and cell types. Conclusion: We manually constructed the most comprehensive dataset to date for genes with mutations related to 20 nervous system diseases. By using this dataset, the similarities between diseases in terms of genes and cell types were found to be significantly correlated to their phenotypic similarity. However, the disease similarities in terms of brain regions were not significantly correlated with the phenotypic similarities. Thus, the phenotypic similarity between the diseases is more likely to be caused by dysfunctions of the same genes or the same types of neurons rather than the same brain regions. The data are collected into the database NeurodisM, which is available at http://biomed-ai.org/neurodism.
Collapse
Affiliation(s)
- Mengling Qi
- Sun Yat-sen Memorial Hospital, Guangzhou, China.,Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene Regulation, Guangzhou, China
| | - Shichao Fan
- Sun Yat-sen Memorial Hospital, Guangzhou, China.,Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene Regulation, Guangzhou, China
| | - Zhi Wang
- Sun Yat-sen Memorial Hospital, Guangzhou, China.,Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene Regulation, Guangzhou, China
| | - Xiaoxing Yang
- School of Data and Computer Science, Sun Yat-sen University, Guangzhou, China
| | - Zicong Xie
- Software Institute, Nanjing University, Nanjing, China
| | - Ken Chen
- School of Data and Computer Science, Sun Yat-sen University, Guangzhou, China
| | - Lei Zhang
- Department of Hepatobiliary Surgery II, Zhujiang Hospital of Southern Medical University, Guangzhou, China
| | - Tao Lin
- Zhongshan Medical College, Sun Yat-sen University, Guangzhou, China
| | - Wei Liu
- School of Public Health, Sun Yat-sen University, Guangzhou, China
| | - Xinlei Lin
- School of Public Health, Sun Yat-sen University, Guangzhou, China
| | - Yan Yan
- School of Public Health, Sun Yat-sen University, Guangzhou, China
| | - Yuedong Yang
- School of Data and Computer Science, Sun Yat-sen University, Guangzhou, China
| | - Huiying Zhao
- Sun Yat-sen Memorial Hospital, Guangzhou, China.,Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene Regulation, Guangzhou, China
| |
Collapse
|
12
|
Cheng L, Zhao H, Wang P, Zhou W, Luo M, Li T, Han J, Liu S, Jiang Q. Computational Methods for Identifying Similar Diseases. MOLECULAR THERAPY. NUCLEIC ACIDS 2019; 18:590-604. [PMID: 31678735 PMCID: PMC6838934 DOI: 10.1016/j.omtn.2019.09.019] [Citation(s) in RCA: 75] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/25/2019] [Revised: 09/11/2019] [Accepted: 09/12/2019] [Indexed: 02/01/2023]
Abstract
Although our knowledge of human diseases has increased dramatically, the molecular basis, phenotypic traits, and therapeutic targets of most diseases still remain unclear. An increasing number of studies have observed that similar diseases often are caused by similar molecules, can be diagnosed by similar markers or phenotypes, or can be cured by similar drugs. Thus, the identification of diseases similar to known ones has attracted considerable attention worldwide. To this end, the associations between diseases at the molecular, phenotypic, and taxonomic levels were used to measure the pairwise similarity in diseases. The corresponding performance assessment strategies for these methods involving the terms “category-based,” “simulated-patient-based,” and “benchmark-data-based” were thus further emphasized. Then, frequently used methods were evaluated using a benchmark-data-based strategy. To facilitate the assessment of disease similarity scores, researchers have designed dozens of tools that implement these methods for calculating disease similarity. Currently, disease similarity has been advantageous in predicting noncoding RNA (ncRNA) function and therapeutic drugs for diseases. In this article, we review disease similarity methods, evaluation strategies, tools, and their applications in the biomedical community. We further evaluate the performance of these methods and discuss the current limitations and future trends for calculating disease similarity.
Collapse
Affiliation(s)
- Liang Cheng
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Hengqiang Zhao
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Pingping Wang
- School of Life Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Wenyang Zhou
- School of Life Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Meng Luo
- School of Life Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Tianxin Li
- School of Life Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Junwei Han
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China.
| | - Shulin Liu
- Systemomics Center, College of Pharmacy, and Genomics Research Center (State-Province Key Laboratories of Biomedicine-Pharmaceutics of China), Harbin Medical University, Harbin, Heilongjiang, China; Department of Microbiology, Immunology and Infectious Diseases, University of Calgary, Calgary, AB, Canada.
| | - Qinghua Jiang
- School of Life Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China.
| |
Collapse
|
13
|
Luo L, Zheng C, Wang J, Tan M, Li Y, Xu R. Analysis of disease organ as a novel phenotype towards disease genetics understanding. J Biomed Inform 2019; 95:103235. [PMID: 31207382 PMCID: PMC6644057 DOI: 10.1016/j.jbi.2019.103235] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2018] [Revised: 06/06/2019] [Accepted: 06/13/2019] [Indexed: 11/24/2022]
Abstract
Discerning the modular nature of human diseases through computational approaches calls for diverse data. The finding sites of diseases, like other disease phenotypes, possess rich information in understanding disease genetics. Yet, analysis of the rich knowledge of disease finding sites has not been comprehensively investigated. In this study, we built a large-scale disease organ network (DON) based on 76,561 disease-organ associations (for 37,615 diseases and 3492 organs) extracted from the United Medical Language System (UMLS) Metathesaurus. We investigated how phenotypic organ similarity among diseases in DON reflects disease gene sharing. We constructed a disease genetic network (DGN) using curated disease-gene associations and demonstrated that disease pairs with higher organ similarities not only are more likely to share genes, but also tend to share more genes. Based on community detection algorithm, we showed that phenotypic disease clusters on DON significantly correlated with genetic disease clusters on DGN. We compared DON with a state-of-art disease phenotype network, disease manifestation network (DMN), that we have recently constructed, and demonstrated that DON contains complementary knowledge for disease genetics understanding.
Collapse
Affiliation(s)
- Lingyun Luo
- School of Computer Science, University of South China, Hengyang, Hunan 421001, China; Department of Population and Quantitative Health Sciences, School of Medicine, Case Western Reserve University, Cleveland, Ohio 44106, USA.
| | - Chunlei Zheng
- Department of Population and Quantitative Health Sciences, School of Medicine, Case Western Reserve University, Cleveland, Ohio 44106, USA
| | - Jiaolong Wang
- School of Computer Science, University of South China, Hengyang, Hunan 421001, China
| | - Minsheng Tan
- School of Computer Science, University of South China, Hengyang, Hunan 421001, China
| | - Yanshu Li
- Department of Population and Quantitative Health Sciences, School of Medicine, Case Western Reserve University, Cleveland, Ohio 44106, USA
| | - Rong Xu
- Department of Population and Quantitative Health Sciences, School of Medicine, Case Western Reserve University, Cleveland, Ohio 44106, USA
| |
Collapse
|
14
|
Chen Y, Xu R. Context-sensitive network analysis identifies food metabolites associated with Alzheimer's disease: an exploratory study. BMC Med Genomics 2019; 12:17. [PMID: 30704467 PMCID: PMC6357669 DOI: 10.1186/s12920-018-0459-2] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
Abstract
BACKGROUND Diet plays an important role in Alzheimer's disease (AD) initiation, progression and outcomes. Previous studies have shown individual food-derived substances may have neuroprotective or neurotoxic effects. However, few works systematically investigate the role of food and food-derived metabolites on the development and progression of AD. METHODS In this study, we systematically investigated 7569 metabolites and identified AD-associated food metabolites using a novel network-based approach. We constructed a context-sensitive network to integrate heterogeneous chemical and genetic data, and to model context-specific inter-relationships among foods, metabolites, human genes and AD. RESULTS Our metabolite prioritization algorithm ranked 59 known AD-associated food metabolites within top 4.9%, which is significantly higher than random expectation. Interestingly, a few top-ranked food metabolites were specifically enriched in herbs and spices. Pathway enrichment analysis shows that these top-ranked herb-and-spice metabolites share many common pathways with AD, including the amyloid processing pathway, which is considered as a hallmark in AD-affected brains and has pathological roles in AD development. CONCLUSIONS Our study represents the first unbiased systems approach to characterizing the effects of food and food-derived metabolites in AD pathogenesis. Our ranking approach prioritizes the known AD-associated food metabolites, and identifies interesting relationships between AD and the food group "herbs and spices". Overall, our study provides intriguing evidence for the role of diet, as an important environmental factor, in AD etiology.
Collapse
Affiliation(s)
- Yang Chen
- Department of Population and Quantitative Health Science, School of Medicine, Case Western Reserve University, Cleveland, OH 44106 USA
| | - Rong Xu
- Department of Population and Quantitative Health Science, School of Medicine, Case Western Reserve University, Cleveland, OH 44106 USA
| |
Collapse
|
15
|
Zheng C, Xu R. Large-scale mining disease comorbidity relationships from post-market drug adverse events surveillance data. BMC Bioinformatics 2018; 19:500. [PMID: 30591027 PMCID: PMC6309066 DOI: 10.1186/s12859-018-2468-8] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Background Systems approaches in studying disease relationship have wide applications in biomedical discovery, such as disease mechanism understanding and drug discovery. The FDA Adverse Event Reporting System (FAERS) contains rich information about patient diseases, medications, drug adverse events and demographics of 17 million case reports. Here, we explored this data resource to mine disease comorbidity relationships using association rule mining algorithm and constructed a disease comorbidity network. Results We constructed a disease comorbidity network with 1059 disease nodes and 12,608 edges using association rule mining of FAERS (14,157 rules). We evaluated the performance of comorbidity mining from FAERS using known disease comorbidities of multiple sclerosis (MS), psoriasis and obesity that represent rare, moderate and common disease respectively. Comorbidities of MS, obesity and psoriasis obtained from our network achieved precisions of 58.6%, 73.7%, 56.2% and recalls 87.5%, 69.2% and 72.7% separately. We performed comparative analysis of the disease comorbidity network with disease semantic network, disease genetic network and disease treatment network. We showed that (1) disease comorbidity clusters exhibit significantly higher semantic similarity than random network (0.18 vs 0.10); (2) disease comorbidity clusters share significantly more genes (0.46 vs 0.06); and (3) disease comorbidity clusters share significantly more drugs (0.64 vs 0.17). Finally, we demonstrated that the disease comorbidity network has potential in uncovering novel disease relationships using asthma as a case study. Conclusions Our study presented the first comprehensive attempt to build a disease comorbidity network from FDA Adverse Event Reporting System. This network shows well correlated with disease semantic similarity, disease genetics and disease treatment, which has great potential in disease genetics prediction and drug discovery.
Collapse
Affiliation(s)
- Chunlei Zheng
- Department of Population and Quantitative Health Sciences, School of Medicine, Case Western Reserve University, 2103 Cornell Road, Cleveland, 44106, OH, USA
| | - Rong Xu
- Department of Population and Quantitative Health Sciences, School of Medicine, Case Western Reserve University, 2103 Cornell Road, Cleveland, 44106, OH, USA.
| |
Collapse
|
16
|
Zheng C, Xu R. The Alzheimer's comorbidity phenome: mining from a large patient database and phenome-driven genetics prediction. JAMIA Open 2018; 2:131-138. [PMID: 30944912 PMCID: PMC6434979 DOI: 10.1093/jamiaopen/ooy050] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2018] [Revised: 10/23/2018] [Accepted: 12/05/2018] [Indexed: 01/08/2023] Open
Abstract
Objective Alzheimer’s disease (AD) is a severe neurodegenerative disorder and has become a global public health problem. Intensive research has been conducted for AD. But the pathophysiology of AD is still not elucidated. Disease comorbidity often associates diseases with overlapping patterns of genetic markers. This may inform a common etiology and suggest essential protein targets. US Food and Drug Administration (FDA) Adverse Event Reporting System (FAERS) collects large-scale postmarketing surveillance data that provide a unique opportunity to investigate disease co-occurrence pattern. We aim to construct a heterogeneous network that integrates disease comorbidity network (DCN) from FAERS with protein–protein interaction (PPI) to prioritize the AD risk genes using network-based ranking algorithm. Materials and Methods We built a DCN based on indication data from FAERS using association rule mining. DCN was further integrated with PPI network. We used random walk with restart ranking algorithm to prioritize AD risk genes. Results We evaluated the performance of our approach using AD risk genes curated from genetic association studies. Our approach achieved an area under a receiver operating characteristic curve of 0.770. Top 500 ranked genes achieved 5.53-fold enrichment for known AD risk genes as compared to random expectation. Pathway enrichment analysis using top-ranked genes revealed that two novel pathways, ERBB and coagulation pathways, might be involved in AD pathogenesis. Conclusion We innovatively leveraged FAERS, a comprehensive data resource for FDA postmarket drug safety surveillance, for large-scale AD comorbidity mining. This exploratory study demonstrated the potential of disease-comorbidities mining from FAERS in AD genetics discovery.
Collapse
Affiliation(s)
- Chunlei Zheng
- Department of Population and Quantitative Health Sciences, Institute of Computational Biology, School of Medicine, Case Western Reserve University, Cleveland, Ohio, USA
| | - Rong Xu
- Department of Population and Quantitative Health Sciences, Institute of Computational Biology, School of Medicine, Case Western Reserve University, Cleveland, Ohio, USA
| |
Collapse
|
17
|
Kalgotra P, Sharda R, Croff JM. Examining health disparities by gender: A multimorbidity network analysis of electronic medical record. Int J Med Inform 2017; 108:22-28. [DOI: 10.1016/j.ijmedinf.2017.09.014] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2017] [Revised: 09/19/2017] [Accepted: 09/24/2017] [Indexed: 01/22/2023]
|
18
|
Cai X, Chen Y, Zheng C, Xu R. Interrogating Patient-level Genomics and Mouse Phenomics towards Understanding Cytokines in Colorectal Cancer Metastasis. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE PROCEEDINGS. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE 2017; 2017:227-236. [PMID: 28815134 PMCID: PMC5543389] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
Background: Colorectal cancer is the second leading cancer-related death worldwide and a majority of patients die from metastasis. Chronic intestinal inflammation plays an important role in tumor progression of colorectal cancer. However, few study works on systematically predicting colorectal cancer metastasis using inflammatory cytokine genes. Results: We developed a supervised machine learning approach to predict colorectal cancer tumor progression using patient level genomic features. To better understand the role of cytokines, we integrated the metastatic-related genes from mouse phenotypic data. In addition, pathway analysis and network visualization were also applied to top significant genes ranked by feature weights of the final prediction model. The combined model of cytokines and mouse phenotypes achieved a predictive accuracy of 75.54%, higher than the model based on mouse phenotypes independently (70.42%, p-value<0.05). In additional, the combined model outperformed the model based on the existing metastatic-related epithelial-to-mesenchymal transition (EMT) genes (75.54% vs. 71.61%, p-value<0.05). We also observed that the most important cytokine gene features of the our model interact with the cancer driver genes and are highly associated with the colorectal cancer metastasis signaling pathway. Conclusion: We developed a combined model using both cytokine and mouse phenotype information to predict colorectal cancer metastasis. The results suggested that the inflammatory cytokines increase the power of predicting metastasis. We also systematically demonstrated the critical role of cytokines in progression of colorectal tumor.
Collapse
Affiliation(s)
- Xiaoshu Cai
- Department of Electrical Engineering and Computer Science, School of Engineering, Case Western Reserve University, Cleveland, Ohio, USA
| | - Yang Chen
- Department of Epidemiology & Biostatistics, School of Medicine, Case Western Reserve University, Cleveland, Ohio, USA
| | - Chunlei Zheng
- Department of Epidemiology & Biostatistics, School of Medicine, Case Western Reserve University, Cleveland, Ohio, USA
| | - Rong Xu
- Department of Epidemiology & Biostatistics, School of Medicine, Case Western Reserve University, Cleveland, Ohio, USA
| |
Collapse
|
19
|
Chen Y, Xu R. Context-sensitive network-based disease genetics prediction and its implications in drug discovery. Bioinformatics 2017; 33:1031-1039. [PMID: 28062449 DOI: 10.1093/bioinformatics/btw737] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2016] [Accepted: 11/19/2016] [Indexed: 01/05/2023] Open
Abstract
Motivation Disease phenotype networks play an important role in computational approaches to identifying new disease-gene associations. Current disease phenotype networks often model disease relationships based on pairwise similarities, therefore ignore the specific context on how two diseases are connected. In this study, we propose a new strategy to model disease associations using context-sensitive networks (CSNs). We developed a CSN-based phenome-driven approach for disease genetics prediction, and investigated the translational potential of the predicted genes in drug discovery. Results We constructed CSNs by directly connecting diseases with associated phenotypes. Here, we constructed two CSNs using different data sources; the two networks contain 26 790 and 13 822 nodes respectively. We integrated the CSNs with a genetic functional relationship network and predicted disease genes using a network-based ranking algorithm. For comparison, we built Similarity-Based disease Networks (SBN) using the same disease phenotype data. In a de novo cross validation for 3324 diseases, the CSN-based approach significantly increased the average rank from top 12.6 to top 8.8% for all tested genes comparing with the SBN-based approach ( p<e-22 ). The area under the receiver operating characteristic curve for the CSN approach was also significantly higher than the SBN approach (0.91 versus 0.87, p<e-3 ). In addition, we predicted genes for Parkinson's disease using CSNs, and demonstrated that the top-ranked genes are highly relevant to PD pathologenesis. We pin-pointed a top-ranked drug target gene for PD, and found its association with neurodegeneration supported by literature. In summary, CSNs lead to significantly improve the disease genetics prediction comparing with SBNs and provide leads for potential drug targets. Availability and Implementation nlp.case.edu/public/data/. Contact rxx@case.edu.
Collapse
|
20
|
Bhattacharjee D, Hossain SMM, Sultana R, Ray S. Topological Inquisition into the PPI Networks Associated with Human Diseases Through Graphlet Frequency Distribution. LECTURE NOTES IN COMPUTER SCIENCE 2017:431-437. [DOI: 10.1007/978-3-319-69900-4_55] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/03/2025]
|
21
|
Soualmia LF, Charlet J. Efficient Results in Semantic Interoperability for Health Care. Findings from the Section on Knowledge Representation and Management. Yearb Med Inform 2016:184-187. [PMID: 27830249 DOI: 10.15265/iy-2016-051] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
OBJECTIVES To summarize excellent current research in the field of Knowledge Representation and Management (KRM) within the health and medical care domain. METHOD We provide a synopsis of the 2016 IMIA selected articles as well as a related synthetic overview of the current and future field activities. A first step of the selection was performed through MEDLINE querying with a list of MeSH descriptors completed by a list of terms adapted to the KRM section. The second step of the selection was completed by the two section editors who separately evaluated the set of 1,432 articles. The third step of the selection consisted of a collective work that merged the evaluation results to retain 15 articles for peer-review. RESULTS The selection and evaluation process of this Yearbook's section on Knowledge Representation and Management has yielded four excellent and interesting articles regarding semantic interoperability for health care by gathering heterogeneous sources (knowledge and data) and auditing ontologies. In the first article, the authors present a solution based on standards and Semantic Web technologies to access distributed and heterogeneous datasets in the domain of breast cancer clinical trials. The second article describes a knowledge-based recommendation system that relies on ontologies and Semantic Web rules in the context of chronic diseases dietary. The third article is related to concept-recognition and text-mining to derive common human diseases model and a phenotypic network of common diseases. In the fourth article, the authors highlight the need for auditing the SNOMED CT. They propose to use a crowdbased method for ontology engineering. CONCLUSIONS The current research activities further illustrate the continuous convergence of Knowledge Representation and Medical Informatics, with a focus this year on dedicated tools and methods to advance clinical care by proposing solutions to cope with the problem of semantic interoperability. Indeed, there is a need for powerful tools able to manage and interpret complex, large-scale and distributed datasets and knowledge bases, but also a need for user-friendly tools developed for the clinicians in their daily practice.
Collapse
Affiliation(s)
- L F Soualmia
- Dr Lina F. Soualmia, Normandie Universités, Rouen University and Hospital, D2IM, LITIS EA 4108, Information Processing in Biology & Health, 1, rue de Germont, Cour Leschevin porte 21, 76031 Rouen Cedex, France, Tel : +33 232 885 869, E-mail:
| | | |
Collapse
|
22
|
Chen Y, Xu R. Phenome-based gene discovery provides information about Parkinson's disease drug targets. BMC Genomics 2016; 17 Suppl 5:493. [PMID: 27586503 PMCID: PMC5009520 DOI: 10.1186/s12864-016-2820-1] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
Abstract
BACKGROUND Parkinson disease (PD) is a severe neurodegenerative disease without curative drugs. The highly complex and heterogeneous disease mechanisms are still unclear. Detecting novel PD associated genes not only contributes in revealing the disease pathogenesis, but also facilitates discovering new targets for drugs. METHODS We propose a phenome-based gene prediction strategy to identify disease-associated genes for PD. We integrated multiple disease phenotype networks, a gene functional relationship network, and known PD genes to predict novel candidate genes. Then we investigated the translational potential of the predicted genes in drug discovery. RESULTS In a cross validation analysis, the average rank for 15 known PD genes is within top 0.8 %. We also tested the algorithm with an independent validation set of 669 PD-associated genes detected by genome-wide association studies. The top ranked genes predicted by our approach are enriched for these validation genes. In addition, our approach prioritized the target genes for FDA-approved PD drugs and the drugs that have been tested for PD in clinical trials. Pathway analysis shows that the prioritized drug target genes are closely associated with PD pathogenesis. The result provides empirical evidence that our computational gene prediction approach identifies novel candidate genes for PD, and has the potential to lead to rapid drug discovery.
Collapse
Affiliation(s)
- Yang Chen
- Department of Epidemiology and Biostatistics, Case Western Reserve University, Cleveland, OH, USA
| | - Rong Xu
- Department of Epidemiology and Biostatistics, Case Western Reserve University, Cleveland, OH, USA.
| |
Collapse
|
23
|
Chen Y, Gao Z, Wang B, Xu R. Towards precision medicine-based therapies for glioblastoma: interrogating human disease genomics and mouse phenotypes. BMC Genomics 2016; 17 Suppl 7:516. [PMID: 27557118 PMCID: PMC5001238 DOI: 10.1186/s12864-016-2908-7] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
BACKGROUND Glioblastoma (GBM) is the most common and aggressive brain tumors. It has poor prognosis even with optimal radio- and chemo-therapies. Since GBM is highly heterogeneous, drugs that target on specific molecular profiles of individual tumors may achieve maximized efficacy. Currently, the Cancer Genome Atlas (TCGA) projects have identified hundreds of GBM-associated genes. We develop a drug repositioning approach combining disease genomics and mouse phenotype data towards predicting targeted therapies for GBM. METHODS We first identified disease specific mouse phenotypes using the most recently discovered GBM genes. Then we systematically searched all FDA-approved drugs for candidates that share similar mouse phenotype profiles with GBM. We evaluated the ranks for approved and novel GBM drugs, and compared with an existing approach, which also use the mouse phenotype data but not the disease genomics data. RESULTS We achieved significantly higher ranks for the approved and novel GBM drugs than the earlier approach. For all positive examples of GBM drugs, we achieved a median rank of 9.2 45.6 of the top predictions have been demonstrated effective in inhibiting the growth of human GBM cells. CONCLUSION We developed a computational drug repositioning approach based on both genomic and phenotypic data. Our approach prioritized existing GBM drugs and outperformed a recent approach. Overall, our approach shows potential in discovering new targeted therapies for GBM.
Collapse
Affiliation(s)
- Yang Chen
- Department of Epidemiology and Biostatistics, Case Western Reserve University, Cleveland, Ohio, USA
| | - Zhen Gao
- Department of Epidemiology and Biostatistics, Case Western Reserve University, Cleveland, Ohio, USA
| | - Bingcheng Wang
- Department of Pharmacology, Case Western Reserve University, Cleveland, Ohio, USA
| | - Rong Xu
- Department of Epidemiology and Biostatistics, Case Western Reserve University, Cleveland, Ohio, USA.
| |
Collapse
|
24
|
Cai X, Chen Y, Gao Z, Xu R. Explore Small Molecule-induced Genome-wide Transcriptional Profiles for Novel Inflammatory Bowel Disease Drug. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE PROCEEDINGS. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE 2016; 2016:22-31. [PMID: 27570643 PMCID: PMC5001780] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Inflammatory Bowel Disease (IBD) is a chronic and relapsing disorder, which affects millions people worldwide. Current drug options cannot cure the disease and may cause severe side effects. We developed a systematic framework to identify novel IBD drugs exploiting millions of genomic signatures for chemical compounds. Specifically, we searched all FDA-approved drugs for candidates that share similar genomic profiles with IBD. In the evaluation experiments, our approach ranked approved IBD drugs averagely within top 26% among 858 candidates, significantly outperforming a state-of-art genomics-based drug repositioning method (p-value < e-8). Our approach also achieved significantly higher average precision than the state-of-art approach in predicting potential IBD drugs from clinical trials (0.072 vs. 0.043, p<0.1) and off-label IBD drugs (0.198 vs. 0.138, p<0.1). Furthermore, we found evidences supporting the therapeutic potential of the top-ranked drugs, such as Naloxone, in literature and through analyzing target genes and pathways.
Collapse
Affiliation(s)
- Xiaoshu Cai
- Department of Electrical Engineering and Computer Science, School of Engineering, Case Western Reserve University, Cleveland, Ohio, USA
| | - Yang Chen
- Department of Epidemiology & Biostatistics, School of Medicine, Case Western Reserve University, Cleveland, Ohio, USA
| | - Zhen Gao
- Department of Epidemiology & Biostatistics, School of Medicine, Case Western Reserve University, Cleveland, Ohio, USA
| | - Rong Xu
- Department of Epidemiology & Biostatistics, School of Medicine, Case Western Reserve University, Cleveland, Ohio, USA
| |
Collapse
|
25
|
Li J, Lin X, Teng Y, Qi S, Xiao D, Zhang J, Kang Y. A Comprehensive Evaluation of Disease Phenotype Networks for Gene Prioritization. PLoS One 2016; 11:e0159457. [PMID: 27415759 PMCID: PMC4944959 DOI: 10.1371/journal.pone.0159457] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2016] [Accepted: 07/01/2016] [Indexed: 12/31/2022] Open
Abstract
Identification of disease-causing genes is a fundamental challenge for human health studies. The phenotypic similarity among diseases may reflect the interactions at the molecular level, and phenotype comparison can be used to predict disease candidate genes. Online Mendelian Inheritance in Man (OMIM) is a database of human genetic diseases and related genes that has become an authoritative source of disease phenotypes. However, disease phenotypes have been described by free text; thus, standardization of phenotypic descriptions is needed before diseases can be compared. Several disease phenotype networks have been established in OMIM using different standardization methods. Two of these networks are important for phenotypic similarity analysis: the first and most commonly used network (mimMiner) is standardized by medical subject heading, and the other network (resnikHPO) is the first to be standardized by human phenotype ontology. This paper comprehensively evaluates for the first time the accuracy of these two networks in gene prioritization based on protein–protein interactions using large-scale, leave-one-out cross-validation experiments. The results show that both networks can effectively prioritize disease-causing genes, and the approach that relates two diseases using a logistic function improves prioritization performance. Tanimoto, one of four methods for normalizing resnikHPO, generates a symmetric network and it performs similarly to mimMiner. Furthermore, an integration of these two networks outperforms either network alone in gene prioritization, indicating that these two disease networks are complementary.
Collapse
Affiliation(s)
- Jianhua Li
- Department of Biomedical Informatics, Sino-Dutch Biomedical and Information Engineering School, Northeastern University, Shenyang, Liaoning, China
- Key Laboratory of Medical Image Computing of Northeastern University, Ministry of Education, Shenyang, Liaoning, China
| | - Xiaoyan Lin
- Department of Biomedical Informatics, Sino-Dutch Biomedical and Information Engineering School, Northeastern University, Shenyang, Liaoning, China
| | - Yueyang Teng
- Department of Biomedical Imaging, Sino-Dutch Biomedical and Information Engineering School, Northeastern University, Shenyang, Liaoning, China
| | - Shouliang Qi
- Key Laboratory of Medical Image Computing of Northeastern University, Ministry of Education, Shenyang, Liaoning, China
- Department of Biomedical Imaging, Sino-Dutch Biomedical and Information Engineering School, Northeastern University, Shenyang, Liaoning, China
| | - Dayu Xiao
- Department of Biomedical Imaging, Sino-Dutch Biomedical and Information Engineering School, Northeastern University, Shenyang, Liaoning, China
| | - Jianying Zhang
- Department of Biomedical Informatics, Sino-Dutch Biomedical and Information Engineering School, Northeastern University, Shenyang, Liaoning, China
- Border Biomedical Research Center, Department of Biological Sciences, The University of Texas at El Paso, El Paso, Texas, United States of America
| | - Yan Kang
- Key Laboratory of Medical Image Computing of Northeastern University, Ministry of Education, Shenyang, Liaoning, China
- Department of Biomedical Imaging, Sino-Dutch Biomedical and Information Engineering School, Northeastern University, Shenyang, Liaoning, China
- * E-mail:
| |
Collapse
|
26
|
Abstract
MOTIVATION Discerning genetic contributions to diseases not only enhances our understanding of disease mechanisms, but also leads to translational opportunities for drug discovery. Recent computational approaches incorporate disease phenotypic similarities to improve the prediction power of disease gene discovery. However, most current studies used only one data source of human disease phenotype. We present an innovative and generic strategy for combining multiple different data sources of human disease phenotype and predicting disease-associated genes from integrated phenotypic and genomic data. RESULTS To demonstrate our approach, we explored a new phenotype database from biomedical ontologies and constructed Disease Manifestation Network (DMN). We combined DMN with mimMiner, which was a widely used phenotype database in disease gene prediction studies. Our approach achieved significantly improved performance over a baseline method, which used only one phenotype data source. In the leave-one-out cross-validation and de novo gene prediction analysis, our approach achieved the area under the curves of 90.7% and 90.3%, which are significantly higher than 84.2% (P < e(-4)) and 81.3% (P < e(-12)) for the baseline approach. We further demonstrated that our predicted genes have the translational potential in drug discovery. We used Crohn's disease as an example and ranked the candidate drugs based on the rank of drug targets. Our gene prediction approach prioritized druggable genes that are likely to be associated with Crohn's disease pathogenesis, and our rank of candidate drugs successfully prioritized the Food and Drug Administration-approved drugs for Crohn's disease. We also found literature evidence to support a number of drugs among the top 200 candidates. In summary, we demonstrated that a novel strategy combining unique disease phenotype data with system approaches can lead to rapid drug discovery. AVAILABILITY AND IMPLEMENTATION nlp. CASE edu/public/data/DMN
Collapse
Affiliation(s)
- Yang Chen
- Department of Electrical Engineering and Computer Science, Department of Epidemiology and Biostatistics and Department of Family Medicine and Community Health, Case Western Reserve University, Cleveland, OH 44106, USA
| | - Li Li
- Department of Electrical Engineering and Computer Science, Department of Epidemiology and Biostatistics and Department of Family Medicine and Community Health, Case Western Reserve University, Cleveland, OH 44106, USA Department of Electrical Engineering and Computer Science, Department of Epidemiology and Biostatistics and Department of Family Medicine and Community Health, Case Western Reserve University, Cleveland, OH 44106, USA
| | - Guo-Qiang Zhang
- Department of Electrical Engineering and Computer Science, Department of Epidemiology and Biostatistics and Department of Family Medicine and Community Health, Case Western Reserve University, Cleveland, OH 44106, USA
| | - Rong Xu
- Department of Electrical Engineering and Computer Science, Department of Epidemiology and Biostatistics and Department of Family Medicine and Community Health, Case Western Reserve University, Cleveland, OH 44106, USA
| |
Collapse
|
27
|
Chen Y, Cai X, Xu R. Combining Human Disease Genetics and Mouse Model Phenotypes towards Drug Repositioning for Parkinson's disease. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2015; 2015:1851-60. [PMID: 26958284 PMCID: PMC4765695] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Parkinson's disease (PD) is a severe neurodegenerative disorder without effective treatments. Here, we present a novel drug repositioning approach to predict new drugs for PD leveraging both disease genetics and large amounts of mouse model phenotypes. First, we identified PD-specific mouse phenotypes using well-studied human disease genes. Then we searched all FDA-approved drugs for candidates that share similar mouse phenotype profiles with PD. We demonstrated the validity of our approach using drugs that have been approved for PD: 10 approved PD drugs were ranked within top 10% among 1197 candidates. In predicting novel PD drugs, our approach achieved a mean average precision of 0.24, which is significantly higher (p
Collapse
Affiliation(s)
- Yang Chen
- Department of Electrical Engineering and Computer Science, School of Engineering, Case Western Reserve University, Cleveland, Ohio, USA
| | - Xiaoshu Cai
- Department of Electrical Engineering and Computer Science, School of Engineering, Case Western Reserve University, Cleveland, Ohio, USA
| | - Rong Xu
- Department of Epidemiology and Biostatistics, School of Medicine, Case Western Reserve University, Cleveland, Ohio, USA
| |
Collapse
|
28
|
Xu R, Wang Q. PhenoPredict: A disease phenome-wide drug repositioning approach towards schizophrenia drug discovery. J Biomed Inform 2015; 56:348-55. [PMID: 26151312 PMCID: PMC4589865 DOI: 10.1016/j.jbi.2015.06.027] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2014] [Revised: 06/26/2015] [Accepted: 06/29/2015] [Indexed: 01/26/2023]
Abstract
Schizophrenia (SCZ) is a common complex disorder with poorly understood mechanisms and no effective drug treatments. Despite the high prevalence and vast unmet medical need represented by the disease, many drug companies have moved away from the development of drugs for SCZ. Therefore, alternative strategies are needed for the discovery of truly innovative drug treatments for SCZ. Here, we present a disease phenome-driven computational drug repositioning approach for SCZ. We developed a novel drug repositioning system, PhenoPredict, by inferring drug treatments for SCZ from diseases that are phenotypically related to SCZ. The key to PhenoPredict is the availability of a comprehensive drug treatment knowledge base that we recently constructed. PhenoPredict retrieved all 18 FDA-approved SCZ drugs and ranked them highly (recall=1.0, and average ranking of 8.49%). When compared to PREDICT, one of the most comprehensive drug repositioning systems currently available, in novel predictions, PhenoPredict represented clear improvements over PREDICT in Precision-Recall (PR) curves, with a significant 98.8% improvement in the area under curve (AUC) of the PR curves. In addition, we discovered many drug candidates with mechanisms of action fundamentally different from traditional antipsychotics, some of which had published literature evidence indicating their treatment benefits in SCZ patients. In summary, although the fundamental pathophysiological mechanisms of SCZ remain unknown, integrated systems approaches to studying phenotypic connections among diseases may facilitate the discovery of innovative SCZ drugs.
Collapse
Affiliation(s)
- Rong Xu
- Department of Epidemiology and Biostatistics, School of Medicine, Case Western Reserve University, Cleveland, OH 44106, United States.
| | - QuanQiu Wang
- ThinTek, LLC, Palo Alto, CA 94306, United States.
| |
Collapse
|