1
|
Recent advances in characterization of citrullination and its implication in human disease research: From method development to network integration. Proteomics 2023; 23:e2200286. [PMID: 36546832 PMCID: PMC10285031 DOI: 10.1002/pmic.202200286] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2022] [Revised: 12/12/2022] [Accepted: 12/13/2022] [Indexed: 12/24/2022]
Abstract
Post-translational modifications (PTM) of proteins increase the functional diversity of the proteome and have been implicated in the pathogenesis of numerous diseases. The most widely understood modifications include phosphorylation, methylation, acetylation, O-linked/N-linked glycosylation, and ubiquitination, all of which have been extensively studied and documented. Citrullination is a historically less explored, yet increasingly studied, protein PTM which has profound effects on protein conformation and protein-protein interactions. Dysregulation of protein citrullination has been associated with disease development and progression. Identification and characterization of citrullinated proteins is highly challenging, complicated by the low cellular abundance of citrullinated proteins, making it difficult to identify and quantify the extent of citrullination in samples, coupled with challenges associated with development of mass spectrometry (MS)-based methods, as the corresponding mass shift is relatively small, +0.984 Da, and identical to the mass shift of deamidation. The focus of this review is to discuss recent advancements of citrullination-specific MS approaches and integration of the potential methodology for improved citrullination identification and characterization. In addition, the association of citrullination in disease networks is also highlighted.
Collapse
|
2
|
Using a Network-Based Analysis Approach to Investigate the Involvement of S. aureus in the Pathogenesis of Granulomatosis with Polyangiitis. Int J Mol Sci 2023; 24:ijms24031822. [PMID: 36768148 PMCID: PMC9915048 DOI: 10.3390/ijms24031822] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2022] [Revised: 01/12/2023] [Accepted: 01/13/2023] [Indexed: 01/19/2023] Open
Abstract
Chronic nasal carriage of Staphylococcus aureus (SA) has been shown to be significantly higher in GPA patients when compared to healthy subjects, as well as being associated with increased endonasal activity and disease relapse. The aim of this study was to investigate SA involvement in GPA by applying a network-based analysis (NBA) approach to publicly available nasal transcriptomic data. Using these data, our NBA pipeline generated a proteinase 3 (PR3) positive ANCA associated vasculitis (AAV) disease network integrating differentially expressed genes, dysregulated transcription factors (TFs), disease-specific genes derived from GWAS studies, drug-target and protein-protein interactions. The PR3+ AAV disease network captured genes previously reported to be dysregulated in AAV associated. A subnetwork focussing on interactions between SA virulence factors and enriched biological processes revealed potential mechanisms for SA's involvement in PR3+ AAV. Immunosuppressant treatment reduced differential expression and absolute TF activities in this subnetwork for patients with inactive nasal disease but not active nasal disease symptoms at the time of sampling. The disease network generated identified the key molecular signatures and highlighted the associated biological processes in PR3+ AAV and revealed potential mechanisms for SA to affect these processes.
Collapse
|
3
|
MorbidGCN: prediction of multimorbidity with a graph convolutional network based on integration of population phenotypes and disease network. Brief Bioinform 2022; 23:6627601. [PMID: 35780382 DOI: 10.1093/bib/bbac255] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Revised: 05/17/2022] [Accepted: 06/01/2022] [Indexed: 02/06/2023] Open
Abstract
Exploring multimorbidity relationships among diseases is of great importance for understanding their shared mechanisms, precise diagnosis and treatment. However, the landscape of multimorbidities is still far from complete due to the complex nature of multimorbidity. Although various types of biological data, such as biomolecules and clinical symptoms, have been used to identify multimorbidities, the population phenotype information (e.g. physical activity and diet) remains less explored for multimorbidity. Here, we present a graph convolutional network (GCN) model, named MorbidGCN, for multimorbidity prediction by integrating population phenotypes and disease network. Specifically, MorbidGCN treats the multimorbidity prediction as a missing link prediction problem in the disease network, where a novel feature selection method is embedded to select important phenotypes. Benchmarking results on two large-scale multimorbidity data sets, i.e. the UK Biobank (UKB) and Human Disease Network (HuDiNe) data sets, demonstrate that MorbidGCN outperforms other competitive methods. With MorbidGCN, 9742 and 14 010 novel multimorbidities are identified in the UKB and HuDiNe data sets, respectively. Moreover, we notice that the selected phenotypes that are generally differentially distributed between multimorbidity patients and single-disease patients can help interpret multimorbidities and show potential for prognosis of multimorbidities.
Collapse
|
4
|
Associations between Multimorbidity Patterns and Subsequent Labor Market Marginalization among Refugees and Swedish-Born Young Adults-A Nationwide Registered-Based Cohort Study. J Pers Med 2021; 11:jpm11121305. [PMID: 34945776 PMCID: PMC8705997 DOI: 10.3390/jpm11121305] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2021] [Revised: 11/19/2021] [Accepted: 11/23/2021] [Indexed: 11/16/2022] Open
Abstract
Background: Young refugees are at increased risk of labor market marginalization (LMM). We sought to examine whether the association of multimorbidity patterns and LMM differs in refugee youth compared to Swedish-born youth and identify the diagnostic groups driving this association. Methodology: We analyzed 249,245 individuals between 20–25 years, on 31 December 2011, from a combined Swedish registry. Refugees were matched 1:5 to Swedish-born youth. A multimorbidity score was computed from a network of disease co-occurrences in 2009–2011. LMM was defined as disability pension (DP) or >180 days of unemployment during 2012–2016. Relative risks (RR) of LMM were calculated for 114 diagnostic groups (2009–2011). The odds of LMM as a function of multimorbidity score were estimated using logistic regression. Results: 2841 (1.1%) individuals received DP and 16,323 (6.5%) experienced >180 annual days of unemployment during follow-up. Refugee youth had a marginally higher risk of DP (OR (95% CI): 1.59 (1.52, 1.67)) depending on their multimorbidity score compared to Swedish-born youth (OR (95% CI): 1.51 (1.48, 1.54)); no differences were found for unemployment (OR (95% CI): 1.15 (1.12, 1.17), 1.12 (1.10, 1.14), respectively). Diabetes mellitus and influenza/pneumonia elevated RR of DP in refugees (RRs (95% CI) 2.4 (1.02, 5.6) and 1.75 (0.88, 3.45), respectively); most diagnostic groups were associated with a higher risk for unemployment in refugees. Conclusion: Multimorbidity related similarly to LMM in refugees and Swedish-born youth, but different diagnoses drove these associations. Targeted prevention, screening, and early intervention strategies towards specific diagnoses may effectively reduce LMM in young adult refugees.
Collapse
|
5
|
A disease network-based deep learning approach for characterizing melanoma. Int J Cancer 2021; 150:1029-1044. [PMID: 34716589 DOI: 10.1002/ijc.33860] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2021] [Revised: 10/08/2021] [Accepted: 10/19/2021] [Indexed: 12/12/2022]
Abstract
Multiple types of genomic variations are present in cutaneous melanoma and some of the genomic features may have an impact on the prognosis of the disease. The access to genomics data via public repositories such as The Cancer Genome Atlas (TCGA) allows for a better understanding of melanoma at the molecular level, therefore making characterization of substantial heterogeneity in melanoma patients possible. Here, we proposed an approach that integrates genomics data, a disease network, and a deep learning model to classify melanoma patients for prognosis, assess the impact of genomic features on the classification and provide interpretation to the impactful features. We integrated genomics data into a melanoma network and applied an autoencoder model to identify subgroups in TCGA melanoma patients. The model utilizes communities identified in the network to effectively reduce the dimensionality of genomics data into a patient score profile. Based on the score profile, we identified three patient subtypes that show different survival times. Furthermore, we quantified and ranked the impact of genomic features on the patient score profile using a machine-learning technique. Follow-up analysis of the top-ranking features provided us with the biological interpretation of them at both pathway and molecular levels, such as their mutation and interactome profiles in melanoma and their involvement in pathways associated with signaling transduction, immune system and cell cycle. Taken together, we demonstrated the ability of the approach to identify disease subgroups using a deep learning model that captures the most relevant information of genomics data in the melanoma network.
Collapse
|
6
|
Identifying the biomarkers and pathways associated with hepatocellular carcinoma based on an integrated analysis approach. Liver Int 2021; 41:2485-2498. [PMID: 34033190 DOI: 10.1111/liv.14972] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/26/2019] [Revised: 05/11/2021] [Accepted: 05/19/2021] [Indexed: 02/13/2023]
Abstract
BACKGROUND AND AIMS Hepatocellular carcinoma (HCC) is one of the most common causes of cancer-related death worldwide. The molecular mechanism underlying HCC is still unclear. In this study, we conducted a comprehensive analysis to explore the genes, pathways and their interactions involved in HCC. METHODS We analysed the gene expression datasets corresponding to 488 samples from 10 studies on HCC and identified the genes differentially expressed in HCC samples. Then, the genes were compared against Phenolyzer and GeneCards to screen those potentially associated with HCC. The features of the selected genes were explored by mapping them onto the human protein-protein interaction network, and a subnetwork related to HCC was constructed. Hub genes in this HCC specific subnetwork were identified, and their relevance with HCC was investigated by survival analysis. RESULTS We identified 444 differentially expressed genes (177 upregulated and 267 downregulated) related to HCC. Functional enrichment analysis revealed that pathways like p53 signalling and chemical carcinogenesis were eriched in HCC genes. In the subnetwork related to HCC, five disease modules were detected. Further analysis identified six hub genes from the HCC specific subnetwork. Survival analysis showed that the expression levels of these genes were negatively correlated with survival rate of HCC patients. CONCLUSIONS Based on a systems biology framework, we identified the genes, pathways, as well as the disease specific network related to HCC. We also found novel biomarkers whose expression patterns were correlated with progression of HCC, and they could be candidates for further investigation.
Collapse
|
7
|
Enroll-HD: An Integrated Clinical Research Platform and Worldwide Observational Study for Huntington's Disease. Front Neurol 2021; 12:667420. [PMID: 34484094 PMCID: PMC8416308 DOI: 10.3389/fneur.2021.667420] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2021] [Accepted: 06/21/2021] [Indexed: 12/20/2022] Open
Abstract
Established in July 2012, Enroll-HD is both an integrated clinical research platform and a worldwide observational study designed to meet the clinical research requirements necessary to develop therapeutics for Huntington's disease (HD). The platform offers participants a low-burden entry into HD research, providing a large, well-characterized, research-engaged cohort with associated clinical data and biosamples that facilitates recruitment into interventional trials and other research studies. Additional studies that use Enroll-HD data and/or biosamples are built into the platform to further research on biomarkers and outcome measures. Enroll-HD is now operating worldwide in 21 countries at 159 clinical sites across four continents—Europe, North America, Latin America, and Australasia—and has recruited almost 25,000 participants, generating a large, rich clinical database with associated biosamples to expedite HD research; any researcher at a verifiable research organization can access the clinical datasets and biosamples from Enroll-HD and nested studies. Important operational features of Enroll-HD include a strong emphasis on standardization, data quality, and protecting participant identity, a single worldwide study protocol, a flexible EDC system capable of integrating multiple studies, a comprehensive monitoring infrastructure, an online portal to train and certify site personnel, and standardized study documents including informed consent forms and contractual agreements.
Collapse
|
8
|
Abstract
Disease interaction in multimorbid patients is relevant to treatment and prognosis, yet poorly understood. In the present work, we combine approaches from network science, machine learning and computational phenotyping to assess interactions between two or more diseases in a transparent way across the full diagnostic spectrum. We demonstrate that health states of hospitalized patients can be better characterized by including higher-order features capturing interactions between more than two diseases. We identify a meaningful set of higher-order diagnosis features that account for synergistic disease interactions in a population-wide (N = 9 M) medical claims dataset. We construct a generalized disease network where (higher-order) diagnosis features are linked if they predict similar diagnoses across the whole diagnostic spectrum. The fact that specific diagnoses are generally represented multiple times in the network allows for the identification of putatively different disease phenotypes that may reflect different disease aetiologies. At the example of obesity, we demonstrate the purely data-driven detection of two complex phenotypes of obesity. As indicated by a matched comparison between patients having these phenotypes, we show that these phenotypes show specific characteristics of what has been controversially discussed in the medical literature as metabolically healthy and unhealthy obesity, respectively. The findings also suggest that metabolically healthy patients show some progression towards more unhealthy obesity over time, a finding that is consistent with longitudinal studies indicating a transient nature of metabolically healthy obesity. The disease network is available for exploration at https://disease.network/.
Collapse
|
9
|
CovMulNet19, Integrating Proteins, Diseases, Drugs, and Symptoms: A Network Medicine Approach to COVID-19. NETWORK AND SYSTEMS MEDICINE 2020; 3:130-141. [PMID: 33274348 PMCID: PMC7703682 DOI: 10.1089/nsm.2020.0011] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/02/2020] [Indexed: 12/23/2022] Open
Abstract
Introduction: We introduce in this study CovMulNet19, a comprehensive COVID-19 network containing all available known interactions involving SARS-CoV-2 proteins, interacting-human proteins, diseases and symptoms that are related to these human proteins, and compounds that can potentially target them. Materials and Methods: Extensive network analysis methods, based on a bootstrap approach, allow us to prioritize a list of diseases that display a high similarity to COVID-19 and a list of drugs that could potentially be beneficial to treat patients. As a key feature of CovMulNet19, the inclusion of symptoms allows a deeper characterization of the disease pathology, representing a useful proxy for COVID-19-related molecular processes. Results: We recapitulate many of the known symptoms of the disease and we find the most similar diseases to COVID-19 reflect conditions that are risk factors in patients. In particular, the comparison between CovMulNet19 and randomized networks recovers many of the known associated comorbidities that are important risk factors for COVID-19 patients, through identified similarities with intestinal, hepatic, and neurological diseases as well as with respiratory conditions, in line with reported comorbidities. Conclusion: CovMulNet19 can be suitably used for network medicine analysis, as a valuable tool for exploring drug repurposing while accounting for the intervening multidimensional factors, from molecular interactions to symptoms.
Collapse
|
10
|
Summarizing Complex Graphical Models of Multiple Chronic Conditions Using the Second Eigenvalue of Graph Laplacian: Algorithm Development and Validation. JMIR Med Inform 2020; 8:e16372. [PMID: 32554376 PMCID: PMC7330739 DOI: 10.2196/16372] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2019] [Revised: 01/06/2020] [Accepted: 03/22/2020] [Indexed: 01/16/2023] Open
Abstract
Background It is important but challenging to understand the interactions of multiple chronic conditions (MCC) and how they develop over time in patients and populations. Clinical data on MCC can now be represented using graphical models to study their interaction and identify the path toward the development of MCC. However, the current graphical models representing MCC are often complex and difficult to analyze. Therefore, it is necessary to develop improved methods for generating these models. Objective This study aimed to summarize the complex graphical models of MCC interactions to improve comprehension and aid analysis. Methods We examined the emergence of 5 chronic medical conditions (ie, traumatic brain injury [TBI], posttraumatic stress disorder [PTSD], depression [Depr], substance abuse [SuAb], and back pain [BaPa]) over 5 years among 257,633 veteran patients. We developed 3 algorithms that utilize the second eigenvalue of the graph Laplacian to summarize the complex graphical models of MCC by removing less significant edges. The first algorithm learns a sparse probabilistic graphical model of MCC interactions directly from the data. The second algorithm summarizes an existing probabilistic graphical model of MCC interactions when a supporting data set is available. The third algorithm, which is a variation of the second algorithm, summarizes the existing graphical model of MCC interactions with no supporting data. Finally, we examined the coappearance of the 100 most common terms in the literature of MCC to validate the performance of the proposed model. Results The proposed summarization algorithms demonstrate considerable performance in extracting major connections among MCC without reducing the predictive accuracy of the resulting graphical models. For the model learned directly from the data, the area under the curve (AUC) performance for predicting TBI, PTSD, BaPa, SuAb, and Depr, respectively, during the next 4 years is as follows—year 2: 79.91%, 84.04%, 78.83%, 82.50%, and 81.47%; year 3: 76.23%, 80.61%, 73.51%, 79.84%, and 77.13%; year 4: 72.38%, 78.22%, 72.96%, 77.92%, and 72.65%; and year 5: 69.51%, 76.15%, 73.04%, 76.72%, and 69.99%, respectively. This demonstrates an overall 12.07% increase in the cumulative sum of AUC in comparison with the classic multilevel temporal Bayesian network. Conclusions Using graph summarization can improve the interpretability and the predictive power of the complex graphical models of MCC.
Collapse
|
11
|
Abstract
Multiplexed isobaric labeling methods, such as tandem mass tags (TMT), remarkably improve the throughput of quantitative mass spectrometry. Here, we present a 27-plex TMT method coupled with two-dimensional liquid chromatography (LC/LC) for extensive peptide fractionation and high-resolution tandem mass spectrometry (MS/MS) for peptide quantification and then apply the method to profile the complex human brain proteome of Alzheimer's disease (AD). The 27-plex method combines multiplexed capacities of the 11-plex and the 16-plex TMT, as the peptides labeled by the two TMT sets display different mass and hydrophobicity, which can be well separated in LC-MS/MS. We first systematically optimized the protocol for the newly developed 16-plex TMT, including labeling reaction, desalting, and MS conditions, and then directly compared the 11-plex and 16-plex methods by analyzing the same human AD samples. Both methods yielded similar proteome coverage, analyzing >100 000 peptides in >10 000 human proteins. Furthermore, the 11-plex and 16-plex samples were mixed for a 27-plex assay, resulting in more than 8000 protein measurements within the same MS time. The 27-plex results are highly consistent with those of the individual 11-plex and 16-plex TMT analyses. We also used these proteomics data sets to compare the AD brain with the nondementia controls, discovering major AD-related proteins and revealing numerous novel protein alterations enriched in the pathways of amyloidosis, immunity, mitochondrial, and synaptic functions. Overall, our data strongly demonstrate that this new 27-plex strategy is highly feasible for routine large-scale proteomic analysis.
Collapse
|
12
|
[UTILIZING GENOMIC INFORMATION FOR BIOMARKER IDENTIFICATION AND GENOMIC DRUG DISCOVERY]. ARERUGI = [ALLERGY] 2020; 69:952-957. [PMID: 33310976 DOI: 10.15036/arerugi.69.952] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
|
13
|
Computer aided analysis of disease linked protein networks. Bioinformation 2019; 15:513-522. [PMID: 31485137 PMCID: PMC6704336 DOI: 10.6026/97320630015513] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2019] [Revised: 04/16/2019] [Accepted: 04/17/2019] [Indexed: 12/26/2022] Open
Abstract
Proteins can interact in various ways, ranging from direct physical relationships to indirect interactions in a formation of protein-protein
interaction network. Diagnosis of the protein connections is critical to identify various cellular pathways. Today constructing and
analyzing the protein interaction network is being developed as a powerful approach to create network pharmacology toward detecting
unknown genes and proteins associated with diseases. Discovery drug targets regarding therapeutic decisions are exciting outcomes of
studying disease networks. Protein connections may be identified by experimental and recent new computational approaches. Due to
difficulties in analyzing in-vivo proteins interactions, many researchers have encouraged improving computational methods to design
protein interaction network. In this review, the experimental and computational approaches and also advantages and disadvantages of
these methods regarding the identification of new interactions in a molecular mechanism have been reviewed. Systematic analysis of
complex biological systems including network pharmacology and disease network has also been discussed in this review.
Collapse
|
14
|
Network Diffusion Approach to Predict LncRNA Disease Associations Using Multi-Type Biological Networks: LION. Front Physiol 2019; 10:888. [PMID: 31379598 PMCID: PMC6646690 DOI: 10.3389/fphys.2019.00888] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2019] [Accepted: 06/26/2019] [Indexed: 11/13/2022] Open
Abstract
Recently, long-non-coding RNAs (lncRNAs) have attracted attention because of their emerging role in many important biological mechanisms. The accumulating evidence indicates that the dysregulation of lncRNAs is associated with complex diseases. However, only a few lncRNA-disease associations have been experimentally validated and therefore, predicting potential lncRNAs that are associated with diseases become an important task. Current computational approaches often use known lncRNA-disease associations to predict potential lncRNA-disease links. In this work, we exploited the topology of multi-level networks to propose the LncRNA rankIng by NetwOrk DiffusioN (LION) approach to identify lncRNA-disease associations. The multi-level complex network consisted of lncRNA-protein, protein–protein interactions, and protein-disease associations. We applied the network diffusion algorithm of LION to predict the lncRNA-disease associations within the multi-level network. LION achieved an AUC value of 96.8% for cardiovascular diseases, 91.9% for cancer, and 90.2% for neurological diseases by using experimentally verified lncRNAs associated with diseases. Furthermore, compared to a similar approach (TPGLDA), LION performed better for cardiovascular diseases and cancer. Given the versatile role played by lncRNAs in different biological mechanisms that are perturbed in diseases, LION’s accurate prediction of lncRNA-disease associations helps in ranking lncRNAs that could function as potential biomarkers and potential drug targets.
Collapse
|
15
|
DDOT: A Swiss Army Knife for Investigating Data-Driven Biological Ontologies. Cell Syst 2019; 8:267-273.e3. [PMID: 30878356 PMCID: PMC7042149 DOI: 10.1016/j.cels.2019.02.003] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2018] [Revised: 12/08/2018] [Accepted: 02/08/2019] [Indexed: 01/08/2023]
Abstract
Systems biology requires not only genome-scale data but also methods to integrate these data into interpretable models. Previously, we developed approaches that organize omics data into a structured hierarchy of cellular components and pathways, called a "data-driven ontology." Such hierarchies recapitulate known cellular subsystems and discover new ones. To broadly facilitate this type of modeling, we report the development of a software library called the Data-Driven Ontology Toolkit (DDOT), consisting of a Python package (https://github.com/idekerlab/ddot) to assemble and analyze ontologies and a web application (http://hiview.ucsd.edu) to visualize them. Using DDOT, we programmatically assemble a compendium of ontologies for 652 diseases by integrating gene-disease mappings with a gene similarity network derived from omics data. For example, the ontology for Fanconi anemia describes known and novel disease mechanisms in its hierarchy of 194 genes and 74 subsystems. DDOT provides an easy interface to share ontologies online at the Network Data Exchange.
Collapse
|
16
|
Knowledge-Based Neuroendocrine Immunomodulation (NIM) Molecular Network Construction and Its Application. Molecules 2018; 23:molecules23061312. [PMID: 29848990 PMCID: PMC6099962 DOI: 10.3390/molecules23061312] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2018] [Revised: 05/24/2018] [Accepted: 05/25/2018] [Indexed: 01/23/2023] Open
Abstract
Growing evidence shows that the neuroendocrine immunomodulation (NIM) network plays an important role in maintaining and modulating body function and the homeostasis of the internal environment. The disequilibrium of NIM in the body is closely associated with many diseases. In the present study, we first collected a core dataset of NIM signaling molecules based on our knowledge and obtained 611 NIM signaling molecules. Then, we built a NIM molecular network based on the MetaCore database and analyzed the signaling transduction characteristics of the core network. We found that the endocrine system played a pivotal role in the bridge between the nervous and immune systems and the signaling transduction between the three systems was not homogeneous. Finally, employing the forest algorithm, we identified the molecular hub playing an important role in the pathogenesis of rheumatoid arthritis (RA) and Alzheimer’s disease (AD), based on the NIM molecular network constructed by us. The results showed that GSK3B, SMARCA4, PSMD7, HNF4A, PGR, RXRA, and ESRRA might be the key molecules for RA, while RARA, STAT3, STAT1, and PSMD14 might be the key molecules for AD. The molecular hub may be a potentially druggable target for these two complex diseases based on the literature. This study suggests that the NIM molecular network in this paper combined with the forest algorithm might provide a useful tool for predicting drug targets and understanding the pathogenesis of diseases. Therefore, the NIM molecular network and the corresponding online tool will not only enhance research on complex diseases and system biology, but also promote the communication of valuable clinical experience between modern medicine and Traditional Chinese Medicine (TCM).
Collapse
|
17
|
The landscape of genetic susceptibility correlations among diseases and traits. J Am Med Inform Assoc 2018; 24:921-926. [PMID: 28371808 DOI: 10.1093/jamia/ocx026] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2016] [Accepted: 03/01/2017] [Indexed: 11/12/2022] Open
Abstract
Objective The aim of the study was to comprehensively explore the genetic susceptibility correlations among diseases and traits from large-scale individual genotype data. Materials and Methods Based on a knowledge base of genetic variants significantly (P < 5 × 10 -8 ) linked with human phenotypes, genetic risk scores (GRSs) of diseases or traits were calculated for 2504 individuals with whole-genome sequencing data from the 1000 Genomes Project. Associations between diseases/traits were statistically evaluated by pairwise correlation analysis of GRSs. Overlaps between the genetic susceptibility correlations and disease comorbidity associations from hospital claims data in more than 30 million patients in United States were assessed. Results Correlation analysis of GRSs revealed 823 significant correlations among 78 diseases and 89 traits (false discovery rate adjusted P -value or Q -value < 0.01). It is noticeable that GRSs were correlated in 464 associations (56.4%) even if they were combinations of distinct sets of risk variants without chromosomal linkage, suggesting the presence of genetic interactions beyond chromosome position. When 312 significant genetic susceptibility correlations between diseases were compared to nationwide disease comorbidity correlations obtained from data from 32 million Medicare claims in the United States, 108 overlaps (34.6%) were found that had both genetic susceptibility and epidemiologic comorbid correlations. Conclusion The study suggests that common genetic background exists between diseases and traits with epidemiologic associations. The GRS correlation approach provides a rich source of candidate associations among diseases and traits from the genetic perspective, warranting further epidemiologic studies.
Collapse
|
18
|
Computational study of 'HUB' microRNA in human cardiac diseases. Bioinformation 2017; 13:17-20. [PMID: 28479745 PMCID: PMC5405088 DOI: 10.6026/97320630013017] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2016] [Revised: 01/01/2017] [Accepted: 01/05/2017] [Indexed: 11/29/2022] Open
Abstract
MicroRNAs (miRNAs) are small non-coding RNAs ~22 nucleotides long that do not encode for proteins but have been reported to
influence gene expression in normal and abnormal health conditions. Though a large body of scientific literature on miRNAs exists,
their network level profile linking molecules with their corresponding phenotypes, is less explored. Here, we studied a network of 191
human miRNAs reported to play a role in 30 human cardiac diseases. Our aim was to study miRNA network properties like hubness
and preferred associations, using data mining, network graph theory and statistical analysis. A total of 16 miRNAs were found to have
a disease node connectivity of >5 edges (i.e., they were linked to more than 5 diseases) and were considered hubs in the miRNAcardiac
disease network. Alternatively, when diseases were considered as hubs, >10 of miRNAs showed up on each ‘disease hub
node’. Of all the miRNAs associated with diseases, 19 miRNAs (19/24= 79.1% of upregulated events) were found to be upregulated in
atherosclerosis. The data suggest micro RNAs as early stage biological markers in cardiac conditions with potential towards microRNA
based therapeutics.
Collapse
|
19
|
A simple and efficient algorithm for genome-wide homozygosity analysis in disease. Mol Syst Biol 2009; 5:304. [PMID: 19756043 PMCID: PMC2758715 DOI: 10.1038/msb.2009.53] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2008] [Accepted: 07/07/2009] [Indexed: 11/18/2022] Open
Abstract
Here we propose a simple statistical algorithm for rapidly scoring loci associated with disease or traits due to recessive mutations or deletions using genome-wide single nucleotide polymorphism genotyping case-control data in unrelated individuals. This algorithm identifies loci by defining homozygous segments of the genome present at significantly different frequencies between cases and controls. We found that false positive loci could be effectively removed from the output of this procedure by applying different physical size thresholds for the homozygous segments. This procedure is then conducted iteratively using random sub-datasets until the number of selected loci converges. We demonstrate this method in a publicly available data set for Alzheimer's disease and identify 26 candidate risk loci in the 22 autosomes. In this data set, these loci can explain 75% of the genetic risk variability of the disease.
Collapse
|