1
|
Kupershmidt Y, Kasif S, Sharan R. SPIDER: constructing cell-type-specific protein-protein interaction networks. BIOINFORMATICS ADVANCES 2024; 4:vbae130. [PMID: 39346952 PMCID: PMC11438548 DOI: 10.1093/bioadv/vbae130] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/04/2024] [Revised: 08/11/2024] [Accepted: 08/28/2024] [Indexed: 10/01/2024]
Abstract
Motivation Protein-protein interactions (PPIs) play essential roles in the buildup of cellular machinery and provide the skeleton for cellular signaling. However, these biochemical roles are context dependent and interactions may change across cell type, time, and space. In contrast, PPI detection assays are run in a single condition that may not even be an endogenous condition of the organism, resulting in static networks that do not reflect full cellular complexity. Thus, there is a need for computational methods to predict cell-type-specific interactions. Results Here we present SPIDER (Supervised Protein Interaction DEtectoR), a graph attention-based model for predicting cell-type-specific PPI networks. In contrast to previous attempts at this problem, which were unsupervised in nature, our model's training is guided by experimentally measured cell-type-specific networks, enhancing its performance. We evaluate our method using experimental data of cell-type-specific networks from both humans and mice, and show that it outperforms current approaches by a large margin. We further demonstrate the ability of our method to generalize the predictions to datasets of tissues lacking prior PPI experimental data. We leverage the networks predicted by the model to facilitate the identification of tissue-specific disease genes. Availability and implementation Our code and data are available at https://github.com/Kuper994/SPIDER.
Collapse
Affiliation(s)
- Yael Kupershmidt
- Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv 69978, Israel
| | - Simon Kasif
- Department of Biomedical Engineering, Boston University, Boston, MA 02215, United States
| | - Roded Sharan
- Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv 69978, Israel
| |
Collapse
|
2
|
Wright SN, Colton S, Schaffer LV, Pillich RT, Churas C, Pratt D, Ideker T. State of the Interactomes: an evaluation of molecular networks for generating biological insights. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.26.587073. [PMID: 38746239 PMCID: PMC11092493 DOI: 10.1101/2024.04.26.587073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
Advancements in genomic and proteomic technologies have powered the use of gene and protein networks ("interactomes") for understanding genotype-phenotype translation. However, the proliferation of interactomes complicates the selection of networks for specific applications. Here, we present a comprehensive evaluation of 46 current human interactomes, encompassing protein-protein interactions as well as gene regulatory, signaling, colocalization, and genetic interaction networks. Our analysis shows that large composite networks such as HumanNet, STRING, and FunCoup are most effective for identifying disease genes, while smaller networks such as DIP and SIGNOR demonstrate strong interaction prediction performance. These findings provide a benchmark for interactomes across diverse network biology applications and clarify factors that influence network performance. Furthermore, our evaluation pipeline paves the way for continued assessment of emerging and updated interaction networks in the future.
Collapse
|
3
|
Yang BZ, Xiang B, Wang T, Ma S, Li CSR. Neurogenetic underpinnings of nicotine use severity: Integrating the brain transcriptomes and GWAS variants via network approaches. Psychiatry Res 2024; 334:115815. [PMID: 38422867 PMCID: PMC11017751 DOI: 10.1016/j.psychres.2024.115815] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/02/2024] [Revised: 02/19/2024] [Accepted: 02/23/2024] [Indexed: 03/02/2024]
Abstract
Our study focused on human brain transcriptomes and the genetic risks of cigarettes per day (CPD) to investigate the neurogenetic mechanisms of individual variation in nicotine use severity. We constructed whole-brain and intramodular region-specific coexpression networks using BrainSpan's transcriptomes, and the genomewide association studies identified risk variants of CPD, confirmed the associations between CPD and each gene set in the region-specific subnetworks using an independent dataset, and conducted bioinformatic analyses. Eight brain-region-specific coexpression subnetworks were identified in association with CPD: amygdala, hippocampus, medial prefrontal cortex (MPFC), orbitofrontal cortex (OPFC), dorsolateral prefrontal cortex, striatum, mediodorsal nucleus of the thalamus (MDTHAL), and primary motor cortex (M1C). Each gene set in the eight subnetworks was associated with CPD. We also identified three hub proteins encoded by GRIN2A in the amygdala, PMCA2 in the hippocampus, MPFC, OPFC, striatum, and MDTHAL, and SV2B in M1C. Intriguingly, the pancreatic secretion pathway appeared in all the significant protein interaction subnetworks, suggesting pleiotropic effects between cigarette smoking and pancreatic diseases. The three hub proteins and genes are implicated in stress response, drug memory, calcium homeostasis, and inhibitory control. These findings provide novel evidence of the neurogenetic underpinnings of smoking severity.
Collapse
Affiliation(s)
- Bao-Zhu Yang
- Department of Psychiatry, Yale University School of Medicine, New Haven, CT, USA; Veterans Affairs Connecticut Healthcare System, West Haven, CT, USA.
| | - Bo Xiang
- Department of Psychiatry, Affiliated Hospital of Southwest Medical University, Luzhou, Sichuan Province, China.
| | - Tingting Wang
- Department of Psychiatry, Affiliated Hospital of Southwest Medical University, Luzhou, Sichuan Province, China
| | - Shuangge Ma
- Department of Biostatistics, Yale University, New Haven, CT, USA
| | - Chiang-Shan R Li
- Department of Psychiatry, Yale University School of Medicine, New Haven, CT, USA; Department of Neuroscience, Yale University School of Medicine, New Haven, CT, USA; Wu Tsai Institute, Yale University, New Haven, CT, USA
| |
Collapse
|
4
|
da Silva Rosa SC, Barzegar Behrooz A, Guedes S, Vitorino R, Ghavami S. Prioritization of genes for translation: a computational approach. Expert Rev Proteomics 2024; 21:125-147. [PMID: 38563427 DOI: 10.1080/14789450.2024.2337004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Accepted: 02/21/2024] [Indexed: 04/04/2024]
Abstract
INTRODUCTION Gene identification for genetic diseases is critical for the development of new diagnostic approaches and personalized treatment options. Prioritization of gene translation is an important consideration in the molecular biology field, allowing researchers to focus on the most promising candidates for further investigation. AREAS COVERED In this paper, we discussed different approaches to prioritize genes for translation, including the use of computational tools and machine learning algorithms, as well as experimental techniques such as knockdown and overexpression studies. We also explored the potential biases and limitations of these approaches and proposed strategies to improve the accuracy and reliability of gene prioritization methods. Although numerous computational methods have been developed for this purpose, there is a need for computational methods that incorporate tissue-specific information to enable more accurate prioritization of candidate genes. Such methods should provide tissue-specific predictions, insights into underlying disease mechanisms, and more accurate prioritization of genes. EXPERT OPINION Using advanced computational tools and machine learning algorithms to prioritize genes, we can identify potential targets for therapeutic intervention of complex diseases. This represents an up-and-coming method for drug development and personalized medicine.
Collapse
Affiliation(s)
- Simone C da Silva Rosa
- Department of Human Anatomy and Cell Science, Max Rady College of Medicine, Rady Faculty of Health Science, University of Manitoba, Winnipeg, Canada
| | - Amir Barzegar Behrooz
- Department of Human Anatomy and Cell Science, Max Rady College of Medicine, Rady Faculty of Health Science, University of Manitoba, Winnipeg, Canada
- Electrophysiology Research Center, Neuroscience Institute, Tehran University of Medical Sciences, Tehran, Iran
| | - Sofia Guedes
- LAQV/REQUIMTE, Department of Chemistry, University of Aveiro, Aveiro, Portugal
| | - Rui Vitorino
- LAQV/REQUIMTE, Department of Chemistry, University of Aveiro, Aveiro, Portugal
- Department of Medical Sciences, Institute of Biomedicine-iBiMED, University of Aveiro, Aveiro, Portugal
- UnIC@RISE, Department of Surgery and Physiology, Faculty of Medicine of the University of Porto, Porto, Portugal
| | - Saeid Ghavami
- Department of Human Anatomy and Cell Science, Max Rady College of Medicine, Rady Faculty of Health Science, University of Manitoba, Winnipeg, Canada
- Faculty of Medicine in Zabrze, Academia of Silesia, Katowice, Poland
- Research Institute of Oncology and Hematology, Cancer Care Manitoba, University of Manitoba, Winnipeg, Canada
| |
Collapse
|
5
|
McClatchy DB, Powell SB, Yates JR. In vivo mapping of protein-protein interactions of schizophrenia risk factors generates an interconnected disease network. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.12.571320. [PMID: 38168169 PMCID: PMC10759996 DOI: 10.1101/2023.12.12.571320] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/05/2024]
Abstract
Genetic analyses of Schizophrenia (SCZ) patients have identified thousands of risk factors. In silico protein-protein interaction (PPI) network analysis has provided strong evidence that disrupted PPI networks underlie SCZ pathogenesis. In this study, we performed in vivo PPI analysis of several SCZ risk factors in the rodent brain. Using endogenous antibody immunoprecipitations coupled to mass spectrometry (MS) analysis, we constructed a SCZ network comprising 1612 unique PPI with a 5% FDR. Over 90% of the PPI were novel, reflecting the lack of previous PPI MS studies in brain tissue. Our SCZ PPI network was enriched with known SCZ risk factors, which supports the hypothesis that an accumulation of disturbances in selected PPI networks underlies SCZ. We used Stable Isotope Labeling in Mammals (SILAM) to quantitate phencyclidine (PCP) perturbations in the SCZ network and found that PCP weakened most PPI but also led to some enhanced or new PPI. These findings demonstrate that quantitating PPI in perturbed biological states can reveal alterations to network biology.
Collapse
|
6
|
Simonovsky E, Sharon M, Ziv M, Mauer O, Hekselman I, Jubran J, Vinogradov E, Argov CM, Basha O, Kerber L, Yogev Y, Segrè AV, Im HK, Birk O, Rokach L, Yeger‐Lotem E. Predicting molecular mechanisms of hereditary diseases by using their tissue-selective manifestation. Mol Syst Biol 2023; 19:e11407. [PMID: 37232043 PMCID: PMC10407743 DOI: 10.15252/msb.202211407] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2022] [Revised: 04/30/2023] [Accepted: 05/10/2023] [Indexed: 05/27/2023] Open
Abstract
How do aberrations in widely expressed genes lead to tissue-selective hereditary diseases? Previous attempts to answer this question were limited to testing a few candidate mechanisms. To answer this question at a larger scale, we developed "Tissue Risk Assessment of Causality by Expression" (TRACE), a machine learning approach to predict genes that underlie tissue-selective diseases and selectivity-related features. TRACE utilized 4,744 biologically interpretable tissue-specific gene features that were inferred from heterogeneous omics datasets. Application of TRACE to 1,031 disease genes uncovered known and novel selectivity-related features, the most common of which was previously overlooked. Next, we created a catalog of tissue-associated risks for 18,927 protein-coding genes (https://netbio.bgu.ac.il/trace/). As proof-of-concept, we prioritized candidate disease genes identified in 48 rare-disease patients. TRACE ranked the verified disease gene among the patient's candidate genes significantly better than gene prioritization methods that rank by gene constraint or tissue expression. Thus, tissue selectivity combined with machine learning enhances genetic and clinical understanding of hereditary diseases.
Collapse
Affiliation(s)
- Eyal Simonovsky
- Department of Clinical Biochemistry and PharmacologyBen‐Gurion University of the NegevBeer ShevaIsrael
| | - Moran Sharon
- Department of Clinical Biochemistry and PharmacologyBen‐Gurion University of the NegevBeer ShevaIsrael
| | - Maya Ziv
- Department of Clinical Biochemistry and PharmacologyBen‐Gurion University of the NegevBeer ShevaIsrael
| | - Omry Mauer
- Department of Clinical Biochemistry and PharmacologyBen‐Gurion University of the NegevBeer ShevaIsrael
| | - Idan Hekselman
- Department of Clinical Biochemistry and PharmacologyBen‐Gurion University of the NegevBeer ShevaIsrael
| | - Juman Jubran
- Department of Clinical Biochemistry and PharmacologyBen‐Gurion University of the NegevBeer ShevaIsrael
| | - Ekaterina Vinogradov
- Department of Clinical Biochemistry and PharmacologyBen‐Gurion University of the NegevBeer ShevaIsrael
| | - Chanan M Argov
- Department of Clinical Biochemistry and PharmacologyBen‐Gurion University of the NegevBeer ShevaIsrael
| | - Omer Basha
- Department of Clinical Biochemistry and PharmacologyBen‐Gurion University of the NegevBeer ShevaIsrael
| | - Lior Kerber
- Department of Clinical Biochemistry and PharmacologyBen‐Gurion University of the NegevBeer ShevaIsrael
| | - Yuval Yogev
- Morris Kahn Laboratory of Human Genetics and the Genetics Institute at Soroka Medical Center, Faculty of Health SciencesBen Gurion University of the NegevBeer ShevaIsrael
| | - Ayellet V Segrè
- Ocular Genomics Institute, Massachusetts Eye and EarHarvard Medical SchoolBostonMAUSA
- The Broad Institute of MIT and HarvardCambridgeMAUSA
| | - Hae Kyung Im
- Section of Genetic Medicine, Department of MedicineThe University of ChicagoChicagoILUSA
| | | | - Ohad Birk
- Morris Kahn Laboratory of Human Genetics and the Genetics Institute at Soroka Medical Center, Faculty of Health SciencesBen Gurion University of the NegevBeer ShevaIsrael
- The National Institute for Biotechnology in the NegevBen‐Gurion University of the NegevBeer ShevaIsrael
| | - Lior Rokach
- Department of Software & Information Systems EngineeringBen‐Gurion University of the NegevBeer ShevaIsrael
| | - Esti Yeger‐Lotem
- Department of Clinical Biochemistry and PharmacologyBen‐Gurion University of the NegevBeer ShevaIsrael
- The National Institute for Biotechnology in the NegevBen‐Gurion University of the NegevBeer ShevaIsrael
| |
Collapse
|
7
|
Hussein S, Vu T, Lange L, Bowler RP, Kechris KJ, Banaei-Kashani F. Effective Subject Representation based on Multi-omics Disease Networks using Graph Embedding. PROCEEDINGS. IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE 2022; 2022:1911-1918. [PMID: 36776768 PMCID: PMC9916186 DOI: 10.1109/bibm55620.2022.9995707] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
The study of complex behavior of biological systems has become increasingly dependent on evolutionary network modeling. In particular, multi-omics networks capture interactions between biomolecules such as proteins and metabolites, providing a basis for predicting relationships between such biomolecules and various phenotypic traits of complex diseases. In this paper, we introduce an integrative framework that given a multi-omics network representing a cohort of subjects, learns expressive representations for network nodes, and combines the learned nodes representations with the biological profiles of individual subjects for enriched representation of the subjects. With extensive empirical evaluation using real-world multi-omics networks, we show that our proposed framework significantly outperforms existing and baseline methods in terms of subject representation accuracy, particularly when the multi-omics network representing the cohort is sparse and structured and therefore, more informative.
Collapse
Affiliation(s)
- Sundous Hussein
- Department of Computer Science and Engineering, University of Colorado Denver, Denver, CO, USA
| | - Thao Vu
- Department of Biostatistics and Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Leslie Lange
- Division of Biomedical Informatics and Personalized Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Russell P Bowler
- Division of Pulmonary, Critical Care and Sleep Medicine, National Jewish Health, Denver, CO, USA
| | - Katerina J Kechris
- Department of Biostatistics and Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Farnoush Banaei-Kashani
- Department of Computer Science and Engineering, University of Colorado Denver, Denver, CO, USA
| |
Collapse
|
8
|
Möller S, Saul N, Projahn E, Barrantes I, Gézsi A, Walter M, Antal P, Fuellen G. Gene co-expression analyses of health(span) across multiple species. NAR Genom Bioinform 2022; 4:lqac083. [PMID: 36458022 PMCID: PMC9706456 DOI: 10.1093/nargab/lqac083] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2021] [Revised: 08/20/2022] [Accepted: 10/31/2022] [Indexed: 12/03/2022] Open
Abstract
Health(span)-related gene clusters/modules were recently identified based on knowledge about the cross-species genetic basis of health, to interpret transcriptomic datasets describing health-related interventions. However, the cross-species comparison of health-related observations reveals a lot of heterogeneity, not least due to widely varying health(span) definitions and study designs, posing a challenge for the exploration of conserved healthspan modules and, specifically, their transfer across species. To improve the identification and exploration of conserved/transferable healthspan modules, here we apply an established workflow based on gene co-expression network analyses employing GEO/ArrayExpress data for human and animal models, and perform a comprehensive meta-study of the resulting modules related to health(span), yielding a small set of literature backed health(span) candidate genes. For each experiment, WGCNA (weighted gene correlation network analysis) was used to infer modules of genes which correlate in their expression with a 'health phenotype score' and to determine the most-connected (hub) genes (and their interactions) for each such module. After mapping these hub genes to their human orthologs, 12 health(span) genes were identified in at least two species (ACTN3, ANK1, MRPL18, MYL1, PAXIP1, PPP1CA, SCN3B, SDCBP, SKIV2L, TUBG1, TYROBP, WIPF1), for which enrichment analysis by g:profiler found an association with actin filament-based movement and associated organelles, as well as muscular structures. We conclude that a meta-study of hub genes from co-expression network analyses for the complex phenotype health(span), across multiple species, can yield molecular-mechanistic insights and can direct experimentalists to further investigate the contribution of individual genes and their interactions to health(span).
Collapse
Affiliation(s)
- Steffen Möller
- To whom correspondence should be addressed. Tel: +49 381 494 7361; Fax: +49 381 494 7203;
| | - Nadine Saul
- Humboldt-University of Berlin, Institute of Biology, Berlin, Germany
| | - Elias Projahn
- Rostock University Medical Center, Institute for Biostatistics and Informatics in Medicine and Ageing Research, Rostock, Germany
| | - Israel Barrantes
- Rostock University Medical Center, Institute for Biostatistics and Informatics in Medicine and Ageing Research, Rostock, Germany
| | - András Gézsi
- Budapest University of Technology and Economics, Department of Measurement and Information Systems, Budapest, Hungary
| | - Michael Walter
- Rostock University Medical Center, Institute for Clinical Chemistry and Laboratory Medicine, Rostock, Germany
| | - Péter Antal
- Budapest University of Technology and Economics, Department of Measurement and Information Systems, Budapest, Hungary
| | - Georg Fuellen
- Rostock University Medical Center, Institute for Biostatistics and Informatics in Medicine and Ageing Research, Rostock, Germany
| |
Collapse
|
9
|
Kaczmarczyk L, Schleif M, Dittrich L, Williams RH, Koderman M, Bansal V, Rajput A, Schulte T, Jonson M, Krost C, Testaquadra FJ, Bonn S, Jackson WS. Distinct translatome changes in specific neural populations precede electroencephalographic changes in prion-infected mice. PLoS Pathog 2022; 18:e1010747. [PMID: 35960762 PMCID: PMC9401167 DOI: 10.1371/journal.ppat.1010747] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2022] [Revised: 08/24/2022] [Accepted: 07/18/2022] [Indexed: 12/04/2022] Open
Abstract
Selective vulnerability is an enigmatic feature of neurodegenerative diseases (NDs), whereby a widely expressed protein causes lesions in specific cell types and brain regions. Using the RiboTag method in mice, translational responses of five neural subtypes to acquired prion disease (PrD) were measured. Pre-onset and disease onset timepoints were chosen based on longitudinal electroencephalography (EEG) that revealed a gradual increase in theta power between 10- and 18-weeks after prion injection, resembling a clinical feature of human PrD. At disease onset, marked by significantly increased theta power and histopathological lesions, mice had pronounced translatome changes in all five cell types despite appearing normal. Remarkably, at a pre-onset stage, prior to EEG and neuropathological changes, we found that 1) translatomes of astrocytes indicated reduced synthesis of ribosomal and mitochondrial components, 2) glutamatergic neurons showed increased expression of cytoskeletal genes, and 3) GABAergic neurons revealed reduced expression of circadian rhythm genes. These data demonstrate that early translatome responses to neurodegeneration emerge prior to conventional markers of disease and are cell type-specific. Therapeutic strategies may need to target multiple pathways in specific populations of cells, early in disease. Prions are infectious agents composed of a misfolded protein. When isolated from a mammalian brain and transferred to the same host species, prions will cause the same neurodegenerative disease affecting the same brain regions and cell types. This concept of selective vulnerability is also a feature of more common types of neurodegenerative diseases, such as Alzheimer’s, Parkinson’s, and Huntington’s. To better understand the mechanisms behind selective vulnerability, we studied disease responses of five cell types with different vulnerabilities in prion-infected mice at two different disease stages. Responses were measured as changes to mRNAs undergoing translation, referred to as the translatome. Before prion-infected mice demonstrated typical disease signs, electroencephalography (a method used clinically to characterize neurodegeneration in humans) revealed brain changes resembling those in human prion diseases, and surprisingly, the translatomes of all cells were drastically changed. Furthermore, before electroencephalography changes emerged, three cell types made unique responses while the most vulnerable cell type did not. These results suggests that mechanisms causing selective vulnerability will be difficult to dissect and that therapies will likely need to be provided before clinical signs emerge and individually engage multiple cell types and their distinct molecular pathways.
Collapse
Affiliation(s)
- Lech Kaczmarczyk
- Wallenberg Center for Molecular Medicine, Department of Biomedical and Clinical Sciences, Linköping University, Linköping, Sweden
- German Center for Neurodegenerative Diseases, Bonn, Germany
| | - Melvin Schleif
- German Center for Neurodegenerative Diseases, Bonn, Germany
| | - Lars Dittrich
- German Center for Neurodegenerative Diseases, Bonn, Germany
| | | | - Maruša Koderman
- Wallenberg Center for Molecular Medicine, Department of Biomedical and Clinical Sciences, Linköping University, Linköping, Sweden
| | - Vikas Bansal
- Institute of Medical Systems Biology, Center for Biomedical AI (bAIome), Center for Molecular Neurobiology Hamburg (ZMNH), University Medical Center Hamburg-Eppendorf, Germany
- German Center for Neurodegenerative Diseases, Tübingen, Germany
| | - Ashish Rajput
- Institute of Medical Systems Biology, Center for Biomedical AI (bAIome), Center for Molecular Neurobiology Hamburg (ZMNH), University Medical Center Hamburg-Eppendorf, Germany
- Maximon AG, Zug, Switzerland
| | | | - Maria Jonson
- Wallenberg Center for Molecular Medicine, Department of Biomedical and Clinical Sciences, Linköping University, Linköping, Sweden
| | - Clemens Krost
- German Center for Neurodegenerative Diseases, Bonn, Germany
| | | | - Stefan Bonn
- Institute of Medical Systems Biology, Center for Biomedical AI (bAIome), Center for Molecular Neurobiology Hamburg (ZMNH), University Medical Center Hamburg-Eppendorf, Germany
| | - Walker S. Jackson
- Wallenberg Center for Molecular Medicine, Department of Biomedical and Clinical Sciences, Linköping University, Linköping, Sweden
- German Center for Neurodegenerative Diseases, Bonn, Germany
- * E-mail:
| |
Collapse
|
10
|
Chothani SP, Adami E, Widjaja AA, Langley SR, Viswanathan S, Pua CJ, Zhihao NT, Harmston N, D'Agostino G, Whiffin N, Mao W, Ouyang JF, Lim WW, Lim S, Lee CQE, Grubman A, Chen J, Kovalik JP, Tryggvason K, Polo JM, Ho L, Cook SA, Rackham OJL, Schafer S. A high-resolution map of human RNA translation. Mol Cell 2022; 82:2885-2899.e8. [PMID: 35841888 DOI: 10.1016/j.molcel.2022.06.023] [Citation(s) in RCA: 49] [Impact Index Per Article: 24.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2021] [Revised: 03/10/2022] [Accepted: 06/15/2022] [Indexed: 10/17/2022]
Abstract
Translated small open reading frames (smORFs) can have important regulatory roles and encode microproteins, yet their genome-wide identification has been challenging. We determined the ribosome locations across six primary human cell types and five tissues and detected 7,767 smORFs with translational profiles matching those of known proteins. The human genome was found to contain highly cell-type- and tissue-specific smORFs and a subset that encodes highly conserved amino acid sequences. Changes in the translational efficiency of upstream-encoded smORFs (uORFs) and the corresponding main ORFs predominantly occur in the same direction. Integration with 456 mass-spectrometry datasets confirms the presence of 603 small peptides at the protein level in humans and provides insights into the subcellular localization of these small proteins. This study provides a comprehensive atlas of high-confidence translated smORFs derived from primary human cells and tissues in order to provide a more complete understanding of the translated human genome.
Collapse
Affiliation(s)
- Sonia P Chothani
- Program in Cardiovascular and Metabolic Disorders, Duke-National University of Singapore, Singapore 169857, Singapore
| | - Eleonora Adami
- Program in Cardiovascular and Metabolic Disorders, Duke-National University of Singapore, Singapore 169857, Singapore; Cardiovascular and Metabolic Sciences, Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany
| | - Anissa A Widjaja
- Program in Cardiovascular and Metabolic Disorders, Duke-National University of Singapore, Singapore 169857, Singapore
| | - Sarah R Langley
- Lee Kong Chian School of Medicine, Nanyang Technological University, Clinical Sciences Building, Singapore 308232, Singapore
| | - Sivakumar Viswanathan
- Program in Cardiovascular and Metabolic Disorders, Duke-National University of Singapore, Singapore 169857, Singapore
| | - Chee Jian Pua
- National Heart Research Institute Singapore (NHRIS), National Heart Centre Singapore, Singapore 169609, Singapore
| | - Nevin Tham Zhihao
- Lee Kong Chian School of Medicine, Nanyang Technological University, Clinical Sciences Building, Singapore 308232, Singapore
| | - Nathan Harmston
- Program in Cancer and Stem Cell Biology, Duke-NUS Medical School, Singapore 169857, Singapore; Science Division, Yale-NUS College, Singapore 138527, Singapore
| | - Giuseppe D'Agostino
- Lee Kong Chian School of Medicine, Nanyang Technological University, Clinical Sciences Building, Singapore 308232, Singapore
| | - Nicola Whiffin
- Wellcome Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, UK
| | - Wang Mao
- Program in Cardiovascular and Metabolic Disorders, Duke-National University of Singapore, Singapore 169857, Singapore
| | - John F Ouyang
- Program in Cardiovascular and Metabolic Disorders, Duke-National University of Singapore, Singapore 169857, Singapore
| | - Wei Wen Lim
- Program in Cardiovascular and Metabolic Disorders, Duke-National University of Singapore, Singapore 169857, Singapore; National Heart Research Institute Singapore (NHRIS), National Heart Centre Singapore, Singapore 169609, Singapore
| | - Shiqi Lim
- National Heart Research Institute Singapore (NHRIS), National Heart Centre Singapore, Singapore 169609, Singapore
| | - Cheryl Q E Lee
- Program in Cardiovascular and Metabolic Disorders, Duke-National University of Singapore, Singapore 169857, Singapore
| | - Alexandra Grubman
- Department of Anatomy and Developmental Biology, Monash University, Wellington Road, Clayton, VIC 3800, Australia; Development and Stem Cells Program, Monash Biomedicine Discovery Institute, Wellington Road, Clayton, VIC 3800, Australia; Australian Regenerative Medicine Institute, Monash University, Wellington Road, Clayton, VIC 3800, Australia
| | - Joseph Chen
- Department of Anatomy and Developmental Biology, Monash University, Wellington Road, Clayton, VIC 3800, Australia; Development and Stem Cells Program, Monash Biomedicine Discovery Institute, Wellington Road, Clayton, VIC 3800, Australia; Australian Regenerative Medicine Institute, Monash University, Wellington Road, Clayton, VIC 3800, Australia
| | - J P Kovalik
- Program in Cardiovascular and Metabolic Disorders, Duke-National University of Singapore, Singapore 169857, Singapore
| | - Karl Tryggvason
- Program in Cardiovascular and Metabolic Disorders, Duke-National University of Singapore, Singapore 169857, Singapore
| | - Jose M Polo
- Department of Anatomy and Developmental Biology, Monash University, Wellington Road, Clayton, VIC 3800, Australia; Development and Stem Cells Program, Monash Biomedicine Discovery Institute, Wellington Road, Clayton, VIC 3800, Australia; Australian Regenerative Medicine Institute, Monash University, Wellington Road, Clayton, VIC 3800, Australia
| | - Lena Ho
- Program in Cardiovascular and Metabolic Disorders, Duke-National University of Singapore, Singapore 169857, Singapore
| | - Stuart A Cook
- Program in Cardiovascular and Metabolic Disorders, Duke-National University of Singapore, Singapore 169857, Singapore; National Heart Research Institute Singapore (NHRIS), National Heart Centre Singapore, Singapore 169609, Singapore; London Institute of Medical Sciences, London W12 ONN, UK
| | - Owen J L Rackham
- Program in Cardiovascular and Metabolic Disorders, Duke-National University of Singapore, Singapore 169857, Singapore; School of Biological Sciences, University of Southampton, Southampton, UK.
| | - Sebastian Schafer
- Program in Cardiovascular and Metabolic Disorders, Duke-National University of Singapore, Singapore 169857, Singapore; National Heart Research Institute Singapore (NHRIS), National Heart Centre Singapore, Singapore 169609, Singapore.
| |
Collapse
|
11
|
Devkota K, Schmidt H, Werenski M, Murphy JM, Erden M, Arsenescu V, Cowen LJ. GLIDER: Function Prediction from GLIDE-based Neigborhoods. Bioinformatics 2022; 38:3395-3406. [PMID: 35575379 PMCID: PMC9237677 DOI: 10.1093/bioinformatics/btac322] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2021] [Revised: 02/10/2022] [Accepted: 05/10/2022] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Protein function prediction, based on the patterns of connection in a Protein-Protein Interaction (or Association) network, is perhaps the most studied of the classical, fundamental inference problems for biological networks. A highly successful set of recent approaches use random walk-based low dimensional embeddings, that tend to place functionally similar proteins into coherent spatial regions. However, these approaches lose valuable local graph structure from the network when considering only the embedding. We introduce GLIDER, a method that replaces a protein-protein interaction or association network with a new graph-based similarity network. GLIDER is based on a variant of our previous GLIDE method, which was designed to predict missing links in Protein-Protein Association networks, capturing implicit local and global (i.e. embedding-based) graph properties. RESULTS GLIDER outperforms competing methods on the task of predicting GO functional labels in cross-validation on a heterogeneous collection of four Human Protein-Protein Association networks derived from the 2016 DREAM Disease Module Identification Challenge, and also on three different protein-protein association networks built from the STRING database. We show that this is due to the strong functional enrichment that is present in the local GLIDER neighborhood in multiple different types of protein-protein association networks. Furthermore, we introduce the GLIDER graph neighborhood as a way for biologists to visualize the local neighborhood of a disease gene. As an application, we look at the local GLIDER neighborhoods of a set of known Parkinson's Disease GWAS genes, rediscover many genes which have known involvement in Parkinson's disease pathways, plus suggest some new genes to study. AVAILABILITY All code is publicly available and can be accessed here: https://github.com/kap-devkota/GLIDER. SUPPLEMENTARY INFORMATION is available at Bioinformatics online.
Collapse
Affiliation(s)
- Kapil Devkota
- Department of Computer Science, Tufts University, Medford, MA, 02155, USA
| | - Henri Schmidt
- Department of Computer Science, Tufts University, Medford, MA, 02155, USA
| | - Matt Werenski
- Department of Computer Science, Tufts University, Medford, MA, 02155, USA
| | - James M Murphy
- Department of Mathematics, Tufts University, Medford, MA, 02155, USA
| | - Mert Erden
- Department of Computer Science, Tufts University, Medford, MA, 02155, USA
| | - Victor Arsenescu
- Department of Computer Science, Tufts University, Medford, MA, 02155, USA
| | - Lenore J Cowen
- Department of Computer Science, Tufts University, Medford, MA, 02155, USA
| |
Collapse
|
12
|
Gonzalez-Teran B, Pittman M, Felix F, Thomas R, Richmond-Buccola D, Hüttenhain R, Choudhary K, Moroni E, Costa MW, Huang Y, Padmanabhan A, Alexanian M, Lee CY, Maven BEJ, Samse-Knapp K, Morton SU, McGregor M, Gifford CA, Seidman JG, Seidman CE, Gelb BD, Colombo G, Conklin BR, Black BL, Bruneau BG, Krogan NJ, Pollard KS, Srivastava D. Transcription factor protein interactomes reveal genetic determinants in heart disease. Cell 2022; 185:794-814.e30. [PMID: 35182466 PMCID: PMC8923057 DOI: 10.1016/j.cell.2022.01.021] [Citation(s) in RCA: 35] [Impact Index Per Article: 17.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2020] [Revised: 08/20/2021] [Accepted: 01/25/2022] [Indexed: 02/08/2023]
Abstract
Congenital heart disease (CHD) is present in 1% of live births, yet identification of causal mutations remains challenging. We hypothesized that genetic determinants for CHDs may lie in the protein interactomes of transcription factors whose mutations cause CHDs. Defining the interactomes of two transcription factors haplo-insufficient in CHD, GATA4 and TBX5, within human cardiac progenitors, and integrating the results with nearly 9,000 exomes from proband-parent trios revealed an enrichment of de novo missense variants associated with CHD within the interactomes. Scoring variants of interactome members based on residue, gene, and proband features identified likely CHD-causing genes, including the epigenetic reader GLYR1. GLYR1 and GATA4 widely co-occupied and co-activated cardiac developmental genes, and the identified GLYR1 missense variant disrupted interaction with GATA4, impairing in vitro and in vivo function in mice. This integrative proteomic and genetic approach provides a framework for prioritizing and interrogating genetic variants in heart disease.
Collapse
Affiliation(s)
- Barbara Gonzalez-Teran
- Gladstone Institutes, San Francisco, CA, USA; Roddenberry Center for Stem Cell Biology and Medicine at Gladstone, San Francisco, CA, USA
| | - Maureen Pittman
- Gladstone Institutes, San Francisco, CA, USA; Department of Epidemiology & Biostatistics, Institute for Computational Health Sciences, and Institute for Human Genetics, University of California San Francisco, San Francisco, CA, USA
| | - Franco Felix
- Gladstone Institutes, San Francisco, CA, USA; Roddenberry Center for Stem Cell Biology and Medicine at Gladstone, San Francisco, CA, USA
| | | | - Desmond Richmond-Buccola
- Gladstone Institutes, San Francisco, CA, USA; Roddenberry Center for Stem Cell Biology and Medicine at Gladstone, San Francisco, CA, USA
| | - Ruth Hüttenhain
- Gladstone Institutes, San Francisco, CA, USA; Department of Cellular and Molecular Pharmacology, University of California San Francisco, San Francisco, CA, USA; Quantitative Biosciences Institute (QBI), University of California San Francisco, San Francisco, CA, USA
| | | | | | - Mauro W Costa
- Gladstone Institutes, San Francisco, CA, USA; Roddenberry Center for Stem Cell Biology and Medicine at Gladstone, San Francisco, CA, USA
| | - Yu Huang
- Gladstone Institutes, San Francisco, CA, USA; Roddenberry Center for Stem Cell Biology and Medicine at Gladstone, San Francisco, CA, USA
| | - Arun Padmanabhan
- Gladstone Institutes, San Francisco, CA, USA; Roddenberry Center for Stem Cell Biology and Medicine at Gladstone, San Francisco, CA, USA; Division of Cardiology, Department of Medicine, University of California, San Francisco, CA, USA
| | - Michael Alexanian
- Gladstone Institutes, San Francisco, CA, USA; Roddenberry Center for Stem Cell Biology and Medicine at Gladstone, San Francisco, CA, USA
| | - Clara Youngna Lee
- Gladstone Institutes, San Francisco, CA, USA; Roddenberry Center for Stem Cell Biology and Medicine at Gladstone, San Francisco, CA, USA
| | - Bonnie E J Maven
- Gladstone Institutes, San Francisco, CA, USA; Roddenberry Center for Stem Cell Biology and Medicine at Gladstone, San Francisco, CA, USA; Developmental and Stem Cell Biology Graduate Program, University of California San Francisco, San Francisco, CA, USA
| | - Kaitlen Samse-Knapp
- Gladstone Institutes, San Francisco, CA, USA; Roddenberry Center for Stem Cell Biology and Medicine at Gladstone, San Francisco, CA, USA
| | - Sarah U Morton
- Division of Newborn Medicine, Department of Medicine, Boston Children's Hospital, Boston, MA, USA; Department of Pediatrics, Harvard Medical School, Boston, MA, USA
| | - Michael McGregor
- Gladstone Institutes, San Francisco, CA, USA; Department of Cellular and Molecular Pharmacology, University of California San Francisco, San Francisco, CA, USA; Quantitative Biosciences Institute (QBI), University of California San Francisco, San Francisco, CA, USA
| | - Casey A Gifford
- Gladstone Institutes, San Francisco, CA, USA; Roddenberry Center for Stem Cell Biology and Medicine at Gladstone, San Francisco, CA, USA
| | - J G Seidman
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - Christine E Seidman
- Department of Genetics, Harvard Medical School, Boston, MA, USA; Howard Hughes Medical Institute, Harvard Medical School, Boston, MA, USA; Cardiovascular Division, Brigham and Women's Hospital, Boston, MA, USA
| | - Bruce D Gelb
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA; Mindich Child Health and Development Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA; Department of Pediatrics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | | | - Bruce R Conklin
- Gladstone Institutes, San Francisco, CA, USA; Roddenberry Center for Stem Cell Biology and Medicine at Gladstone, San Francisco, CA, USA
| | - Brian L Black
- Cardiovascular Research Institute, University of California San Francisco, San Francisco, CA, USA
| | - Benoit G Bruneau
- Gladstone Institutes, San Francisco, CA, USA; Roddenberry Center for Stem Cell Biology and Medicine at Gladstone, San Francisco, CA, USA; Cardiovascular Research Institute, University of California San Francisco, San Francisco, CA, USA; Division of Cardiology, Department of Pediatrics, UCSF School of Medicine, San Francisco, CA, USA
| | - Nevan J Krogan
- Gladstone Institutes, San Francisco, CA, USA; Department of Cellular and Molecular Pharmacology, University of California San Francisco, San Francisco, CA, USA; Quantitative Biosciences Institute (QBI), University of California San Francisco, San Francisco, CA, USA
| | - Katherine S Pollard
- Gladstone Institutes, San Francisco, CA, USA; Chan Zuckerberg Biohub, San Francisco, CA, USA; Department of Epidemiology & Biostatistics, Institute for Computational Health Sciences, and Institute for Human Genetics, University of California San Francisco, San Francisco, CA, USA.
| | - Deepak Srivastava
- Gladstone Institutes, San Francisco, CA, USA; Roddenberry Center for Stem Cell Biology and Medicine at Gladstone, San Francisco, CA, USA; Division of Cardiology, Department of Pediatrics, UCSF School of Medicine, San Francisco, CA, USA; Department of Biochemistry and Biophysics, University of California San Francisco, San Francisco, CA, USA.
| |
Collapse
|
13
|
Ziv M, Gruber G, Sharon M, Vinogradov E, Yeger-Lotem E. The TissueNet v.3 database: Protein-protein interactions in adult and embryonic human tissue contexts. J Mol Biol 2022; 434:167532. [DOI: 10.1016/j.jmb.2022.167532] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Revised: 03/03/2022] [Accepted: 03/03/2022] [Indexed: 12/28/2022]
|
14
|
Badkas A, De Landtsheer S, Sauter T. Construction and contextualization approaches for protein-protein interaction networks. Comput Struct Biotechnol J 2022; 20:3280-3290. [PMID: 35832626 PMCID: PMC9251778 DOI: 10.1016/j.csbj.2022.06.040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2022] [Revised: 06/15/2022] [Accepted: 06/15/2022] [Indexed: 11/17/2022] Open
Abstract
Protein-protein interaction network (PPIN) analysis is a widely used method to study the contextual role of proteins of interest, to predict novel disease genes, disease or functional modules, and to identify novel drug targets. PPIN-based analysis uses both generic and context-specific networks. Multiple contextualization methodologies have been described, such as shortest-path algorithms, neighborhood-based methods, and diffusion/propagation algorithms. This review discusses these methods, provides intuitive representations of PPIN contextualization, and also examines how the quality of such context-specific networks could be improved by considering additional sources of evidence. As a heuristic, we observe that tasks such as identifying disease genes, drug targets, and protein complexes should consider local neighborhoods, while uncovering disease mechanisms and discovering disease-pathways would gain from diffusion-based construction.
Collapse
|
15
|
Yadav AK, Shukla R, Singh TR. Topological parameters, patterns, and motifs in biological networks. Bioinformatics 2022. [DOI: 10.1016/b978-0-323-89775-4.00012-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022] Open
|
16
|
Kotlyar M, Pastrello C, Ahmed Z, Chee J, Varyova Z, Jurisica I. IID 2021: towards context-specific protein interaction analyses by increased coverage, enhanced annotation and enrichment analysis. Nucleic Acids Res 2021; 50:D640-D647. [PMID: 34755877 PMCID: PMC8728267 DOI: 10.1093/nar/gkab1034] [Citation(s) in RCA: 34] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2021] [Revised: 10/13/2021] [Accepted: 11/03/2021] [Indexed: 01/02/2023] Open
Abstract
Improved bioassays have significantly increased the rate of identifying new protein-protein interactions (PPIs), and the number of detected human PPIs has greatly exceeded early estimates of human interactome size. These new PPIs provide a more complete view of disease mechanisms but precise understanding of how PPIs affect phenotype remains a challenge. It requires knowledge of PPI context (e.g. tissues, subcellular localizations), and functional roles, especially within pathways and protein complexes. The previous IID release focused on PPI context, providing networks with comprehensive tissue, disease, cellular localization, and druggability annotations. The current update adds developmental stages to the available contexts, and provides a way of assigning context to PPIs that could not be previously annotated due to insufficient data or incompatibility with available context categories (e.g. interactions between membrane and cytoplasmic proteins). This update also annotates PPIs with conservation across species, directionality in pathways, membership in large complexes, interaction stability (i.e. stable or transient), and mutation effects. Enrichment analysis is now available for all annotations, and includes multiple options; for example, context annotations can be analyzed with respect to PPIs or network proteins. In addition to tabular view or download, IID provides online network visualization. This update is available at http://ophid.utoronto.ca/iid.
Collapse
Affiliation(s)
- Max Kotlyar
- Osteoarthritis Research Program, Division of Orthopedic Surgery, Schroeder Arthritis Institute and Data Science Discovery Centre for Chronic Diseases, Krembil Research Institute, University Health Network, Toronto, ON M5T 0S8, Canada
| | - Chiara Pastrello
- Osteoarthritis Research Program, Division of Orthopedic Surgery, Schroeder Arthritis Institute and Data Science Discovery Centre for Chronic Diseases, Krembil Research Institute, University Health Network, Toronto, ON M5T 0S8, Canada
| | - Zuhaib Ahmed
- Osteoarthritis Research Program, Division of Orthopedic Surgery, Schroeder Arthritis Institute and Data Science Discovery Centre for Chronic Diseases, Krembil Research Institute, University Health Network, Toronto, ON M5T 0S8, Canada
| | - Justin Chee
- Osteoarthritis Research Program, Division of Orthopedic Surgery, Schroeder Arthritis Institute and Data Science Discovery Centre for Chronic Diseases, Krembil Research Institute, University Health Network, Toronto, ON M5T 0S8, Canada
| | - Zofia Varyova
- Osteoarthritis Research Program, Division of Orthopedic Surgery, Schroeder Arthritis Institute and Data Science Discovery Centre for Chronic Diseases, Krembil Research Institute, University Health Network, Toronto, ON M5T 0S8, Canada
| | - Igor Jurisica
- Osteoarthritis Research Program, Division of Orthopedic Surgery, Schroeder Arthritis Institute and Data Science Discovery Centre for Chronic Diseases, Krembil Research Institute, University Health Network, Toronto, ON M5T 0S8, Canada.,Departments of Medical Biophysics and Computer Science, University of Toronto, Toronto, ON M5S 1A4, Canada.,Institute of Neuroimmunology, Slovak Academy of Sciences, Bratislava, Slovakia
| |
Collapse
|
17
|
Abstract
Interpreting the effects of genetic variants is key to understanding individual susceptibility to disease and designing personalized therapeutic approaches. Modern experimental technologies are enabling the generation of massive compendia of human genome sequence data and associated molecular and phenotypic traits, together with genome-scale expression, epigenomics and other functional genomic data. Integrative computational models can leverage these data to understand variant impact, elucidate the effect of dysregulated genes on biological pathways in specific disease and tissue contexts, and interpret disease risk beyond what is feasible with experiments alone. In this Review, we discuss recent developments in machine learning algorithms for genome interpretation and for integrative molecular-level modelling of cells, tissues and organs relevant to disease. More specifically, we highlight existing methods and key challenges and opportunities in identifying specific disease-causing genetic variants and linking them to molecular pathways and, ultimately, to disease phenotypes.
Collapse
|
18
|
Skinnider MA, Scott NE, Prudova A, Kerr CH, Stoynov N, Stacey RG, Chan QWT, Rattray D, Gsponer J, Foster LJ. An atlas of protein-protein interactions across mouse tissues. Cell 2021; 184:4073-4089.e17. [PMID: 34214469 DOI: 10.1016/j.cell.2021.06.003] [Citation(s) in RCA: 46] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2021] [Revised: 04/05/2021] [Accepted: 06/01/2021] [Indexed: 12/20/2022]
Abstract
Cellular processes arise from the dynamic organization of proteins in networks of physical interactions. Mapping the interactome has therefore been a central objective of high-throughput biology. However, the dynamics of protein interactions across physiological contexts remain poorly understood. Here, we develop a quantitative proteomic approach combining protein correlation profiling with stable isotope labeling of mammals (PCP-SILAM) to map the interactomes of seven mouse tissues. The resulting maps provide a proteome-scale survey of interactome rewiring across mammalian tissues, revealing more than 125,000 unique interactions at a quality comparable to the highest-quality human screens. We identify systematic suppression of cross-talk between the evolutionarily ancient housekeeping interactome and younger, tissue-specific modules. Rewired proteins are tightly regulated by multiple cellular mechanisms and are implicated in disease. Our study opens up new avenues to uncover regulatory mechanisms that shape in vivo interactome responses to physiological and pathophysiological stimuli in mammalian systems.
Collapse
Affiliation(s)
- Michael A Skinnider
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC V6T 1Z4, Canada
| | - Nichollas E Scott
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC V6T 1Z4, Canada; Peter Doherty Institute, Department of Microbiology and Immunology, The University of Melbourne, Melbourne, VIC 3000, Australia
| | - Anna Prudova
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC V6T 1Z4, Canada
| | - Craig H Kerr
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC V6T 1Z4, Canada; Department of Biochemistry & Molecular Biology, University of British Columbia, Vancouver, BC V6T 1Z3, Canada
| | - Nikolay Stoynov
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC V6T 1Z4, Canada
| | - R Greg Stacey
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC V6T 1Z4, Canada
| | - Queenie W T Chan
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC V6T 1Z4, Canada
| | - David Rattray
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC V6T 1Z4, Canada; Department of Biochemistry & Molecular Biology, University of British Columbia, Vancouver, BC V6T 1Z3, Canada
| | - Jörg Gsponer
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC V6T 1Z4, Canada; Department of Biochemistry & Molecular Biology, University of British Columbia, Vancouver, BC V6T 1Z3, Canada.
| | - Leonard J Foster
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC V6T 1Z4, Canada; Department of Biochemistry & Molecular Biology, University of British Columbia, Vancouver, BC V6T 1Z3, Canada.
| |
Collapse
|
19
|
Morgan S, Malatras A, Duguez S, Duddy W. Optimized Molecular Interaction Networks for the Study of Skeletal Muscle. J Neuromuscul Dis 2021; 8:S223-S239. [PMID: 34308911 DOI: 10.3233/jnd-210680] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
BACKGROUND Molecular interaction networks (MINs) aim to capture the complex relationships between interacting molecules within a biological system. MINs can be constructed from existing knowledge of molecular functional associations, such as protein-protein binding interactions (PPI) or gene co-expression, and these different sources may be combined into a single MIN. A given MIN may be more or less optimal in its representation of the important functional relationships of molecules in a tissue. OBJECTIVE The aim of this study was to establish whether a combined MIN derived from different types of functional association could better capture muscle-relevant biology compared to its constituent single-source MINs. METHODS MINs were constructed from functional association databases for both protein-binding and gene co-expression. The networks were then compared based on the capture of muscle-relevant genes and gene ontology (GO) terms, tested in two different ways using established biological network clustering algorithms. The top performing MINs were combined to test whether an optimal MIN for skeletal muscle could be constructed. RESULTS The STRING PPI network was the best performing single-source MIN among those tested. Combining STRING with interactions from either the MyoMiner or CoXPRESSdb gene co-expression sources resulted in a combined network with improved performance relative to its constituent networks. CONCLUSION MINs constructed from multiple types of functional association can better represent the functional relationships of molecules in a given tissue. Such networks may be used to improve the analysis and interpretation of functional genomics data in the study of skeletal muscle and neuromuscular diseases. Networks and clusters described by this study, including the combinations of STRING with MyoMiner or with CoXPRESSdb, are available for download from https://www.sys-myo.com/myominer/download.php.
Collapse
Affiliation(s)
- Stephen Morgan
- Northern Ireland Centre for Stratified Medicine, Altnagelvin Hospital Campus, Ulster University, Londonderry, Northern Ireland, UK
| | - Apostolos Malatras
- Department of Biological Sciences, Molecular Medicine Research Center, University of Cyprus, University Avenue, Nicosia, Cyprus
| | - Stephanie Duguez
- Northern Ireland Centre for Stratified Medicine, Altnagelvin Hospital Campus, Ulster University, Londonderry, Northern Ireland, UK
| | - William Duddy
- Northern Ireland Centre for Stratified Medicine, Altnagelvin Hospital Campus, Ulster University, Londonderry, Northern Ireland, UK
| |
Collapse
|
20
|
Shu J, Li Y, Wang S, Xi B, Ma J. Disease gene prediction with privileged information and heteroscedastic dropout. Bioinformatics 2021; 37:i410-i417. [PMID: 34252957 PMCID: PMC8275341 DOI: 10.1093/bioinformatics/btab310] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/24/2021] [Indexed: 11/19/2022] Open
Abstract
Motivation Recently, machine learning models have achieved tremendous success in prioritizing candidate genes for genetic diseases. These models are able to accurately quantify the similarity among disease and genes based on the intuition that similar genes are more likely to be associated with similar diseases. However, the genetic features these methods rely on are often hard to collect due to high experimental cost and various other technical limitations. Existing solutions of this problem significantly increase the risk of overfitting and decrease the generalizability of the models. Results In this work, we propose a graph neural network (GNN) version of the Learning under Privileged Information paradigm to predict new disease gene associations. Unlike previous gene prioritization approaches, our model does not require the genetic features to be the same at training and test stages. If a genetic feature is hard to measure and therefore missing at the test stage, our model could still efficiently incorporate its information during the training process. To implement this, we develop a Heteroscedastic Gaussian Dropout algorithm, where the dropout probability of the GNN model is determined by another GNN model with a mirrored GNN architecture. To evaluate our method, we compared our method with four state-of-the-art methods on the Online Mendelian Inheritance in Man dataset to prioritize candidate disease genes. Extensive evaluations show that our model could improve the prediction accuracy when all the features are available compared to other methods. More importantly, our model could make very accurate predictions when >90% of the features are missing at the test stage. Availability and implementation Our method is realized with Python 3.7 and Pytorch 1.5.0 and method and data are freely available at: https://github.com/juanshu30/Disease-Gene-Prioritization-with-Privileged-Information-and-Heteroscedastic-Dropout.
Collapse
Affiliation(s)
- Juan Shu
- Department of Statistics, Purdue University, West Lafayette, IN 47906, USA
| | - Yu Li
- Department of Computer Science and Engineering, The Chinese University of HongKong, HongKong 999077, China
| | - Sheng Wang
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, WA 98195, USA
| | - Bowei Xi
- Department of Statistics, Purdue University, West Lafayette, IN 47906, USA
| | - Jianzhu Ma
- Institute for Artificial Intelligence, Peking University, Beijing 100871, China
| |
Collapse
|
21
|
Tosadori G, Di Silvestre D, Spoto F, Mauri P, Laudanna C, Scardoni G. Analysing omics data sets with weighted nodes networks (WNNets). Sci Rep 2021; 11:14447. [PMID: 34262093 PMCID: PMC8280138 DOI: 10.1038/s41598-021-93699-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2021] [Accepted: 06/16/2021] [Indexed: 11/30/2022] Open
Abstract
Current trends in biomedical research indicate data integration as a fundamental step towards precision medicine. In this context, network models allow representing and analysing complex biological processes. However, although effective in unveiling network properties, these models fail in considering the individual, biochemical variations occurring at molecular level. As a consequence, the analysis of these models partially loses its predictive power. To overcome these limitations, Weighted Nodes Networks (WNNets) were developed. WNNets allow to easily and effectively weigh nodes using experimental information from multiple conditions. In this study, the characteristics of WNNets were described and a proteomics data set was modelled and analysed. Results suggested that degree, an established centrality index, may offer a novel perspective about the functional role of nodes in WNNets. Indeed, degree allowed retrieving significant differences between experimental conditions, highlighting relevant proteins, and provided a novel interpretation for degree itself, opening new perspectives in experimental data modelling and analysis. Overall, WNNets may be used to model any high-throughput experimental data set requiring weighted nodes. Finally, improving the power of the analysis by using centralities such as betweenness may provide further biological insights and unveil novel, interesting characteristics of WNNets.
Collapse
Affiliation(s)
- Gabriele Tosadori
- Center for BioMedical Computing (CBMC), University of Verona, Strada le Grazie 8, 37134, Verona, Italy.
- Section of General Pathology, Department of Medicine, University of Verona, 37134, Verona, Italy.
| | - Dario Di Silvestre
- Institute for Biomedical Technologies, National Research Council (ITB-CNR), via F.lli Cervi 93, Segrate, 20090, Milan, Italy
| | - Fausto Spoto
- Department of Computer Science, University of Verona, Strada le Grazie 15, 37134, Verona, Italy
| | - Pierluigi Mauri
- Institute for Biomedical Technologies, National Research Council (ITB-CNR), via F.lli Cervi 93, Segrate, 20090, Milan, Italy
| | - Carlo Laudanna
- Section of General Pathology, Department of Medicine, University of Verona, 37134, Verona, Italy.
| | - Giovanni Scardoni
- Center for BioMedical Computing (CBMC), University of Verona, Strada le Grazie 8, 37134, Verona, Italy
| |
Collapse
|
22
|
Luo P, Chen B, Liao B, Wu F. Predicting disease‐associated genes: Computational methods, databases, and evaluations. WIRES DATA MINING AND KNOWLEDGE DISCOVERY 2021; 11. [DOI: 10.1002/widm.1383] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/28/2019] [Accepted: 06/13/2020] [Indexed: 09/09/2024]
Abstract
AbstractComplex diseases are associated with a set of genes (called disease genes), the identification of which can help scientists uncover the mechanisms of diseases and develop new drugs and treatment strategies. Due to the huge cost and time of experimental identification techniques, many computational algorithms have been proposed to predict disease genes. Although several review publications in recent years have discussed many computational methods, some of them focus on cancer driver genes while others focus on biomolecular networks, which only cover a specific aspect of existing methods. In this review, we summarize existing methods and classify them into three categories based on their rationales. Then, the algorithms, biological data, and evaluation methods used in the computational prediction are discussed. Finally, we highlight the limitations of existing methods and point out some future directions for improving these algorithms. This review could help investigators understand the principles of existing methods, and thus develop new methods to advance the computational prediction of disease genes.This article is categorized under:Technologies > Machine LearningTechnologies > PredictionAlgorithmic Development > Biological Data Mining
Collapse
Affiliation(s)
- Ping Luo
- Division of Biomedical Engineering University of Saskatchewan Saskatoon Canada
- Princess Margaret Cancer Centre University Health Network Toronto Canada
| | - Bolin Chen
- School of Computer Science and Technology Northwestern Polytechnical University China
| | - Bo Liao
- School of Mathematics and Statistics Hainan Normal University Haikou China
| | - Fang‐Xiang Wu
- Department of Mechanical Engineering and Department of Computer Science University of Saskatchewan Saskatoon Canada
| |
Collapse
|
23
|
Li Y, Burgman B, Khatri IS, Pentaparthi SR, Su Z, McGrail DJ, Li Y, Wu E, Eckhardt SG, Sahni N, Yi SS. e-MutPath: computational modeling reveals the functional landscape of genetic mutations rewiring interactome networks. Nucleic Acids Res 2021; 49:e2. [PMID: 33211847 PMCID: PMC7797045 DOI: 10.1093/nar/gkaa1015] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2020] [Revised: 10/07/2020] [Accepted: 10/20/2020] [Indexed: 02/06/2023] Open
Abstract
Understanding the functional impact of cancer somatic mutations represents a critical knowledge gap for implementing precision oncology. It has been increasingly appreciated that the interaction profile mediated by a genomic mutation provides a fundamental link between genotype and phenotype. However, specific effects on biological signaling networks for the majority of mutations are largely unknown by experimental approaches. To resolve this challenge, we developed e-MutPath (edgetic Mutation-mediated Pathway perturbations), a network-based computational method to identify candidate ‘edgetic’ mutations that perturb functional pathways. e-MutPath identifies informative paths that could be used to distinguish disease risk factors from neutral elements and to stratify disease subtypes with clinical relevance. The predicted targets are enriched in cancer vulnerability genes, known drug targets but depleted for proteins associated with side effects, demonstrating the power of network-based strategies to investigate the functional impact and perturbation profiles of genomic mutations. Together, e-MutPath represents a robust computational tool to systematically assign functions to genetic mutations, especially in the context of their specific pathway perturbation effect.
Collapse
Affiliation(s)
- Yongsheng Li
- Department of Oncology, Livestrong Cancer Institutes, Dell Medical School, The University of Texas at Austin, Austin, TX 78712, USA.,Oden Institute for Computational Engineering and Sciences (ICES), The University of Texas at Austin, Austin, TX 78712, USA
| | - Brandon Burgman
- Department of Oncology, Livestrong Cancer Institutes, Dell Medical School, The University of Texas at Austin, Austin, TX 78712, USA.,Interdisciplinary Life Sciences Graduate Programs (ILSGP), College of Natural Sciences, The University of Texas at Austin, Austin, TX 78712, USA
| | - Ishaani S Khatri
- Department of Oncology, Livestrong Cancer Institutes, Dell Medical School, The University of Texas at Austin, Austin, TX 78712, USA.,Oden Institute for Computational Engineering and Sciences (ICES), The University of Texas at Austin, Austin, TX 78712, USA
| | - Sairahul R Pentaparthi
- Department of Oncology, Livestrong Cancer Institutes, Dell Medical School, The University of Texas at Austin, Austin, TX 78712, USA
| | - Zhe Su
- Department of Oncology, Livestrong Cancer Institutes, Dell Medical School, The University of Texas at Austin, Austin, TX 78712, USA.,Oden Institute for Computational Engineering and Sciences (ICES), The University of Texas at Austin, Austin, TX 78712, USA
| | - Daniel J McGrail
- Department of Systems Biology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Yang Li
- Department of Epigenetics and Molecular Carcinogenesis, The University of Texas MD Anderson Science Park, Smithville, TX 78957, USA
| | - Erxi Wu
- Department of Oncology, Livestrong Cancer Institutes, Dell Medical School, The University of Texas at Austin, Austin, TX 78712, USA.,Neuroscience Institute and Department of Neurosurgery, Baylor Scott & White Health, Temple, TX 76502, USA.,Department of Surgery, Texas A & M University Health Science Center, College of Medicine, Temple, TX 76508, USA.,Department of Pharmaceutical Sciences, Texas A & M University Health Science Center, College of Pharmacy, College Station, TX 77843, USA
| | - S Gail Eckhardt
- Department of Oncology, Livestrong Cancer Institutes, Dell Medical School, The University of Texas at Austin, Austin, TX 78712, USA.,Interdisciplinary Life Sciences Graduate Programs (ILSGP), College of Natural Sciences, The University of Texas at Austin, Austin, TX 78712, USA
| | - Nidhi Sahni
- Department of Epigenetics and Molecular Carcinogenesis, The University of Texas MD Anderson Science Park, Smithville, TX 78957, USA.,Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA.,Program in Quantitative and Computational Biosciences (QCB), Baylor College of Medicine, Houston, TX 77030, USA
| | - S Stephen Yi
- Department of Oncology, Livestrong Cancer Institutes, Dell Medical School, The University of Texas at Austin, Austin, TX 78712, USA.,Oden Institute for Computational Engineering and Sciences (ICES), The University of Texas at Austin, Austin, TX 78712, USA.,Interdisciplinary Life Sciences Graduate Programs (ILSGP), College of Natural Sciences, The University of Texas at Austin, Austin, TX 78712, USA.,Department of Biomedical Engineering, Cockrell School of Engineering, The University of Texas at Austin, Austin, TX 78712, USA
| |
Collapse
|
24
|
Du Y, Cai M, Xing X, Ji J, Yang E, Wu J. PINA 3.0: mining cancer interactome. Nucleic Acids Res 2021; 49:D1351-D1357. [PMID: 33231689 PMCID: PMC7779002 DOI: 10.1093/nar/gkaa1075] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Revised: 10/20/2020] [Accepted: 10/23/2020] [Indexed: 12/22/2022] Open
Abstract
Protein–protein interactions (PPIs) are crucial to mediate biological functions, and understanding PPIs in cancer type-specific context could help decipher the underlying molecular mechanisms of tumorigenesis and identify potential therapeutic options. Therefore, we update the Protein Interaction Network Analysis (PINA) platform to version 3.0, to integrate the unified human interactome with RNA-seq transcriptomes and mass spectrometry-based proteomes across tens of cancer types. A number of new analytical utilities were developed to help characterize the cancer context for a PPI network, which includes inferring proteins with expression specificity and identifying candidate prognosis biomarkers, putative cancer drivers, and therapeutic targets for a specific cancer type; as well as identifying pairs of co-expressing interacting proteins across cancer types. Furthermore, a brand-new web interface has been designed to integrate these new utilities within an interactive network visualization environment, which allows users to quickly and comprehensively investigate the roles of human interacting proteins in a cancer type-specific context. PINA is freely available at https://omics.bjcancer.org/pina/.
Collapse
Affiliation(s)
- Yang Du
- Center for Cancer Bioinformatics, Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education), Peking University Cancer Hospital & Institute, Beijing 100142, China
| | - Meng Cai
- Institute of Systems Biomedicine, Department of Medical Bioinformatics, School of Basic Medical Sciences, Peking University Health Science Center, Beijing 100191, China
| | - Xiaofang Xing
- Department of Gastrointestinal Translational Research, Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education), Peking University Cancer Hospital & Institute, Beijing 100142, China
| | - Jiafu Ji
- Gastrointestinal Cancer Center, Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education), Peking University Cancer Hospital & Institute, Beijing 100142, China
| | - Ence Yang
- Institute of Systems Biomedicine, Department of Medical Bioinformatics, School of Basic Medical Sciences, Peking University Health Science Center, Beijing 100191, China
| | - Jianmin Wu
- Center for Cancer Bioinformatics, Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education), Peking University Cancer Hospital & Institute, Beijing 100142, China.,Peking University International Cancer Institute, Peking University, Beijing 100191, China
| |
Collapse
|
25
|
Scelsi MA, Napolioni V, Greicius MD, Altmann A. Network propagation of rare variants in Alzheimer's disease reveals tissue-specific hub genes and communities. PLoS Comput Biol 2021; 17:e1008517. [PMID: 33411734 PMCID: PMC7817020 DOI: 10.1371/journal.pcbi.1008517] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2020] [Revised: 01/20/2021] [Accepted: 11/10/2020] [Indexed: 11/18/2022] Open
Abstract
State-of-the-art rare variant association testing methods aggregate the contribution of rare variants in biologically relevant genomic regions to boost statistical power. However, testing single genes separately does not consider the complex interaction landscape of genes, nor the downstream effects of non-synonymous variants on protein structure and function. Here we present the NETwork Propagation-based Assessment of Genetic Events (NETPAGE), an integrative approach aimed at investigating the biological pathways through which rare variation results in complex disease phenotypes. We applied NETPAGE to sporadic, late-onset Alzheimer's disease (AD), using whole-genome sequencing from the AD Neuroimaging Initiative (ADNI) cohort, as well as whole-exome sequencing from the AD Sequencing Project (ADSP). NETPAGE is based on network propagation, a framework that models information flow on a graph and simulates the percolation of genetic variation through tissue-specific gene interaction networks. The result of network propagation is a set of smoothed gene scores that can be tested for association with disease status through sparse regression. The application of NETPAGE to AD enabled the identification of a set of connected genes whose smoothed variation profile was robustly associated to case-control status, based on gene interactions in the hippocampus. Additionally, smoothed scores significantly correlated with risk of conversion to AD in Mild Cognitive Impairment (MCI) subjects. Lastly, we investigated tissue-specific transcriptional dysregulation of the core genes in two independent RNA-seq datasets, as well as significant enrichments in terms of gene sets with known connections to AD. We present a framework that enables enhanced genetic association testing for a wide range of traits, diseases, and sample sizes.
Collapse
Affiliation(s)
- Marzia Antonella Scelsi
- Centre for Medical Image Computing, Department of Medical Physics and Biomedical Engineering, University College London, London, United Kingdom
| | - Valerio Napolioni
- Functional Imaging in Neuropsychiatric Disorders (FIND) Lab, Department of Neurology and Neurological Sciences, Stanford University School of Medicine, Stanford, California, United States of America
| | - Michael D Greicius
- Functional Imaging in Neuropsychiatric Disorders (FIND) Lab, Department of Neurology and Neurological Sciences, Stanford University School of Medicine, Stanford, California, United States of America
| | - Andre Altmann
- Centre for Medical Image Computing, Department of Medical Physics and Biomedical Engineering, University College London, London, United Kingdom
| | | |
Collapse
|
26
|
Savino A, Provero P, Poli V. Differential Co-Expression Analyses Allow the Identification of Critical Signalling Pathways Altered during Tumour Transformation and Progression. Int J Mol Sci 2020; 21:E9461. [PMID: 33322692 PMCID: PMC7764314 DOI: 10.3390/ijms21249461] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Revised: 12/02/2020] [Accepted: 12/09/2020] [Indexed: 02/02/2023] Open
Abstract
Biological systems respond to perturbations through the rewiring of molecular interactions, organised in gene regulatory networks (GRNs). Among these, the increasingly high availability of transcriptomic data makes gene co-expression networks the most exploited ones. Differential co-expression networks are useful tools to identify changes in response to an external perturbation, such as mutations predisposing to cancer development, and leading to changes in the activity of gene expression regulators or signalling. They can help explain the robustness of cancer cells to perturbations and identify promising candidates for targeted therapy, moreover providing higher specificity with respect to standard co-expression methods. Here, we comprehensively review the literature about the methods developed to assess differential co-expression and their applications to cancer biology. Via the comparison of normal and diseased conditions and of different tumour stages, studies based on these methods led to the definition of pathways involved in gene network reorganisation upon oncogenes' mutations and tumour progression, often converging on immune system signalling. A relevant implementation still lagging behind is the integration of different data types, which would greatly improve network interpretability. Most importantly, performance and predictivity evaluation of the large variety of mathematical models proposed would urgently require experimental validations and systematic comparisons. We believe that future work on differential gene co-expression networks, complemented with additional omics data and experimentally tested, will considerably improve our insights into the biology of tumours.
Collapse
Affiliation(s)
- Aurora Savino
- Molecular Biotechnology Center, Department of Molecular Biotechnology and Health Sciences, University of Turin, Via Nizza 52, 10126 Turin, Italy
| | - Paolo Provero
- Department of Neurosciences “Rita Levi Montalcini”, University of Turin, Corso Massimo D’Ázeglio 52, 10126 Turin, Italy;
- Center for Omics Sciences, Ospedale San Raffaele IRCCS, Via Olgettina 60, 20132 Milan, Italy
| | - Valeria Poli
- Molecular Biotechnology Center, Department of Molecular Biotechnology and Health Sciences, University of Turin, Via Nizza 52, 10126 Turin, Italy
| |
Collapse
|
27
|
Guerra C, Joshi S, Lu Y, Palini F, Ferraro Petrillo U, Rossignac J. Rank-Similarity Measures for Comparing Gene Prioritizations: A Case Study in Autism. J Comput Biol 2020; 28:283-295. [PMID: 33103913 DOI: 10.1089/cmb.2020.0244] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
We discuss the challenge of comparing three gene prioritization methods: network propagation, integer linear programming rank aggregation (RA), and statistical RA. These methods are based on different biological categories and estimate disease-gene association. Previously proposed comparison schemes are based on three measures of performance: receiver operating curve, area under the curve, and median rank ratio. Although they may capture important aspects of gene prioritization performance, they may fail to capture important differences in the rankings of individual genes. We suggest that comparison schemes could be improved by also considering recently proposed measures of similarity between gene rankings. We tested this suggestion on comparison schemes for prioritizations of genes associated with autism that were obtained using brain- and tissue-specific data. Our results show the effectiveness of our measures of similarity in clustering brain regions based on their relevance to autism.
Collapse
Affiliation(s)
- Concettina Guerra
- Georgia Institute of Technology College of Computing, School of Interactive Computing, Atlanta, Georgia, USA
| | - Sarang Joshi
- Georgia Institute of Technology College of Computing, School of Interactive Computing, Atlanta, Georgia, USA
| | - Yinquan Lu
- Georgia Institute of Technology College of Computing, School of Interactive Computing, Atlanta, Georgia, USA
| | - Francesco Palini
- Dipartimento di Scienze Statistiche, Università di Roma-La Sapienza, Rome, Italy
| | | | - Jarek Rossignac
- Georgia Institute of Technology College of Computing, School of Interactive Computing, Atlanta, Georgia, USA
| |
Collapse
|
28
|
Yu H, Hageman Blair R. Scalable module detection for attributed networks with applications to breast cancer. J Appl Stat 2020; 49:230-247. [DOI: 10.1080/02664763.2020.1803811] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Affiliation(s)
- Han Yu
- Department of Biostatistics and Bioinformatics, Roswell Park Comprehensive Cancer Center, Buffalo, NY, USA
| | - Rachael Hageman Blair
- Department of Biostatistics, State University of New York at Buffalo, Buffalo, NY, USA
| |
Collapse
|
29
|
Yu L, Shi Y, Zou Q, Wang S, Zheng L, Gao L. Exploring Drug Treatment Patterns Based on the Action of Drug and Multilayer Network Model. Int J Mol Sci 2020; 21:E5014. [PMID: 32708644 PMCID: PMC7404256 DOI: 10.3390/ijms21145014] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2020] [Revised: 07/13/2020] [Accepted: 07/14/2020] [Indexed: 02/01/2023] Open
Abstract
Some drugs can be used to treat multiple diseases, suggesting potential patterns in drug treatment. Determination of drug treatment patterns can improve our understanding of the mechanisms of drug action, enabling drug repurposing. A drug can be associated with a multilayer tissue-specific protein-protein interaction (TSPPI) network for the diseases it is used to treat. Proteins usually interact with other proteins to achieve functions that cause diseases. Hence, studying drug treatment patterns is similar to studying common module structures in multilayer TSPPI networks. Therefore, we propose a network-based model to study the treatment patterns of drugs. The method was designated SDTP (studying drug treatment pattern) and was based on drug effects and a multilayer network model. To demonstrate the application of the SDTP method, we focused on analysis of trichostatin A (TSA) in leukemia, breast cancer, and prostate cancer. We constructed a TSPPI multilayer network and obtained candidate drug-target modules from the network. Gene ontology analysis provided insights into the significance of the drug-target modules and co-expression networks. Finally, two modules were obtained as potential treatment patterns for TSA. Through analysis of the significance, composition, and functions of the selected drug-target modules, we validated the feasibility and rationality of our proposed SDTP method for identifying drug treatment patterns. In summary, our novel approach used a multilayer network model to overcome the shortcomings of single-layer networks and combined the network with information on drug activity. Based on the discovered drug treatment patterns, we can predict the potential diseases that the drug can treat. That is, if a disease-related protein module has a similar structure, then the drug is likely to be a potential drug for the treatment of the disease.
Collapse
Affiliation(s)
- Liang Yu
- School of Computer Science and Technology, Xidian University, Xi’an 710071, China; (Y.S.); (L.G.)
| | - Yayong Shi
- School of Computer Science and Technology, Xidian University, Xi’an 710071, China; (Y.S.); (L.G.)
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology, Chengdu 650004, China;
| | - Shuhang Wang
- Department of Radiology, Massachusetts General Hospital, Boston, MA 02114, USA;
| | - Liping Zheng
- School of Computer Science and Technology, Liaocheng University, Liaocheng 252000, China;
| | - Lin Gao
- School of Computer Science and Technology, Xidian University, Xi’an 710071, China; (Y.S.); (L.G.)
| |
Collapse
|
30
|
Niss K, Gomez-Casado C, Hjaltelin JX, Joeris T, Agace WW, Belling KG, Brunak S. Complete Topological Mapping of a Cellular Protein Interactome Reveals Bow-Tie Motifs as Ubiquitous Connectors of Protein Complexes. Cell Rep 2020; 31:107763. [PMID: 32553166 DOI: 10.1016/j.celrep.2020.107763] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2019] [Revised: 02/03/2020] [Accepted: 05/21/2020] [Indexed: 11/18/2022] Open
Abstract
The network topology of a protein interactome is shaped by the function of each protein, making it a resource of functional knowledge in tissues and in single cells. Today, this resource is underused, as complete network topology characterization has proved difficult for large protein interactomes. We apply a matrix visualization and decoding approach to a physical protein interactome of a dendritic cell, thereby characterizing its topology with no prior assumptions of structure. We discover 294 proteins, each forming topological motifs called "bow-ties" that tie together the majority of observed protein complexes. The central proteins of these bow-ties have unique network properties, display multifunctional capabilities, are enriched for essential proteins, and are widely expressed in other cells and tissues. Collectively, the bow-tie motifs are a pervasive and previously unnoted topological trend in cellular interactomes. As such, these results provide fundamental knowledge on how intracellular protein connectivity is organized and operates.
Collapse
Affiliation(s)
- Kristoffer Niss
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, 2200 Copenhagen, Denmark
| | - Cristina Gomez-Casado
- Immunology Section, Lund University, BMC D14, 221-84 Lund, Sweden; Institute of Applied Molecular Medicine, Faculty of Medicine, San Pablo CEU University, 28925 Madrid, Spain
| | - Jessica X Hjaltelin
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, 2200 Copenhagen, Denmark
| | - Thorsten Joeris
- Immunology Section, Lund University, BMC D14, 221-84 Lund, Sweden
| | - William W Agace
- Immunology Section, Lund University, BMC D14, 221-84 Lund, Sweden; Mucosal Immunology Group, Department of Health Technology, Technical University of Denmark, 2800 Kgs. Lyngby, Denmark
| | - Kirstine G Belling
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, 2200 Copenhagen, Denmark
| | - Søren Brunak
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, 2200 Copenhagen, Denmark.
| |
Collapse
|
31
|
Basha O, Mauer O, Simonovsky E, Shpringer R, Yeger-Lotem E. ResponseNet v.3: revealing signaling and regulatory pathways connecting your proteins and genes across human tissues. Nucleic Acids Res 2020; 47:W242-W247. [PMID: 31114913 PMCID: PMC6602570 DOI: 10.1093/nar/gkz421] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2019] [Revised: 04/23/2019] [Accepted: 05/06/2019] [Indexed: 12/13/2022] Open
Abstract
ResponseNet v.3 is an enhanced version of ResponseNet, a web server that is designed to highlight signaling and regulatory pathways connecting user-defined proteins and genes by using the ResponseNet network optimization approach (http://netbio.bgu.ac.il/respnet). Users run ResponseNet by defining source and target sets of proteins, genes and/or microRNAs, and by specifying a molecular interaction network (interactome). The output of ResponseNet is a sparse, high-probability interactome subnetwork that connects the two sets, thereby revealing additional molecules and interactions that are involved in the studied condition. In recent years, massive efforts were invested in profiling the transcriptomes of human tissues, enabling the inference of human tissue interactomes. ResponseNet v.3 expands ResponseNet2.0 by harnessing ∼11,600 RNA-sequenced human tissue profiles made available by the Genotype-Tissue Expression consortium, to support context-specific analysis of 44 human tissues. Thus, ResponseNet v.3 allows users to illuminate the signaling and regulatory pathways potentially active in the context of a specific tissue, and to compare them with active pathways in other tissues. In the era of precision medicine, such analyses open the door for tissue- and patient-specific analyses of pathways and diseases.
Collapse
Affiliation(s)
- Omer Basha
- Department of Clinical Biochemistry & Pharmacology, Faculty of Health Sciences
| | - Omry Mauer
- Department of Clinical Biochemistry & Pharmacology, Faculty of Health Sciences
| | - Eyal Simonovsky
- Department of Clinical Biochemistry & Pharmacology, Faculty of Health Sciences
| | - Rotem Shpringer
- Department of Clinical Biochemistry & Pharmacology, Faculty of Health Sciences
| | - Esti Yeger-Lotem
- Department of Clinical Biochemistry & Pharmacology, Faculty of Health Sciences.,National Institute for Biotechnology in the Negev, Ben-Gurion University of the Negev, Beer-Sheva 84105, Israel
| |
Collapse
|
32
|
Basha O, Argov CM, Artzy R, Zoabi Y, Hekselman I, Alfandari L, Chalifa-Caspi V, Yeger-Lotem E. Differential network analysis of multiple human tissue interactomes highlights tissue-selective processes and genetic disorder genes. Bioinformatics 2020; 36:2821-2828. [DOI: 10.1093/bioinformatics/btaa034] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2019] [Revised: 01/07/2020] [Accepted: 01/16/2020] [Indexed: 01/19/2023] Open
Abstract
Abstract
Motivation
Differential network analysis, designed to highlight network changes between conditions, is an important paradigm in network biology. However, differential network analysis methods have been typically designed to compare between two conditions and were rarely applied to multiple protein interaction networks (interactomes). Importantly, large-scale benchmarks for their evaluation have been lacking.
Results
Here, we present a framework for assessing the ability of differential network analysis of multiple human tissue interactomes to highlight tissue-selective processes and disorders. For this, we created a benchmark of 6499 curated tissue-specific Gene Ontology biological processes. We applied five methods, including four differential network analysis methods, to construct weighted interactomes for 34 tissues. Rigorous assessment of this benchmark revealed that differential analysis methods perform well in revealing tissue-selective processes (AUCs of 0.82–0.9). Next, we applied differential network analysis to illuminate the genes underlying tissue-selective hereditary disorders. For this, we curated a dataset of 1305 tissue-specific hereditary disorders and their manifesting tissues. Focusing on subnetworks containing the top 1% differential interactions in disease-relevant tissue interactomes revealed significant enrichment for disorder-causing genes in 18.6% of the cases, with a significantly high success rate for blood, nerve, muscle and heart diseases.
Summary
Altogether, we offer a framework that includes expansive manually curated datasets of tissue-selective processes and disorders to be used as benchmarks or to illuminate tissue-selective processes and genes. Our results demonstrate that differential analysis of multiple human tissue interactomes is a powerful tool for highlighting processes and genes with tissue-selective functionality and clinical impact.
Availability and implementation
Datasets are available as part of the Supplementary data.
Supplementary information
Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Omer Basha
- Department of Clinical Biochemistry and Pharmacology, Faculty of Health Sciences
| | - Chanan M Argov
- Department of Clinical Biochemistry and Pharmacology, Faculty of Health Sciences
| | - Raviv Artzy
- Department of Clinical Biochemistry and Pharmacology, Faculty of Health Sciences
| | - Yazeed Zoabi
- Department of Clinical Biochemistry and Pharmacology, Faculty of Health Sciences
| | - Idan Hekselman
- Department of Clinical Biochemistry and Pharmacology, Faculty of Health Sciences
| | - Liad Alfandari
- Department of Clinical Biochemistry and Pharmacology, Faculty of Health Sciences
| | - Vered Chalifa-Caspi
- National Institute for Biotechnology in the Negev, Ben-Gurion University of the Negev, Beer-Sheva 84105, Israel
| | - Esti Yeger-Lotem
- Department of Clinical Biochemistry and Pharmacology, Faculty of Health Sciences
- National Institute for Biotechnology in the Negev, Ben-Gurion University of the Negev, Beer-Sheva 84105, Israel
| |
Collapse
|
33
|
Hekselman I, Yeger-Lotem E. Mechanisms of tissue and cell-type specificity in heritable traits and diseases. Nat Rev Genet 2020; 21:137-150. [DOI: 10.1038/s41576-019-0200-9] [Citation(s) in RCA: 67] [Impact Index Per Article: 16.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/12/2019] [Indexed: 02/07/2023]
|
34
|
Cowman T, Coşkun M, Grama A, Koyutürk M. Integrated querying and version control of context-specific biological networks. Database (Oxford) 2020; 2020:baaa018. [PMID: 32294194 PMCID: PMC7158887 DOI: 10.1093/database/baaa018] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2019] [Revised: 01/13/2020] [Accepted: 02/21/2020] [Indexed: 01/26/2023]
Abstract
MOTIVATION Biomolecular data stored in public databases is increasingly specialized to organisms, context/pathology and tissue type, potentially resulting in significant overhead for analyses. These networks are often specializations of generic interaction sets, presenting opportunities for reducing storage and computational cost. Therefore, it is desirable to develop effective compression and storage techniques, along with efficient algorithms and a flexible query interface capable of operating on compressed data structures. Current graph databases offer varying levels of support for network integration. However, these solutions do not provide efficient methods for the storage and querying of versioned networks. RESULTS We present VerTIoN, a framework consisting of novel data structures and associated query mechanisms for integrated querying of versioned context-specific biological networks. As a use case for our framework, we study network proximity queries in which the user can select and compose a combination of tissue-specific and generic networks. Using our compressed version tree data structure, in conjunction with state-of-the-art numerical techniques, we demonstrate real-time querying of large network databases. CONCLUSION Our results show that it is possible to support flexible queries defined on heterogeneous networks composed at query time while drastically reducing response time for multiple simultaneous queries. The flexibility offered by VerTIoN in composing integrated network versions opens significant new avenues for the utilization of ever increasing volume of context-specific network data in a broad range of biomedical applications. AVAILABILITY AND IMPLEMENTATION VerTIoN is implemented as a C++ library and is available at http://compbio.case.edu/omics/software/vertion and https://github.com/tjcowman/vertion. CONTACT tyler.cowman@case.edu.
Collapse
Affiliation(s)
- Tyler Cowman
- Department of Computer and Data Sciences, Case Western Reserve University, Cleveland, OH 44106, USA
| | - Mustafa Coşkun
- Department of Computer Engineering, Abdullah Gül University, Kayseri 38080, Turkey
| | - Ananth Grama
- Department of Computer Science, Purdue University, West Lafayette, IN 47906, USA
| | - Mehmet Koyutürk
- Department of Computer and Data Sciences, Case Western Reserve University, Cleveland, OH 44106, USA
- Center for Proteomics and Bioinformatics, Case Western Reserve University, Cleveland, OH 44106, USA
| |
Collapse
|
35
|
Mohammadi S, Davila-Velderrain J, Kellis M. Reconstruction of Cell-type-Specific Interactomes at Single-Cell Resolution. Cell Syst 2019; 9:559-568.e4. [PMID: 31786210 PMCID: PMC6943823 DOI: 10.1016/j.cels.2019.10.007] [Citation(s) in RCA: 38] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2019] [Revised: 07/13/2019] [Accepted: 10/22/2019] [Indexed: 01/03/2023]
Abstract
The human interactome is instrumental in the systems-level study of the cell and the contextualization of disease-associated gene perturbations. However, reference organismal interactomes do not capture the cell-type-specific context in which proteins and modules preferentially act. Here, we introduce SCINET, a computational framework that reconstructs an ensemble of cell-type-specific interactomes by integrating a global, context-independent reference interactome with a single-cell gene-expression profile. SCINET addresses technical challenges of single-cell data by robustly imputing, transforming, and normalizing the initially noisy and sparse expression of data. Inferred cell-level gene interaction probabilities and group-level interaction strengths define cell-type-specific interactomes. We use SCINET to reconstruct and analyze interactomes of the major human brain and immune cell types, revealing specificity and modularity of perturbations associated with neurodegenerative, neuropsychiatric, and autoimmune disorders. We report cell-type interactomes for brain and immune cell types, together with the SCINET package.
Collapse
Affiliation(s)
- Shahin Mohammadi
- MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA 02139, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.
| | - Jose Davila-Velderrain
- MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA 02139, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Manolis Kellis
- MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA 02139, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| |
Collapse
|
36
|
Biological Network Approaches and Applications in Rare Disease Studies. Genes (Basel) 2019; 10:genes10100797. [PMID: 31614842 PMCID: PMC6827097 DOI: 10.3390/genes10100797] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2019] [Revised: 09/30/2019] [Accepted: 10/10/2019] [Indexed: 12/26/2022] Open
Abstract
Network biology has the capability to integrate, represent, interpret, and model complex biological systems by collectively accommodating biological omics data, biological interactions and associations, graph theory, statistical measures, and visualizations. Biological networks have recently been shown to be very useful for studies that decipher biological mechanisms and disease etiologies and for studies that predict therapeutic responses, at both the molecular and system levels. In this review, we briefly summarize the general framework of biological network studies, including data resources, network construction methods, statistical measures, network topological properties, and visualization tools. We also introduce several recent biological network applications and methods for the studies of rare diseases.
Collapse
|
37
|
Huang H, Shangguan J, Ruan P, Liang H. Bi-level feature selection in high dimensional AFT models with applications to a genomic study. Stat Appl Genet Mol Biol 2019; 18:/j/sagmb.ahead-of-print/sagmb-2019-0016/sagmb-2019-0016.xml. [PMID: 31525158 DOI: 10.1515/sagmb-2019-0016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
We propose a new bi-level feature selection method for high dimensional accelerated failure time models by formulating the models to a single index model. The method yields sparse solutions at both the group and individual feature levels along with an expedient algorithm, which is computationally efficient and easily implemented. We analyze a genomic dataset for an illustration, and present a simulation study to show the finite sample performance of the proposed method.
Collapse
Affiliation(s)
- Hailin Huang
- Department of Statistics, George Washington University, Washington, DC 20052, USA
| | - Jizi Shangguan
- Department of Statistics, George Washington University, Washington, DC 20052, USA
| | - Peifeng Ruan
- Department of Statistics, George Washington University, Washington, DC 20052, USA
| | - Hua Liang
- Department of Statistics, George Washington University, Washington, DC 20052, USA
| |
Collapse
|
38
|
Wang J, Hossain MS, Lyu Z, Schmutz J, Stacey G, Xu D, Joshi T. SoyCSN: Soybean context-specific network analysis and prediction based on tissue-specific transcriptome data. PLANT DIRECT 2019; 3:e00167. [PMID: 31549018 PMCID: PMC6747016 DOI: 10.1002/pld3.167] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/05/2019] [Revised: 08/12/2019] [Accepted: 08/20/2019] [Indexed: 05/04/2023]
Abstract
The Soybean Gene Atlas project provides a comprehensive map for understanding gene expression patterns in major soybean tissues from flower, root, leaf, nodule, seed, and shoot and stem. The RNA-Seq data generated in the project serve as a valuable resource for discovering tissue-specific transcriptome behavior of soybean genes in different tissues. We developed a computational pipeline for Soybean context-specific network (SoyCSN) inference with a suite of prediction tools to analyze, annotate, retrieve, and visualize soybean context-specific networks at both transcriptome and interactome levels. BicMix and Cross-Conditions Cluster Detection algorithms were applied to detect modules based on co-expression relationships across all the tissues. Soybean context-specific interactomes were predicted by combining soybean tissue gene expression and protein-protein interaction data. Functional analyses of these predicted networks provide insights into soybean tissue specificities. For example, under symbiotic, nitrogen-fixing conditions, the constructed soybean leaf network highlights the connection between the photosynthesis function and rhizobium-legume symbiosis. SoyCSN data and all its results are publicly available via an interactive web service within the Soybean Knowledge Base (SoyKB) at http://soykb.org/SoyCSN. SoyCSN provides a useful web-based access for exploring context specificities systematically in gene regulatory mechanisms and gene relationships for soybean researchers and molecular breeders.
Collapse
Affiliation(s)
- Juexin Wang
- Department of Electrical Engineering and Computer ScienceUniversity of MissouriSt. LouisMOUSA
- Christopher S. Bond Life Sciences CenterUniversity of MissouriSt. LouisMOUSA
| | - Md Shakhawat Hossain
- Christopher S. Bond Life Sciences CenterUniversity of MissouriSt. LouisMOUSA
- Divisions of Plant Science and BiochemistryUniversity of MissouriSt. LouisMOUSA
| | - Zhen Lyu
- Department of Electrical Engineering and Computer ScienceUniversity of MissouriSt. LouisMOUSA
| | - Jeremy Schmutz
- HudsonAlpha Institute for BiotechnologyHuntsvilleALUSA
- DOE Joint Genome InstituteWalnut CreekCAUSA
| | - Gary Stacey
- Christopher S. Bond Life Sciences CenterUniversity of MissouriSt. LouisMOUSA
- Divisions of Plant Science and BiochemistryUniversity of MissouriSt. LouisMOUSA
| | - Dong Xu
- Department of Electrical Engineering and Computer ScienceUniversity of MissouriSt. LouisMOUSA
- Christopher S. Bond Life Sciences CenterUniversity of MissouriSt. LouisMOUSA
- Informatics InstituteUniversity of MissouriSt. LouisMOUSA
| | - Trupti Joshi
- Christopher S. Bond Life Sciences CenterUniversity of MissouriSt. LouisMOUSA
- Informatics InstituteUniversity of MissouriSt. LouisMOUSA
- Department of Health Management and Informatics and Office of ResearchSchool of MedicineUniversity of MissouriSt. LouisMOUSA
| |
Collapse
|
39
|
Kaczmarczyk L, Bansal V, Rajput A, Rahman RU, Krzyżak W, Degen J, Poll S, Fuhrmann M, Bonn S, Jackson WS. Tagger-A Swiss army knife for multiomics to dissect cell type-specific mechanisms of gene expression in mice. PLoS Biol 2019; 17:e3000374. [PMID: 31393866 PMCID: PMC6701817 DOI: 10.1371/journal.pbio.3000374] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2019] [Revised: 08/20/2019] [Accepted: 07/17/2019] [Indexed: 12/22/2022] Open
Abstract
A deep understanding of how regulation of the multiple levels of gene expression in mammalian tissues give rise to complex phenotypes has been impeded by cellular diversity. A handful of techniques were developed to tag-select nucleic acids of interest in specific cell types, thereby enabling their capture. We expanded this strategy by developing the Tagger knock-in mouse line bearing a quad-cistronic transgene combining enrichment tools for nuclei, nascent RNA, translating mRNA, and mature microRNA (miRNA). We demonstrate that Tagger can capture the desired nucleic acids, enabling multiple omics approaches to be applied to specific cell types in vivo using a single transgenic mouse line. This Methods and Resources paper describes Tagger, a knock-in mouse line bearing a quad-cistronic transgene that enables the capture of translating mRNAs, mature miRNAs, pulse-labeled total RNA, and the nucleus, all from specific cells of complex tissues.
Collapse
Affiliation(s)
- Lech Kaczmarczyk
- Wallenberg Center for Molecular Medicine, Department of Clinical and Experimental Medicine, Linköping University, Linköping, Sweden
- German Center for Neurodegenerative Diseases, Bonn, Germany
| | - Vikas Bansal
- Institute for Medical Systems Biology, Center for Molecular Neuroscience, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Ashish Rajput
- Institute for Medical Systems Biology, Center for Molecular Neuroscience, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Raza-ur Rahman
- Institute for Medical Systems Biology, Center for Molecular Neuroscience, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Wiesław Krzyżak
- Life & Medical Sciences Institute, University of Bonn, Bonn, Germany
| | - Joachim Degen
- Life & Medical Sciences Institute, University of Bonn, Bonn, Germany
| | - Stefanie Poll
- German Center for Neurodegenerative Diseases, Bonn, Germany
| | | | - Stefan Bonn
- Institute for Medical Systems Biology, Center for Molecular Neuroscience, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
- German Center for Neurodegenerative Diseases, Tübingen, Germany
- * E-mail: (SB); (WSJ)
| | - Walker Scot Jackson
- Wallenberg Center for Molecular Medicine, Department of Clinical and Experimental Medicine, Linköping University, Linköping, Sweden
- German Center for Neurodegenerative Diseases, Bonn, Germany
- * E-mail: (SB); (WSJ)
| |
Collapse
|
40
|
Xiang B, Liu K, Yu M, Liang X, Huang C, Zhang J, He W, Lei W, Chen J, Gu X, Gong K. Systematic genetic analyses of GWAS data reveal an association between the immune system and insomnia. Mol Genet Genomic Med 2019; 7:e00742. [PMID: 31094102 PMCID: PMC6625127 DOI: 10.1002/mgg3.742] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2019] [Revised: 04/18/2019] [Accepted: 04/22/2019] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND Previous studies have inferred a strong genetic component for insomnia. However, the etiology of insomnia is still unclear. The aim of the current study was to explore potential biological pathways, gene networks, and brain regions associated with insomnia. METHODS Using pathways (gene sets) from Reactome, we carried out a two-stage gene set enrichment analysis strategy. From a large genome-wide association studies (GWASs) of insomnia symptoms (32,155 cases/26,973 controls), significant gene sets were tested for replication in other large GWASs of insomnia complaints (32,384 cases/80,622 controls). After the network analysis of unique genes within the replicated pathways, a gene set analysis for genes in each cluster/module of the enhancing neuroimaging genetics through meta-analysis GWAS data was performed for the volumes of the intracranial and seven subcortical regions. RESULTS A total of 31 of 1,816 Reactome pathways were identified and showed associations with insomnia risk. In addition, seven functionally and topologically interconnected clusters (clusters 0-6) and six gene modules (named Yellow, Blue, Brown, Green, Red, and Turquoise) were associated with insomnia. Moreover, significant associations were detected between common variants of the genes in Cluster 2 with hippocampal volume (p = 0.035; family wise error [FWE] correction) and the red module with intracranial volume (p = 0.047; FWE correction). Functional enrichment for genes in the Cluster 2 and the Red module revealed the involvement of immune responses, nervous system development, NIK/NF-kappaB signaling, and I-kappaB kinase/NF-kappaB signaling. Core genes (UBC, UBB, and UBA52) in the interconnected functional network were found to be involved in regulating brain development. CONCLUSIONS The current study demonstrates that the immune system and the hippocampus may play central roles in neurodevelopment and insomnia risk.
Collapse
Affiliation(s)
- Bo Xiang
- Department of Psychiatry, Nuclear Medicine and Molecular Imaging Key Laboratory of Sichuan ProvinceAffiliated Hospital of Southwest Medical UniversityLuzhouSichuan ProvinceChina
| | - Kezhi Liu
- Department of Psychiatry, Nuclear Medicine and Molecular Imaging Key Laboratory of Sichuan ProvinceAffiliated Hospital of Southwest Medical UniversityLuzhouSichuan ProvinceChina
| | - Minglan Yu
- Medical Laboratory CenterAffiliated Hospital of Southwest Medical UniversityLuzhouSichuan ProvinceChina
| | - Xuemei Liang
- Department of Psychiatry, Nuclear Medicine and Molecular Imaging Key Laboratory of Sichuan ProvinceAffiliated Hospital of Southwest Medical UniversityLuzhouSichuan ProvinceChina
| | - Chaohua Huang
- Department of Psychiatry, Nuclear Medicine and Molecular Imaging Key Laboratory of Sichuan ProvinceAffiliated Hospital of Southwest Medical UniversityLuzhouSichuan ProvinceChina
| | - Jin Zhang
- Department of Psychiatry, Nuclear Medicine and Molecular Imaging Key Laboratory of Sichuan ProvinceAffiliated Hospital of Southwest Medical UniversityLuzhouSichuan ProvinceChina
| | - Wenying He
- Department of Psychiatry, Nuclear Medicine and Molecular Imaging Key Laboratory of Sichuan ProvinceAffiliated Hospital of Southwest Medical UniversityLuzhouSichuan ProvinceChina
| | - Wei Lei
- Department of Psychiatry, Nuclear Medicine and Molecular Imaging Key Laboratory of Sichuan ProvinceAffiliated Hospital of Southwest Medical UniversityLuzhouSichuan ProvinceChina
| | - Jing Chen
- Department of Psychiatry, Nuclear Medicine and Molecular Imaging Key Laboratory of Sichuan ProvinceAffiliated Hospital of Southwest Medical UniversityLuzhouSichuan ProvinceChina
| | - Xiaochu Gu
- Clinical LaboratorySuzhou Guangji HospitalSuzhouJiangsu ProvinceChina
| | - Ke Gong
- Department of Psychiatry, Nuclear Medicine and Molecular Imaging Key Laboratory of Sichuan ProvinceAffiliated Hospital of Southwest Medical UniversityLuzhouSichuan ProvinceChina
| |
Collapse
|
41
|
Dozmorov MG. Disease classification: from phenotypic similarity to integrative genomics and beyond. Brief Bioinform 2019; 20:1769-1780. [DOI: 10.1093/bib/bby049] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2018] [Revised: 05/01/2018] [Indexed: 02/06/2023] Open
Abstract
Abstract
A fundamental challenge of modern biomedical research is understanding how diseases that are similar on the phenotypic level are similar on the molecular level. Integration of various genomic data sets with the traditionally used phenotypic disease similarity revealed novel genetic and molecular mechanisms and blurred the distinction between monogenic (Mendelian) and complex diseases. Network-based medicine has emerged as a complementary approach for identifying disease-causing genes, genetic mediators, disruptions in the underlying cellular functions and for drug repositioning. The recent development of machine and deep learning methods allow for leveraging real-life information about diseases to refine genetic and phenotypic disease relationships. This review describes the historical development and recent methodological advancements for studying disease classification (nosology).
Collapse
Affiliation(s)
- Mikhail G Dozmorov
- Department of Biostatistics, Virginia Commonwealth University, 830 East Main Street, Richmond, VA, USA
| |
Collapse
|
42
|
Latysheva NS, Babu MM. Molecular Signatures of Fusion Proteins in Cancer. ACS Pharmacol Transl Sci 2019; 2:122-133. [PMID: 32219217 PMCID: PMC7088938 DOI: 10.1021/acsptsci.9b00019] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2019] [Indexed: 01/07/2023]
Abstract
![]()
Although gene fusions
are recognized as driver mutations in a wide
variety of cancers, the general molecular mechanisms underlying oncogenic
fusion proteins are insufficiently understood. Here, we employ large-scale
data integration and machine learning and (1) identify three functionally
distinct subgroups of gene fusions and their molecular signatures;
(2) characterize the cellular pathways rewired by fusion events across
different cancers; and (3) analyze the relative importance of over
100 structural, functional, and regulatory features of ∼2200
gene fusions. We report subgroups of fusions that likely act as driver
mutations and find that gene fusions disproportionately affect pathways
regulating cellular shape and movement. Although fusion proteins are
similar across different cancer types, they affect cancer type-specific
pathways. Key indicators of fusion-forming proteins include high and
nontissue specific expression, numerous splice sites, and higher centrality
in protein-interaction networks. Together, these findings provide
unifying and cancer type-specific trends across diverse oncogenic
fusion proteins.
Collapse
Affiliation(s)
- Natasha S Latysheva
- MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, United Kingdom
| | - M Madan Babu
- MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, United Kingdom
| |
Collapse
|
43
|
Abstract
Recent research increasingly shows the relevance of network based approaches for our understanding of biological systems. Analyzing human protein interaction networks, we determined collective influencers (CI), defined as network nodes that damage the integrity of the underlying networks to the utmost degree. We found that CI proteins were enriched with essential, regulatory, signaling and disease genes as well as drug targets, indicating their biological significance. Also by focusing on different organisms, we found that CI proteins had a penchant to be evolutionarily conserved as CI proteins, indicating the fundamental role that collective influencers in protein interaction networks plays for our understanding of regulation, diseases and evolution.
Collapse
|
44
|
Kotlyar M, Pastrello C, Malik Z, Jurisica I. IID 2018 update: context-specific physical protein-protein interactions in human, model organisms and domesticated species. Nucleic Acids Res 2019; 47:D581-D589. [PMID: 30407591 PMCID: PMC6323934 DOI: 10.1093/nar/gky1037] [Citation(s) in RCA: 132] [Impact Index Per Article: 26.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2018] [Revised: 10/15/2018] [Accepted: 10/28/2018] [Indexed: 12/11/2022] Open
Abstract
Knowing the set of physical protein-protein interactions (PPIs) that occur in a particular context-a tissue, disease, or other condition-can provide valuable insights into key research questions. However, while the number of identified human PPIs is expanding rapidly, context information remains limited, and for most non-human species context-specific networks are completely unavailable. The Integrated Interactions Database (IID) provides one of the most comprehensive sets of context-specific human PPI networks, including networks for 133 tissues, 91 disease conditions, and many other contexts. Importantly, it also provides context-specific networks for 17 non-human species including model organisms and domesticated animals. These species are vitally important for drug discovery and agriculture. IID integrates interactions from multiple databases and datasets. It comprises over 4.8 million PPIs annotated with several types of context: tissues, subcellular localizations, diseases, and druggability information (the latter three are new annotations not available in the previous version). This update increases the number of species from 6 to 18, the number of PPIs from ∼1.5 million to ∼4.8 million, and the number of tissues from 30 to 133. IID also now supports topology and enrichment analyses of returned networks. IID is available at http://ophid.utoronto.ca/iid.
Collapse
Affiliation(s)
- Max Kotlyar
- Krembil Research Institute, University Health Network, Toronto, ON M5T 0S8, Canada
| | - Chiara Pastrello
- Krembil Research Institute, University Health Network, Toronto, ON M5T 0S8, Canada
| | - Zara Malik
- Krembil Research Institute, University Health Network, Toronto, ON M5T 0S8, Canada
| | - Igor Jurisica
- Krembil Research Institute, University Health Network, Toronto, ON M5T 0S8, Canada
- Departments of Medical Biophysics and Computer Science, University of Toronto, Toronto, ON M5S 1A4, Canada
- Institute of Neuroimmunology, Slovak Academy of Sciences, Bratislava, Slovakia
| |
Collapse
|
45
|
Pan X, Jensen LJ, Gorodkin J. Inferring disease-associated long non-coding RNAs using genome-wide tissue expression profiles. Bioinformatics 2018; 35:1494-1502. [DOI: 10.1093/bioinformatics/bty859] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2018] [Revised: 08/28/2018] [Accepted: 10/04/2018] [Indexed: 11/13/2022] Open
Affiliation(s)
- Xiaoyong Pan
- Department of Veterinary and Animal Sciences, Center for Non-coding RNA in Technology and Health, University of Copenhagen, Frederiksberg C, Denmark
- Disease Systems Biology Program, Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Copenhagen N, Denmark
| | - Lars Juhl Jensen
- Disease Systems Biology Program, Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Copenhagen N, Denmark
| | - Jan Gorodkin
- Department of Veterinary and Animal Sciences, Center for Non-coding RNA in Technology and Health, University of Copenhagen, Frederiksberg C, Denmark
| |
Collapse
|
46
|
Siahpirani AF, Roy S. A prior-based integrative framework for functional transcriptional regulatory network inference. Nucleic Acids Res 2018; 45:e21. [PMID: 27794550 PMCID: PMC5389674 DOI: 10.1093/nar/gkw963] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2015] [Accepted: 10/12/2016] [Indexed: 12/16/2022] Open
Abstract
Transcriptional regulatory networks specify regulatory proteins controlling the context-specific expression levels of genes. Inference of genome-wide regulatory networks is central to understanding gene regulation, but remains an open challenge. Expression-based network inference is among the most popular methods to infer regulatory networks, however, networks inferred from such methods have low overlap with experimentally derived (e.g. ChIP-chip and transcription factor (TF) knockouts) networks. Currently we have a limited understanding of this discrepancy. To address this gap, we first develop a regulatory network inference algorithm, based on probabilistic graphical models, to integrate expression with auxiliary datasets supporting a regulatory edge. Second, we comprehensively analyze our and other state-of-the-art methods on different expression perturbation datasets. Networks inferred by integrating sequence-specific motifs with expression have substantially greater agreement with experimentally derived networks, while remaining more predictive of expression than motif-based networks. Our analysis suggests natural genetic variation as the most informative perturbation for network inference, and, identifies core TFs whose targets are predictable from expression. Multiple reasons make the identification of targets of other TFs difficult, including network architecture and insufficient variation of TF mRNA level. Finally, we demonstrate the utility of our inference algorithm to infer stress-specific regulatory networks and for regulator prioritization.
Collapse
Affiliation(s)
- Alireza F Siahpirani
- Department of Computer Sciences, University of Wisconsin-Madison, 1210 W. Dayton St. Madison, WI 53706-1613, USA
| | - Sushmita Roy
- Wisconsin Institute for Discovery, University of Wisconsin-Madison, Discovery Building 330 North Orchard St. Madison, WI 53715, USA.,Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, K6/446 Clinical Sciences Center 600 Highland Avenue Madison, WI 53792-4675, USA
| |
Collapse
|
47
|
Enabling Precision Medicine through Integrative Network Models. J Mol Biol 2018; 430:2913-2923. [DOI: 10.1016/j.jmb.2018.07.004] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2018] [Revised: 06/15/2018] [Accepted: 07/03/2018] [Indexed: 11/17/2022]
|
48
|
Abstract
Motivation Understanding functions of proteins in specific human tissues is essential for insights into disease diagnostics and therapeutics, yet prediction of tissue-specific cellular function remains a critical challenge for biomedicine. Results Here, we present OhmNet, a hierarchy-aware unsupervised node feature learning approach for multi-layer networks. We build a multi-layer network, where each layer represents molecular interactions in a different human tissue. OhmNet then automatically learns a mapping of proteins, represented as nodes, to a neural embedding-based low-dimensional space of features. OhmNet encourages sharing of similar features among proteins with similar network neighborhoods and among proteins activated in similar tissues. The algorithm generalizes prior work, which generally ignores relationships between tissues, by modeling tissue organization with a rich multiscale tissue hierarchy. We use OhmNet to study multicellular function in a multi-layer protein interaction network of 107 human tissues. In 48 tissues with known tissue-specific cellular functions, OhmNet provides more accurate predictions of cellular function than alternative approaches, and also generates more accurate hypotheses about tissue-specific protein actions. We show that taking into account the tissue hierarchy leads to improved predictive power. Remarkably, we also demonstrate that it is possible to leverage the tissue hierarchy in order to effectively transfer cellular functions to a functionally uncharacterized tissue. Overall, OhmNet moves from flat networks to multiscale models able to predict a range of phenotypes spanning cellular subsystems. Availability and implementation Source code and datasets are available at http://snap.stanford.edu/ohmnet.
Collapse
Affiliation(s)
- Marinka Zitnik
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | - Jure Leskovec
- Department of Computer Science, Stanford University, Stanford, CA, USA
| |
Collapse
|
49
|
Rouillard AD, Hurle MR, Agarwal P. Systematic interrogation of diverse Omic data reveals interpretable, robust, and generalizable transcriptomic features of clinically successful therapeutic targets. PLoS Comput Biol 2018; 14:e1006142. [PMID: 29782487 PMCID: PMC5983857 DOI: 10.1371/journal.pcbi.1006142] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2017] [Revised: 06/01/2018] [Accepted: 04/13/2018] [Indexed: 11/19/2022] Open
Abstract
Target selection is the first and pivotal step in drug discovery. An incorrect choice may not manifest itself for many years after hundreds of millions of research dollars have been spent. We collected a set of 332 targets that succeeded or failed in phase III clinical trials, and explored whether Omic features describing the target genes could predict clinical success. We obtained features from the recently published comprehensive resource: Harmonizome. Nineteen features appeared to be significantly correlated with phase III clinical trial outcomes, but only 4 passed validation schemes that used bootstrapping or modified permutation tests to assess feature robustness and generalizability while accounting for target class selection bias. We also used classifiers to perform multivariate feature selection and found that classifiers with a single feature performed as well in cross-validation as classifiers with more features (AUROC = 0.57 and AUPR = 0.81). The two predominantly selected features were mean mRNA expression across tissues and standard deviation of expression across tissues, where successful targets tended to have lower mean expression and higher expression variance than failed targets. This finding supports the conventional wisdom that it is favorable for a target to be present in the tissue(s) affected by a disease and absent from other tissues. Overall, our results suggest that it is feasible to construct a model integrating interpretable target features to inform target selection. We anticipate deeper insights and better models in the future, as researchers can reuse the data we have provided to improve methods for handling sample biases and learn more informative features. Code, documentation, and data for this study have been deposited on GitHub at https://github.com/arouillard/omic-features-successful-targets.
Collapse
Affiliation(s)
| | - Mark R. Hurle
- Computational Biology, GSK, Collegeville, PA, United States of America
| | - Pankaj Agarwal
- Computational Biology, GSK, Collegeville, PA, United States of America
| |
Collapse
|
50
|
Verma SS, Ritchie MD. Another Round of "Clue" to Uncover the Mystery of Complex Traits. Genes (Basel) 2018; 9:E61. [PMID: 29370075 PMCID: PMC5852557 DOI: 10.3390/genes9020061] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2017] [Revised: 12/19/2017] [Accepted: 01/15/2018] [Indexed: 12/13/2022] Open
Abstract
A plethora of genetic association analyses have identified several genetic risk loci. Technological and statistical advancements have now led to the identification of not only common genetic variants, but also low-frequency variants, structural variants, and environmental factors, as well as multi-omics variations that affect the phenotypic variance of complex traits in a population, thus referred to as complex trait architecture. The concept of heritability, or the proportion of phenotypic variance due to genetic inheritance, has been studied for several decades, but its application is mainly in addressing the narrow sense heritability (or additive genetic component) from Genome-Wide Association Studies (GWAS). In this commentary, we reflect on our perspective on the complexity of understanding heritability for human traits in comparison to model organisms, highlighting another round of clues beyond GWAS and an alternative approach, investigating these clues comprehensively to help in elucidating the genetic architecture of complex traits.
Collapse
Affiliation(s)
- Shefali Setia Verma
- The Huck Institute of Life Sciences, The Pennsylvania State University, University Park, PA 16802, USA.
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA.
| | - Marylyn D Ritchie
- The Huck Institute of Life Sciences, The Pennsylvania State University, University Park, PA 16802, USA.
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA.
| |
Collapse
|