1
|
Bukharina TA, Golubyatnikov VP, Furman DP. The central regulatory circuit in the gene network controlling the morphogenesis of Drosophila mechanoreceptors: an in silico analysis. Vavilovskii Zhurnal Genet Selektsii 2023; 27:746-754. [PMID: 38213705 PMCID: PMC10777295 DOI: 10.18699/vjgb-23-87] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Revised: 09/20/2023] [Accepted: 09/25/2023] [Indexed: 01/13/2024] Open
Abstract
Identification of the mechanisms underlying the genetic control of spatial structure formation is among the relevant tasks of developmental biology. Both experimental and theoretical approaches and methods are used for this purpose, including gene network methodology, as well as mathematical and computer modeling. Reconstruction and analysis of the gene networks that provide the formation of traits allow us to integrate the existing experimental data and to identify the key links and intra-network connections that ensure the function of networks. Mathematical and computer modeling is used to obtain the dynamic characteristics of the studied systems and to predict their state and behavior. An example of the spatial morphological structure is the Drosophila bristle pattern with a strictly defined arrangement of its components - mechanoreceptors (external sensory organs) - on the head and body. The mechanoreceptor develops from a single sensory organ parental cell (SOPC), which is isolated from the ectoderm cells of the imaginal disk. It is distinguished from its surroundings by the highest content of proneural proteins (ASC), the products of the achaete-scute proneural gene complex (AS-C). The SOPC status is determined by the gene network we previously reconstructed and the AS-C is the key component of this network. AS-C activity is controlled by its subnetwork - the central regulatory circuit (CRC) comprising seven genes: AS-C, hairy, senseless (sens), charlatan (chn), scratch (scrt), phyllopod (phyl), and extramacrochaete (emc), as well as their respective proteins. In addition, the CRC includes the accessory proteins Daughterless (DA), Groucho (GRO), Ubiquitin (UB), and Seven-in-absentia (SINA). The paper describes the results of computer modeling of different CRC operation modes. As is shown, a cell is determined as an SOPC when the ASC content increases approximately 2.5-fold relative to the level in the surrounding cells. The hierarchy of the effects of mutations in the CRC genes on the dynamics of ASC protein accumulation is clarified. AS-C as the main CRC component is the most significant. The mutations that decrease the ASC content by more than 40 % lead to the prohibition of SOPC segregation.
Collapse
Affiliation(s)
- T A Bukharina
- Institute of Cytology and Genetics of the Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia Novosibirsk State University, Novosibirsk, Russia
| | - V P Golubyatnikov
- Sobolev Institute of Mathematics of the Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia
| | - D P Furman
- Institute of Cytology and Genetics of the Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia Novosibirsk State University, Novosibirsk, Russia
| |
Collapse
|
2
|
El-Kafrawy SA, El-Daly MM, Bajrai LH, Alandijany TA, Faizo AA, Mobashir M, Ahmed SS, Ahmed S, Alam S, Jeet R, Kamal MA, Anwer ST, Khan B, Tashkandi M, Rizvi MA, Azhar EI. Genomic profiling and network-level understanding uncover the potential genes and the pathways in hepatocellular carcinoma. Front Genet 2022; 13:880440. [PMID: 36479247 PMCID: PMC9720179 DOI: 10.3389/fgene.2022.880440] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2022] [Accepted: 11/02/2022] [Indexed: 12/11/2023] Open
Abstract
Data integration with phenotypes such as gene expression, pathways or function, and protein-protein interactions data has proven to be a highly promising technique for improving human complex diseases, particularly cancer patient outcome prediction. Hepatocellular carcinoma is one of the most prevalent cancers, and the most common cause is chronic HBV and HCV infection, which is linked to the majority of cases, and HBV and HCV play a role in multistep carcinogenesis progression. We examined the list of known hepatocellular carcinoma biomarkers with the publicly available expression profile dataset of hepatocellular carcinoma infected with HCV from day 1 to day 10 in this study. The study covers an overexpression pattern for the selected biomarkers in clinical hepatocellular carcinoma patients, a combined investigation of these biomarkers with the gathered temporal dataset, temporal expression profiling changes, and temporal pathway enrichment following HCV infection. Following a temporal analysis, it was discovered that the early stages of HCV infection tend to be more harmful in terms of expression shifting patterns, and that there is no significant change after that, followed by a set of genes that are consistently altered. PI3K, cAMP, TGF, TNF, Rap1, NF-kB, Apoptosis, Longevity regulating pathway, signaling pathways regulating pluripotency of stem cells, Cytokine-cytokine receptor interaction, p53 signaling, Wnt signaling, Toll-like receptor signaling, and Hippo signaling pathways are just a few of the most commonly enriched pathways. The majority of these pathways are well-known for their roles in the immune system, infection and inflammation, and human illnesses like cancer. We also find that ADCY8, MYC, PTK2, CTNNB1, TP53, RB1, PRKCA, TCF7L2, PAK1, ITPR2, CYP3A4, UGT1A6, GCK, and FGFR2/3 appear to be among the prominent genes based on the networks of genes and pathways based on the copy number alterations, mutations, and structural variants study.
Collapse
Affiliation(s)
- Sherif A. El-Kafrawy
- Special Infectious Agents Unit-BSL3, King Fahd Medical Research Centre, King Abdulaziz University, Jeddah, Saudi Arabia
- Department of Medical Laboratory Sciences, Faculty of Applied Medical Sciences, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Mai M. El-Daly
- Special Infectious Agents Unit-BSL3, King Fahd Medical Research Centre, King Abdulaziz University, Jeddah, Saudi Arabia
- Department of Medical Laboratory Sciences, Faculty of Applied Medical Sciences, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Leena H. Bajrai
- Special Infectious Agents Unit-BSL3, King Fahd Medical Research Centre, King Abdulaziz University, Jeddah, Saudi Arabia
- Biochemistry Department, Faculty of Sciences, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Thamir A. Alandijany
- Special Infectious Agents Unit-BSL3, King Fahd Medical Research Centre, King Abdulaziz University, Jeddah, Saudi Arabia
- Department of Medical Laboratory Sciences, Faculty of Applied Medical Sciences, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Arwa A. Faizo
- Special Infectious Agents Unit-BSL3, King Fahd Medical Research Centre, King Abdulaziz University, Jeddah, Saudi Arabia
- Department of Medical Laboratory Sciences, Faculty of Applied Medical Sciences, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Mohammad Mobashir
- Department of Microbiology, Tumor and Cell Biology (MTC), Karolinska Institute, Stockholm, Sweden
- Genome Biology Lab, Department of Biosciences, Jamia Millia Islamia, New Delhi, India
| | - Sunbul S. Ahmed
- Genome Biology Lab, Department of Biosciences, Jamia Millia Islamia, New Delhi, India
| | - Sarfraz Ahmed
- Department of Biosciences, Jamia Millia Islamia, New Delhi, India
| | - Shoaib Alam
- Department of Biotechnology, Jamia Millia Islamia, New Delhi, India
| | - Raja Jeet
- Botany Department, Ganesh Dutt College, Begusarai, Bihar, India
| | - Mohammad Amjad Kamal
- Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu, China
- King Fahd Medical Research Center, King Abdulaziz University, Jeddah, Saudi Arabia
- Department of Pharmacy, Faculty of Allied Health Sciences, Daffodil International University, Dhaka, Bangladesh
- Enzymoics, Hebersham, NSW, Australia
- Novel Global Community Educational Foundation, Hebersham, NSW, Australia
| | - Syed Tauqeer Anwer
- Genome Biology Lab, Department of Biosciences, Jamia Millia Islamia, New Delhi, India
| | - Bushra Khan
- Genome Biology Lab, Department of Biosciences, Jamia Millia Islamia, New Delhi, India
| | - Manal Tashkandi
- Department of Biochemistry, College of Science, University of Jeddah, Jeddah, Saudi Arabia
| | - Moshahid A. Rizvi
- Genome Biology Lab, Department of Biosciences, Jamia Millia Islamia, New Delhi, India
| | - Esam Ibraheem Azhar
- Special Infectious Agents Unit-BSL3, King Fahd Medical Research Centre, King Abdulaziz University, Jeddah, Saudi Arabia
- Department of Medical Laboratory Sciences, Faculty of Applied Medical Sciences, King Abdulaziz University, Jeddah, Saudi Arabia
| |
Collapse
|
3
|
Identifying large scale interaction atlases using probabilistic graphs and external knowledge. J Clin Transl Sci 2022; 6:e27. [PMID: 35321220 PMCID: PMC8922291 DOI: 10.1017/cts.2022.18] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Revised: 12/29/2021] [Accepted: 02/07/2022] [Indexed: 11/17/2022] Open
Abstract
Introduction: Reconstruction of gene interaction networks from experimental data provides a deep understanding of the underlying biological mechanisms. The noisy nature of the data and the large size of the network make this a very challenging task. Complex approaches handle the stochastic nature of the data but can only do this for small networks; simpler, linear models generate large networks but with less reliability. Methods: We propose a divide-and-conquer approach using probabilistic graph representations and external knowledge. We cluster the experimental data and learn an interaction network for each cluster, which are merged using the interaction network for the representative genes selected for each cluster. Results: We generated an interaction atlas for 337 human pathways yielding a network of 11,454 genes with 17,777 edges. Simulated gene expression data from this atlas formed the basis for reconstruction. Based on the area under the curve of the precision-recall curve, the proposed approach outperformed the baseline (random classifier) by ∼15-fold and conventional methods by ∼5–17-fold. The performance of the proposed workflow is significantly linked to the accuracy of the clustering step that tries to identify the modularity of the underlying biological mechanisms. Conclusions: We provide an interaction atlas generation workflow optimizing the algorithm/parameter selection. The proposed approach integrates external knowledge in the reconstruction of the interactome using probabilistic graphs. Network characterization and understanding long-range effects in interaction atlases provide means for comparative analysis with implications in biomarker discovery and therapeutic approaches. The proposed workflow is freely available at http://otulab.unl.edu/atlas.
Collapse
|
4
|
Emmert-Streib F. Grand Challenges for Artificial Intelligence in Molecular Medicine. FRONTIERS IN MOLECULAR MEDICINE 2021; 1:734659. [PMID: 39087080 PMCID: PMC11285658 DOI: 10.3389/fmmed.2021.734659] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Accepted: 07/08/2021] [Indexed: 08/02/2024]
Affiliation(s)
- Frank Emmert-Streib
- Predictive Society and Data Analytics Lab, Faculty of Information Technolgy and Communication Sciences, Tampere University, Tampere, Finland
- Institute of Biosciences and Medical Technology, Tampere, Finland
| |
Collapse
|
5
|
Sudhakar P, Machiels K, Verstockt B, Korcsmaros T, Vermeire S. Computational Biology and Machine Learning Approaches to Understand Mechanistic Microbiome-Host Interactions. Front Microbiol 2021; 12:618856. [PMID: 34046017 PMCID: PMC8148342 DOI: 10.3389/fmicb.2021.618856] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2020] [Accepted: 03/19/2021] [Indexed: 12/11/2022] Open
Abstract
The microbiome, by virtue of its interactions with the host, is implicated in various host functions including its influence on nutrition and homeostasis. Many chronic diseases such as diabetes, cancer, inflammatory bowel diseases are characterized by a disruption of microbial communities in at least one biological niche/organ system. Various molecular mechanisms between microbial and host components such as proteins, RNAs, metabolites have recently been identified, thus filling many gaps in our understanding of how the microbiome modulates host processes. Concurrently, high-throughput technologies have enabled the profiling of heterogeneous datasets capturing community level changes in the microbiome as well as the host responses. However, due to limitations in parallel sampling and analytical procedures, big gaps still exist in terms of how the microbiome mechanistically influences host functions at a system and community level. In the past decade, computational biology and machine learning methodologies have been developed with the aim of filling the existing gaps. Due to the agnostic nature of the tools, they have been applied in diverse disease contexts to analyze and infer the interactions between the microbiome and host molecular components. Some of these approaches allow the identification and analysis of affected downstream host processes. Most of the tools statistically or mechanistically integrate different types of -omic and meta -omic datasets followed by functional/biological interpretation. In this review, we provide an overview of the landscape of computational approaches for investigating mechanistic interactions between individual microbes/microbiome and the host and the opportunities for basic and clinical research. These could include but are not limited to the development of activity- and mechanism-based biomarkers, uncovering mechanisms for therapeutic interventions and generating integrated signatures to stratify patients.
Collapse
Affiliation(s)
- Padhmanand Sudhakar
- Department of Chronic Diseases, Metabolism and Ageing, Translational Research Center for Gastrointestinal Disorders (TARGID), KU Leuven, Leuven, Belgium
- Earlham Institute, Norwich, United Kingdom
- Quadram Institute Bioscience, Norwich, United Kingdom
| | - Kathleen Machiels
- Department of Chronic Diseases, Metabolism and Ageing, Translational Research Center for Gastrointestinal Disorders (TARGID), KU Leuven, Leuven, Belgium
| | - Bram Verstockt
- Department of Chronic Diseases, Metabolism and Ageing, Translational Research Center for Gastrointestinal Disorders (TARGID), KU Leuven, Leuven, Belgium
- Department of Gastroenterology and Hepatology, University Hospitals Leuven, KU Leuven, Leuven, Belgium
| | - Tamas Korcsmaros
- Earlham Institute, Norwich, United Kingdom
- Quadram Institute Bioscience, Norwich, United Kingdom
| | - Séverine Vermeire
- Department of Chronic Diseases, Metabolism and Ageing, Translational Research Center for Gastrointestinal Disorders (TARGID), KU Leuven, Leuven, Belgium
- Department of Gastroenterology and Hepatology, University Hospitals Leuven, KU Leuven, Leuven, Belgium
| |
Collapse
|
6
|
Manjang K, Tripathi S, Yli-Harja O, Dehmer M, Emmert-Streib F. Graph-based exploitation of gene ontology using GOxploreR for scrutinizing biological significance. Sci Rep 2020; 10:16672. [PMID: 33028846 PMCID: PMC7542435 DOI: 10.1038/s41598-020-73326-3] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2020] [Accepted: 08/17/2020] [Indexed: 12/12/2022] Open
Abstract
Gene ontology (GO) is an eminent knowledge base frequently used for providing biological interpretations for the analysis of genes or gene sets from biological, medical and clinical problems. Unfortunately, the interpretation of such results is challenging due to the large number of GO terms, their hierarchical and connected organization as directed acyclic graphs (DAGs) and the lack of tools allowing to exploit this structural information explicitly. For this reason, we developed the R package GOxploreR. The main features of GOxploreR are (I) easy and direct access to structural features of GO, (II) structure-based ranking of GO-terms, (III) mapping to reduced GO-DAGs including visualization capabilities and (IV) prioritizing of GO-terms. The underlying idea of GOxploreR is to exploit a graph-theoretical perspective of GO as manifested by its DAG-structure and the containing hierarchy levels for cumulating semantic information. That means all these features enhance the utilization of structural information of GO and complement existing analysis tools. Overall, GOxploreR provides exploratory as well as confirmatory tools for complementing any kind of analysis resulting in a list of GO-terms, e.g., from differentially expressed genes or gene sets, GWAS or biomarkers. Our R package GOxploreR is freely available from CRAN.
Collapse
Affiliation(s)
- Kalifa Manjang
- Predictive Society and Data Analytics Lab, Tampere University, Tampere, Korkeakoulunkatu 10, 33720, Tampere, Finland
| | - Shailesh Tripathi
- Predictive Society and Data Analytics Lab, Tampere University, Tampere, Korkeakoulunkatu 10, 33720, Tampere, Finland
| | - Olli Yli-Harja
- Computational Systems Biology, Tampere University, Tampere, Korkeakoulunkatu 10, 33720, Tampere, Finland.,Institute for Systems Biology, Seattle, WA, USA.,Institute of Biosciences and Medical Technology, Tampere University, Tampere, Korkeakoulunkatu 10, 33720, Tampere, Finland
| | - Matthias Dehmer
- Department of Biomedical Computer Science and Mechatronics, UMIT-The Health and Life Science University, 6060, Hall in Tyrol, Austria.,College of Artificial Intelligence, Nankai University, Tianjin, 300350, China
| | - Frank Emmert-Streib
- Predictive Society and Data Analytics Lab, Tampere University, Tampere, Korkeakoulunkatu 10, 33720, Tampere, Finland. .,Institute of Biosciences and Medical Technology, Tampere University, Tampere, Korkeakoulunkatu 10, 33720, Tampere, Finland.
| |
Collapse
|
7
|
Chen ZH, You ZH, Guo ZH, Yi HC, Luo GX, Wang YB. Prediction of Drug-Target Interactions From Multi-Molecular Network Based on Deep Walk Embedding Model. Front Bioeng Biotechnol 2020; 8:338. [PMID: 32582646 PMCID: PMC7283956 DOI: 10.3389/fbioe.2020.00338] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2020] [Accepted: 03/26/2020] [Indexed: 12/16/2022] Open
Abstract
Predicting drug-target interactions (DTIs) is crucial in innovative drug discovery, drug repositioning and other fields. However, there are many shortcomings for predicting DTIs using traditional biological experimental methods, such as the high-cost, time-consumption, low efficiency, and so on, which make these methods difficult to widely apply. As a supplement, the in silico method can provide helpful information for predictions of DTIs in a timely manner. In this work, a deep walk embedding method is developed for predicting DTIs from a multi-molecular network. More specifically, a multi-molecular network, also called molecular associations network, is constructed by integrating the associations among drug, protein, disease, lncRNA, and miRNA. Then, each node can be represented as a behavior feature vector by using a deep walk embedding method. Finally, we compared behavior features with traditional attribute features on an integrated dataset by using various classifiers. The experimental results revealed that the behavior feature could be performed better on different classifiers, especially on the random forest classifier. It is also demonstrated that the use of behavior information is very helpful for addressing the problem of sequences containing both self-interacting and non-interacting pairs of proteins. This work is not only extremely suitable for predicting DTIs, but also provides a new perspective for the prediction of other biomolecules' associations.
Collapse
Affiliation(s)
- Zhan-Heng Chen
- The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Zhu-Hong You
- The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Zhen-Hao Guo
- The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Hai-Cheng Yi
- The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Gong-Xu Luo
- The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Yan-Bin Wang
- School of Cyber Science and Technology, Zhejiang University, Hangzhou, China
| |
Collapse
|
8
|
Ni P, Wang J, Zhong P, Li Y, Wu FX, Pan Y. Constructing Disease Similarity Networks Based on Disease Module Theory. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020; 17:906-915. [PMID: 29993782 DOI: 10.1109/tcbb.2018.2817624] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Quantifying the associations between diseases is now playing an important role in modern biology and medicine. Actually discovering associations between diseases could help us gain deeper insights into pathogenic mechanisms of complex diseases, thus could lead to improvements in disease diagnosis, drug repositioning, and drug development. Due to the growing body of high-throughput biological data, a number of methods have been developed for computing similarity between diseases during the past decade. However, these methods rarely consider the interconnections of genes related to each disease in protein-protein interaction network (PPIN). Recently, the disease module theory has been proposed, which states that disease-related genes or proteins tend to interact with each other in the same neighborhood of a PPIN. In this study, we propose a new method called ModuleSim to measure associations between diseases by using disease-gene association data and PPIN data based on disease module theory. The experimental results show that by considering the interactions between disease modules and their modularity, the disease similarity calculated by ModuleSim has a significant correlation with disease classification of Disease Ontology (DO). Furthermore, ModuleSim outperforms other four popular methods which are all using disease-gene association data and PPIN data to measure disease-disease associations. In addition, the disease similarity network constructed by MoudleSim suggests that ModuleSim is capable of finding potential associations between diseases.
Collapse
|
9
|
Abstract
BACKGROUND A collection of disease-associated data contributes to study the association between diseases. Discovering closely related diseases plays a crucial role in revealing their common pathogenic mechanisms. This might further imply treatment that can be appropriated from one disease to another. During the past decades, a number of approaches for calculating disease similarity have been developed. However, most of them are designed to take advantage of single or few data sources, which results in their low accuracy. METHODS In this paper, we propose a novel method, called MultiSourcDSim, to calculate disease similarity by integrating multiple data sources, namely, gene-disease associations, GO biological process-disease associations and symptom-disease associations. Firstly, we establish three disease similarity networks according to the three disease-related data sources respectively. Secondly, the representation of each node is obtained by integrating the three small disease similarity networks. In the end, the learned representations are applied to calculate the similarity between diseases. RESULTS Our approach shows the best performance compared to the other three popular methods. Besides, the similarity network built by MultiSourcDSim suggests that our method can also uncover the latent relationships between diseases. CONCLUSIONS MultiSourcDSim is an efficient approach to predict similarity between diseases.
Collapse
Affiliation(s)
- Lei Deng
- School of Computer Science and Engineering, Central South University, Changsha, 410075 China
| | - Danyi Ye
- School of Computer Science and Engineering, Central South University, Changsha, 410075 China
| | - Junmin Zhao
- School of Computer and Data Science, Henan University of Urban Construction, Pingdingshan, 467000 China
| | - Jingpu Zhang
- School of Computer and Data Science, Henan University of Urban Construction, Pingdingshan, 467000 China
| |
Collapse
|
10
|
Smolander J, Dehmer M, Emmert-Streib F. Comparing deep belief networks with support vector machines for classifying gene expression data from complex disorders. FEBS Open Bio 2019; 9:1232-1248. [PMID: 31074948 PMCID: PMC6609581 DOI: 10.1002/2211-5463.12652] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2019] [Revised: 04/25/2019] [Accepted: 05/08/2019] [Indexed: 12/24/2022] Open
Abstract
Genomics data provide great opportunities for translational research and the clinical practice, for example, for predicting disease stages. However, the classification of such data is a challenging task due to their high dimensionality, noise, and heterogeneity. In recent years, deep learning classifiers generated much interest, but due to their complexity, so far, little is known about the utility of this method for genomics. In this paper, we address this problem by studying a computational diagnostics task by classification of breast cancer and inflammatory bowel disease patients based on high‐dimensional gene expression data. We provide a comprehensive analysis of the classification performance of deep belief networks (DBNs) in dependence on its multiple model parameters and in comparison with support vector machines (SVMs). Furthermore, we investigate combined classifiers that integrate DBNs with SVMs. Such a classifier utilizes a DBN as representation learner forming the input for a SVM. Overall, our results provide guidelines for the complex usage of DBN for classifying gene expression data from complex diseases.
Collapse
Affiliation(s)
- Johannes Smolander
- Predictive Society and Data Analytics Lab, Faculty of Information Technology and Communication Sciences, Tampere University, Finland.,Turku Centre for Biotechnology, University of Turku, Finland
| | - Matthias Dehmer
- Institute for Intelligent Production, Faculty for Management, University of Applied Sciences Upper Austria, Steyr, Austria.,Department of Mechatronics and Biomedical Computer Science, UMIT, Hall in Tyrol, Austria.,College of Computer and Control Engineering, Nankai University, Tianjin, China
| | - Frank Emmert-Streib
- Predictive Society and Data Analytics Lab, Faculty of Information Technology and Communication Sciences, Tampere University, Finland.,Institute of Biosciences and Medical Technology, Tampere, Finland
| |
Collapse
|
11
|
Musa A, Tripathi S, Dehmer M, Yli-Harja O, Kauffman SA, Emmert-Streib F. Systems Pharmacogenomic Landscape of Drug Similarities from LINCS data: Drug Association Networks. Sci Rep 2019; 9:7849. [PMID: 31127155 PMCID: PMC6534546 DOI: 10.1038/s41598-019-44291-3] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2018] [Accepted: 05/08/2019] [Indexed: 02/01/2023] Open
Abstract
Modern research in the biomedical sciences is data-driven utilizing high-throughput technologies to generate big genomic data. The Library of Integrated Network-based Cellular Signatures (LINCS) is an example for a large-scale genomic data repository providing hundred thousands of high-dimensional gene expression measurements for thousands of drugs and dozens of cell lines. However, the remaining challenge is how to use these data effectively for pharmacogenomics. In this paper, we use LINCS data to construct drug association networks (DANs) representing the relationships between drugs. By using the Anatomical Therapeutic Chemical (ATC) classification of drugs we demonstrate that the DANs represent a systems pharmacogenomic landscape of drugs summarizing the entire LINCS repository on a genomic scale meaningfully. Here we identify the modules of the DANs as therapeutic attractors of the ATC drug classes.
Collapse
Affiliation(s)
- Aliyu Musa
- Predictive Society and Data Analytics Lab, Tampere University, Tampere, Korkeakoulunkatu 10, 33720, Tampere, Finland
- Institute of Biosciences and Medical Technology, Tampere University, Tampere, Korkeakoulunkatu 10, 33720, Tampere, Finland
| | - Shailesh Tripathi
- Predictive Society and Data Analytics Lab, Tampere University, Tampere, Korkeakoulunkatu 10, 33720, Tampere, Finland
- Institute for Intelligent Production, Faculty for Management, University of Applied Sciences Upper Austria, Wehrgrabengasse 1-3, 4400, Steyr, Austria
| | - Matthias Dehmer
- Department for Biomedical Computer Science and Mechatronics, UMIT - The Health and Lifesciences University, Eduard Wallnoefer Zentrum 1, 6060, Hall in Tyrol, Austria
- College of Computer and Control Engineering, Nankai University, Tianjin, 300350, P.R. China
- Institute for Intelligent Production, Faculty for Management, University of Applied Sciences Upper Austria, Wehrgrabengasse 1-3, 4400, Steyr, Austria
| | - Olli Yli-Harja
- Institute of Biosciences and Medical Technology, Tampere University, Tampere, Korkeakoulunkatu 10, 33720, Tampere, Finland
- Computational Systems Biology Lab, Tampere University of Technology, Korkeakoulunkatu 10, 33720, Tampere, Finland
- Institute for Systems Biology, Seattle, WA, 98109, USA
| | | | - Frank Emmert-Streib
- Predictive Society and Data Analytics Lab, Tampere University, Tampere, Korkeakoulunkatu 10, 33720, Tampere, Finland.
- Institute of Biosciences and Medical Technology, Tampere University, Tampere, Korkeakoulunkatu 10, 33720, Tampere, Finland.
| |
Collapse
|
12
|
Muetze T, Lynn DJ. Using the Contextual Hub Analysis Tool (CHAT) in Cytoscape to Identify Contextually Relevant Network Hubs. ACTA ACUST UNITED AC 2018; 59:8.24.1-8.24.13. [DOI: 10.1002/cpbi.35] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Affiliation(s)
- Tanja Muetze
- EMBL Australia Biomedical Informatics Group, Infection & Immunity Theme, South Australian Health and Medical Research Institute North Terrace Adelaide Australia
| | - David J. Lynn
- EMBL Australia Biomedical Informatics Group, Infection & Immunity Theme, South Australian Health and Medical Research Institute North Terrace Adelaide Australia
- School of Medicine, Flinders University Bedford Park Australia
| |
Collapse
|
13
|
Tripathi S, Lloyd-Price J, Ribeiro A, Yli-Harja O, Dehmer M, Emmert-Streib F. sgnesR: An R package for simulating gene expression data from an underlying real gene network structure considering delay parameters. BMC Bioinformatics 2017; 18:325. [PMID: 28676075 PMCID: PMC5496254 DOI: 10.1186/s12859-017-1731-8] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2016] [Accepted: 06/15/2017] [Indexed: 01/04/2023] Open
Abstract
Background sgnesR (Stochastic Gene Network Expression Simulator in R) is an R package that provides an interface to simulate gene expression data from a given gene network using the stochastic simulation algorithm (SSA). The package allows various options for delay parameters and can easily included in reactions for promoter delay, RNA delay and Protein delay. A user can tune these parameters to model various types of reactions within a cell. As examples, we present two network models to generate expression profiles. We also demonstrated the inference of networks and the evaluation of association measure of edge and non-edge components from the generated expression profiles. Results The purpose of sgnesR is to enable an easy to use and a quick implementation for generating realistic gene expression data from biologically relevant networks that can be user selected. Conclusions sgnesR is freely available for academic use. The R package has been tested for R 3.2.0 under Linux, Windows and Mac OS X. Electronic supplementary material The online version of this article (doi:10.1186/s12859-017-1731-8) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Shailesh Tripathi
- Predictive Medicine and Data Analytics Lab, Department of Signal Processing, Tampere University of Technology, Tampere, Finland
| | - Jason Lloyd-Price
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Harvard University, Boston, USA.,Laboratory of Biosystem Dynamics, Department of Signal Processing, Tampere University of Technology, Tampere, Finland
| | - Andre Ribeiro
- Laboratory of Biosystem Dynamics, Department of Signal Processing, Tampere University of Technology, Tampere, Finland.,Institute of Biosciences and Medical Technology, Tampere, Finland
| | - Olli Yli-Harja
- Institute of Biosciences and Medical Technology, Tampere, Finland.,Computational Systems Biology, Department of Signal Processing, Tampere University of Technology, Tampere, Finland
| | - Matthias Dehmer
- Institute for Theoretical Informatics, Mathematics and Operations Research, Department of Computer Science, Universität der Bundeswehr München, Munich, Germany
| | - Frank Emmert-Streib
- Predictive Medicine and Data Analytics Lab, Department of Signal Processing, Tampere University of Technology, Tampere, Finland. .,Institute of Biosciences and Medical Technology, Tampere, Finland.
| |
Collapse
|
14
|
Jinawath N, Bunbanjerdsuk S, Chayanupatkul M, Ngamphaiboon N, Asavapanumas N, Svasti J, Charoensawan V. Bridging the gap between clinicians and systems biologists: from network biology to translational biomedical research. J Transl Med 2016; 14:324. [PMID: 27876057 PMCID: PMC5120462 DOI: 10.1186/s12967-016-1078-3] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2016] [Accepted: 11/08/2016] [Indexed: 01/22/2023] Open
Abstract
With the wealth of data accumulated from completely sequenced genomes and other high-throughput experiments, global studies of biological systems, by simultaneously investigating multiple biological entities (e.g. genes, transcripts, proteins), has become a routine. Network representation is frequently used to capture the presence of these molecules as well as their relationship. Network biology has been widely used in molecular biology and genetics, where several network properties have been shown to be functionally important. Here, we discuss how such methodology can be useful to translational biomedical research, where scientists traditionally focus on one or a small set of genes, diseases, and drug candidates at any one time. We first give an overview of network representation frequently used in biology: what nodes and edges represent, and review its application in preclinical research to date. Using cancer as an example, we review how network biology can facilitate system-wide approaches to identify targeted small molecule inhibitors. These types of inhibitors have the potential to be more specific, resulting in high efficacy treatments with less side effects, compared to the conventional treatments such as chemotherapy. Global analysis may provide better insight into the overall picture of human diseases, as well as identify previously overlooked problems, leading to rapid advances in medicine. From the clinicians’ point of view, it is necessary to bridge the gap between theoretical network biology and practical biomedical research, in order to improve the diagnosis, prevention, and treatment of the world’s major diseases.
Collapse
Affiliation(s)
- Natini Jinawath
- Integrative Computational BioScience (ICBS) Center, Mahidol University, Nakhon Pathom, Thailand.,Program in Translational Medicine, Faculty of Medicine Ramathibodi Hospital, Mahidol University, Bangkok, Thailand
| | - Sacarin Bunbanjerdsuk
- Program in Translational Medicine, Faculty of Medicine Ramathibodi Hospital, Mahidol University, Bangkok, Thailand
| | - Maneerat Chayanupatkul
- Department of Physiology, Faculty of Medicine, Chulalongkorn University, Bangkok, Thailand.,Division of Gastroenterology and Hepatology, Department of Medicine, Baylor College of Medicine, Houston, TX, USA
| | - Nuttapong Ngamphaiboon
- Medical Oncology Unit, Department of Medicine Faculty of Medicine, Ramathibodi Hospital, Mahidol University, Bangkok, Thailand
| | - Nithi Asavapanumas
- Department of Physiology, Faculty of Science, Mahidol University, Bangkok, Thailand
| | - Jisnuson Svasti
- Integrative Computational BioScience (ICBS) Center, Mahidol University, Nakhon Pathom, Thailand.,Department of Biochemistry, Faculty of Science, Mahidol University, Bangkok, Thailand.,Laboratory of Biochemistry, Chulabhorn Research Institute, Bangkok, Thailand
| | - Varodom Charoensawan
- Integrative Computational BioScience (ICBS) Center, Mahidol University, Nakhon Pathom, Thailand. .,Department of Biochemistry, Faculty of Science, Mahidol University, Bangkok, Thailand. .,Systems Biology of Diseases Research Unit, Faculty of Science, Mahidol University, Bangkok, Thailand.
| |
Collapse
|
15
|
Tripathi S, Moutari S, Dehmer M, Emmert-Streib F. Comparison of module detection algorithms in protein networks and investigation of the biological meaning of predicted modules. BMC Bioinformatics 2016; 17:129. [PMID: 26987731 PMCID: PMC4797184 DOI: 10.1186/s12859-016-0979-8] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2015] [Accepted: 03/06/2016] [Indexed: 01/22/2023] Open
Abstract
Background It is generally acknowledged that a functional understanding of a biological system can only be obtained by an understanding of the collective of molecular interactions in form of biological networks. Protein networks are one particular network type of special importance, because proteins form the functional base units of every biological cell. On a mesoscopic level of protein networks, modules are of significant importance because these building blocks may be the next elementary functional level above individual proteins allowing to gain insight into fundamental organizational principles of biological cells. Results In this paper, we provide a comparative analysis of five popular and four novel module detection algorithms. We study these module prediction methods for simulated benchmark networks as well as 10 biological protein interaction networks (PINs). A particular focus of our analysis is placed on the biological meaning of the predicted modules by utilizing the Gene Ontology (GO) database as gold standard for the definition of biological processes. Furthermore, we investigate the robustness of the results by perturbing the PINs simulating in this way our incomplete knowledge of protein networks. Conclusions Overall, our study reveals that there is a large heterogeneity among the different module prediction algorithms if one zooms-in the biological level of biological processes in the form of GO terms and all methods are severely affected by a slight perturbation of the networks. However, we also find pathways that are enriched in multiple modules, which could provide important information about the hierarchical organization of the system.
Collapse
Affiliation(s)
- Shailesh Tripathi
- Predictive Medicine and Analytics Lab, Department of Signal Processing, Tampere University of Technology, Tampere, Finland
| | - Salissou Moutari
- Centre for Statistical Science and Operational Research, School of Mathematics and Physics, Queen's University Belfast, Belfast, UK
| | - Matthias Dehmer
- Institute for Theoretical Informatics, Mathematics and Operations Research, Department of Computer Science, Universität der Bundeswehr München, Munich, Germany
| | - Frank Emmert-Streib
- Predictive Medicine and Analytics Lab, Department of Signal Processing, Tampere University of Technology, Tampere, Finland. .,Institute of Biosciences and Medical Technology, Tampere, Finland.
| |
Collapse
|
16
|
Xiang Z, Sun H, Cai X, Chen D. The study on serum and urine of renal interstitial fibrosis rats induced by unilateral ureteral obstruction based on metabonomics and network analysis methods. Anal Bioanal Chem 2016; 408:2607-19. [PMID: 26873208 DOI: 10.1007/s00216-016-9368-4] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2015] [Revised: 01/18/2016] [Accepted: 01/27/2016] [Indexed: 12/14/2022]
Abstract
Transmission of biological information is a biochemical process of multistep cascade from genes/proteins to metabolites. However, because most metabolites reflect the terminal information of the biochemical process, it is difficult to describe the transmission process of disease information in terms of the metabolomics strategy. In this paper, by incorporating network and metabolomics methods, an integrated approach was proposed to systematically investigate and explain the molecular mechanism of renal interstitial fibrosis. Through analysis of the network, the cascade transmission process of disease information starting from genes/proteins to metabolites was putatively identified and uncovered. The results indicated that renal fibrosis was involved in metabolic pathways of glycerophospholipid metabolism, biosynthesis of unsaturated fatty acids and arachidonic acid metabolism, riboflavin metabolism, tyrosine metabolism, and sphingolipid metabolism. These pathways involve kidney disease genes such as TGF-β1 and P2RX7. Our results showed that combining metabolomics and network analysis can provide new strategies and ideas for the interpretation of pathogenesis of disease with full consideration of "gene-protein-metabolite."
Collapse
Affiliation(s)
- Zheng Xiang
- School of Traditional Chinese Medicine, Guangdong Pharmaceutical University, Guangzhou, 510006, China. .,School of Pharmaceutical Sciences, Wenzhou Medical University, Wenzhou, 325035, China.
| | - Hao Sun
- School of Pharmaceutical Sciences, Wenzhou Medical University, Wenzhou, 325035, China
| | - Xiaojun Cai
- School of Pharmaceutical Sciences, Wenzhou Medical University, Wenzhou, 325035, China
| | - Dahui Chen
- School of Pharmaceutical Sciences, Wenzhou Medical University, Wenzhou, 325035, China
| |
Collapse
|
17
|
Dand N, Schulz R, Weale ME, Southgate L, Oakey RJ, Simpson MA, Schlitt T. Network-Informed Gene Ranking Tackles Genetic Heterogeneity in Exome-Sequencing Studies of Monogenic Disease. Hum Mutat 2015; 36:1135-44. [PMID: 26394720 PMCID: PMC4982032 DOI: 10.1002/humu.22906] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2014] [Accepted: 09/09/2015] [Indexed: 11/10/2022]
Abstract
Genetic heterogeneity presents a significant challenge for the identification of monogenic disease genes. Whole-exome sequencing generates a large number of candidate disease-causing variants and typical analyses rely on deleterious variants being observed in the same gene across several unrelated affected individuals. This is less likely to occur for genetically heterogeneous diseases, making more advanced analysis methods necessary. To address this need, we present HetRank, a flexible gene-ranking method that incorporates interaction network data. We first show that different genes underlying the same monogenic disease are frequently connected in protein interaction networks. This motivates the central premise of HetRank: those genes carrying potentially pathogenic variants and whose network neighbors do so in other affected individuals are strong candidates for follow-up study. By simulating 1,000 exome sequencing studies (20,000 exomes in total), we model varying degrees of genetic heterogeneity and show that HetRank consistently prioritizes more disease-causing genes than existing analysis methods. We also demonstrate a proof-of-principle application of the method to prioritize genes causing Adams-Oliver syndrome, a genetically heterogeneous rare disease. An implementation of HetRank in R is available via the Website http://sourceforge.net/p/hetrank/.
Collapse
Affiliation(s)
- Nick Dand
- Division of Genetics and Molecular Medicine, King's College London, London, UK
| | - Reiner Schulz
- Division of Genetics and Molecular Medicine, King's College London, London, UK
| | - Michael E Weale
- Division of Genetics and Molecular Medicine, King's College London, London, UK
| | - Laura Southgate
- Division of Genetics and Molecular Medicine, King's College London, London, UK.,Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London, UK
| | - Rebecca J Oakey
- Division of Genetics and Molecular Medicine, King's College London, London, UK
| | - Michael A Simpson
- Division of Genetics and Molecular Medicine, King's College London, London, UK
| | - Thomas Schlitt
- Division of Genetics and Molecular Medicine, King's College London, London, UK.,Institute for Mathematical and Molecular Biomedicine, King's College London, London, UK
| |
Collapse
|
18
|
Ahmadi M, Jafari R, Marashi SA, Farazmand A. Evidence for the relationship between the regulatory effects of microRNAs and attack robustness of biological networks. Comput Biol Med 2015; 63:83-91. [DOI: 10.1016/j.compbiomed.2015.05.010] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2015] [Revised: 05/09/2015] [Accepted: 05/11/2015] [Indexed: 10/23/2022]
|
19
|
Glebova K, Reznik ON, Reznik AO, Mehta R, Galkin A, Baranova A, Skoblov M. siRNA technology in kidney transplantation: current status and future potential. BioDrugs 2015; 28:345-61. [PMID: 24573958 DOI: 10.1007/s40259-014-0087-0] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Kidney transplantation is one of the most common transplantation operations in the world, accounting for up to 50 % of all transplantation surgeries. To curtail the damage to transplanted organs that is caused by ischemia-reperfusion injury and the recipient's immune system, small interfering RNA (siRNA) technology is being explored. Importantly, the kidney as a whole is a preferential site for non-specific systemic delivery of siRNA. To date, most attempts at siRNA-based therapy for transplantation-related conditions have remained at the in vitro stage, with only a few of them being advanced into animal models. Hydrodynamic intravenous injection of naked or carrier-bound siRNAs is currently the most common route for delivery of therapeutic constructs. To our knowledge, no systematic screens for siRNA targets most relevant for kidney transplantation have been attempted so far. A majority of researchers have arrived at one or another target of interest by analyzing current literature that dissects pathological processes taking place in transplanted organs. A majority of the genes that make up the list of 53 siRNA targets that have been tested in transplantation-related models so far belong to either apoptosis- or immune rejection-centered networks. There is an opportunity for therapeutic siRNA combinations that may be delivered within the same delivery vector or injected at the same time and, by targeting more than one pathway, or by hitting the same pathways within two different key points, will augment the effects of each other.
Collapse
Affiliation(s)
- Kristina Glebova
- Research Center for Medical Genetics, Russian Academy of Medical Sciences, Moscow, Russia
| | | | | | | | | | | | | |
Collapse
|
20
|
Emmert-Streib F, Tripathi S, Simoes RDM, Hawwa AF, Dehmer M. The human disease network. ACTA ACUST UNITED AC 2014. [DOI: 10.4161/sysb.22816] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
21
|
Emmert-Streib F, Dehmer M, Haibe-Kains B. Gene regulatory networks and their applications: understanding biological and medical problems in terms of networks. Front Cell Dev Biol 2014; 2:38. [PMID: 25364745 PMCID: PMC4207011 DOI: 10.3389/fcell.2014.00038] [Citation(s) in RCA: 112] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2014] [Accepted: 07/29/2014] [Indexed: 11/13/2022] Open
Abstract
In recent years gene regulatory networks (GRNs) have attracted a lot of interest and many methods have been introduced for their statistical inference from gene expression data. However, despite their popularity, GRNs are widely misunderstood. For this reason, we provide in this paper a general discussion and perspective of gene regulatory networks. Specifically, we discuss their meaning, the consistency among different network inference methods, ensemble methods, the assessment of GRNs, the estimated number of existing GRNs and their usage in different application domains. Furthermore, we discuss open questions and necessary steps in order to utilize gene regulatory networks in a clinical context and for personalized medicine.
Collapse
Affiliation(s)
- Frank Emmert-Streib
- Computational Biology and Machine Learning Laboratory, Faculty of Medicine, Health and Life Sciences, Center for Cancer Research and Cell Biology, School of Medicine, Dentistry and Biomedical Sciences, Queen's University Belfast Belfast, UK
| | - Matthias Dehmer
- Institute for Bioinformatics and Translational Research, UMIT Hall in Tyrol, Austria
| | - Benjamin Haibe-Kains
- Bioinformatics and Computational Genomics Laboratory, Department of Medical Biophysics, Princess Margaret Cancer Centre, University of Toronto Canada
| |
Collapse
|
22
|
Lane J, Kaartinen V. Signaling networks in palate development. WILEY INTERDISCIPLINARY REVIEWS-SYSTEMS BIOLOGY AND MEDICINE 2014; 6:271-8. [PMID: 24644145 DOI: 10.1002/wsbm.1265] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/26/2013] [Revised: 01/24/2014] [Accepted: 01/24/2014] [Indexed: 01/04/2023]
Abstract
UNLABELLED Palatogenesis, the formation of the palate, is a dynamic process regulated by a complex series of context-dependent morphogenetic signaling events. Many genes involved in palatogenesis have been discovered through the use of genetically manipulated mouse models as well as from human genetic studies, but the roles of these genes and their products in signaling networks regulating palatogenesis are still poorly known. In this review, we give a brief overview on palatogenesis and introduce key signaling cascades leading to formation of the intact palate. Moreover, we review conceptual differences between pathway biology and network biology and discuss how some of the recent technological advances in conjunction with mouse genetic models have contributed to our understanding of signaling networks regulating palate growth and fusion. For further resources related to this article, please visit the WIREs website. CONFLICT OF INTEREST The authors have declared no conflicts of interest for this article.
Collapse
Affiliation(s)
- Jamie Lane
- Department of Biologic and Materials Sciences, University of Michigan School of Dentistry, Ann Arbor, MI, USA
| | | |
Collapse
|
23
|
Scardoni G, Montresor A, Tosadori G, Laudanna C. Node interference and robustness: performing virtual knock-out experiments on biological networks: the case of leukocyte integrin activation network. PLoS One 2014; 9:e88938. [PMID: 24586448 PMCID: PMC3930642 DOI: 10.1371/journal.pone.0088938] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2013] [Accepted: 01/13/2014] [Indexed: 01/13/2023] Open
Abstract
The increasing availability of large network datasets derived from high-throughput experiments requires the development of tools to extract relevant information from biological networks, and the development of computational methods capable of detecting qualitative and quantitative changes in the topological properties of biological networks is of critical relevance. We introduce the notions of node and as measures of the reciprocal influence between nodes within a network. We examine the theoretical significance of these new, centrality-based, measures by characterizing the topological relationships between nodes and groups of nodes. Node interference analysis allows topologically determining the context of functional influence of single nodes. Conversely, the node robustness analysis allows topologically identifying the nodes having the highest functional influence on a specific node. A new Cytoscape plug-in calculating these measures was developed and applied to a protein-protein interaction network specifically regulating integrin activation in human primary leukocytes. Notably, the functional effects of compounds inhibiting important protein kinases, such as SRC, HCK, FGR and JAK2, are predicted by the interference and robustness analysis, are in agreement with previous studies and are confirmed by laboratory experiments. The interference and robustness notions can be applied to a variety of different contexts, including, for instance, the identification of potential side effects of drugs or the characterization of the consequences of genes deletion, duplication or of proteins degradation, opening new perspectives in biological network analysis.
Collapse
Affiliation(s)
- Giovanni Scardoni
- Center for BioMedical Computing (CBMC), University of Verona, Verona, Italy
- * E-mail:
| | - Alessio Montresor
- Department of Pathology and Diagnostic, University of Verona, Verona, Italy
| | - Gabriele Tosadori
- Center for BioMedical Computing (CBMC), University of Verona, Verona, Italy
| | - Carlo Laudanna
- Center for BioMedical Computing (CBMC), University of Verona, Verona, Italy
- Department of Pathology and Diagnostic, University of Verona, Verona, Italy
| |
Collapse
|
24
|
de Matos Simoes R, Dehmer M, Emmert-Streib F. Interfacing cellular networks of S. cerevisiae and E. coli: connecting dynamic and genetic information. BMC Genomics 2013; 14:324. [PMID: 23663484 PMCID: PMC3698017 DOI: 10.1186/1471-2164-14-324] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2012] [Accepted: 04/25/2013] [Indexed: 12/11/2022] Open
Abstract
Background In recent years, various types of cellular networks have penetrated biology and are nowadays used omnipresently for studying eukaryote and prokaryote organisms. Still, the relation and the biological overlap among phenomenological and inferential gene networks, e.g., between the protein interaction network and the gene regulatory network inferred from large-scale transcriptomic data, is largely unexplored. Results We provide in this study an in-depth analysis of the structural, functional and chromosomal relationship between a protein-protein network, a transcriptional regulatory network and an inferred gene regulatory network, for S. cerevisiae and E. coli. Further, we study global and local aspects of these networks and their biological information overlap by comparing, e.g., the functional co-occurrence of Gene Ontology terms by exploiting the available interaction structure among the genes. Conclusions Although the individual networks represent different levels of cellular interactions with global structural and functional dissimilarities, we observe crucial functions of their network interfaces for the assembly of protein complexes, proteolysis, transcription, translation, metabolic and regulatory interactions. Overall, our results shed light on the integrability of these networks and their interfacing biological processes.
Collapse
Affiliation(s)
- Ricardo de Matos Simoes
- Computational Biology and Machine Learning Laboratory Center for Cancer Research and Cell Biology, School of Medicine, Dentistry and Biomedical Sciences Faculty of Medicine, Health and Life Sciences, Queen's University, 97 Lisburn Road, Belfast, UK
| | | | | |
Collapse
|
25
|
Emmert-Streib F. Structural properties and complexity of a new network class: Collatz step graphs. PLoS One 2013; 8:e56461. [PMID: 23431377 PMCID: PMC3576403 DOI: 10.1371/journal.pone.0056461] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2012] [Accepted: 01/11/2013] [Indexed: 12/04/2022] Open
Abstract
In this paper, we introduce a biologically inspired model to generate complex networks. In contrast to many other construction procedures for growing networks introduced so far, our method generates networks from one-dimensional symbol sequences that are related to the so called Collatz problem from number theory. The major purpose of the present paper is, first, to derive a symbol sequence from the Collatz problem, we call the step sequence, and investigate its structural properties. Second, we introduce a construction procedure for growing networks that is based on these step sequences. Third, we investigate the structural properties of this new network class including their finite scaling and asymptotic behavior of their complexity, average shortest path lengths and clustering coefficients. Interestingly, in contrast to many other network models including the small-world network from Watts & Strogatz, we find that CS graphs become ‘smaller’ with an increasing size.
Collapse
Affiliation(s)
- Frank Emmert-Streib
- Computational Biology and Machine Learning Laboratory, Center for Cancer Research and Cell Biology, School of Medicine, Dentistry and Biomedical Sciences, Faculty of Medicine, Health and Life Sciences, Queen's University Belfast, Belfast, United Kingdom.
| |
Collapse
|
26
|
Emmert-Streib F, Abogunrin F, de Matos Simoes R, Duggan B, Ruddock MW, Reid CN, Roddy O, White L, O'Kane HF, O'Rourke D, Anderson NH, Nambirajan T, Williamson KE. Collectives of diagnostic biomarkers identify high-risk subpopulations of hematuria patients: exploiting heterogeneity in large-scale biomarker data. BMC Med 2013; 11:12. [PMID: 23327460 PMCID: PMC3570289 DOI: 10.1186/1741-7015-11-12] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/17/2012] [Accepted: 01/17/2013] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Ineffective risk stratification can delay diagnosis of serious disease in patients with hematuria. We applied a systems biology approach to analyze clinical, demographic and biomarker measurements (n = 29) collected from 157 hematuric patients: 80 urothelial cancer (UC) and 77 controls with confounding pathologies. METHODS On the basis of biomarkers, we conducted agglomerative hierarchical clustering to identify patient and biomarker clusters. We then explored the relationship between the patient clusters and clinical characteristics using Chi-square analyses. We determined classification errors and areas under the receiver operating curve of Random Forest Classifiers (RFC) for patient subpopulations using the biomarker clusters to reduce the dimensionality of the data. RESULTS Agglomerative clustering identified five patient clusters and seven biomarker clusters. Final diagnoses categories were non-randomly distributed across the five patient clusters. In addition, two of the patient clusters were enriched with patients with 'low cancer-risk' characteristics. The biomarkers which contributed to the diagnostic classifiers for these two patient clusters were similar. In contrast, three of the patient clusters were significantly enriched with patients harboring 'high cancer-risk" characteristics including proteinuria, aggressive pathological stage and grade, and malignant cytology. Patients in these three clusters included controls, that is, patients with other serious disease and patients with cancers other than UC. Biomarkers which contributed to the diagnostic classifiers for the largest 'high cancer- risk' cluster were different than those contributing to the classifiers for the 'low cancer-risk' clusters. Biomarkers which contributed to subpopulations that were split according to smoking status, gender and medication were different. CONCLUSIONS The systems biology approach applied in this study allowed the hematuric patients to cluster naturally on the basis of the heterogeneity within their biomarker data, into five distinct risk subpopulations. Our findings highlight an approach with the promise to unlock the potential of biomarkers. This will be especially valuable in the field of diagnostic bladder cancer where biomarkers are urgently required. Clinicians could interpret risk classification scores in the context of clinical parameters at the time of triage. This could reduce cystoscopies and enable priority diagnosis of aggressive diseases, leading to improved patient outcomes at reduced costs.
Collapse
Affiliation(s)
- Frank Emmert-Streib
- Centre for Cancer Research & Cell Biology, Queens University Belfast, Belfast, Northern Ireland
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
27
|
Emmert-Streib F, Tripathi S, de Matos Simoes R. Harnessing the complexity of gene expression data from cancer: from single gene to structural pathway methods. Biol Direct 2012; 7:44. [PMID: 23227854 PMCID: PMC3769148 DOI: 10.1186/1745-6150-7-44] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2012] [Accepted: 10/01/2012] [Indexed: 12/22/2022] Open
Abstract
High-dimensional gene expression data provide a rich source of information because they capture the expression level of genes in dynamic states that reflect the biological functioning of a cell. For this reason, such data are suitable to reveal systems related properties inside a cell, e.g., in order to elucidate molecular mechanisms of complex diseases like breast or prostate cancer. However, this is not only strongly dependent on the sample size and the correlation structure of a data set, but also on the statistical hypotheses tested. Many different approaches have been developed over the years to analyze gene expression data to (I) identify changes in single genes, (II) identify changes in gene sets or pathways, and (III) identify changes in the correlation structure in pathways. In this paper, we review statistical methods for all three types of approaches, including subtypes, in the context of cancer data and provide links to software implementations and tools and address also the general problem of multiple hypotheses testing. Further, we provide recommendations for the selection of such analysis methods.
Collapse
Affiliation(s)
- Frank Emmert-Streib
- Computational Biology and Machine Learning Laboratory, Queen's University Belfast, Belfast, UK.
| | | | | |
Collapse
|
28
|
Komorowsky CV, Brosius FC, Pennathur S, Kretzler M. Perspectives on systems biology applications in diabetic kidney disease. J Cardiovasc Transl Res 2012; 5:491-508. [PMID: 22733404 PMCID: PMC3422674 DOI: 10.1007/s12265-012-9382-7] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/13/2012] [Accepted: 05/22/2012] [Indexed: 12/18/2022]
Abstract
Diabetic kidney disease (DKD) is a microvascular complication of type 1 and 2 diabetes with a devastating impact on individuals with the disease, their families, and society as a whole. DKD is the single most frequent cause of incident chronic kidney disease cases and accounts for over 40% of the population with end-stage renal disease. Contributing factors for the high prevalence are the increase in obesity and subsequent diabetes combined with an improved long-term survival with diabetes. Environment and genetic variations contribute to DKD susceptibility and progressive loss of kidney function. How the molecular mechanisms of genetic and environmental exposures interact during DKD initiation and progression is the focus of ongoing research efforts. The development of standardized, unbiased high-throughput profiling technologies of human DKD samples opens new avenues in capturing the multiple layers of DKD pathobiology. These techniques routinely interrogate analytes on a genome-wide scale generating comprehensive DKD-associated fingerprints. Linking the molecular fingerprints to deep clinical phenotypes may ultimately elucidate the intricate molecular interplay in a disease stage and subtype-specific manner. This insight will form the basis for accurate prognosis and facilitate targeted therapeutic interventions. In this review, we present ongoing efforts from large-scale data integration translating "-omics" research efforts into improved and individualized health care in DKD.
Collapse
Affiliation(s)
- Claudiu V. Komorowsky
- Department of Internal Medicine, Division of Nephrology, University of Michigan Medical School, Ann Arbor, MI, USA
| | - Frank C. Brosius
- Department of Internal Medicine, Division of Nephrology, University of Michigan Medical School, Ann Arbor, MI, USA
| | - Subramaniam Pennathur
- Department of Internal Medicine, Division of Nephrology, University of Michigan Medical School, Ann Arbor, MI, USA
| | - Matthias Kretzler
- Department of Internal Medicine, Division of Nephrology, University of Michigan Medical School, Ann Arbor, MI, USA
- Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI, USA
| |
Collapse
|
29
|
Emmert-Streib F, de Matos Simoes R, Tripathi S, Glazko GV, Dehmer M. A Bayesian analysis of the chromosome architecture of human disorders by integrating reductionist data. Sci Rep 2012; 2:513. [PMID: 22822426 PMCID: PMC3400933 DOI: 10.1038/srep00513] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2012] [Accepted: 06/27/2012] [Indexed: 11/09/2022] Open
Abstract
In this paper, we present a Bayesian approach to estimate a chromosome and a disorder network from the Online Mendelian Inheritance in Man (OMIM) database. In contrast to other approaches, we obtain statistic rather than deterministic networks enabling a parametric control in the uncertainty of the underlying disorder-disease gene associations contained in the OMIM, on which the networks are based. From a structural investigation of the chromosome network, we identify three chromosome subgroups that reflect architectural differences in chromosome-disorder associations that are predictively exploitable for a functional analysis of diseases.
Collapse
Affiliation(s)
- Frank Emmert-Streib
- Computational Biology and Machine Learning Lab, Center forCancer Research and Cell Biology, School of Medicine, Dentistry and Biomedical Sciences, Queen’s University Belfast, 97 Lisburn Road, Belfast, UK.
| | | | | | | | | |
Collapse
|
30
|
Emmert-Streib F, Häkkinen A, Ribeiro AS. Detecting sequence dependent transcriptional pauses from RNA and protein number time series. BMC Bioinformatics 2012; 13:152. [PMID: 22741547 PMCID: PMC3534578 DOI: 10.1186/1471-2105-13-152] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2012] [Accepted: 06/20/2012] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Evidence suggests that in prokaryotes sequence-dependent transcriptional pauses affect the dynamics of transcription and translation, as well as of small genetic circuits. So far, a few pause-prone sequences have been identified from in vitro measurements of transcription elongation kinetics. RESULTS Using a stochastic model of gene expression at the nucleotide and codon levels with realistic parameter values, we investigate three different but related questions and present statistical methods for their analysis. First, we show that information from in vivo RNA and protein temporal numbers is sufficient to discriminate between models with and without a pause site in their coding sequence. Second, we demonstrate that it is possible to separate a large variety of models from each other with pauses of various durations and locations in the template by means of a hierarchical clustering and a random forest classifier. Third, we introduce an approximate likelihood function that allows to estimate the location of a pause site. CONCLUSIONS This method can aid in detecting unknown pause-prone sequences from temporal measurements of RNA and protein numbers at a genome-wide scale and thus elucidate possible roles that these sequences play in the dynamics of genetic networks and phenotype.
Collapse
Affiliation(s)
- Frank Emmert-Streib
- Computational Biology and Machine Learning Lab, Center for CancerResearch and Cell Biology, School of Medicine, Dentistry and Biomedical Sciences, Queen’s University Belfast, Belfast, UK
| | | | | |
Collapse
|
31
|
Assessment method for a power analysis to identify differentially expressed pathways. PLoS One 2012; 7:e37510. [PMID: 22629411 PMCID: PMC3356338 DOI: 10.1371/journal.pone.0037510] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2011] [Accepted: 04/22/2012] [Indexed: 12/04/2022] Open
Abstract
Gene expression data can provide a very rich source of information for elucidating the biological function on the pathway level if the experimental design considers the needs of the statistical analysis methods. The purpose of this paper is to provide a comparative analysis of statistical methods for detecting the differentially expression of pathways (DEP). In contrast to many other studies conducted so far, we use three novel simulation types, producing a more realistic correlation structure than previous simulation methods. This includes also the generation of surrogate data from two large-scale microarray experiments from prostate cancer and ALL. As a result from our comprehensive analysis of parameter configurations, we find that each method should only be applied if certain conditions of the data from a pathway are met. Further, we provide method-specific estimates for the optimal sample size for microarray experiments aiming to identify DEP in order to avoid an underpowered design. Our study highlights the sensitivity of the studied methods on the parameters of the system.
Collapse
|
32
|
de Matos Simoes R, Tripathi S, Emmert-Streib F. Organizational structure and the periphery of the gene regulatory network in B-cell lymphoma. BMC SYSTEMS BIOLOGY 2012; 6:38. [PMID: 22583750 PMCID: PMC3476434 DOI: 10.1186/1752-0509-6-38] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/24/2011] [Accepted: 05/14/2012] [Indexed: 12/22/2022]
Abstract
Background The physical periphery of a biological cell is mainly described by signaling pathways which are triggered by transmembrane proteins and receptors that are sentinels to control the whole gene regulatory network of a cell. However, our current knowledge about the gene regulatory mechanisms that are governed by extracellular signals is severely limited. Results The purpose of this paper is three fold. First, we infer a gene regulatory network from a large-scale B-cell lymphoma expression data set using the C3NET algorithm. Second, we provide a functional and structural analysis of the largest connected component of this network, revealing that this network component corresponds to the peripheral region of a cell. Third, we analyze the hierarchical organization of network components of the whole inferred B-cell gene regulatory network by introducing a new approach which exploits the variability within the data as well as the inferential characteristics of C3NET. As a result, we find a functional bisection of the network corresponding to different cellular components. Conclusions Overall, our study allows to highlight the peripheral gene regulatory network of B-cells and shows that it is centered around hub transmembrane proteins located at the physical periphery of the cell. In addition, we identify a variety of novel pathological transmembrane proteins such as ion channel complexes and signaling receptors in B-cell lymphoma.
Collapse
Affiliation(s)
- Ricardo de Matos Simoes
- Computational Biology and Machine Learning Lab, Center for Cancer Research and Cell Biology, School of Medicine, Dentistry and Biomedical Sciences, Queen's University Belfast, Belfast, UK
| | | | | |
Collapse
|
33
|
Emmert-Streib F. Limitations of gene duplication models: evolution of modules in protein interaction networks. PLoS One 2012; 7:e35531. [PMID: 22530042 PMCID: PMC3329483 DOI: 10.1371/journal.pone.0035531] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2011] [Accepted: 03/18/2012] [Indexed: 01/05/2023] Open
Abstract
It has been generally acknowledged that the module structure of protein interaction networks plays a crucial role with respect to the functional understanding of these networks. In this paper, we study evolutionary aspects of the module structure of protein interaction networks, which forms a mesoscopic level of description with respect to the architectural principles of networks. The purpose of this paper is to investigate limitations of well known gene duplication models by showing that these models are lacking crucial structural features present in protein interaction networks on a mesoscopic scale. This observation reveals our incomplete understanding of the structural evolution of protein networks on the module level.
Collapse
Affiliation(s)
- Frank Emmert-Streib
- Computational Biology and Machine Learning Lab, Center for Cancer Research and Cell Biology, School of Medicine, Dentistry and Biomedical Sciences, Queen's University Belfast, Belfast, United Kingdom.
| |
Collapse
|
34
|
Wang YC, Huang SH, Lan CY, Chen BS. Prediction of phenotype-associated genes via a cellular network approach: a Candida albicans infection case study. PLoS One 2012; 7:e35339. [PMID: 22509408 PMCID: PMC3324557 DOI: 10.1371/journal.pone.0035339] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2011] [Accepted: 03/15/2012] [Indexed: 02/04/2023] Open
Abstract
Candida albicans is the most prevalent opportunistic fungal pathogen in humans causing superficial and serious systemic infections. The infection process can be divided into three stages: adhesion, invasion, and host cell damage. To enhance our understanding of these C. albicans infection stages, this study aimed to predict phenotype-associated genes involved during these three infection stages and their roles in C. albicans-host interactions. In light of the principles that proteins that lie closer to one another in a protein interaction network are more likely to have similar functions, and that genes regulated by the same transcription factors tend to have similar functions, a cellular network approach was proposed to predict the phenotype-associated genes in this study. A total of 4, 12, and 3 genes were predicted as adhesion-, invasion-, and damage-associated genes during C. albicans infection, respectively. These predicted genes highlight the facts that cell surface components are critical for cell adhesion, and that morphogenesis is crucial for cell invasion. In addition, they provide targets for further investigations into the mechanisms of the three C. albicans infection stages. These results give insights into the responses elicited in C. albicans during interaction with the host, possibly instrumental in identifying novel therapies to treat C. albicans infection.
Collapse
Affiliation(s)
- Yu-Chao Wang
- Laboratory of Control and Systems Biology, Department of Electrical Engineering, National Tsing Hua University, Hsinchu, Taiwan
| | - Shin-Hao Huang
- Laboratory of Control and Systems Biology, Department of Electrical Engineering, National Tsing Hua University, Hsinchu, Taiwan
| | - Chung-Yu Lan
- Department of Life Science, National Tsing Hua University, Hsinchu, Taiwan
- Institute of Molecular and Cellular Biology, National Tsing Hua University, Hsinchu, Taiwan
| | - Bor-Sen Chen
- Laboratory of Control and Systems Biology, Department of Electrical Engineering, National Tsing Hua University, Hsinchu, Taiwan
- * E-mail:
| |
Collapse
|
35
|
de Matos Simoes R, Emmert-Streib F. Bagging statistical network inference from large-scale gene expression data. PLoS One 2012; 7:e33624. [PMID: 22479422 PMCID: PMC3316596 DOI: 10.1371/journal.pone.0033624] [Citation(s) in RCA: 77] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2011] [Accepted: 02/14/2012] [Indexed: 11/24/2022] Open
Abstract
Modern biology and medicine aim at hunting molecular and cellular causes of biological functions and diseases. Gene regulatory networks (GRN) inferred from gene expression data are considered an important aid for this research by providing a map of molecular interactions. Hence, GRNs have the potential enabling and enhancing basic as well as applied research in the life sciences. In this paper, we introduce a new method called BC3NET for inferring causal gene regulatory networks from large-scale gene expression data. BC3NET is an ensemble method that is based on bagging the C3NET algorithm, which means it corresponds to a Bayesian approach with noninformative priors. In this study we demonstrate for a variety of simulated and biological gene expression data from S. cerevisiae that BC3NET is an important enhancement over other inference methods that is capable of capturing biochemical interactions from transcription regulation and protein-protein interaction sensibly. An implementation of BC3NET is freely available as an R package from the CRAN repository.
Collapse
Affiliation(s)
| | - Frank Emmert-Streib
- Computational Biology and Machine Learning Lab, Center for Cancer Research and Cell Biology, School of Medicine, Dentistry and Biomedical Sciences, Queen's University Belfast, Belfast, United Kingdom
- * E-mail:
| |
Collapse
|
36
|
Emmert-Streib F, Glazko GV, Altay G, de Matos Simoes R. Statistical inference and reverse engineering of gene regulatory networks from observational expression data. Front Genet 2012; 3:8. [PMID: 22408642 PMCID: PMC3271232 DOI: 10.3389/fgene.2012.00008] [Citation(s) in RCA: 91] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2011] [Accepted: 01/10/2012] [Indexed: 01/04/2023] Open
Abstract
In this paper, we present a systematic and conceptual overview of methods for inferring gene regulatory networks from observational gene expression data. Further, we discuss two classic approaches to infer causal structures and compare them with contemporary methods by providing a conceptual categorization thereof. We complement the above by surveying global and local evaluation measures for assessing the performance of inference algorithms.
Collapse
Affiliation(s)
- Frank Emmert-Streib
- Computational Biology and Machine Learning Lab, School of Medicine, Dentistry and Biomedical Sciences, Center for Cancer Research and Cell Biology, Queen's University Belfast Belfast, UK
| | | | | | | |
Collapse
|
37
|
Parametric construction of episode networks from pseudoperiodic time series based on mutual information. PLoS One 2012; 6:e27733. [PMID: 22216086 PMCID: PMC3245224 DOI: 10.1371/journal.pone.0027733] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2011] [Accepted: 10/24/2011] [Indexed: 12/02/2022] Open
Abstract
Recently, the construction of networks from time series data has gained widespread interest. In this paper, we develop this area further by introducing a network construction procedure for pseudoperiodic time series. We call such networks episode networks, in which an episode corresponds to a temporal interval of a time series, and which defines a node in the network. Our model includes a number of features which distinguish it from current methods. First, the proposed construction procedure is a parametric model which allows it to adapt to the characteristics of the data; the length of an episode being the parameter. As a direct consequence, networks of minimal size containing the maximal information about the time series can be obtained. In this paper, we provide an algorithm to determine the optimal value of this parameter. Second, we employ estimates of mutual information values to define the connectivity structure among the nodes in the network to exploit efficiently the nonlinearities in the time series. Finally, we apply our method to data from electroencephalogram (EEG) experiments and demonstrate that the constructed episode networks capture discriminative information from the underlying time series that may be useful for diagnostic purposes.
Collapse
|
38
|
de Matos Simoes R, Emmert-Streib F. Influence of statistical estimators of mutual information and data heterogeneity on the inference of gene regulatory networks. PLoS One 2011; 6:e29279. [PMID: 22242113 PMCID: PMC3248437 DOI: 10.1371/journal.pone.0029279] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2011] [Accepted: 11/23/2011] [Indexed: 11/19/2022] Open
Abstract
The inference of gene regulatory networks from gene expression data is a difficult problem because the performance of the inference algorithms depends on a multitude of different factors. In this paper we study two of these. First, we investigate the influence of discrete mutual information (MI) estimators on the global and local network inference performance of the C3NET algorithm. More precisely, we study different MI estimators (Empirical, Miller-Madow, Shrink and Schürmann-Grassberger) in combination with discretization methods (equal frequency, equal width and global equal width discretization). We observe the best global and local inference performance of C3NET for the Miller-Madow estimator with an equal width discretization. Second, our numerical analysis can be considered as a systems approach because we simulate gene expression data from an underlying gene regulatory network, instead of making a distributional assumption to sample thereof. We demonstrate that despite the popularity of the latter approach, which is the traditional way of studying MI estimators, this is in fact not supported by simulated and biological expression data because of their heterogeneity. Hence, our study provides guidance for an efficient design of a simulation study in the context of network inference, supporting a systems approach.
Collapse
Affiliation(s)
- Ricardo de Matos Simoes
- Computational Biology and Machine Learning Lab, Center for Cancer Research and Cell Biology, School of Medicine, Dentistry and Biomedical Sciences, Queen's University Belfast, Belfast, United Kingdom
| | - Frank Emmert-Streib
- Computational Biology and Machine Learning Lab, Center for Cancer Research and Cell Biology, School of Medicine, Dentistry and Biomedical Sciences, Queen's University Belfast, Belfast, United Kingdom
- * E-mail:
| |
Collapse
|
39
|
Mueller LAJ, Kugler KG, Graber A, Emmert-Streib F, Dehmer M. Structural measures for network biology using QuACN. BMC Bioinformatics 2011; 12:492. [PMID: 22195644 PMCID: PMC3293850 DOI: 10.1186/1471-2105-12-492] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2011] [Accepted: 12/24/2011] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Structural measures for networks have been extensively developed, but many of them have not yet demonstrated their sustainably. That means, it remains often unclear whether a particular measure is useful and feasible to solve a particular problem in network biology. Exemplarily, the classification of complex biological networks can be named, for which structural measures are used leading to a minimal classification error. Hence, there is a strong need to provide freely available software packages to calculate and demonstrate the appropriate usage of structural graph measures in network biology. RESULTS Here, we discuss topological network descriptors that are implemented in the R-package QuACN and demonstrate their behavior and characteristics by applying them to a set of example graphs. Moreover, we show a representative application to illustrate their capabilities for classifying biological networks. In particular, we infer gene regulatory networks from microarray data and classify them by methods provided by QuACN. Note that QuACN is the first freely available software written in R containing a large number of structural graph measures. CONCLUSION The R package QuACN is under ongoing development and we add promising groups of topological network descriptors continuously. The package can be used to answer intriguing research questions in network biology, e.g., classifying biological data or identifying meaningful biological features, by analyzing the topology of biological networks.
Collapse
Affiliation(s)
- Laurin A J Mueller
- Institute for Bioinformatics and Translational Research, Department of Biomedical Sciences and Engineering, University for Health Sciences, Medical Informatics and Technology (UMIT), EWZ 1, Hall in Tirol, Austria
| | | | | | | | | |
Collapse
|
40
|
Ben-Hamo R, Efroni S. Gene expression and network-based analysis reveals a novel role for hsa-miR-9 and drug control over the p38 network in glioblastoma multiforme progression. Genome Med 2011; 3:77. [PMID: 22122801 PMCID: PMC3308032 DOI: 10.1186/gm293] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2011] [Revised: 11/18/2011] [Accepted: 11/28/2011] [Indexed: 11/24/2022] Open
Abstract
Background Glioblastoma multiforme (GBM) is the most common, aggressive and malignant primary tumor of the brain and is associated with one of the worst 5-year survival rates among all human cancers. Identification of molecular interactions that associate with disease progression may be key in finding novel treatments. Methods Using five independent molecular and clinical datasets with a set of computational algorithms we were able to identify a gene-gene and gene-microRNA network that significantly stratifies patient prognosis. By combining gene expression microarray data with microRNA expression levels, copy number alterations, drug response and clinical data, combined with network knowledge, we were able to identify a single pathway at the core of glioblastoma. Results This network, the p38 network, and an associated microRNA, hsa-miR-9, facilitate prognostic stratification. The microRNA hsa-miR-9 correlated with network behavior and presents binding affinities with network members in a manner that suggests control over network behavior. A similar control over network behavior is possible through a set of drugs. These drugs are part of the treatment regimen for a subpopulation of the patients that participated in the TCGA study and for which the study provides clinical information. Interestingly, the patients that were treated with these specific sets of drugs, all of which targeted against p38 network members, demonstrate highly significant stratification of prognosis. Conclusions Combined, these results call for attention to p38 network targeted treatment and present the p38 network-hsa-miR-9 control mechanism as critical in GBM progression.
Collapse
Affiliation(s)
- Rotem Ben-Hamo
- The Mina and Everard Goodman Faculty of Life Science, Bar Ilan University, 1 Keren-Hayesod St, Ramat-Gan, 52900, Israel.
| | | |
Collapse
|
41
|
Pathway analysis of expression data: deciphering functional building blocks of complex diseases. PLoS Comput Biol 2011; 7:e1002053. [PMID: 21637797 PMCID: PMC3102754 DOI: 10.1371/journal.pcbi.1002053] [Citation(s) in RCA: 78] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
|