1
|
Yaslianifard S, Movahedi M, Yaslianifard S, Mozhgani SH. The mirror like expression of genes involved in the FOXO signaling pathway could be effective in the pathogenesis of human lymphotropic virus type 1 (HTLV-1) through disruption of the downstream pathways. BMC Res Notes 2023; 16:147. [PMID: 37461070 DOI: 10.1186/s13104-023-06423-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Accepted: 07/04/2023] [Indexed: 07/20/2023] Open
Abstract
OBJECTIVES Human lymphotropic virus type 1 (HTLV-1) is the cause of two major diseases, ATLL and HAM/TSP in a percentage of carriers. Despite progress in understanding the pathogenesis of these two diseases, the exact pathogenesis mechanism is still not well understood. High-throughput technologies have revolutionized medical research. This study aims to investigate the mechanism of pathogenesis of these two diseases using the results of high-throughput analysis of microarray datasets. RESULTS A total of 100 differentially expressed genes were found between ATLL and HAM/TSP. After constructing protein-protein network and further analyzing, proteins including ATM, CD8, CXCR4, PIK3R1 and CD2 were found as the hub ones between ATLL and HAM/TSP. Finding the modules of the subnetwork revealed the enrichment of two common pathways including FOXO signaling pathway and Cell cycle with two common genes including ATM and CDKN2D. Unlike ATLL, ATM gene had higher expressions in HAM/TSP patients. The expression of CDKN2D was increased in ATLL patients. The results of this study could be helpful for understanding the pathogenic mechanism of these two diseases in the same signaling pathways.
Collapse
Affiliation(s)
- Sahar Yaslianifard
- Department of Biochemistry, Faculty of Biological Sciences, NorthTehran Branch, Islamic Azad University, Tehran, Iran
| | - Monireh Movahedi
- Department of Biochemistry, Faculty of Biological Sciences, NorthTehran Branch, Islamic Azad University, Tehran, Iran
| | - Somayeh Yaslianifard
- Department of Microbiology and Virology, School of Medicine, Alborz University of Medical Sciences, Karaj, Iran
- Dietary Supplements and Probiotic Research Center, Alborz University of Medical Sciences, Karaj, Iran
| | - Sayed-Hamidreza Mozhgani
- Department of Microbiology and Virology, School of Medicine, Alborz University of Medical Sciences, Karaj, Iran.
- Non-Communicable Diseases Research Center, Alborz University of Medical Sciences, Karaj, Iran.
| |
Collapse
|
2
|
Yazdani A, Yazdani A, Mendez-Giraldez R, Samiei A, Kosorok MR, Schaid DJ. From classical mendelian randomization to causal networks for systematic integration of multi-omics. Front Genet 2022; 13:990486. [PMID: 36186433 PMCID: PMC9520987 DOI: 10.3389/fgene.2022.990486] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2022] [Accepted: 08/17/2022] [Indexed: 11/17/2022] Open
Abstract
The number of studies with information at multiple biological levels of granularity, such as genomics, proteomics, and metabolomics, is increasing each year, and a biomedical questaion is how to systematically integrate these data to discover new biological mechanisms that have the potential to elucidate the processes of health and disease. Causal frameworks, such as Mendelian randomization (MR), provide a foundation to begin integrating data for new biological discoveries. Despite the growing number of MR applications in a wide variety of biomedical studies, there are few approaches for the systematic analysis of omic data. The large number and diverse types of molecular components involved in complex diseases interact through complex networks, and classical MR approaches targeting individual components do not consider the underlying relationships. In contrast, causal network models established in the principles of MR offer significant improvements to the classical MR framework for understanding omic data. Integration of these mostly distinct branches of statistics is a recent development, and we here review the current progress. To set the stage for causal network models, we review some recent progress in the classical MR framework. We then explain how to transition from the classical MR framework to causal networks. We discuss the identification of causal networks and evaluate the underlying assumptions. We also introduce some tests for sensitivity analysis and stability assessment of causal networks. We then review practical details to perform real data analysis and identify causal networks and highlight some of the utility of causal networks. The utilities with validated novel findings reveal the full potential of causal networks as a systems approach that will become necessary to integrate large-scale omic data.
Collapse
Affiliation(s)
- Azam Yazdani
- Center of Perioperative Genetics and Genomics, Brigham Women's Hospital, Harvard Medical School, Boston, MA, United States
| | - Akram Yazdani
- Health Science Center at Houston, McGovern Medical School, Division of Clinical and Translational Sciences, University of Texas, Houston, TX, United States
| | - Raul Mendez-Giraldez
- Biostatistics and Computational Biology Branch, National Institute of Environmental Health Sciences, Durham, NC, United States
| | - Ahmad Samiei
- Division of Pulmonary Medicine, Boston Children's Hospital, Boston, MA, United States
| | - Michael R Kosorok
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States
| | - Daniel J Schaid
- Department of Quantitative Health Sciences, Mayo Clinic, Rochester, MN, United States
| |
Collapse
|
3
|
Xiang J, Meng X, Zhao Y, Wu FX, Li M. HyMM: hybrid method for disease-gene prediction by integrating multiscale module structure. Brief Bioinform 2022; 23:6547263. [PMID: 35275996 DOI: 10.1093/bib/bbac072] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2021] [Revised: 01/18/2022] [Accepted: 02/13/2022] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Identifying disease-related genes is an important issue in computational biology. Module structure widely exists in biomolecule networks, and complex diseases are usually thought to be caused by perturbations of local neighborhoods in the networks, which can provide useful insights for the study of disease-related genes. However, the mining and effective utilization of the module structure is still challenging in such issues as a disease gene prediction. RESULTS We propose a hybrid disease-gene prediction method integrating multiscale module structure (HyMM), which can utilize multiscale information from local to global structure to more effectively predict disease-related genes. HyMM extracts module partitions from local to global scales by multiscale modularity optimization with exponential sampling, and estimates the disease relatedness of genes in partitions by the abundance of disease-related genes within modules. Then, a probabilistic model for integration of gene rankings is designed in order to integrate multiple predictions derived from multiscale module partitions and network propagation, and a parameter estimation strategy based on functional information is proposed to further enhance HyMM's predictive power. By a series of experiments, we reveal the importance of module partitions at different scales, and verify the stable and good performance of HyMM compared with eight other state-of-the-arts and its further performance improvement derived from the parameter estimation. CONCLUSIONS The results confirm that HyMM is an effective framework for integrating multiscale module structure to enhance the ability to predict disease-related genes, which may provide useful insights for the study of the multiscale module structure and its application in such issues as a disease-gene prediction.
Collapse
Affiliation(s)
- Ju Xiang
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 410083, China; Department of Basic Medical Sciences & Academician Workstation, Changsha Medical University, Changsha, Hunan 410219, China
| | - Xiangmao Meng
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Yichao Zhao
- School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, China
| | - Fang-Xiang Wu
- Division of Biomedical Engineering and Department of Mechanical Engineering, University of Saskatchewan, Saskatoon, SK, S7N 5A9, Canada
| | - Min Li
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 410083, China
| |
Collapse
|
4
|
Gao* K, Ayati* M, Koyuturk M, Calabrese JR, Ganocy SJ, Kaye NM, Lazarus HM, Christian E, Kaplan D. Protein Biomarkers in Monocytes and CD4 + Lymphocytes for Predicting Lithium Treatment Response of Bipolar Disorder: a Feasibility Study with Tyramine-Based Signal-Amplified Flow Cytometry. PSYCHOPHARMACOLOGY BULLETIN 2022; 52:8-35. [PMID: 35342205 PMCID: PMC8896753] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 02/27/2023]
Abstract
Purpose To determine if enhanced flow cytometry (CellPrint™) can identify intracellular proteins of lithium responsiveness in monocytes and CD4+ lymphocytes from patients with bipolar disorder. Methods Eligible bipolar I or II patients were openly treated with lithium for 16-weeks. Baseline levels of Bcl2, BDNF, calmodulin, Fyn, phospho-Fyn/phospho-Yes, GSK3β, phospho-GSK3αβ, HMGB1, iNOS, IRS2, mTor, NLPR3, PGM1, PKA C-α, PPAR-γ, phospho-RelA, and TPH1 in monocytes and CD4+ lymphocytes of lithium responders and non-responders were measured with CellPrint™. Their utility of discriminating responders from non-responders was explored. Protein-protein network and pathway enrichment analyses were conducted. Results Of the 24 intent-to-treat patients, 12 patients completed the 16-week study. Eleven of 13 responders and 8 of 11 non-responders were available for this analysis. The levels of the majority of analytes in lithium responders were lower than non-responders in both cell types, but only the level of GSK3β in monocytes was significantly different (p = 0.034). The combination of GSK3β and phospho-GSK3αβ levels in monocytes correctly classified 11/11 responders and 5/8 non-responders. Combination of GSK3β, phospho-RelA, TPH1 and PGM1 correctly classified 10/11 responders and 6/7 non-responders, both with a likelihood of ≥ 85%. Prolactin, leptin, BDNF, neurotrophin, and epidermal growth factor/epidermal growth factor receptor signaling pathways are involved in the lithium treatment response. GSK3β and RelA genes are involved in 4 of 5 these pathways. Conclusion CellPrint™ flow cytometry was able to detect differences in multiple proteins in monocytes and CD4+ lymphocytes between lithium responders and non-responders. A large study is warranted to confirm or refute these findings.
Collapse
|
5
|
Xiang J, Zhang J, Zhao Y, Wu FX, Li M. Biomedical data, computational methods and tools for evaluating disease-disease associations. Brief Bioinform 2022; 23:6522999. [PMID: 35136949 DOI: 10.1093/bib/bbac006] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2021] [Revised: 01/04/2022] [Accepted: 01/05/2022] [Indexed: 12/12/2022] Open
Abstract
In recent decades, exploring potential relationships between diseases has been an active research field. With the rapid accumulation of disease-related biomedical data, a lot of computational methods and tools/platforms have been developed to reveal intrinsic relationship between diseases, which can provide useful insights to the study of complex diseases, e.g. understanding molecular mechanisms of diseases and discovering new treatment of diseases. Human complex diseases involve both external phenotypic abnormalities and complex internal molecular mechanisms in organisms. Computational methods with different types of biomedical data from phenotype to genotype can evaluate disease-disease associations at different levels, providing a comprehensive perspective for understanding diseases. In this review, available biomedical data and databases for evaluating disease-disease associations are first summarized. Then, existing computational methods for disease-disease associations are reviewed and classified into five groups in terms of the usages of biomedical data, including disease semantic-based, phenotype-based, function-based, representation learning-based and text mining-based methods. Further, we summarize software tools/platforms for computation and analysis of disease-disease associations. Finally, we give a discussion and summary on the research of disease-disease associations. This review provides a systematic overview for current disease association research, which could promote the development and applications of computational methods and tools/platforms for disease-disease associations.
Collapse
Affiliation(s)
- Ju Xiang
- School of Computer Science and Engineering, Central South University, China
| | - Jiashuai Zhang
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, China
| | - Yichao Zhao
- School of Computer Science and Engineering, Central South University, China
| | - Fang-Xiang Wu
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, China
| | - Min Li
- Division of Biomedical Engineering and Department of Mechanical Engineering at University of Saskatchewan, Saskatoon, Canada
| |
Collapse
|
6
|
Yadav AK, Shukla R, Singh TR. Topological parameters, patterns, and motifs in biological networks. Bioinformatics 2022. [DOI: 10.1016/b978-0-323-89775-4.00012-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022] Open
|
7
|
Alveolar Regeneration in COVID-19 Patients: A Network Perspective. Int J Mol Sci 2021; 22:ijms222011279. [PMID: 34681944 PMCID: PMC8538208 DOI: 10.3390/ijms222011279] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2021] [Revised: 10/13/2021] [Accepted: 10/14/2021] [Indexed: 12/12/2022] Open
Abstract
A viral infection involves entry and replication of viral nucleic acid in a host organism, subsequently leading to biochemical and structural alterations in the host cell. In the case of SARS-CoV-2 viral infection, over-activation of the host immune system may lead to lung damage. Albeit the regeneration and fibrotic repair processes being the two protective host responses, prolonged injury may lead to excessive fibrosis, a pathological state that can result in lung collapse. In this review, we discuss regeneration and fibrosis processes in response to SARS-CoV-2 and provide our viewpoint on the triggering of alveolar regeneration in coronavirus disease 2019 (COVID-19) patients.
Collapse
|
8
|
Vlietstra WJ, Vos R, van den Akker M, van Mulligen EM, Kors JA. Identifying disease trajectories with predicate information from a knowledge graph. J Biomed Semantics 2020; 11:9. [PMID: 32819419 PMCID: PMC7439632 DOI: 10.1186/s13326-020-00228-8] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2020] [Accepted: 08/12/2020] [Indexed: 12/28/2022] Open
Abstract
BACKGROUND Knowledge graphs can represent the contents of biomedical literature and databases as subject-predicate-object triples, thereby enabling comprehensive analyses that identify e.g. relationships between diseases. Some diseases are often diagnosed in patients in specific temporal sequences, which are referred to as disease trajectories. Here, we determine whether a sequence of two diseases forms a trajectory by leveraging the predicate information from paths between (disease) proteins in a knowledge graph. Furthermore, we determine the added value of directional information of predicates for this task. To do so, we create four feature sets, based on two methods for representing indirect paths, and both with and without directional information of predicates (i.e., which protein is considered subject and which object). The added value of the directional information of predicates is quantified by comparing the classification performance of the feature sets that include or exclude it. RESULTS Our method achieved a maximum area under the ROC curve of 89.8% and 74.5% when evaluated with two different reference sets. Use of directional information of predicates significantly improved performance by 6.5 and 2.0 percentage points respectively. CONCLUSIONS Our work demonstrates that predicates between proteins can be used to identify disease trajectories. Using the directional information of predicates significantly improved performance over not using this information.
Collapse
Affiliation(s)
- Wytze J. Vlietstra
- Department of Medical Informatics, Erasmus University Medical Center, Dr. Molewaterplein 50, 3015 GE Rotterdam, the Netherlands
| | - Rein Vos
- Department of Medical Informatics, Erasmus University Medical Center, Dr. Molewaterplein 50, 3015 GE Rotterdam, the Netherlands
- Department of Methodology & Statistics, Maastricht University, PO Box 616, 6200 MD Maastricht, the Netherlands
| | - Marjan van den Akker
- Institute of General Practice, Johann Wolfgang Goethe University, Theodor-Stern-Kai 7, D-60590 Frankfurt, Germany
- Department of Family Medicine, Maastricht University, PO Box 616, 6200 MD Maastricht, the Netherlands
| | - Erik M. van Mulligen
- Department of Medical Informatics, Erasmus University Medical Center, Dr. Molewaterplein 50, 3015 GE Rotterdam, the Netherlands
| | - Jan A. Kors
- Department of Medical Informatics, Erasmus University Medical Center, Dr. Molewaterplein 50, 3015 GE Rotterdam, the Netherlands
| |
Collapse
|
9
|
Baldwin E, Han J, Luo W, Zhou J, An L, Liu J, Zhang HH, Li H. On fusion methods for knowledge discovery from multi-omics datasets. Comput Struct Biotechnol J 2020; 18:509-517. [PMID: 32206210 PMCID: PMC7078495 DOI: 10.1016/j.csbj.2020.02.011] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2019] [Revised: 01/25/2020] [Accepted: 02/19/2020] [Indexed: 12/22/2022] Open
Abstract
Recent years have witnessed the tendency of measuring a biological sample on multiple omics scales for a comprehensive understanding of how biological activities on varying levels are perturbed by genetic variants, environments, and their interactions. This new trend raises substantial challenges to data integration and fusion, of which the latter is a specific type of integration that applies a uniform method in a scalable manner, to solve biological problems which the multi-omics measurements target. Fusion-based analysis has advanced rapidly in the past decade, thanks to application drivers and theoretical breakthroughs in mathematics, statistics, and computer science. We will briefly address these methods from methodological and mathematical perspectives and categorize them into three types of approaches: data fusion (a narrowed definition as compared to the general data fusion concept), model fusion, and mixed fusion. We will demonstrate at least one typical example in each specific category to exemplify the characteristics, principles, and applications of the methods in general, as well as discuss the gaps and potential issues for future studies.
Collapse
Affiliation(s)
- Edwin Baldwin
- Department of Biosystems Engineering, University of Arizona, United States
| | - Jiali Han
- Department of Systems and Industrial Engineering, University of Arizona, United States
| | - Wenting Luo
- Department of Biosystems Engineering, University of Arizona, United States
| | - Jin Zhou
- Department of Epidemiology and Biostatics, University of Arizona, United States
| | - Lingling An
- Department of Biosystems Engineering, University of Arizona, United States.,Department of Epidemiology and Biostatics, University of Arizona, United States
| | - Jian Liu
- Department of Systems and Industrial Engineering, University of Arizona, United States
| | - Hao Helen Zhang
- Department of Mathematics, University of Arizona, United States
| | - Haiquan Li
- Department of Biosystems Engineering, University of Arizona, United States
| |
Collapse
|
10
|
Integrative Deep Learning for Identifying Differentially Expressed (DE) Biomarkers. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2019; 2019:8418760. [PMID: 31915462 PMCID: PMC6935456 DOI: 10.1155/2019/8418760] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/05/2019] [Revised: 06/19/2019] [Accepted: 08/04/2019] [Indexed: 11/17/2022]
Abstract
As a large amount of genetic data are accumulated, an effective analytical method and a significant interpretation are required. Recently, various methods of machine learning have emerged to process genetic data. In addition, machine learning analysis tools using statistical models have been proposed. In this study, we propose adding an integrated layer to the deep learning structure, which would enable the effective analysis of genetic data and the discovery of significant biomarkers of diseases. We conducted a simulation study in order to compare the proposed method with metalogistic regression and meta-SVM methods. The objective function with lasso penalty is used for parameter estimation, and the Youden J index is used for model comparison. The simulation results indicate that the proposed method is more robust for the variance of the data than metalogistic regression and meta-SVM methods. We also conducted real data (breast cancer data (TCGA)) analysis. Based on the results of gene set enrichment analysis, we obtained that TCGA multiple omics data involve significantly enriched pathways which contain information related to breast cancer. Therefore, it is expected that the proposed method will be helpful to discover biomarkers.
Collapse
|
11
|
Zitnik M, Nguyen F, Wang B, Leskovec J, Goldenberg A, Hoffman MM. Machine Learning for Integrating Data in Biology and Medicine: Principles, Practice, and Opportunities. AN INTERNATIONAL JOURNAL ON INFORMATION FUSION 2019; 50:71-91. [PMID: 30467459 PMCID: PMC6242341 DOI: 10.1016/j.inffus.2018.09.012] [Citation(s) in RCA: 222] [Impact Index Per Article: 44.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
New technologies have enabled the investigation of biology and human health at an unprecedented scale and in multiple dimensions. These dimensions include myriad properties describing genome, epigenome, transcriptome, microbiome, phenotype, and lifestyle. No single data type, however, can capture the complexity of all the factors relevant to understanding a phenomenon such as a disease. Integrative methods that combine data from multiple technologies have thus emerged as critical statistical and computational approaches. The key challenge in developing such approaches is the identification of effective models to provide a comprehensive and relevant systems view. An ideal method can answer a biological or medical question, identifying important features and predicting outcomes, by harnessing heterogeneous data across several dimensions of biological variation. In this Review, we describe the principles of data integration and discuss current methods and available implementations. We provide examples of successful data integration in biology and medicine. Finally, we discuss current challenges in biomedical integrative methods and our perspective on the future development of the field.
Collapse
Affiliation(s)
- Marinka Zitnik
- Department of Computer Science, Stanford University,
Stanford, CA, USA
| | - Francis Nguyen
- Department of Medical Biophysics, University of Toronto,
Toronto, ON, Canada
- Princess Margaret Cancer Centre, Toronto, ON, Canada
| | - Bo Wang
- Hikvision Research Institute, Santa Clara, CA, USA
| | - Jure Leskovec
- Department of Computer Science, Stanford University,
Stanford, CA, USA
- Chan Zuckerberg Biohub, San Francisco, CA, USA
| | - Anna Goldenberg
- Genetics & Genome Biology, SickKids Research Institute,
Toronto, ON, Canada
- Department of Computer Science, University of Toronto,
Toronto, ON, Canada
- Vector Institute, Toronto, ON, Canada
| | - Michael M. Hoffman
- Department of Medical Biophysics, University of Toronto,
Toronto, ON, Canada
- Princess Margaret Cancer Centre, Toronto, ON, Canada
- Department of Computer Science, University of Toronto,
Toronto, ON, Canada
- Vector Institute, Toronto, ON, Canada
| |
Collapse
|
12
|
Pavlopoulos GA, Kontou PI, Pavlopoulou A, Bouyioukos C, Markou E, Bagos PG. Bipartite graphs in systems biology and medicine: a survey of methods and applications. Gigascience 2018; 7:1-31. [PMID: 29648623 PMCID: PMC6333914 DOI: 10.1093/gigascience/giy014] [Citation(s) in RCA: 77] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2017] [Revised: 01/15/2018] [Accepted: 02/13/2018] [Indexed: 11/14/2022] Open
Abstract
The latest advances in high-throughput techniques during the past decade allowed the systems biology field to expand significantly. Today, the focus of biologists has shifted from the study of individual biological components to the study of complex biological systems and their dynamics at a larger scale. Through the discovery of novel bioentity relationships, researchers reveal new information about biological functions and processes. Graphs are widely used to represent bioentities such as proteins, genes, small molecules, ligands, and others such as nodes and their connections as edges within a network. In this review, special focus is given to the usability of bipartite graphs and their impact on the field of network biology and medicine. Furthermore, their topological properties and how these can be applied to certain biological case studies are discussed. Finally, available methodologies and software are presented, and useful insights on how bipartite graphs can shape the path toward the solution of challenging biological problems are provided.
Collapse
Affiliation(s)
- Georgios A Pavlopoulos
- Lawrence Berkeley Labs, DOE Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA
| | - Panagiota I Kontou
- University of Thessaly, Department of Computer Science and Biomedical Informatics, Papasiopoulou 2–4, Lamia, 35100, Greece
| | - Athanasia Pavlopoulou
- Izmir International Biomedicine and Genome Institute (iBG-Izmir), Dokuz Eylül University, 35340, Turkey
| | - Costas Bouyioukos
- Université Paris Diderot, Sorbonne Paris Cité, Epigenetics and Cell Fate, UMR7216, CNRS, France
| | - Evripides Markou
- University of Thessaly, Department of Computer Science and Biomedical Informatics, Papasiopoulou 2–4, Lamia, 35100, Greece
| | - Pantelis G Bagos
- University of Thessaly, Department of Computer Science and Biomedical Informatics, Papasiopoulou 2–4, Lamia, 35100, Greece
| |
Collapse
|
13
|
Abstract
The diversity and huge omics data take biology and biomedicine research and application into a big data era, just like that popular in human society a decade ago. They are opening a new challenge from horizontal data ensemble (e.g., the similar types of data collected from different labs or companies) to vertical data ensemble (e.g., the different types of data collected for a group of person with match information), which requires the integrative analysis in biology and biomedicine and also asks for emergent development of data integration to address the great changes from previous population-guided to newly individual-guided investigations.Data integration is an effective concept to solve the complex problem or understand the complicate system. Several benchmark studies have revealed the heterogeneity and trade-off that existed in the analysis of omics data. Integrative analysis can combine and investigate many datasets in a cost-effective reproducible way. Current integration approaches on biological data have two modes: one is "bottom-up integration" mode with follow-up manual integration, and the other one is "top-down integration" mode with follow-up in silico integration.This paper will firstly summarize the combinatory analysis approaches to give candidate protocol on biological experiment design for effectively integrative study on genomics and then survey the data fusion approaches to give helpful instruction on computational model development for biological significance detection, which have also provided newly data resources and analysis tools to support the precision medicine dependent on the big biomedical data. Finally, the problems and future directions are highlighted for integrative analysis of omics big data.
Collapse
Affiliation(s)
- Xiang-Tian Yu
- Key Laboratory of Systems Biology, Institute of Biochemistry and Cell Biology, Chinese Academy Science, Shanghai, China
| | - Tao Zeng
- Key Laboratory of Systems Biology, Institute of Biochemistry and Cell Biology, Chinese Academy Science, Shanghai, China.
| |
Collapse
|
14
|
Karim AF, Sande OJ, Tomechko SE, Ding X, Li M, Maxwell S, Ewing RM, Harding CV, Rojas RE, Chance MR, Boom WH. Proteomics and Network Analyses Reveal Inhibition of Akt-mTOR Signaling in CD4 + T Cells by Mycobacterium tuberculosis Mannose-Capped Lipoarabinomannan. Proteomics 2017; 17:1700233. [PMID: 28994205 PMCID: PMC5725663 DOI: 10.1002/pmic.201700233] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2017] [Revised: 09/13/2017] [Indexed: 11/10/2022]
Abstract
Mycobacterium tuberculosis (Mtb) cell wall glycolipid mannose-capped lipoarabinomannan (ManLAM) inhibits CD4+ T-cell activation by inhibiting proximal T-cell receptor (TCR) signaling when activated by anti-CD3. To understand the impact of ManLAM on CD4+ T-cell function when both the TCR-CD3 complex and major costimulator CD28 are engaged, we performed label-free quantitative MS and network analysis. Mixed-effect model analysis of peptide intensity identified 149 unique peptides representing 131 proteins that were differentially regulated by ManLAM in anti-CD3- and anti-CD28-activated CD4+ T cells. Crosstalker, a novel network analysis tool identified dysregulated translation, TCA cycle, and RNA metabolism network modules. PCNA, Akt, mTOR, and UBC were found to be bridge node proteins connecting these modules of dysregulated proteins. Altered PCNA expression and cell cycle analysis showed arrest at the G2M phase. Western blot confirmed that ManLAM inhibited Akt and mTOR phosphorylation, and decreased expression of deubiquitinating enzymes Usp9x and Otub1. Decreased NF-κB phosphorylation suggested interference with CD28 signaling through inhibition of the Usp9x-Akt-mTOR pathway. Thus, ManLAM induced global changes in the CD4+ T-cell proteome by affecting Akt-mTOR signaling, resulting in broad functional impairment of CD4+ T-cell activation beyond inhibition of proximal TCR-CD3 signaling.
Collapse
Affiliation(s)
- Ahmad F. Karim
- Department of MedicineUniversity Hospitals Cleveland Medical CenterCase Western Reserve UniversityClevelandOHUSA
- Department of Molecular Biology & MicrobiologyCase Western Reserve UniversityClevelandOHUSA
| | - Obondo J. Sande
- Department of MedicineUniversity Hospitals Cleveland Medical CenterCase Western Reserve UniversityClevelandOHUSA
| | - Sara E. Tomechko
- Center for Proteomics & BioinformaticsCase Western Reserve UniversityClevelandOHUSA
| | - Xuedong Ding
- Department of MedicineUniversity Hospitals Cleveland Medical CenterCase Western Reserve UniversityClevelandOHUSA
| | - Ming Li
- Center for Proteomics & BioinformaticsCase Western Reserve UniversityClevelandOHUSA
| | - Sean Maxwell
- Center for Proteomics & BioinformaticsCase Western Reserve UniversityClevelandOHUSA
| | - Rob M. Ewing
- Centre for Biological SciencesUniversity of SouthamptonSouthamptonUK
| | - Clifford V. Harding
- Department of Molecular Biology & MicrobiologyCase Western Reserve UniversityClevelandOHUSA
- Department of PathologyUniversity Hospitals Cleveland Medical CenterCase Western Reserve UniversityClevelandOHUSA
| | - Roxana E. Rojas
- Department of Molecular Biology & MicrobiologyCase Western Reserve UniversityClevelandOHUSA
| | - Mark R. Chance
- Center for Proteomics & BioinformaticsCase Western Reserve UniversityClevelandOHUSA
- Department of NutritionSchool of MedicineCase Western Reserve UniversityClevelandOHUSA
| | - W. Henry Boom
- Department of MedicineUniversity Hospitals Cleveland Medical CenterCase Western Reserve UniversityClevelandOHUSA
- Department of Molecular Biology & MicrobiologyCase Western Reserve UniversityClevelandOHUSA
| |
Collapse
|
15
|
Mukherjee PK, Funchain P, Retuerto M, Jurevic RJ, Fowler N, Burkey B, Eng C, Ghannoum MA. Metabolomic analysis identifies differentially produced oral metabolites, including the oncometabolite 2-hydroxyglutarate, in patients with head and neck squamous cell carcinoma. BBA CLINICAL 2017; 7:8-15. [PMID: 28053877 PMCID: PMC5199158 DOI: 10.1016/j.bbacli.2016.12.001] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/04/2016] [Revised: 11/30/2016] [Accepted: 12/15/2016] [Indexed: 01/31/2023]
Abstract
BACKGROUND Metabolomics represents a promising approach for discovering novel targets and biomarkers in head and neck squamous cell carcinoma (HNSCC). Here we used metabolomics to identify oral metabolites associated with HNSCC. METHODS Tumor and adjacent normal tissue from surgical resections and presurgical oral washes as well as oral washes were collected from healthy participants. Metabolites extractions of these samples were analyzed by liquid chromatography-mass spectroscopy (LC/MS), LC/MS/MS and gas chromatography-MS (GC/MS). RESULTS Among 28 samples obtained from 7 HNSCC cases and 7 controls, 422 metabolites were detected (269 identified and 153 unidentified). Oral washes contained 12 and 23 metabolites in healthy controls and HNSCC patients, respectively, with phosphate and lactate being the most abundant. Small molecules related to energy metabolism were significantly elevated in HNSCC patients compared to controls. Levels of beta-alanine, alpha-hydroxyisovalerate, tryptophan, and hexanoylcarnitine were elevated in HNSCC oral washes compared to healthy controls (range 7.8-12.2-fold). Resection tissues contained 22 metabolites, of which eight were overproduced in tumor by 1.9- to 12-fold compared to controls. TCA cycle analogs 2-hydroxyglutarate (2-HG) and 3-GMP were detected exclusively in tumor tissues. Targeted quantification of 2-HG in a representative HNSCC patient showed increase in tumor tissue (14.7 μg/mL), but undetectable in normal tissue. Moreover, high levels of 2-HG were detected in HNSCC cell lines but not in healthy primary oral keratinocyte cultures. CONCLUSIONS Oral metabolites related to energy metabolism were elevated in HNSCC, and acylcarnitine and 2HG may have potential as non-invasive biomarkers. Further validation in clinical studies is warranted.
Collapse
Affiliation(s)
- Pranab K. Mukherjee
- Center for Medical Mycology, Department of Dermatology, Case Western Reserve University, University Hospitals Case Medical Center, Cleveland, OH, United States
| | - Pauline Funchain
- Genomic Medicine Institute, Lerner Research Institute, Taussig Cancer Institute, United States
| | - Mauricio Retuerto
- Center for Medical Mycology, Department of Dermatology, Case Western Reserve University, University Hospitals Case Medical Center, Cleveland, OH, United States
| | - Richard J. Jurevic
- Diagnostic Sciences, School of Dentistry, West Virginia University, Morgantown, WV, United States
| | | | - Brian Burkey
- Head and Neck Institute, Cleveland Clinic, Cleveland, OH, United States
| | - Charis Eng
- Genomic Medicine Institute, Lerner Research Institute, Taussig Cancer Institute, United States
- Department of Genetics and Genome Sciences, Cleveland, OH, United States
- Case Comprehensive Cancer Center, Case Western Reserve University, Cleveland, OH, United States
| | - Mahmoud A Ghannoum
- Center for Medical Mycology, Department of Dermatology, Case Western Reserve University, University Hospitals Case Medical Center, Cleveland, OH, United States
| |
Collapse
|
16
|
Mendes-Soares H, Chia N. Community metabolic modeling approaches to understanding the gut microbiome: Bridging biochemistry and ecology. Free Radic Biol Med 2017; 105:102-109. [PMID: 27989793 PMCID: PMC5401773 DOI: 10.1016/j.freeradbiomed.2016.12.017] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/23/2016] [Revised: 11/27/2016] [Accepted: 12/12/2016] [Indexed: 12/27/2022]
Abstract
Interest in the human microbiome is at an all time high. The number of human microbiome studies is growing exponentially, as are reported associations between microbial communities and disease. However, we have not been able to translate the ever-growing amount of microbiome sequence data into better health. To do this, we need a practical means of transforming a disease-associated microbiome into a health-associated microbiome. This will require a framework that can be used to generate predictions about community dynamics within the microbiome under different conditions, predictions that can be tested and validated. In this review, using the gut microbiome to illustrate, we describe two classes of model that are currently being used to generate predictions about microbial community dynamics: ecological models and metabolic models. We outline the strengths and weaknesses of each approach and discuss the insights into the gut microbiome that have emerged from modeling thus far. We then argue that the two approaches can be combined to yield a community metabolic model, which will supply the framework needed to move from high-throughput omics data to testable predictions about how prebiotic, probiotic, and nutritional interventions affect the microbiome. We are confident that with a suitable model, researchers and clinicians will be able to harness the stream of sequence data and begin designing strategies to make targeted alterations to the microbiome and improve health.
Collapse
Affiliation(s)
- Helena Mendes-Soares
- Microbiome Program, Center for Individualized Medicine, Mayo Clinic, Rochester, MN 55905, USA; Department of Surgery, Mayo Clinic, Rochester, MN 55905, USA
| | - Nicholas Chia
- Microbiome Program, Center for Individualized Medicine, Mayo Clinic, Rochester, MN 55905, USA; Department of Surgery, Mayo Clinic, Rochester, MN 55905, USA; Department of Bioengineering and Physiology, College of Medicine, Mayo Clinic, Rochester, MN 55905, USA.
| |
Collapse
|
17
|
Panis C, Pizzatti L, Souza GF, Abdelhay E. Clinical proteomics in cancer: Where we are. Cancer Lett 2016; 382:231-239. [PMID: 27561426 DOI: 10.1016/j.canlet.2016.08.014] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2016] [Revised: 08/16/2016] [Accepted: 08/17/2016] [Indexed: 12/25/2022]
Abstract
Proteomics has emerged as a promising field in the post-genomic era. Notwithstanding the great advances provided by gene expression analysis in cancer, the lack of a correlation between gene expression and protein levels has highlighted the need for a proteomic focus on cancer. Although the increasing knowledge regarding cancer biology, a reliable marker to improve diagnosis, prognosis and treatment for cancer patients is not a reality at present. In this review, we address the main considerations regarding proteomics-based studies and their clinical applications on cancer research, highlighting some considerations related to strengths and limitations of proteomics-based studies and its application to clinical practice.
Collapse
Affiliation(s)
- Carolina Panis
- Laboratório de Células Tronco, Instituto Nacional de Câncer, INCA, Rio de Janeiro, Brazil; Laboratório de Mediadores Inflamatórios, Universidade Estadual do Oeste do Paraná, UNIOESTE, Campus Francisco Beltrão, Paraná, Brazil.
| | - Luciana Pizzatti
- Laboratório de Biologia Molecular e Proteômica do Sangue - LABMOPS, Universidade Federal do Rio de Janeiro, Rio de Janeiro, Brazil
| | | | - Eliana Abdelhay
- Laboratório de Células Tronco, Instituto Nacional de Câncer, INCA, Rio de Janeiro, Brazil
| |
Collapse
|
18
|
Pataskar A, Tiwari VK. Computational challenges in modeling gene regulatory events. Transcription 2016; 7:188-195. [PMID: 27390891 DOI: 10.1080/21541264.2016.1204491] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022] Open
Abstract
Cellular transcriptional programs driven by genetic and epigenetic mechanisms could be better understood by integrating "omics" data and subsequently modeling the gene-regulatory events. Toward this end, computational biology should keep pace with evolving experimental procedures and data availability. This article gives an exemplified account of the current computational challenges in molecular biology.
Collapse
Affiliation(s)
| | - Vijay K Tiwari
- a Institute of Molecular Biology (IMB) , Mainz , Germany
| |
Collapse
|
19
|
Papadopoulos T, Krochmal M, Cisek K, Fernandes M, Husi H, Stevens R, Bascands JL, Schanstra JP, Klein J. Omics databases on kidney disease: where they can be found and how to benefit from them. Clin Kidney J 2016; 9:343-52. [PMID: 27274817 PMCID: PMC4886900 DOI: 10.1093/ckj/sfv155] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2015] [Accepted: 12/21/2015] [Indexed: 02/07/2023] Open
Abstract
In the recent decades, the evolution of omics technologies has led to advances in all biological fields, creating a demand for effective storage, management and exchange of rapidly generated data and research discoveries. To address this need, the development of databases of experimental outputs has become a common part of scientific practice in order to serve as knowledge sources and data-sharing platforms, providing information about genes, transcripts, proteins or metabolites. In this review, we present omics databases available currently, with a special focus on their application in kidney research and possibly in clinical practice. Databases are divided into two categories: general databases with a broad information scope and kidney-specific databases distinctively concentrated on kidney pathologies. In research, databases can be used as a rich source of information about pathophysiological mechanisms and molecular targets. In the future, databases will support clinicians with their decisions, providing better and faster diagnoses and setting the direction towards more preventive, personalized medicine. We also provide a test case demonstrating the potential of biological databases in comparing multi-omics datasets and generating new hypotheses to answer a critical and common diagnostic problem in nephrology practice. In the future, employment of databases combined with data integration and data mining should provide powerful insights into unlocking the mysteries of kidney disease, leading to a potential impact on pharmacological intervention and therapeutic disease management.
Collapse
Affiliation(s)
- Theofilos Papadopoulos
- Institut National de la Santé et de la Recherche Médicale (INSERM), U1048, Institut of Cardiovascular and Metabolic Disease, Toulouse, France; Université Toulouse III Paul-Sabatier, Toulouse, France
| | - Magdalena Krochmal
- Biotechnology Division, Biomedical Research Foundation Academy of Athens, Athens, Greece; Institute for Molecular Cardiovascular Research, Universitätsklinikum RWTH Aachen, Aachen, Germany
| | | | - Marco Fernandes
- BHF Glasgow Cardiovascular Research Centre , University of Glasgow , Glasgow , UK
| | - Holger Husi
- BHF Glasgow Cardiovascular Research Centre , University of Glasgow , Glasgow , UK
| | - Robert Stevens
- School of Computer Science , University of Manchester , Manchester , UK
| | - Jean-Loup Bascands
- Institut National de la Santé et de la Recherche Médicale (INSERM), U1048, Institut of Cardiovascular and Metabolic Disease, Toulouse, France; Université Toulouse III Paul-Sabatier, Toulouse, France
| | - Joost P Schanstra
- Institut National de la Santé et de la Recherche Médicale (INSERM), U1048, Institut of Cardiovascular and Metabolic Disease, Toulouse, France; Université Toulouse III Paul-Sabatier, Toulouse, France
| | - Julie Klein
- Institut National de la Santé et de la Recherche Médicale (INSERM), U1048, Institut of Cardiovascular and Metabolic Disease, Toulouse, France; Université Toulouse III Paul-Sabatier, Toulouse, France
| |
Collapse
|
20
|
Ansari-Pour N, Razaghi-Moghadam Z, Barneh F, Jafari M. Testis-Specific Y-Centric Protein-Protein Interaction Network Provides Clues to the Etiology of Severe Spermatogenic Failure. J Proteome Res 2016; 15:1011-22. [PMID: 26794825 DOI: 10.1021/acs.jproteome.5b01080] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
Abstract
Pinpointing causal genes for spermatogenic failure (SpF) on the Y chromosome has been an ever daunting challenge with setbacks during the past decade. Since complex diseases result from the interaction of multiple genes and also display considerable missing heritability, network analysis is more likely to explicate an etiological molecular basis. We therefore took a network medicine approach by integrating interactome (protein-protein interaction (PPI)) and transcriptome data to reconstruct a Y-centric SpF network. Two sets of seed genes (Y genes and SpF-implicated genes (SIGs)) were used for network reconstruction. Since no PPI was observed among Y genes, we identified their common immediate interactors. Interestingly, 81% (N = 175) of these interactors not only interacted directly with SIGs, but also they were enriched for differentially expressed genes (89.6%; N = 43). The SpF network, formed mainly by the dys-regulated interactors and the two seed gene sets, comprised three modules enriched for ribosomal proteins and nuclear receptors for sex hormones. Ribosomal proteins generally showed significant dys-regulation with RPL39L, thought to be expressed at the onset of spermatogenesis, strongly down-regulated. This network is the first global PPI network pertaining to severe SpF and if experimentally validated on independent data sets can lead to more accurate diagnosis and potential fertility recovery of patients.
Collapse
Affiliation(s)
- Naser Ansari-Pour
- Faculty of New Sciences and Technology, University of Tehran , North Kargar Street, Tehran 143995-7131, Iran.,School of Biological Sciences, Institute for Research in Fundamental Sciences (IPM) , Tehran 19395-5531, Iran
| | - Zahra Razaghi-Moghadam
- Faculty of New Sciences and Technology, University of Tehran , North Kargar Street, Tehran 143995-7131, Iran.,School of Biological Sciences, Institute for Research in Fundamental Sciences (IPM) , Tehran 19395-5531, Iran
| | - Farnaz Barneh
- Faculty of Paramedical Sciences, Shahid Beheshti University of Medical Sciences , Tehran 198396-3113, Iran
| | - Mohieddin Jafari
- Drug Design and Bioinformatics Unit, Medical Biotechnology Department, Biotechnology Research Center, Pasteur Institute of Iran , Tehran 131694-3551, Iran.,School of Biological Sciences, Institute for Research in Fundamental Sciences (IPM) , Tehran 19395-5531, Iran
| |
Collapse
|
21
|
Computational Methods for Integration of Biological Data. Per Med 2016. [DOI: 10.1007/978-3-319-39349-0_8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
22
|
Gligorijević V, Pržulj N. Methods for biological data integration: perspectives and challenges. J R Soc Interface 2015; 12:20150571. [PMID: 26490630 PMCID: PMC4685837 DOI: 10.1098/rsif.2015.0571] [Citation(s) in RCA: 157] [Impact Index Per Article: 17.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2015] [Accepted: 09/25/2015] [Indexed: 12/17/2022] Open
Abstract
Rapid technological advances have led to the production of different types of biological data and enabled construction of complex networks with various types of interactions between diverse biological entities. Standard network data analysis methods were shown to be limited in dealing with such heterogeneous networked data and consequently, new methods for integrative data analyses have been proposed. The integrative methods can collectively mine multiple types of biological data and produce more holistic, systems-level biological insights. We survey recent methods for collective mining (integration) of various types of networked biological data. We compare different state-of-the-art methods for data integration and highlight their advantages and disadvantages in addressing important biological problems. We identify the important computational challenges of these methods and provide a general guideline for which methods are suited for specific biological problems, or specific data types. Moreover, we propose that recent non-negative matrix factorization-based approaches may become the integration methodology of choice, as they are well suited and accurate in dealing with heterogeneous data and have many opportunities for further development.
Collapse
Affiliation(s)
| | - Nataša Pržulj
- Department of Computing, Imperial College London, London SW7 2AZ, UK
| |
Collapse
|
23
|
Regan K, Payne PRO. From Molecules to Patients: The Clinical Applications of Translational Bioinformatics. Yearb Med Inform 2015; 10:164-9. [PMID: 26293863 PMCID: PMC4587059 DOI: 10.15265/iy-2015-005] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
OBJECTIVE In order to realize the promise of personalized medicine, Translational Bioinformatics (TBI) research will need to continue to address implementation issues across the clinical spectrum. In this review, we aim to evaluate the expanding field of TBI towards clinical applications, and define common themes and current gaps in order to motivate future research. METHODS Here we present the state-of-the-art of clinical implementation of TBI-based tools and resources. Our thematic analyses of a targeted literature search of recent TBI-related articles ranged across topics in genomics, data management, hypothesis generation, molecular epidemiology, diagnostics, therapeutics and personalized medicine. RESULTS Open areas of clinically-relevant TBI research identified in this review include developing data standards and best practices, publicly available resources, integrative systemslevel approaches, user-friendly tools for clinical support, cloud computing solutions, emerging technologies and means to address pressing legal, ethical and social issues. CONCLUSIONS There is a need for further research bridging the gap from foundational TBI-based theories and methodologies to clinical implementation. We have organized the topic themes presented in this review into four conceptual foci - domain analyses, knowledge engineering, computational architectures and computation methods alongside three stages of knowledge development in order to orient future TBI efforts to accelerate the goals of personalized medicine.
Collapse
Affiliation(s)
| | - P R O Payne
- Philip R.O. Payne, PhD, FACMI, The Ohio State University, Department of Biomedical Informatics, 250 Lincoln Tower, 1800 Cannon Drive, Columbus, OH 43210, USA, Tel: +1 614 292 4778, E-mail:
| |
Collapse
|
24
|
Okser S, Pahikkala T, Airola A, Salakoski T, Ripatti S, Aittokallio T. Regularized machine learning in the genetic prediction of complex traits. PLoS Genet 2014; 10:e1004754. [PMID: 25393026 PMCID: PMC4230844 DOI: 10.1371/journal.pgen.1004754] [Citation(s) in RCA: 99] [Impact Index Per Article: 9.9] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023] Open
Affiliation(s)
- Sebastian Okser
- Department of Information Technology, University of Turku, Turku, Finland
- Turku Centre for Computer Science (TUCS), University of Turku and Åbo Akademi University, Turku, Finland
| | - Tapio Pahikkala
- Department of Information Technology, University of Turku, Turku, Finland
- Turku Centre for Computer Science (TUCS), University of Turku and Åbo Akademi University, Turku, Finland
| | - Antti Airola
- Department of Information Technology, University of Turku, Turku, Finland
- Turku Centre for Computer Science (TUCS), University of Turku and Åbo Akademi University, Turku, Finland
| | - Tapio Salakoski
- Department of Information Technology, University of Turku, Turku, Finland
- Turku Centre for Computer Science (TUCS), University of Turku and Åbo Akademi University, Turku, Finland
| | - Samuli Ripatti
- Hjelt Institute, University of Helsinki, Helsinki, Finland
- Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki, Finland
- Wellcome Trust Sanger Institute, Hinxton, United Kingdom
| | - Tero Aittokallio
- Turku Centre for Computer Science (TUCS), University of Turku and Åbo Akademi University, Turku, Finland
- Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki, Finland
- * E-mail:
| |
Collapse
|
25
|
Identification of possible pathogenic pathways in Behçet's disease using genome-wide association study data from two different populations. Eur J Hum Genet 2014; 23:678-87. [PMID: 25227143 DOI: 10.1038/ejhg.2014.158] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2013] [Revised: 07/08/2014] [Accepted: 07/10/2014] [Indexed: 12/25/2022] Open
Abstract
Behçet's disease (BD) is a multi-system inflammatory disorder of unknown etiology. Two recent genome-wide association studies (GWASs) of BD confirmed a strong association with the MHC class I region and identified two non-HLA common genetic variations. In complex diseases, multiple factors may target different sets of genes in the same pathway and thus may cause the same disease phenotype. We therefore hypothesized that identification of disease-associated pathways is critical to elucidate mechanisms underlying BD, and those pathways may be conserved within and across populations. To identify the disease-associated pathways, we developed a novel methodology that combines nominally significant evidence of genetic association with current knowledge of biochemical pathways, protein-protein interaction networks, and functional information of selected SNPs. Using this methodology, we searched for the disease-related pathways in two BD GWASs in Turkish and Japanese case-control groups. We found that 6 of the top 10 identified pathways in both populations were overlapping, even though there were few significantly conserved SNPs/genes within and between populations. The probability of random occurrence of such an event was 2.24E-39. These shared pathways were focal adhesion, MAPK signaling, TGF-β signaling, ECM-receptor interaction, complement and coagulation cascades, and proteasome pathways. Even though each individual has a unique combination of factors involved in their disease development, the targeted pathways are expected to be mostly the same. Hence, the identification of shared pathways between the Turkish and the Japanese patients using GWAS data may help further elucidate the inflammatory mechanisms in BD pathogenesis.
Collapse
|
26
|
Mora A, Taranta M, Zaki N, Badidi E, Cinti C, Capobianco E. Ensemble inference by integrative cancer networks. Front Genet 2014; 5:59. [PMID: 24744769 PMCID: PMC3978335 DOI: 10.3389/fgene.2014.00059] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2013] [Accepted: 03/09/2014] [Indexed: 11/19/2022] Open
Affiliation(s)
- Antonio Mora
- Laboratory of Integrative Systems Medicine, Institute of Clinical Physiology, CNR Pisa, Italy ; Bioinformatics Lab, College of Information Technology, United Arab Emirates University Al Ain, UAE
| | - Monia Taranta
- Laboratory of Experimental Oncology, Institute of Clinical Physiology, CNR Siena, Italy
| | - Nazar Zaki
- Bioinformatics Lab, College of Information Technology, United Arab Emirates University Al Ain, UAE
| | - Elarbi Badidi
- Bioinformatics Lab, College of Information Technology, United Arab Emirates University Al Ain, UAE
| | - Caterina Cinti
- Laboratory of Experimental Oncology, Institute of Clinical Physiology, CNR Siena, Italy
| | - Enrico Capobianco
- Laboratory of Integrative Systems Medicine, Institute of Clinical Physiology, CNR Pisa, Italy ; Center for Computational Science, University of Miami Miami, FL, USA
| |
Collapse
|
27
|
Bolouri H. Modeling genomic regulatory networks with big data. Trends Genet 2014; 30:182-91. [PMID: 24630831 DOI: 10.1016/j.tig.2014.02.005] [Citation(s) in RCA: 46] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2013] [Revised: 02/18/2014] [Accepted: 02/19/2014] [Indexed: 02/06/2023]
Abstract
High-throughput sequencing, large-scale data generation projects, and web-based cloud computing are changing how computational biology is performed, who performs it, and what biological insights it can deliver. I review here the latest developments in available data, methods, and software, focusing on the modeling and analysis of the gene regulatory interactions in cells. Three key findings are: (i) although sophisticated computational resources are increasingly available to bench biologists, tailored ongoing education is necessary to avoid the erroneous use of these resources. (ii) Current models of the regulation of gene expression are far too simplistic and need updating. (iii) Integrative computational analysis of large-scale datasets is becoming a fundamental component of molecular biology. I discuss current and near-term opportunities and challenges related to these three points.
Collapse
Affiliation(s)
- Hamid Bolouri
- Division of Human Biology, Fred Hutchinson Cancer Research Center (FHCRC), 1100 Fairview Avenue North, PO Box 19024, Seattle, WA 98109, USA.
| |
Collapse
|
28
|
Horn F, Rittweger M, Taubert J, Lysenko A, Rawlings C, Guthke R. Interactive exploration of integrated biological datasets using context-sensitive workflows. Front Genet 2014; 5:21. [PMID: 24600467 PMCID: PMC3929842 DOI: 10.3389/fgene.2014.00021] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2013] [Accepted: 01/21/2014] [Indexed: 11/16/2022] Open
Abstract
Network inference utilizes experimental high-throughput data for the reconstruction of molecular interaction networks where new relationships between the network entities can be predicted. Despite the increasing amount of experimental data, the parameters of each modeling technique cannot be optimized based on the experimental data alone, but needs to be qualitatively assessed if the components of the resulting network describe the experimental setting. Candidate list prioritization and validation builds upon data integration and data visualization. The application of tools supporting this procedure is limited to the exploration of smaller information networks because the display and interpretation of large amounts of information is challenging regarding the computational effort and the users' experience. The Ondex software framework was extended with customizable context-sensitive menus which allow additional integration and data analysis options for a selected set of candidates during interactive data exploration. We provide new functionalities for on-the-fly data integration using InterProScan, PubMed Central literature search, and sequence-based homology search. We applied the Ondex system to the integration of publicly available data for Aspergillus nidulans and analyzed transcriptome data. We demonstrate the advantages of our approach by proposing new hypotheses for the functional annotation of specific genes of differentially expressed fungal gene clusters. Our extension of the Ondex framework makes it possible to overcome the separation between data integration and interactive analysis. More specifically, computationally demanding calculations can be performed on selected sub-networks without losing any information from the whole network. Furthermore, our extensions allow for direct access to online biological databases which helps to keep the integrated information up-to-date.
Collapse
Affiliation(s)
- Fabian Horn
- Systems Biology/Bioinformatics, Leibniz Institute for Natural Product Research and Infection Biology - Hans Knöll Institute Jena, Germany
| | - Martin Rittweger
- Systems Biology/Bioinformatics, Leibniz Institute for Natural Product Research and Infection Biology - Hans Knöll Institute Jena, Germany
| | - Jan Taubert
- Department of Computational and Systems Biology, Rothamsted Research Harpenden, UK
| | - Artem Lysenko
- Department of Computational and Systems Biology, Rothamsted Research Harpenden, UK
| | - Christopher Rawlings
- Department of Computational and Systems Biology, Rothamsted Research Harpenden, UK
| | - Reinhard Guthke
- Systems Biology/Bioinformatics, Leibniz Institute for Natural Product Research and Infection Biology - Hans Knöll Institute Jena, Germany
| |
Collapse
|
29
|
Bakir-Gungor B, Egemen E, Sezerman OU. PANOGA: a web server for identification of SNP-targeted pathways from genome-wide association study data. ACTA ACUST UNITED AC 2014; 30:1287-9. [PMID: 24413675 DOI: 10.1093/bioinformatics/btt743] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Genome-wide association studies (GWAS) have revolutionized the search for the variants underlying human complex diseases. However, in a typical GWAS, only a minority of the single-nucleotide polymorphisms (SNPs) with the strongest evidence of association is explained. One possible reason of complex diseases is the alterations in the activity of several biological pathways. Here we present a web server called Pathway and Network-Oriented GWAS Analysis to devise functionally important pathways through the identification of SNP-targeted genes within these pathways. The strength of our methodology stems from its multidimensional perspective, where we combine evidence from the following five resources: (i) genetic association information obtained through GWAS, (ii) SNP functional information, (iii) protein-protein interaction network, (iv) linkage disequilibrium and (v) biochemical pathways.
Collapse
Affiliation(s)
- Burcu Bakir-Gungor
- Department of Genetics and Bioinformatics, Faculty of Arts and Sciences, Bahcesehir University, 34353, Besiktas, Istanbul, Turkey, Department of Computer Engineering, Faculty of Engineering and Natural Sciences, Abdullah Gul University, 38039, Kayseri, Turkey, Department of Computer Engineering, Faculty of Engineering and Natural Sciences, Sabanci University, 34956, Tuzla, Istanbul, Turkey and Department of Biological Sciences and Bioengineering, Faculty of Engineering and Natural Sciences, Sabanci University, 34956, Tuzla, Istanbul, Turkey
| | | | | |
Collapse
|
30
|
Integrative approaches for finding modular structure in biological networks. Nat Rev Genet 2013; 14:719-32. [PMID: 24045689 DOI: 10.1038/nrg3552] [Citation(s) in RCA: 351] [Impact Index Per Article: 31.9] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
A central goal of systems biology is to elucidate the structural and functional architecture of the cell. To this end, large and complex networks of molecular interactions are being rapidly generated for humans and model organisms. A recent focus of bioinformatics research has been to integrate these networks with each other and with diverse molecular profiles to identify sets of molecules and interactions that participate in a common biological function - that is, 'modules'. Here, we classify such integrative approaches into four broad categories, describe their bioinformatic principles and review their applications.
Collapse
|
31
|
Ghosh S, Baloni P, Mukherjee S, Anand P, Chandra N. A multi-level multi-scale approach to study essential genes in Mycobacterium tuberculosis. BMC SYSTEMS BIOLOGY 2013; 7:132. [PMID: 24308365 PMCID: PMC4234997 DOI: 10.1186/1752-0509-7-132] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/25/2013] [Accepted: 11/20/2013] [Indexed: 11/10/2022]
Abstract
Background The set of indispensable genes that are required by an organism to grow and sustain life are termed as essential genes. There is a strong interest in identification of the set of essential genes, particularly in pathogens, not only for a better understanding of the pathogen biology, but also for identifying drug targets and the minimal gene set for the organism. Essentiality is inherently a systems property and requires consideration of the system as a whole for their identification. The available experimental approaches capture some aspects but each method comes with its own limitations. Moreover, they do not explain the basis for essentiality in most cases. A powerful prediction method to recognize this gene pool including rationalization of the known essential genes in a given organism would be very useful. Here we describe a multi-level multi-scale approach to identify the essential gene pool in a deadly pathogen, Mycobacterium tuberculosis. Results The multi-level workflow analyses the bacterial cell by studying (a) genome-wide gene expression profiles to identify the set of genes which show consistent and significant levels of expression in multiple samples of the same condition, (b) indispensability for growth by using gene expression integrated flux balance analysis of a genome-scale metabolic model, (c) importance for maintaining the integrity and flow in a protein-protein interaction network and (d) evolutionary conservation in a set of genomes of the same ecological niche. In the gene pool identified, the functional basis for essentiality has been addressed by studying residue level conservation and the sub-structure at the ligand binding pockets, from which essential amino acid residues in that pocket have also been identified. 283 genes were identified as essential genes with high-confidence. An agreement of about 73.5% is observed with that obtained from the experimental transposon mutagenesis technique. A large proportion of the identified genes belong to the class of intermediary metabolism and respiration. Conclusions The multi-scale, multi-level approach described can be generally applied to other pathogens as well. The essential gene pool identified form a basis for designing experiments to probe their finer functional roles and also serve as a ready shortlist for identifying drug targets.
Collapse
Affiliation(s)
| | | | | | | | - Nagasuma Chandra
- Department of Biochemistry, Indian Institute of Science, Bangalore, India.
| |
Collapse
|
32
|
Camilo E, Bovolenta LA, Acencio ML, Rybarczyk-Filho JL, Castro MA, Moreira JC, Lemke N. GALANT: a Cytoscape plugin for visualizing data as functional landscapes projected onto biological networks. Bioinformatics 2013; 29:2505-6. [DOI: 10.1093/bioinformatics/btt377] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
|
33
|
A network perspective on unraveling the role of TRP channels in biology and disease. Pflugers Arch 2013; 466:173-82. [PMID: 23677537 DOI: 10.1007/s00424-013-1292-2] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2013] [Revised: 04/22/2013] [Accepted: 05/03/2013] [Indexed: 02/08/2023]
Abstract
Transient receptor potential (TRP) channels are a large family of non-selective cation channels that mediate numerous physiological and pathophysiological processes; however, still largely unknown are the underlying molecular mechanisms. With data generated on an unprecedented scale, network-based approaches have been revolutionizing the way in which we understand biology and disease, discover disease genes, and develop therapeutic strategies. These circumstances have created opportunities to encounter TRP channel research to data-intensive science. In this review, we provide an introduction of network-based approaches in biomedical science, describe the current state of TRP channel network biology, and discuss the future direction of TRP channel research. Network perspective will facilitate the discovery of latent roles and underlying mechanisms of TRP channels in biology and disease.
Collapse
|
34
|
A network pharmacology approach to evaluating the efficacy of chinese medicine using genome-wide transcriptional expression data. EVIDENCE-BASED COMPLEMENTARY AND ALTERNATIVE MEDICINE 2013; 2013:915343. [PMID: 23737854 PMCID: PMC3666440 DOI: 10.1155/2013/915343] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/18/2012] [Revised: 02/13/2013] [Accepted: 02/17/2013] [Indexed: 11/17/2022]
Abstract
The research of multicomponent drugs, such as in Chinese Medicine, on both mechanism dissection and drug discovery is challenging, especially the approaches to systematically evaluating the efficacy at a molecular level. Here, we presented a network pharmacology-based approach to evaluating the efficacy of multicomponent drugs by genome-wide transcriptional expression data and applied it to Shenmai injection (SHENMAI), a widely used Chinese Medicine composed of red ginseng (RG) and Radix Ophiopogonis (RO) in clinically treating myocardial ischemia (MI) diseases. The disease network, MI network in this case, was constructed by combining the protein-protein interactions (PPI) involved in the MI enriched pathways. The therapeutic efficacy of SHENMAI, RG, and RO was therefore evaluated by a network parameter, namely, network recovery index (NRI), which quantitatively evaluates the overall recovery rate in MI network. The NRI of SHENMAI, RG, and RO were 0.865, 0.425, and 0.271 respectively [corrected], which indicated SHENMAI exerts protective effects and the synergistic effect of RG and RO on treating myocardial ischemia disease. The successful application of SHENMAI implied that the proposed network pharmacology-based approach could help researchers to better evaluate a multicomponent drug on a systematic and molecular level.
Collapse
|
35
|
de Matos Simoes R, Dehmer M, Emmert-Streib F. Interfacing cellular networks of S. cerevisiae and E. coli: connecting dynamic and genetic information. BMC Genomics 2013; 14:324. [PMID: 23663484 PMCID: PMC3698017 DOI: 10.1186/1471-2164-14-324] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2012] [Accepted: 04/25/2013] [Indexed: 12/11/2022] Open
Abstract
Background In recent years, various types of cellular networks have penetrated biology and are nowadays used omnipresently for studying eukaryote and prokaryote organisms. Still, the relation and the biological overlap among phenomenological and inferential gene networks, e.g., between the protein interaction network and the gene regulatory network inferred from large-scale transcriptomic data, is largely unexplored. Results We provide in this study an in-depth analysis of the structural, functional and chromosomal relationship between a protein-protein network, a transcriptional regulatory network and an inferred gene regulatory network, for S. cerevisiae and E. coli. Further, we study global and local aspects of these networks and their biological information overlap by comparing, e.g., the functional co-occurrence of Gene Ontology terms by exploiting the available interaction structure among the genes. Conclusions Although the individual networks represent different levels of cellular interactions with global structural and functional dissimilarities, we observe crucial functions of their network interfaces for the assembly of protein complexes, proteolysis, transcription, translation, metabolic and regulatory interactions. Overall, our results shed light on the integrability of these networks and their interfacing biological processes.
Collapse
Affiliation(s)
- Ricardo de Matos Simoes
- Computational Biology and Machine Learning Laboratory Center for Cancer Research and Cell Biology, School of Medicine, Dentistry and Biomedical Sciences Faculty of Medicine, Health and Life Sciences, Queen's University, 97 Lisburn Road, Belfast, UK
| | | | | |
Collapse
|
36
|
Transcriptome data modeling for targeted plant metabolic engineering. Curr Opin Biotechnol 2013; 24:285-90. [DOI: 10.1016/j.copbio.2012.10.018] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2012] [Revised: 10/24/2012] [Accepted: 10/29/2012] [Indexed: 12/31/2022]
|
37
|
Bakir-Gungor B, Baykan B, Ugur İseri S, Tuncer FN, Sezerman OU. Identifying SNP targeted pathways in partial epilepsies with genome-wide association study data. Epilepsy Res 2013; 105:92-102. [PMID: 23498093 DOI: 10.1016/j.eplepsyres.2013.02.008] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2012] [Revised: 01/15/2013] [Accepted: 02/13/2013] [Indexed: 12/18/2022]
Abstract
PURPOSE In a recent genome-wide association study for partial epilepsies in the European population, a common genetic variation has been reported to affect partial epilepsy only modestly. However, in complex diseases such as partial epilepsy, multiple factors (e.g. single nucleotide polymorphisms, microRNAs, metabolic and epigenetic factors) may target different sets of genes in the same pathway, affecting its function and thus causing the disease development. In this regard, we hypothesize that the pathways are critical for elucidating the mechanisms underlying partial epilepsy. METHODS Previously we had developed a novel methodology with the aim of identifying the disease-related pathways. We had combined evidence of genetic association with current knowledge of (i) biochemical pathways, (ii) protein-protein interaction networks, and (iii) the functional information of selected single nucleotide polymorphisms. In our present study, we apply this methodology to a data set on partial epilepsy, including 3445 cases and 6935 controls of European ancestry. RESULTS We have identified 30 overrepresented pathways with corrected p-values smaller than 10(-12). These pathways include complement and coagulation cascades, cell cycle, focal adhesion, extra cellular matrix-receptor interaction, JAK-STAT signaling pathway, MAPK signaling pathway, proteasome, ribosome, calcium signaling and regulation of actin cytoskeleton pathways. Most of these pathways have growing scientific support in the literature as being associated with partial epilepsy. We also demonstrate that different factors affect distinct parts of the pathways, as shown here on complement and coagulation cascades pathway with a comparison of gene expression vs. genome-wide association study. CONCLUSIONS Traditional studies on genome-wide association have not revealed strong associations in epilepsies, since these single nucleotide polymorphisms are not shared by most of the patients. Our results suggest that it is more effective to incorporate the functional effect of a single nucleotide polymorphism on the gene product, protein-protein interaction networks and functional enrichment tools into genome-wide association studies. These can then be used to determine leading molecular pathways, which cannot be detected through traditional analyses. We hope that this type of analysis brings the research community one step closer to unraveling the complex genetic structure of epilepsies.
Collapse
Affiliation(s)
- B Bakir-Gungor
- Department of Genetics and Bioinformatics, Faculty of Arts and Sciences, Bahcesehir University, Ciragan Cad. Osmanpasa Mektebi Sok., No.: 4, 34353, Besiktas, Istanbul, Turkey.
| | | | | | | | | |
Collapse
|
38
|
Okser S, Pahikkala T, Aittokallio T. Genetic variants and their interactions in disease risk prediction - machine learning and network perspectives. BioData Min 2013; 6:5. [PMID: 23448398 PMCID: PMC3606427 DOI: 10.1186/1756-0381-6-5] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2012] [Accepted: 02/11/2013] [Indexed: 12/31/2022] Open
Abstract
A central challenge in systems biology and medical genetics is to understand how interactions among genetic loci contribute to complex phenotypic traits and human diseases. While most studies have so far relied on statistical modeling and association testing procedures, machine learning and predictive modeling approaches are increasingly being applied to mining genotype-phenotype relationships, also among those associations that do not necessarily meet statistical significance at the level of individual variants, yet still contributing to the combined predictive power at the level of variant panels. Network-based analysis of genetic variants and their interaction partners is another emerging trend by which to explore how sub-network level features contribute to complex disease processes and related phenotypes. In this review, we describe the basic concepts and algorithms behind machine learning-based genetic feature selection approaches, their potential benefits and limitations in genome-wide setting, and how physical or genetic interaction networks could be used as a priori information for providing improved predictive power and mechanistic insights into the disease networks. These developments are geared toward explaining a part of the missing heritability, and when combined with individual genomic profiling, such systems medicine approaches may also provide a principled means for tailoring personalized treatment strategies in the future.
Collapse
|
39
|
Peng J, Chen J, Wang Y. Identifying cross-category relations in gene ontology and constructing genome-specific term association networks. BMC Bioinformatics 2013; 14 Suppl 2:S15. [PMID: 23368677 PMCID: PMC3549802 DOI: 10.1186/1471-2105-14-s2-s15] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Background Gene Ontology (GO) has been widely used in biological databases, annotation projects, and computational analyses. Although the three GO categories are structured as independent ontologies, the biological relationships across the categories are not negligible for biological reasoning and knowledge integration. However, the existing cross-category ontology term similarity measures are either developed by utilizing the GO data only or based on manually curated term name similarities, ignoring the fact that GO is evolving quickly and the gene annotations are far from complete. Results In this paper we introduce a new cross-category similarity measurement called CroGO by incorporating genome-specific gene co-function network data. The performance study showed that our measurement outperforms the existing algorithms. We also generated genome-specific term association networks for yeast and human. An enrichment based test showed our networks are better than those generated by the other measures. Conclusions The genome-specific term association networks constructed using CroGO provided a platform to enable a more consistent use of GO. In the networks, the frequently occurred MF-centered hub indicates that a molecular function may be shared by different genes in multiple biological processes, or a set of genes with the same functions may participate in distinct biological processes. And common subgraphs in multiple organisms also revealed conserved GO term relationships. Software and data are available online at http://www.msu.edu/˜jinchen/CroGO.
Collapse
Affiliation(s)
- Jiajie Peng
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
| | | | | |
Collapse
|
40
|
Abstract
Functional analysis of post-genomics data is essential to identify the biological processes involved in a given investigation. Although most of the ontological tools available are limited to organisms with well-annotated genomes, this chapter provides an overview of two complementary tools-MapMan and GeneBins/PathExpress-that are used to perform a functional analysis of legume gene expression data. MapMan is a stand-alone tool that displays large datasets onto diagrams of metabolic pathways or other processes. Although initially developed for Arabidopsis thaliana, MapMan can be extended to other plants by assigning new sequences to their orthologs in the current classification. GeneBins and PathExpress have been developed to perform enrichment analysis of functional groups and metabolic networks, respectively. Based on the KEGG database, these tools can be used with any organism, including the main reference legumes.
Collapse
Affiliation(s)
- Nicolas Goffard
- Plant Science Division, Research School of Biology, College of Medicine, Biology and Environment, The Australian National University, Canberra, ACT, Australia
| | | |
Collapse
|
41
|
Affiliation(s)
- Nagasuma Chandra
- Indian Institute of Science, Department of Biochemistry,
Bangalore – 560012, India ,
| | - Jyothi Padiadpu
- Indian Institute of Science, Department of Biochemistry,
Bangalore – 560012, India
| |
Collapse
|
42
|
Systems genetics in "-omics" era: current and future development. Theory Biosci 2012; 132:1-16. [PMID: 23138757 DOI: 10.1007/s12064-012-0168-x] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2012] [Accepted: 10/25/2012] [Indexed: 02/06/2023]
Abstract
The systems genetics is an emerging discipline that integrates high-throughput expression profiling technology and systems biology approaches for revealing the molecular mechanism of complex traits, and will improve our understanding of gene functions in the biochemical pathway and genetic interactions between biological molecules. With the rapid advances of microarray analysis technologies, bioinformatics is extensively used in the studies of gene functions, SNP-SNP genetic interactions, LD block-block interactions, miRNA-mRNA interactions, DNA-protein interactions, protein-protein interactions, and functional mapping for LD blocks. Based on bioinformatics panel, which can integrate "-omics" datasets to extract systems knowledge and useful information for explaining the molecular mechanism of complex traits, systems genetics is all about to enhance our understanding of biological processes. Systems biology has provided systems level recognition of various biological phenomena, and constructed the scientific background for the development of systems genetics. In addition, the next-generation sequencing technology and post-genome wide association studies empower the discovery of new gene and rare variants. The integration of different strategies will help to propose novel hypothesis and perfect the theoretical framework of systems genetics, which will make contribution to the future development of systems genetics, and open up a whole new area of genetics.
Collapse
|