1
|
Gupta NS, Kumar P. Perspective of artificial intelligence in healthcare data management: A journey towards precision medicine. Comput Biol Med 2023; 162:107051. [PMID: 37271113 DOI: 10.1016/j.compbiomed.2023.107051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2023] [Revised: 05/06/2023] [Accepted: 05/20/2023] [Indexed: 06/06/2023]
Abstract
Mounting evidence has highlighted the implementation of big data handling and management in the healthcare industry to improve the clinical services. Various private and public companies have generated, stored, and analyzed different types of big healthcare data, such as omics data, clinical data, electronic health records, personal health records, and sensing data with the aim to move in the direction of precision medicine. Additionally, with the advancement in technologies, researchers are curious to extract the potential involvement of artificial intelligence and machine learning on big healthcare data to enhance the quality of patient's lives. However, seeking solutions from big healthcare data requires proper management, storage, and analysis, which imposes hinderances associated with big data handling. Herein, we briefly discuss the implication of big data handling and the role of artificial intelligence in precision medicine. Further, we also highlighted the potential of artificial intelligence in integrating and analyzing the big data that offer personalized treatment. In addition, we briefly discuss the applications of artificial intelligence in personalized treatment, especially in neurological diseases. Lastly, we discuss the challenges and limitations imposed by artificial intelligence in big data management and analysis to hinder precision medicine.
Collapse
Affiliation(s)
- Nancy Sanjay Gupta
- Molecular Neuroscience and Functional Genomics Laboratory, Department of Biotechnology, Delhi Technological University, India
| | - Pravir Kumar
- Molecular Neuroscience and Functional Genomics Laboratory, Department of Biotechnology, Delhi Technological University, India.
| |
Collapse
|
2
|
Cao H, Hong X, Tost H, Meyer-Lindenberg A, Schwarz E. Advancing translational research in neuroscience through multi-task learning. Front Psychiatry 2022; 13:993289. [PMID: 36465289 PMCID: PMC9714033 DOI: 10.3389/fpsyt.2022.993289] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/13/2022] [Accepted: 10/24/2022] [Indexed: 11/18/2022] Open
Abstract
Translational research in neuroscience is increasingly focusing on the analysis of multi-modal data, in order to account for the biological complexity of suspected disease mechanisms. Recent advances in machine learning have the potential to substantially advance such translational research through the simultaneous analysis of different data modalities. This review focuses on one of such approaches, the so-called "multi-task learning" (MTL), and describes its potential utility for multi-modal data analyses in neuroscience. We summarize the methodological development of MTL starting from conventional machine learning, and present several scenarios that appear particularly suitable for its application. For these scenarios, we highlight different types of MTL algorithms, discuss emerging technological adaptations, and provide a step-by-step guide for readers to apply the MTL approach in their own studies. With its ability to simultaneously analyze multiple data modalities, MTL may become an important element of the analytics repertoire used in future neuroscience research and beyond.
Collapse
Affiliation(s)
- Han Cao
- Department of Psychiatry and Psychotherapy, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany
| | - Xudong Hong
- Department of Computer Vision and Machine Learning, Max Planck Institute for Informatics, Saarbrücken, Germany
- Department of Language Science and Technology, Saarland University, Saarbrücken, Germany
| | - Heike Tost
- Department of Psychiatry and Psychotherapy, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany
| | - Andreas Meyer-Lindenberg
- Department of Psychiatry and Psychotherapy, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany
| | - Emanuel Schwarz
- Department of Psychiatry and Psychotherapy, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany
| |
Collapse
|
3
|
Wu M, Yi H, Ma S. Vertical integration methods for gene expression data analysis. Brief Bioinform 2021; 22:bbaa169. [PMID: 32793970 PMCID: PMC8138889 DOI: 10.1093/bib/bbaa169] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2020] [Revised: 06/18/2020] [Accepted: 07/04/2020] [Indexed: 12/12/2022] Open
Abstract
Gene expression data have played an essential role in many biomedical studies. When the number of genes is large and sample size is limited, there is a 'lack of information' problem, leading to low-quality findings. To tackle this problem, both horizontal and vertical data integrations have been developed, where vertical integration methods collectively analyze data on gene expressions as well as their regulators (such as mutations, DNA methylation and miRNAs). In this article, we conduct a selective review of vertical data integration methods for gene expression data. The reviewed methods cover both marginal and joint analysis and supervised and unsupervised analysis. The main goal is to provide a sketch of the vertical data integration paradigm without digging into too many technical details. We also briefly discuss potential pitfalls, directions for future developments and application notes.
Collapse
Affiliation(s)
- Mengyun Wu
- School of Statistics and Management, Shanghai University of Finance and Economics
| | - Huangdi Yi
- Department of Biostatistics at Yale University
| | - Shuangge Ma
- Department of Biostatistics at Yale University
| |
Collapse
|
4
|
Cao H, Zhou J, Schwarz E. RMTL: an R library for multi-task learning. Bioinformatics 2020; 35:1797-1798. [PMID: 30256897 DOI: 10.1093/bioinformatics/bty831] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2018] [Revised: 08/20/2018] [Accepted: 09/25/2018] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Multi-task learning (MTL) is a machine learning technique for simultaneous learning of multiple related classification or regression tasks. Despite its increasing popularity, MTL algorithms are currently not available in the widely used software environment R, creating a bottleneck for their application in biomedical research. RESULTS We developed an efficient, easy-to-use R library for MTL (www.r-project.org) comprising 10 algorithms applicable for regression, classification, joint predictor selection, task clustering, low-rank learning and incorporation of biological networks. We demonstrate the utility of the algorithms using simulated data. AVAILABILITY AND IMPLEMENTATION The RMTL package is an open source R package and is freely available at https://github.com/transbioZI/RMTL. RMTL will also be available on cran.r-project.org. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Han Cao
- Department of Psychiatry and Psychotherapy, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany
| | - Jiayu Zhou
- Department of Computer Science and Engineering, Michigan State University, East Lansing, MI, USA
| | - Emanuel Schwarz
- Department of Psychiatry and Psychotherapy, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany
| |
Collapse
|
5
|
Buendia P, Bradley RM, Taylor TJ, Schymanski EL, Patti GJ, Kabuka MR. Ontology-based metabolomics data integration with quality control. Bioanalysis 2019; 11:1139-1155. [PMID: 31179719 PMCID: PMC6661928 DOI: 10.4155/bio-2018-0303] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2018] [Accepted: 05/01/2019] [Indexed: 12/12/2022] Open
Abstract
Aim: The complications that arise when performing meta-analysis of datasets from multiple metabolomics studies are addressed with computational methods that ensure data quality, completeness of metadata and accurate interpretation across studies. Results & methodology: This paper presents an integrated system of quality control (QC) methods to assess metabolomics results by evaluating the data acquisition strategies and metabolite identification process when integrating datasets for meta-analysis. An ontology knowledge base and a rule-based system representing the experiment and chemical background information direct the processes involved in data integration and QC verification. A diabetes meta-analysis study using these QC methods finds putative biomarkers that differ between cohorts. Conclusion: The methods presented here ensure the validity of meta-analysis when integrating data from different metabolic profiling studies.
Collapse
Affiliation(s)
- Patricia Buendia
- INFOTECH Soft, Inc., 1201 Brickell Ave. Suite 220, Miami, FL 33131, USA
| | - Ray M Bradley
- INFOTECH Soft, Inc., 1201 Brickell Ave. Suite 220, Miami, FL 33131, USA
| | - Thomas J Taylor
- INFOTECH Soft, Inc., 1201 Brickell Ave. Suite 220, Miami, FL 33131, USA
| | - Emma L Schymanski
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Campus Belval, 6 Avenue du Swing, Belvaux L-4367, Luxembourg
- Eawag – Swiss Federal Institute of Aquatic Science & Technology, Überland Strasse 133, Dübendorf 8600, Switzerland
| | - Gary J Patti
- Departments of Chemistry, Genetics, & Medicine. Washington University, Saint Louis, MO 63110, USA
| | - Mansur R Kabuka
- INFOTECH Soft, Inc., 1201 Brickell Ave. Suite 220, Miami, FL 33131, USA
| |
Collapse
|
6
|
Huang Y, Liu J. Exclusive Sparsity Norm Minimization With Random Groups via Cone Projection. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2018; 29:6145-6153. [PMID: 29994009 DOI: 10.1109/tnnls.2018.2819958] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Many practical applications such as gene expression analysis, multitask learning, image recognition, signal processing, and medical data analysis pursue a sparse solution for the feature selection purpose and particularly favor the nonzeros evenly distributed in different groups. The exclusive sparsity norm has been widely used to serve to this purpose. However, it still lacks systematical studies for exclusive sparsity norm optimization. This paper offers two main contributions from the optimization perspective: 1) we provide several efficient algorithms to solve exclusive sparsity norm minimization with either smooth loss or hinge loss (nonsmooth loss). All algorithms achieve the optimal convergence rate . ( is the iteration number.) To the best of our knowledge, this is the first time to guarantee such convergence rate for the general exclusive sparsity norm minimization and 2) when the group information is unavailable to define the exclusive sparsity norm, we propose to use the random grouping scheme to construct groups and prove that if the number of groups is appropriately chosen, the nonzeros (true features) would be grouped in the ideal way with high probability. Empirical studies validate the efficiency of the proposed algorithms, and the effectiveness of random grouping scheme on the proposed exclusive support vector machine formulation.
Collapse
|
7
|
Zille P, Calhoun VD, Wang YP. Enforcing Co-Expression Within a Brain-Imaging Genomics Regression Framework. IEEE TRANSACTIONS ON MEDICAL IMAGING 2018; 37:2561-2571. [PMID: 28678703 PMCID: PMC6415768 DOI: 10.1109/tmi.2017.2721301] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/20/2023]
Abstract
Among the challenges arising in brain imaging genetic studies, estimating the potential links between neurological and genetic variability within a population is key. In this paper, we propose a multivariate, multimodal formulation for variable selection that leverages co-expression patterns across various data modalities. Our approach is based on an intuitive combination of two widely used statistical models: sparse regression and canonical correlation analysis (CCA). While the former seeks multivariate linear relationships between a given phenotype and associated observations, the latter searches to extract co-expression patterns between sets of variables belonging to different modalities. In the following, we propose to rely on a "CCA-type" formulation in order to regularize the classical multimodal sparse regression problem (essentially incorporating both CCA and regression models within a unified formulation). The underlying motivation is to extract discriminative variables that are also co-expressed across modalities. We first show that the simplest formulation of such model can be expressed as a special case of collaborative learning methods. After discussing its limitation, we propose an extended, more flexible formulation, and introduce a simple and efficient alternating minimization algorithm to solve the associated optimization problem. We explore the parameter space and provide some guidelines regarding parameter selection. Both the original and extended versions are then compared on a simple toy data set and a more advanced simulated imaging genomics data set in order to illustrate the benefits of the latter. Finally, we validate the proposed formulation using single nucleotide polymorphisms data and functional magnetic resonance imaging data from a population of adolescents ( subjects, age 16.9 ± 1.9 years from the Philadelphia Neurodevelopmental Cohort) for the study of learning ability. Furthermore, we carry out a significance analysis of the resulting features that allow us to carefully extract brain regions and genes linked to learning and cognitive ability.
Collapse
|
8
|
Cao H, Meyer-Lindenberg A, Schwarz E. Comparative Evaluation of Machine Learning Strategies for Analyzing Big Data in Psychiatry. Int J Mol Sci 2018; 19:E3387. [PMID: 30380679 PMCID: PMC6274760 DOI: 10.3390/ijms19113387] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2018] [Revised: 10/22/2018] [Accepted: 10/25/2018] [Indexed: 12/24/2022] Open
Abstract
The requirement of innovative big data analytics has become a critical success factor for research in biological psychiatry. Integrative analyses across distributed data resources are considered essential for untangling the biological complexity of mental illnesses. However, little is known about algorithm properties for such integrative machine learning. Here, we performed a comparative analysis of eight machine learning algorithms for identification of reproducible biological fingerprints across data sources, using five transcriptome-wide expression datasets of schizophrenia patients and controls as a use case. We found that multi-task learning (MTL) with network structure (MTL_NET) showed superior accuracy compared to other MTL formulations as well as single task learning, and tied performance with support vector machines (SVM). Compared to SVM, MTL_NET showed significant benefits regarding the variability of accuracy estimates, as well as its robustness to cross-dataset and sampling variability. These results support the utility of this algorithm as a flexible tool for integrative machine learning in psychiatry.
Collapse
Affiliation(s)
- Han Cao
- Department of Psychiatry and Psychotherapy, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, 68159 Mannheim, Germany.
| | - Andreas Meyer-Lindenberg
- Department of Psychiatry and Psychotherapy, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, 68159 Mannheim, Germany.
| | - Emanuel Schwarz
- Department of Psychiatry and Psychotherapy, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, 68159 Mannheim, Germany.
| |
Collapse
|
9
|
Grapov D, Fahrmann J, Wanichthanarak K, Khoomrung S. Rise of Deep Learning for Genomic, Proteomic, and Metabolomic Data Integration in Precision Medicine. OMICS : A JOURNAL OF INTEGRATIVE BIOLOGY 2018; 22:630-636. [PMID: 30124358 PMCID: PMC6207407 DOI: 10.1089/omi.2018.0097] [Citation(s) in RCA: 121] [Impact Index Per Article: 17.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Machine learning (ML) is being ubiquitously incorporated into everyday products such as Internet search, email spam filters, product recommendations, image classification, and speech recognition. New approaches for highly integrated manufacturing and automation such as the Industry 4.0 and the Internet of things are also converging with ML methodologies. Many approaches incorporate complex artificial neural network architectures and are collectively referred to as deep learning (DL) applications. These methods have been shown capable of representing and learning predictable relationships in many diverse forms of data and hold promise for transforming the future of omics research and applications in precision medicine. Omics and electronic health record data pose considerable challenges for DL. This is due to many factors such as low signal to noise, analytical variance, and complex data integration requirements. However, DL models have already been shown capable of both improving the ease of data encoding and predictive model performance over alternative approaches. It may not be surprising that concepts encountered in DL share similarities with those observed in biological message relay systems such as gene, protein, and metabolite networks. This expert review examines the challenges and opportunities for DL at a systems and biological scale for a precision medicine readership.
Collapse
Affiliation(s)
- Dmitry Grapov
- CDS-Creative Data Solutions LLC, Ballwin, Missouri, www.createdatasol.com
| | - Johannes Fahrmann
- Department of Clinical Cancer Prevention, University of Texas MD Anderson, Houston, Texas
| | - Kwanjeera Wanichthanarak
- Department of Biochemistry, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok, Thailand
- Siriraj Metabolomics and Phenomics Center, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok, Thailand
| | - Sakda Khoomrung
- Department of Biochemistry, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok, Thailand
- Siriraj Metabolomics and Phenomics Center, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok, Thailand
| |
Collapse
|
10
|
Deng SP, Hu W, Calhoun VD, Wang YP. Integrating Imaging Genomic Data in the Quest for Biomarkers of Schizophrenia Disease. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2018; 15:1480-1491. [PMID: 28880187 PMCID: PMC6207076 DOI: 10.1109/tcbb.2017.2748944] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/06/2023]
Abstract
It's increasingly important but difficult to determine potential biomarkers of schizophrenia (SCZ) disease, owing to the complex pathophysiology of this disease. In this study, a network-fusion based framework was proposed to identify genetic biomarkers of the SCZ disease. A three-step feature selection was applied to single nucleotide polymorphisms (SNPs), DNA methylation, and functional magnetic resonance imaging (fMRI) data to select important features, which were then used to construct two gene networks in different states for the SNPs and DNA methylation data, respectively. Two health networks (one is for SNP data and the other is for DNA methylation data) were combined into one health network from which health minimum spanning trees (MSTs) were extracted. Two disease networks also followed the same procedures. Those genes with significant changes were determined as SCZ biomarkers by comparing MSTs in two different states and they were finally validated from five aspects. The effectiveness of the proposed discovery framework was also demonstrated by comparing with other network-based discovery methods. In summary, our approach provides a general framework for discovering gene biomarkers of the complex diseases by integrating imaging genomic data, which can be applied to the diagnosis of the complex diseases in the future.
Collapse
Affiliation(s)
- Su-Ping Deng
- Department of Biomedical Engineering, School of Science and Engineering, Tulane University, New Orleans, LA 70118, USA.,
| | - Wenxing Hu
- Department of Biomedical Engineering, School of Science and Engineering, Tulane University, New Orleans, LA 70118, USA.,
| | | | - Yu-Ping Wang
- Department of Biomedical Engineering, School of Science and Engineering, Tulane University, New Orleans, LA 70118, USA., , Telephone: (504)865-5867, Fax: (504)862-8779
| |
Collapse
|
11
|
Zhao X, Li Y, Wu H. A novel scoring system for acute myeloid leukemia risk assessment based on the expression levels of six genes. Int J Mol Med 2018; 42:1495-1507. [PMID: 29956722 PMCID: PMC6089755 DOI: 10.3892/ijmm.2018.3739] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2017] [Accepted: 05/14/2018] [Indexed: 12/19/2022] Open
Abstract
Acute myeloid leukemia (AML) is the most common type of acute leukemia and is a heterogeneous clonal disorder. At present, the pathogenesis of AML and potential methods to effectively prevent AML have become areas of interest in research. In the present study, two messenger ribonucleic acid sequencing datasets of patients with AML were downloaded from the Cancer Genome Atlas and Gene Expression Omnibus databases. The differentially expressed genes (DEGs) of the poor and good prognosis groups were screened using the Linear Models for Microarray Data package, and the prognosis-related genes were screened using univariate Cox regression analysis. A total of 206 significant DEGs were identified. Following univariate and multivariate Cox regression analysis, 14 genes significantly associated with prognosis were screened and six of these genes, including triggering receptor expressed on myeloid cells 2 (TREML2), cysteine-glutamate transporter (SLC7A11), NACHT, LRR, and PYD domains-containing protein 2 (NLRP2), DNA damage-inducible transcript 4 protein (DDIT4), lymphocyte‑specific protein 1 (LSP1) and C-type lectin domain family 11 member A (CLEC11A), were used to construct model equations for risk assessment. The prognostic scoring system was used to evaluate risk for each patient, and the results showed that patients in the low-risk group had a longer survival time, compared with those in the high-risk group (P=9.59e-06 for the training dataset and P=0.00543 for the validation dataset). A total of eight main Kyoto Encyclopedia of Genes and Genomes pathways were identified, the top three of which were hematopoietic cell lineage, focal adhesion, and regulation of actin cytoskeleton. Taken together, the results showed that the scoring system established in the present study was credible and that the six genes were identified, which were significantly associated with the risk assessment of AML, offer potential as prognostic biomarkers. These findings may provide clues for further clarifying the pathogenesis of AML.
Collapse
Affiliation(s)
- Xiaoyan Zhao
- Department of Hematology, The First Hospital of Jiaxing, Jiaxing, Zhejiang 314000, P.R. China
| | - Yuan Li
- Department of Hematology, The First Hospital of Jiaxing, Jiaxing, Zhejiang 314000, P.R. China
| | - Haibing Wu
- Department of Hematology, The First Hospital of Jiaxing, Jiaxing, Zhejiang 314000, P.R. China
| |
Collapse
|
12
|
Calhoun VD. Predicting schizophrenia by fusing networks from SNPs, DNA methylation and fMRI data. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2017; 2016:1447-1450. [PMID: 28268598 DOI: 10.1109/embc.2016.7590981] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
In order to comprehensively utilize complementary information from multiple types of data for better disease diagnosis, in this study, we applied a network fusion based approach to integrating three types of data including genetic, epigenetic and neuroimaging data from a study of schizophrenia patients (SCZ). A network is a map of interactions, which contributes to investigating the connectivity of components or links between sub-units. We exploited the potential of using networks as features for discriminating SCZ from healthy controls. We first constructed a single network from each type of data. Then we built four fused networks by the network fusion method: three fused networks for each combination of two types of data and one fused network for all three data types. Based on the local consistency of network, we can predict the group of the unlabeled SCZ subjects. The group prediction method was applied to test the power of network-based features and the performance was evaluated by a 10-fold cross validation. The results show that the prediction accuracy is the highest when applying our prediction method to the fused network derived from three data types among 7 tested networks. As a conclusion, integrative approaches that can comprehensively utilize multiple types of data are more useful for diagnosis and prediction.
Collapse
|
13
|
He H, Lin D, Zhang J, Wang Y, Deng HW. Biostatistics, Data Mining and Computational Modeling. TRANSLATIONAL BIOINFORMATICS 2016. [DOI: 10.1007/978-94-017-7543-4_2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
|
14
|
Zhang FT, Zhu ZH, Tong XR, Zhu ZX, Qi T, Zhu J. Mixed Linear Model Approaches of Association Mapping for Complex Traits Based on Omics Variants. Sci Rep 2015. [PMID: 26223539 PMCID: PMC5155518 DOI: 10.1038/srep10298] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
Abstract
Precise prediction for genetic architecture of complex traits is impeded by the limited understanding on genetic effects of complex traits, especially on gene-by-gene (GxG) and gene-by-environment (GxE) interaction. In the past decades, an explosion of high throughput technologies enables omics studies at multiple levels (such as genomics, transcriptomics, proteomics, and metabolomics). The analyses of large omics data, especially two-loci interaction analysis, are very time intensive. Integrating the diverse omics data and environmental effects in the analyses also remain challenges. We proposed mixed linear model approaches using GPU (Graphic Processing Unit) computation to simultaneously dissect various genetic effects. Analyses can be performed for estimating genetic main effects, GxG epistasis effects, and GxE environment interaction effects on large-scale omics data for complex traits, and for estimating heritability of specific genetic effects. Both mouse data analyses and Monte Carlo simulations demonstrated that genetic effects and environment interaction effects could be unbiasedly estimated with high statistical power by using the proposed approaches.
Collapse
Affiliation(s)
- Fu-Tao Zhang
- Institute of Bioinformatics, Zhejiang University, Hangzhou, China
| | - Zhi-Hong Zhu
- Institute of Bioinformatics, Zhejiang University, Hangzhou, China
| | - Xiao-Ran Tong
- Institute of Bioinformatics, Zhejiang University, Hangzhou, China
| | - Zhi-Xiang Zhu
- Institute of Bioinformatics, Zhejiang University, Hangzhou, China
| | - Ting Qi
- Institute of Bioinformatics, Zhejiang University, Hangzhou, China
| | - Jun Zhu
- Institute of Bioinformatics, Zhejiang University, Hangzhou, China
| |
Collapse
|
15
|
Affiliation(s)
- Christine Nardini
- Lazzari Bologna, Italy ; Group of Clinical Genomic Networks, Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences Shanghai, China
| | | | - Paolo Tieri
- Consiglio Nazionale delle Ricerche, Istituto per le Applicazioni del Calcolo Rome, Italy
| |
Collapse
|