1
|
Cao H, Hong X, Tost H, Meyer-Lindenberg A, Schwarz E. Advancing translational research in neuroscience through multi-task learning. Front Psychiatry 2022; 13:993289. [PMID: 36465289 PMCID: PMC9714033 DOI: 10.3389/fpsyt.2022.993289] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/13/2022] [Accepted: 10/24/2022] [Indexed: 11/18/2022] Open
Abstract
Translational research in neuroscience is increasingly focusing on the analysis of multi-modal data, in order to account for the biological complexity of suspected disease mechanisms. Recent advances in machine learning have the potential to substantially advance such translational research through the simultaneous analysis of different data modalities. This review focuses on one of such approaches, the so-called "multi-task learning" (MTL), and describes its potential utility for multi-modal data analyses in neuroscience. We summarize the methodological development of MTL starting from conventional machine learning, and present several scenarios that appear particularly suitable for its application. For these scenarios, we highlight different types of MTL algorithms, discuss emerging technological adaptations, and provide a step-by-step guide for readers to apply the MTL approach in their own studies. With its ability to simultaneously analyze multiple data modalities, MTL may become an important element of the analytics repertoire used in future neuroscience research and beyond.
Collapse
Affiliation(s)
- Han Cao
- Department of Psychiatry and Psychotherapy, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany
| | - Xudong Hong
- Department of Computer Vision and Machine Learning, Max Planck Institute for Informatics, Saarbrücken, Germany
- Department of Language Science and Technology, Saarland University, Saarbrücken, Germany
| | - Heike Tost
- Department of Psychiatry and Psychotherapy, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany
| | - Andreas Meyer-Lindenberg
- Department of Psychiatry and Psychotherapy, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany
| | - Emanuel Schwarz
- Department of Psychiatry and Psychotherapy, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany
| |
Collapse
|
2
|
Mao J, Akhtar J, Zhang X, Sun L, Guan S, Li X, Chen G, Liu J, Jeon HN, Kim MS, No KT, Wang G. Comprehensive strategies of machine-learning-based quantitative structure-activity relationship models. iScience 2021; 24:103052. [PMID: 34553136 PMCID: PMC8441174 DOI: 10.1016/j.isci.2021.103052] [Citation(s) in RCA: 38] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
Early quantitative structure-activity relationship (QSAR) technologies have unsatisfactory versatility and accuracy in fields such as drug discovery because they are based on traditional machine learning and interpretive expert features. The development of Big Data and deep learning technologies significantly improve the processing of unstructured data and unleash the great potential of QSAR. Here we discuss the integration of wet experiments (which provide experimental data and reliable verification), molecular dynamics simulation (which provides mechanistic interpretation at the atomic/molecular levels), and machine learning (including deep learning) techniques to improve QSAR models. We first review the history of traditional QSAR and point out its problems. We then propose a better QSAR model characterized by a new iterative framework to integrate machine learning with disparate data input. Finally, we discuss the application of QSAR and machine learning to many practical research fields, including drug development and clinical trials.
Collapse
Affiliation(s)
- Jiashun Mao
- The Interdisciplinary Graduate Program in Integrative Biotechnology and Translational Medicine, Yonsei University, Incheon 21983, Republic of Korea
- Department of Biology, School of Life Sciences, Southern University of Science and Technology, 1088 Xueyuan Avenue, Shenzhen, Guangdong 518055, China
- Guangdong Provincial Key Laboratory of Computational Science and Material Design, Shenzhen, Guangdong 518055 China
| | - Javed Akhtar
- Department of Biology, School of Life Sciences, Southern University of Science and Technology, 1088 Xueyuan Avenue, Shenzhen, Guangdong 518055, China
- Guangdong Provincial Key Laboratory of Cell Microenvironment and Disease Research, Shenzhen, Guangdong 518055, China
| | - Xiao Zhang
- Shanghai Rural Commercial Bank Co., Ltd, Shanghai 200002, China
| | - Liang Sun
- Department of Physics, City University of Hong Kong, 83 Tat Chee Avenue, Kowloon, Hong Kong, China
| | - Shenghui Guan
- Department of Biology, School of Life Sciences, Southern University of Science and Technology, 1088 Xueyuan Avenue, Shenzhen, Guangdong 518055, China
- Guangdong Provincial Key Laboratory of Computational Science and Material Design, Shenzhen, Guangdong 518055 China
| | - Xinyu Li
- School of Life and Health Sciences and Warshel Institute for Computational Biology, The Chinese University of Hong Kong, Shenzhen 518172, China
| | - Guangming Chen
- Department of Biology, School of Life Sciences, Southern University of Science and Technology, 1088 Xueyuan Avenue, Shenzhen, Guangdong 518055, China
- Guangdong Provincial Key Laboratory of Cell Microenvironment and Disease Research, Shenzhen, Guangdong 518055, China
| | - Jiaxin Liu
- Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul 03722, Republic of Korea
| | - Hyeon-Nae Jeon
- Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul 03722, Republic of Korea
| | - Min Sung Kim
- Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul 03722, Republic of Korea
| | - Kyoung Tai No
- The Interdisciplinary Graduate Program in Integrative Biotechnology and Translational Medicine, Yonsei University, Incheon 21983, Republic of Korea
| | - Guanyu Wang
- Department of Biology, School of Life Sciences, Southern University of Science and Technology, 1088 Xueyuan Avenue, Shenzhen, Guangdong 518055, China
- Guangdong Provincial Key Laboratory of Computational Science and Material Design, Shenzhen, Guangdong 518055 China
- Guangdong Provincial Key Laboratory of Cell Microenvironment and Disease Research, Shenzhen, Guangdong 518055, China
| |
Collapse
|
3
|
Liu Q, Dou Q, Yu L, Heng PA. MS-Net: Multi-Site Network for Improving Prostate Segmentation With Heterogeneous MRI Data. IEEE TRANSACTIONS ON MEDICAL IMAGING 2020; 39:2713-2724. [PMID: 32078543 DOI: 10.1109/tmi.2020.2974574] [Citation(s) in RCA: 81] [Impact Index Per Article: 20.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/17/2023]
Abstract
Automated prostate segmentation in MRI is highly demanded for computer-assisted diagnosis. Recently, a variety of deep learning methods have achieved remarkable progress in this task, usually relying on large amounts of training data. Due to the nature of scarcity for medical images, it is important to effectively aggregate data from multiple sites for robust model training, to alleviate the insufficiency of single-site samples. However, the prostate MRIs from different sites present heterogeneity due to the differences in scanners and imaging protocols, raising challenges for effective ways of aggregating multi-site data for network training. In this paper, we propose a novel multi-site network (MS-Net) for improving prostate segmentation by learning robust representations, leveraging multiple sources of data. To compensate for the inter-site heterogeneity of different MRI datasets, we develop Domain-Specific Batch Normalization layers in the network backbone, enabling the network to estimate statistics and perform feature normalization for each site separately. Considering the difficulty of capturing the shared knowledge from multiple datasets, a novel learning paradigm, i.e., Multi-site-guided Knowledge Transfer, is proposed to enhance the kernels to extract more generic representations from multi-site data. Extensive experiments on three heterogeneous prostate MRI datasets demonstrate that our MS-Net improves the performance across all datasets consistently, and outperforms state-of-the-art methods for multi-site learning.
Collapse
|
4
|
Du L, Liu F, Liu K, Yao X, Risacher SL, Han J, Guo L, Saykin AJ, Shen L. Identifying diagnosis-specific genotype-phenotype associations via joint multitask sparse canonical correlation analysis and classification. Bioinformatics 2020; 36:i371-i379. [PMID: 32657360 PMCID: PMC7355274 DOI: 10.1093/bioinformatics/btaa434] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023] Open
Abstract
MOTIVATION Brain imaging genetics studies the complex associations between genotypic data such as single nucleotide polymorphisms (SNPs) and imaging quantitative traits (QTs). The neurodegenerative disorders usually exhibit the diversity and heterogeneity, originating from which different diagnostic groups might carry distinct imaging QTs, SNPs and their interactions. Sparse canonical correlation analysis (SCCA) is widely used to identify bi-multivariate genotype-phenotype associations. However, most existing SCCA methods are unsupervised, leading to an inability to identify diagnosis-specific genotype-phenotype associations. RESULTS In this article, we propose a new joint multitask learning method, named MT-SCCALR, which absorbs the merits of both SCCA and logistic regression. MT-SCCALR learns genotype-phenotype associations of multiple tasks jointly, with each task focusing on identifying one diagnosis-specific genotype-phenotype pattern. Meanwhile, MT-SCCALR cannot only select relevant SNPs and imaging QTs for each diagnostic group alone, but also allows the selection of those shared by multiple diagnostic groups. We derive an efficient optimization algorithm whose convergence to a local optimum is guaranteed. Compared with two state-of-the-art methods, MT-SCCALR yields better or similar canonical correlation coefficients and classification performances. In addition, it owns much better discriminative canonical weight patterns of great interest than competitors. This demonstrates the power and capability of MTSCCAR in identifying diagnostically heterogeneous genotype-phenotype patterns, which would be helpful to understand the pathophysiology of brain disorders. AVAILABILITY AND IMPLEMENTATION The software is publicly available at https://github.com/dulei323/MTSCCALR. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Lei Du
- Department of intelligent science and technology, School of Automation, Northwestern Polytechnical University, Xi’an 710072, China
| | - Fang Liu
- Department of intelligent science and technology, School of Automation, Northwestern Polytechnical University, Xi’an 710072, China
| | - Kefei Liu
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA
| | - Xiaohui Yao
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA
| | - Shannon L Risacher
- Department of Radiology and Imaging Sciences, Indiana University School of Medicine, Indianapolis, IN 46202, USA
| | - Junwei Han
- Department of intelligent science and technology, School of Automation, Northwestern Polytechnical University, Xi’an 710072, China
| | - Lei Guo
- Department of intelligent science and technology, School of Automation, Northwestern Polytechnical University, Xi’an 710072, China
| | - Andrew J Saykin
- Department of Radiology and Imaging Sciences, Indiana University School of Medicine, Indianapolis, IN 46202, USA
| | - Li Shen
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA
| | | |
Collapse
|
5
|
Dong Q, Zhang J, Li Q, Wang J, Leporé N, Thompson PM, Caselli RJ, Ye J, Wang Y. Integrating Convolutional Neural Networks and Multi-Task Dictionary Learning for Cognitive Decline Prediction with Longitudinal Images. J Alzheimers Dis 2020; 75:971-992. [PMID: 32390615 PMCID: PMC7427104 DOI: 10.3233/jad-190973] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
BACKGROUND Disease progression prediction based on neuroimaging biomarkers is vital in Alzheimer's disease (AD) research. Convolutional neural networks (CNN) have been proved to be powerful for various computer vision research by refining reliable and high-level feature maps from image patches. OBJECTIVE A key challenge in applying CNN to neuroimaging research is the limited labeled samples with high dimensional features. Another challenge is how to improve the prediction accuracy by joint analysis of multiple data sources (i.e., multiple time points or multiple biomarkers). To address these two challenges, we propose a novel multi-task learning framework based on CNN. METHODS First, we pre-trained CNN on the ImageNet dataset and transferred the knowledge from the pre-trained model to neuroimaging representation. We used this deep model as feature extractor to generate high-level feature maps of different tasks. Then a novel unsupervised learning method, termed Multi-task Stochastic Coordinate Coding (MSCC), was proposed for learning sparse features of multi-task feature maps by using shared and individual dictionaries. Finally, Lasso regression was performed on these multi-task sparse features to predict AD progression measured by the Mini-Mental State Examination (MMSE) and the Alzheimer's Disease Assessment Scale cognitive subscale (ADAS-Cog). RESULTS We applied this novel CNN-MSCC system on the Alzheimer's Disease Neuroimaging Initiative dataset to predict future MMSE/ADAS-Cog scales. We found our method achieved superior performances compared with seven other methods. CONCLUSION Our work may add new insights into data augmentation and multi-task deep model research and facilitate the adoption of deep models in neuroimaging research.
Collapse
Affiliation(s)
- Qunxi Dong
- School of Computing, Informatics, and Decision Systems Engineering, Arizona State University, Tempe, AZ, USA
| | - Jie Zhang
- School of Computing, Informatics, and Decision Systems Engineering, Arizona State University, Tempe, AZ, USA
| | - Qingyang Li
- School of Computing, Informatics, and Decision Systems Engineering, Arizona State University, Tempe, AZ, USA
| | - Junwen Wang
- Department of Health Sciences Research, Center for Individualized Medicine, Mayo Clinic, Scottsdale, AZ, 85259, USA
| | - Natasha Leporé
- Department of Radiology, Children’s Hospital Los Angeles, Los Angeles, CA, USA
| | - Paul M. Thompson
- Imaging Genetics Center, Institute for Neuroimaging and Informatics, University of Southern California, Los Angeles, CA, USA
| | | | - Jieping Ye
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Yalin Wang
- School of Computing, Informatics, and Decision Systems Engineering, Arizona State University, Tempe, AZ, USA
| | | |
Collapse
|
6
|
Ma Q, Zhang T, Zanetti MV, Shen H, Satterthwaite TD, Wolf DH, Gur RE, Fan Y, Hu D, Busatto GF, Davatzikos C. Classification of multi-site MR images in the presence of heterogeneity using multi-task learning. Neuroimage Clin 2018; 19:476-486. [PMID: 29984156 PMCID: PMC6029565 DOI: 10.1016/j.nicl.2018.04.037] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2017] [Revised: 04/09/2018] [Accepted: 04/28/2018] [Indexed: 12/21/2022]
Abstract
With the advent of Big Data Imaging Analytics applied to neuroimaging, datasets from multiple sites need to be pooled into larger samples. However, heterogeneity across different scanners, protocols and populations, renders the task of finding underlying disease signatures challenging. The current work investigates the value of multi-task learning in finding disease signatures that generalize across studies and populations. Herein, we present a multi-task learning type of formulation, in which different tasks are from different studies and populations being pooled together. We test this approach in an MRI study of the neuroanatomy of schizophrenia (SCZ) by pooling data from 3 different sites and populations: Philadelphia, Sao Paulo and Tianjin (50 controls and 50 patients from each site), which posed integration challenges due to variability in disease chronicity, treatment exposure, and data collection. Some existing methods are also tested for comparison purposes. Experiments show that classification accuracy of multi-site data outperformed that of single-site data and pooled data using multi-task feature learning, and also outperformed other comparison methods. Several anatomical regions were identified to be common discriminant features across sites. These included prefrontal, superior temporal, insular, anterior cingulate cortex, temporo-limbic and striatal regions consistently implicated in the pathophysiology of schizophrenia, as well as the cerebellum, precuneus, and fusiform, middle temporal, inferior parietal, postcentral, angular, lingual and middle occipital gyri. These results indicate that the proposed multi-task learning method is robust in finding consistent and reliable structural brain abnormalities associated with SCZ across different sites, in the presence of multiple sources of heterogeneity.
Collapse
Affiliation(s)
- Qiongmin Ma
- College of Mechatronics and Automation, National University of Defense Technology, Changsha, Hunan 410073, China; Center for Biomedical Image Computing and Analytics, and Department of Radiology, University of Pennsylvania, Philadelphia, PA 19104, United States; Beijing Institute of System Engineering, China.
| | - Tianhao Zhang
- Center for Biomedical Image Computing and Analytics, and Department of Radiology, University of Pennsylvania, Philadelphia, PA 19104, United States
| | - Marcus V Zanetti
- Laboratory of Psychiatric Neuroimaging (LIM-21), Department and Institute of Psychiatry, Faculty of Medicine, University of São Paulo, São Paulo, Brazil
| | - Hui Shen
- College of Mechatronics and Automation, National University of Defense Technology, Changsha, Hunan 410073, China
| | | | - Daniel H Wolf
- Department of Psychiatry, University of Pennsylvania, Philadelphia, PA 19104, United States
| | - Raquel E Gur
- Department of Psychiatry, University of Pennsylvania, Philadelphia, PA 19104, United States
| | - Yong Fan
- Center for Biomedical Image Computing and Analytics, and Department of Radiology, University of Pennsylvania, Philadelphia, PA 19104, United States
| | - Dewen Hu
- College of Mechatronics and Automation, National University of Defense Technology, Changsha, Hunan 410073, China
| | - Geraldo F Busatto
- Laboratory of Psychiatric Neuroimaging (LIM-21), Department and Institute of Psychiatry, Faculty of Medicine, University of São Paulo, São Paulo, Brazil
| | - Christos Davatzikos
- Center for Biomedical Image Computing and Analytics, and Department of Radiology, University of Pennsylvania, Philadelphia, PA 19104, United States
| |
Collapse
|