1
|
Li W, Wang T, Ng WWY. Population-Based Hyperparameter Tuning With Multitask Collaboration. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:5719-5731. [PMID: 34878983 DOI: 10.1109/tnnls.2021.3130896] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Population-based optimization methods are widely used for hyperparameter (HP) tuning for a given specific task. In this work, we propose the population-based hyperparameter tuning with multitask collaboration (PHTMC), which is a general multitask collaborative framework with parallel and sequential phases for population-based HP tuning methods. In the parallel HP tuning phase, a shared population for all tasks is kept and the intertask relatedness is considered to both yield a better generalization ability and avoid data bias to a single task. In the sequential HP tuning phase, a surrogate model is built for each new-added task so that the metainformation from the existing tasks can be extracted and used to help the initialization for the new task. Experimental results show significant improvements in generalization abilities yielded by neural networks trained using the PHTMC and better performances achieved by multitask metalearning. Moreover, a visualization of the solution distribution and the autoencoder's reconstruction of both the PHTMC and a single-task population-based HP tuning method is compared to analyze the property with the multitask collaboration.
Collapse
|
2
|
Large scale multi-output multi-class classification using Gaussian processes. Mach Learn 2023. [DOI: 10.1007/s10994-022-06289-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/11/2023]
Abstract
AbstractMulti-output Gaussian processes (MOGPs) can help to improve predictive performance for some output variables, by leveraging the correlation with other output variables. In this paper, our main motivation is to use multiple-output Gaussian processes to exploit correlations between outputs where each output is a multi-class classification problem. MOGPs have been mostly used for multi-output regression. There are some existing works that use MOGPs for other types of outputs, e.g., multi-output binary classification. However, MOGPs for multi-class classification has been less studied. The reason is twofold: 1) when using a softmax function, it is not clear how to scale it beyond the case of a few outputs; 2) most common type of data in multi-class classification problems consists of image data, and MOGPs are not specifically designed to image data. We thus propose a new MOGPs model called Multi-output Gaussian Processes with Augment & Reduce (MOGPs-AR) that can deal with large scale classification and downsized image input data. Large scale classification is achieved by subsampling both training data sets and classes in each output whereas downsized image input data is handled by incorporating a convolutional kernel into the new model. We show empirically that our proposed model outperforms single-output Gaussian processes in terms of different performance metrics and multi-output Gaussian processes in terms of scalability, both in synthetic and in real classification problems. We include an example with the Ommiglot dataset where we showcase the properties of our model.
Collapse
|
3
|
A survey on multi-objective hyperparameter optimization algorithms for machine learning. Artif Intell Rev 2022. [DOI: 10.1007/s10462-022-10359-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
AbstractHyperparameter optimization (HPO) is a necessary step to ensure the best possible performance of Machine Learning (ML) algorithms. Several methods have been developed to perform HPO; most of these are focused on optimizing one performance measure (usually an error-based measure), and the literature on such single-objective HPO problems is vast. Recently, though, algorithms have appeared that focus on optimizing multiple conflicting objectives simultaneously. This article presents a systematic survey of the literature published between 2014 and 2020 on multi-objective HPO algorithms, distinguishing between metaheuristic-based algorithms, metamodel-based algorithms and approaches using a mixture of both. We also discuss the quality metrics used to compare multi-objective HPO procedures and present future research directions.
Collapse
|
6
|
Abdar M, Samami M, Dehghani Mahmoodabad S, Doan T, Mazoure B, Hashemifesharaki R, Liu L, Khosravi A, Acharya UR, Makarenkov V, Nahavandi S. Uncertainty quantification in skin cancer classification using three-way decision-based Bayesian deep learning. Comput Biol Med 2021; 135:104418. [PMID: 34052016 DOI: 10.1016/j.compbiomed.2021.104418] [Citation(s) in RCA: 74] [Impact Index Per Article: 24.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2021] [Revised: 04/01/2021] [Accepted: 04/17/2021] [Indexed: 12/18/2022]
Abstract
Accurate automated medical image recognition, including classification and segmentation, is one of the most challenging tasks in medical image analysis. Recently, deep learning methods have achieved remarkable success in medical image classification and segmentation, clearly becoming the state-of-the-art methods. However, most of these methods are unable to provide uncertainty quantification (UQ) for their output, often being overconfident, which can lead to disastrous consequences. Bayesian Deep Learning (BDL) methods can be used to quantify uncertainty of traditional deep learning methods, and thus address this issue. We apply three uncertainty quantification methods to deal with uncertainty during skin cancer image classification. They are as follows: Monte Carlo (MC) dropout, Ensemble MC (EMC) dropout and Deep Ensemble (DE). To further resolve the remaining uncertainty after applying the MC, EMC and DE methods, we describe a novel hybrid dynamic BDL model, taking into account uncertainty, based on the Three-Way Decision (TWD) theory. The proposed dynamic model enables us to use different UQ methods and different deep neural networks in distinct classification phases. So, the elements of each phase can be adjusted according to the dataset under consideration. In this study, two best UQ methods (i.e., DE and EMC) are applied in two classification phases (the first and second phases) to analyze two well-known skin cancer datasets, preventing one from making overconfident decisions when it comes to diagnosing the disease. The accuracy and the F1-score of our final solution are, respectively, 88.95% and 89.00% for the first dataset, and 90.96% and 91.00% for the second dataset. Our results suggest that the proposed TWDBDL model can be used effectively at different stages of medical image analysis.
Collapse
Affiliation(s)
- Moloud Abdar
- Institute for Intelligent Systems Research and Innovation (IISRI), Deakin University, Geelong, Australia.
| | - Maryam Samami
- Department of Computer Engineering, Sari Branch, Islamic Azad University, Sari, Iran
| | - Sajjad Dehghani Mahmoodabad
- Department of Artificial Intelligence, Faculty of Computer Engineering, K. N. Toosi University of Technology, Tehran, Iran
| | - Thang Doan
- Department of Computer Science, McGill University / Mila, Montreal, Canada
| | - Bogdan Mazoure
- Department of Computer Science, McGill University / Mila, Montreal, Canada
| | | | - Li Liu
- Center for Machine Vision and Signal Analysis (CMVS), University of Oulu, Finland
| | - Abbas Khosravi
- Institute for Intelligent Systems Research and Innovation (IISRI), Deakin University, Geelong, Australia
| | - U Rajendra Acharya
- School of Engineering, Ngee Ann Polytechnic, Singapore; Department of Biomedical Engineering, Singapore University of Social Sciences, Singapore; Department of Biomedical Informatics and Medical Engineering, Asia University, Taichung, Taiwan
| | - Vladimir Makarenkov
- Department of Computer Science, University of Quebec in Montreal, Montreal, Canada
| | - Saeid Nahavandi
- Institute for Intelligent Systems Research and Innovation (IISRI), Deakin University, Geelong, Australia
| |
Collapse
|
7
|
Abstract
AbstractMeta-learning, or learning to learn, is a machine learning approach that utilizes prior learning experiences to expedite the learning process on unseen tasks. As a data-driven approach, meta-learning requires meta-features that represent the primary learning tasks or datasets, and are estimated traditonally as engineered dataset statistics that require expert domain knowledge tailored for every meta-task. In this paper, first, we propose a meta-feature extractor called Dataset2Vec that combines the versatility of engineered dataset meta-features with the expressivity of meta-features learned by deep neural networks. Primary learning tasks or datasets are represented as hierarchical sets, i.e., as a set of sets, esp. as a set of predictor/target pairs, and then a DeepSet architecture is employed to regress meta-features on them. Second, we propose a novel auxiliary meta-learning task with abundant data called dataset similarity learning that aims to predict if two batches stem from the same dataset or different ones. In an experiment on a large-scale hyperparameter optimization task for 120 UCI datasets with varying schemas as a meta-learning task, we show that the meta-features of Dataset2Vec outperform the expert engineered meta-features and thus demonstrate the usefulness of learned meta-features for datasets with varying schemas for the first time.
Collapse
|
8
|
Automated machine learning: Review of the state-of-the-art and opportunities for healthcare. Artif Intell Med 2020; 104:101822. [DOI: 10.1016/j.artmed.2020.101822] [Citation(s) in RCA: 197] [Impact Index Per Article: 49.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2019] [Revised: 01/17/2020] [Accepted: 02/17/2020] [Indexed: 12/13/2022]
|
10
|
Da B, Ong YS, Gupta A, Feng L, Liu H. Fast transfer Gaussian process regression with large-scale sources. Knowl Based Syst 2019. [DOI: 10.1016/j.knosys.2018.11.029] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|