1
|
Rani V, Nabi ST, Kumar M, Mittal A, Kumar K. Self-supervised Learning: A Succinct Review. ARCHIVES OF COMPUTATIONAL METHODS IN ENGINEERING : STATE OF THE ART REVIEWS 2023; 30:2761-2775. [PMID: 36713767 PMCID: PMC9857922 DOI: 10.1007/s11831-023-09884-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/26/2022] [Accepted: 01/05/2023] [Indexed: 06/18/2023]
Abstract
Machine learning has made significant advances in the field of image processing. The foundation of this success is supervised learning, which necessitates annotated labels generated by humans and hence learns from labelled data, whereas unsupervised learning learns from unlabeled data. Self-supervised learning (SSL) is a type of un-supervised learning that helps in the performance of downstream computer vision tasks such as object detection, image comprehension, image segmentation, and so on. It can develop generic artificial intelligence systems at a low cost using unstructured and unlabeled data. The authors of this review article have presented detailed literature on self-supervised learning as well as its applications in different domains. The primary goal of this review article is to demonstrate how images learn from their visual features using self-supervised approaches. The authors have also discussed various terms used in self-supervised learning as well as different types of learning, such as contrastive learning, transfer learning, and so on. This review article describes in detail the pipeline of self-supervised learning, including its two main phases: pretext and downstream tasks. The authors have shed light on various challenges encountered while working on self-supervised learning at the end of the article.
Collapse
Affiliation(s)
- Veenu Rani
- Department of Computational Sciences, Maharaja Ranjit Singh Punjab Technical University, Bathinda, Punjab India
| | - Syed Tufael Nabi
- Department of Computational Sciences, Maharaja Ranjit Singh Punjab Technical University, Bathinda, Punjab India
| | - Munish Kumar
- Department of Computational Sciences, Maharaja Ranjit Singh Punjab Technical University, Bathinda, Punjab India
| | - Ajay Mittal
- University Institute of Engineering and Technology, Panjab University, Chandigarh, India
| | - Krishan Kumar
- University Institute of Engineering and Technology, Panjab University, Chandigarh, India
| |
Collapse
|
2
|
Learning from Demonstrations in Human–Robot Collaborative Scenarios: A Survey. ROBOTICS 2022. [DOI: 10.3390/robotics11060126] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Human–Robot Collaboration (HRC) is an interdisciplinary research area that has gained attention within the smart manufacturing context. To address changes within manufacturing processes, HRC seeks to combine the impressive physical capabilities of robots with the cognitive abilities of humans to design tasks with high efficiency, repeatability, and adaptability. During the implementation of an HRC cell, a key activity is the robot programming that takes into account not only the robot restrictions and the working space, but also human interactions. One of the most promising techniques is the so-called Learning from Demonstration (LfD), this approach is based on a collection of learning algorithms, inspired by how humans imitate behaviors to learn and acquire new skills. In this way, the programming task could be simplified and provided by the shop floor operator. The aim of this work is to present a survey of this programming technique, with emphasis on collaborative scenarios rather than just an isolated task. The literature was classified and analyzed based on: the main algorithms employed for Skill/Task learning, and the human level of participation during the whole LfD process. Our analysis shows that human intervention has been poorly explored, and its implications have not been carefully considered. Among the different methods of data acquisition, the prevalent method is physical guidance. Regarding data modeling, techniques such as Dynamic Movement Primitives and Semantic Learning were the preferred methods for low-level and high-level task solving, respectively. This paper aims to provide guidance and insights for researchers looking for an introduction to LfD programming methods in collaborative robotics context and identify research opportunities.
Collapse
|
3
|
Hakeem H, Feng W, Chen Z, Choong J, Brodie MJ, Fong SL, Lim KS, Wu J, Wang X, Lawn N, Ni G, Gao X, Luo M, Chen Z, Ge Z, Kwan P. Development and Validation of a Deep Learning Model for Predicting Treatment Response in Patients With Newly Diagnosed Epilepsy. JAMA Neurol 2022; 79:986-996. [PMID: 36036923 PMCID: PMC9425285 DOI: 10.1001/jamaneurol.2022.2514] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Accepted: 06/17/2022] [Indexed: 11/14/2022]
Abstract
Importance Selection of antiseizure medications (ASMs) for epilepsy remains largely a trial-and-error approach. Under this approach, many patients have to endure sequential trials of ineffective treatments until the "right drugs" are prescribed. Objective To develop and validate a deep learning model using readily available clinical information to predict treatment success with the first ASM for individual patients. Design, Setting, and Participants This cohort study developed and validated a prognostic model. Patients were treated between 1982 and 2020. All patients were followed up for a minimum of 1 year or until failure of the first ASM. A total of 2404 adults with epilepsy newly treated at specialist clinics in Scotland, Malaysia, Australia, and China between 1982 and 2020 were considered for inclusion, of whom 606 (25.2%) were excluded from the final cohort because of missing information in 1 or more variables. Exposures One of 7 antiseizure medications. Main Outcomes and Measures With the use of the transformer model architecture on 16 clinical factors and ASM information, this cohort study first pooled all cohorts for model training and testing. The model was trained again using the largest cohort and externally validated on the other 4 cohorts. The area under the receiver operating characteristic curve (AUROC), weighted balanced accuracy, sensitivity, and specificity of the model were all assessed for predicting treatment success based on the optimal probability cutoff. Treatment success was defined as complete seizure freedom for the first year of treatment while taking the first ASM. Performance of the transformer model was compared with other machine learning models. Results The final pooled cohort included 1798 adults (54.5% female; median age, 34 years [IQR, 24-50 years]). The transformer model that was trained using the pooled cohort had an AUROC of 0.65 (95% CI, 0.63-0.67) and a weighted balanced accuracy of 0.62 (95% CI, 0.60-0.64) on the test set. The model that was trained using the largest cohort only had AUROCs ranging from 0.52 to 0.60 and a weighted balanced accuracy ranging from 0.51 to 0.62 in the external validation cohorts. Number of pretreatment seizures, presence of psychiatric disorders, electroencephalography, and brain imaging findings were the most important clinical variables for predicted outcomes in both models. The transformer model that was developed using the pooled cohort outperformed 2 of the 5 other models tested in terms of AUROC. Conclusions and Relevance In this cohort study, a deep learning model showed the feasibility of personalized prediction of response to ASMs based on clinical information. With improvement of performance, such as by incorporating genetic and imaging data, this model may potentially assist clinicians in selecting the right drug at the first trial.
Collapse
Affiliation(s)
- Haris Hakeem
- Department of Neuroscience, Central Clinical School, Monash University, Melbourne, Victoria, Australia
- Department of Neurology, Alfred Health, Melbourne, Victoria, Australia
| | - Wei Feng
- Department of Electrical and Computer Systems Engineering, Monash University, Clayton, Victoria, Australia
- Monash-Airdoc Research, Monash University, Melbourne, Victoria, Australia
| | - Zhibin Chen
- Department of Neuroscience, Central Clinical School, Monash University, Melbourne, Victoria, Australia
| | - Jiun Choong
- Department of Electrical and Computer Systems Engineering, Monash University, Clayton, Victoria, Australia
| | - Martin J. Brodie
- Department of Medicine and Clinical Pharmacology, University of Glasgow, Glasgow, Scotland
| | - Si-Lei Fong
- Neurology Division, Department of Medicine, Faculty of Medicine, University of Malaya, Kuala Lumpur, Malaysia
| | - Kheng-Seang Lim
- Neurology Division, Department of Medicine, Faculty of Medicine, University of Malaya, Kuala Lumpur, Malaysia
| | - Junhong Wu
- Department of Neurology, the First Affiliated Hospital of Chongqing Medical University, Chongqing Key Laboratory of Neurology, Chongqing, China
| | - Xuefeng Wang
- Department of Neurology, the First Affiliated Hospital of Chongqing Medical University, Chongqing Key Laboratory of Neurology, Chongqing, China
| | - Nicholas Lawn
- WA Adult Epilepsy Service, Sir Charles Gairdner Hospital, Perth, Western Australia, Australia
| | - Guanzhong Ni
- Department of Neurology, the First Affiliated Hospital, Sun Yat-Sen University, Guangzhou, China
| | - Xiang Gao
- Department of Pharmacy, the First Affiliated Hospital, Sun Yat-Sen University, Guangzhou, China
| | - Mijuan Luo
- Department of Pharmacy, the First Affiliated Hospital, Sun Yat-Sen University, Guangzhou, China
| | - Ziyi Chen
- Department of Neurology, the First Affiliated Hospital, Sun Yat-Sen University, Guangzhou, China
| | - Zongyuan Ge
- Department of Electrical and Computer Systems Engineering, Monash University, Clayton, Victoria, Australia
- Monash-Airdoc Research, Monash University, Melbourne, Victoria, Australia
- Monash eResearch Centre, Monash University, Melbourne, Victoria, Australia
| | - Patrick Kwan
- Department of Neuroscience, Central Clinical School, Monash University, Melbourne, Victoria, Australia
- Department of Neurology, Alfred Health, Melbourne, Victoria, Australia
- Department of Neurology, the First Affiliated Hospital of Chongqing Medical University, Chongqing Key Laboratory of Neurology, Chongqing, China
| |
Collapse
|
4
|
RAISE: Rank-Aware Incremental Learning for Remote Sensing Object Detection. Symmetry (Basel) 2022. [DOI: 10.3390/sym14051020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
The deep learning method is widely used in remote sensing object detection on the premise that the training data have complete features. However, when data with a fixed class are added continuously, the trained detector is less able to adapt to new instances, impelling it to carry out incremental learning (IL). IL has two tasks with knowledge-related symmetry: continuing to learn unknown knowledge and maintaining existing knowledge. Unknown knowledge is more likely to exist in these new instances, which have features dissimilar from those of the old instances and cannot be well adapted by the detector before IL. Discarding all the old instances leads to the catastrophic forgetting of existing knowledge, which can be alleviated by relearning old instances, while different subsets represent different existing knowledge ranges and have different memory-retention effects on IL. Due to the different IL values of the data, the existing methods without appropriate distinguishing treatment preclude the efficient absorption of useful knowledge. Therefore, a rank-aware instance-incremental learning (RAIIL) method is proposed in this article, which pays attention to the difference in learning values from the aspects of the data-learning order and training loss weight. Specifically, RAIIL first designs the rank-score according to inference results and the true labels to determine the learning order and then weights the training loss according to the rank-score to balance the learning contribution. Comparative and analytical experiments conducted on two public remote sensing datasets for object detection, DOTA and DIOR, verified the superiority and effectiveness of the proposed method.
Collapse
|
5
|
Classification and Fast Few-Shot Learning of Steel Surface Defects with Randomized Network. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12083967] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
Abstract
Quality inspection is inevitable in the steel industry so there are already benchmark datasets for the visual inspection of steel surface defects. In our work, we show, contrary to previous recent articles, that a generic state-of-art deep neural network is capable of almost-perfect classification of defects of two popular benchmark datasets. However, in real-life applications new types of errors can always appear, thus incremental learning, based on very few example shots, is challenging. In our article, we address the problems of the low number of available shots of new classes, the catastrophic forgetting of known information when tuning for new artifacts, and the long training time required for re-training or fine-tuning existing models. In the proposed new architecture we combine EfficientNet deep neural networks with randomized classifiers to aim for an efficient solution for these demanding problems. The classification outperforms all other known approaches, with an accuracy 100% or almost 100%, on the two datasets with the off-the-shelf network. The proposed few-shot learning approach shows considerably higher accuracy at a low number of shots than the different methods under testing, while its speed is significantly (at least 10 times) higher than its competitors. According to these results, the classification and few-shot learning of steel surface defects can be solved more efficiently than was possible before.
Collapse
|
6
|
AI and Clinical Decision Making: The Limitations and Risks of Computational Reductionism in Bowel Cancer Screening. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12073341] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/09/2022]
Abstract
Advances in artificial intelligence in healthcare are frequently promoted as ‘solutions’ to improve the accuracy, safety, and quality of clinical decisions, treatments, and care. Despite some diagnostic success, however, AI systems rely on forms of reductive reasoning and computational determinism that embed problematic assumptions about clinical decision-making and clinical practice. Clinician autonomy, experience, and judgement are reduced to inputs and outputs framed as binary or multi-class classification problems benchmarked against a clinician’s capacity to identify or predict disease states. This paper examines this reductive reasoning in AI systems for colorectal cancer (CRC) to highlight their limitations and risks: (1) in AI systems themselves due to inherent biases in (a) retrospective training datasets and (b) embedded assumptions in underlying AI architectures and algorithms; (2) in the problematic and limited evaluations being conducted on AI systems prior to system integration in clinical practice; and (3) in marginalising socio-technical factors in the context-dependent interactions between clinicians, their patients, and the broader health system. The paper argues that to optimise benefits from AI systems and to avoid negative unintended consequences for clinical decision-making and patient care, there is a need for more nuanced and balanced approaches to AI system deployment and evaluation in CRC.
Collapse
|
7
|
A Survey on Recent Advances in AI and Vision-Based Methods for Helping and Guiding Visually Impaired People. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12052308] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
We present in this paper the state of the art and an analysis of recent research work and achievements performed in the domain of AI-based and vision-based systems for helping blind and visually impaired people (BVIP). We start by highlighting the recent and tremendous importance that AI has acquired following the use of convolutional neural networks (CNN) and their ability to solve image classification tasks efficiently. After that, we also note that VIP have high expectations about AI-based systems as a possible way to ease the perception of their environment and to improve their everyday life. Then, we set the scope of our survey: we concentrate our investigations on the use of CNN or related methods in a vision-based system for helping BVIP. We analyze the existing surveys, and we study the current work (a selection of 30 case studies) using several dimensions such as acquired data, learned models, and human–computer interfaces. We compare the different approaches, and conclude by analyzing future trends in this domain.
Collapse
|
8
|
Abstract
The proliferation of renewable energy sources distributed generation (RES-DG) into the grid results in time-varying inertia constant. To ensure the security of the grid under varying inertia, techniques for fast security assessment are required. In addition, considering the high penetration of RES-DG units into the modern grids, security prediction using varying grid features is crucial. The computation burden concerns of conventional time-domain security assessment techniques make it unsuitable for real-time security prediction. This paper, therefore, proposes a fast security monitoring model that includes security prediction and load shedding for security control. The attributes considered in this paper include the load level, inertia constant, fault location, and power dispatched from the renewable energy sources generator. An incremental Naïve Bayes algorithm is applied on the training dataset developed from the responses of the grid to transient stability simulations. An additive Gaussian process regression (GPR) model is proposed to estimate the load shedding required for the predicted insecure states. Finally, an algorithm based on the nodes’ security margin is proposed to determine the optimal node (s) for the load shedding. The average security prediction and load shedding estimation model training times are 1.2 s and 3 s, respectively. The result shows that the proposed model can predict the security of the grid, estimate the amount of load shed required, and determine the specific node for load shedding operation.
Collapse
|