1
|
Luo Z, He Y, Xue Y, Wang H, Hauskrecht M, Li T. Hierarchical Active Learning With Qualitative Feedback on Regions. IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS 2023; 53:581-589. [PMID: 37396345 PMCID: PMC10310296 DOI: 10.1109/thms.2023.3252815] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/04/2023]
Abstract
Learning classification models in practice usually requires numerous labeled data for training. However, instance-based annotation can be inefficient for humans to perform. In this article, we propose and study a new type of human supervision that is fast to perform and useful for model learning. Instead of labeling individual instances, humans provide supervision to data regions, which are subspaces of the input data space, representing subpopulations of data. Since labeling now is performed on a region level, 0/1 labeling becomes imprecise. Thus, we design the region label to be a qualitative assessment of the class proportion, which coarsely preserves the labeling precision but is also easy for humans to do. To identify informative regions for labeling and learning, we further devise a hierarchical active learning process that recursively constructs a region hierarchy. This process is semisupervised in the sense that it is driven by both active learning strategies and human expertise, where humans can provide discriminative features. To evaluate our framework, we conducted extensive experiments on nine datasets as well as a real user study on a survival analysis of colorectal cancer patients. The results have clearly demonstrated the superiority of our region-based active learning framework against many instance-based active learning methods.
Collapse
Affiliation(s)
- Zhipeng Luo
- School of Computing and Artificial Intelligence, Southwest Jiaotong University, Chengdu 611756, China
| | - Yazhou He
- Department of Oncology, West China School of Public Health and West China Fourth Hospital, Sichuan University, Chengdu 610041, China
| | | | - Hongjun Wang
- School of Computing and Artificial Intelligence, Southwest Jiaotong University, Chengdu 611756, China
| | - Milos Hauskrecht
- Department of Computer Science, University of Pittsburgh, Pittsburgh, PA 15260 USA
| | - Tianrui Li
- Manufacturing Industry Chains Collaboration and Information Support Technology Key Laboratory, Sichuan 611756, China
| |
Collapse
|
2
|
Active learning for ordinal classification based on expected cost minimization. Sci Rep 2022; 12:22468. [PMID: 36577793 PMCID: PMC9797557 DOI: 10.1038/s41598-022-26844-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2022] [Accepted: 12/21/2022] [Indexed: 12/29/2022] Open
Abstract
To date, a large number of active learning algorithms have been proposed, but active learning methods for ordinal classification are under-researched. For ordinal classification, there is a total ordering among the data classes, and it is natural that the cost of misclassifying an instance as an adjacent class should be lower than that of misclassifying it as a more disparate class. However, existing active learning algorithms typically do not consider the above ordering information in query selection. Thus, most of them do not perform satisfactorily in ordinal classification. This study proposes an active learning method for ordinal classification by considering the ordering information among classes. We design an expected cost minimization criterion that imbues the ordering information. Meanwhile, we incorporate it with an uncertainty sampling criterion to impose the query instance more informative. Furthermore, we introduce a candidate subset selection method based on the k-means algorithm to reduce the computational overhead led by the calculation of expected cost. Extensive experiments on nine public ordinal classification datasets demonstrate that the proposed method outperforms several baseline methods.
Collapse
|
3
|
Cui JW, Zhou Q, Lu S, Cheng Y, Wang J, Bai RL, Li WQ, Qian L, Chen XY, Fan Y, Huang C, Liu XQ, Tu HY, Yang JJ, Zhang L, Zhou JY, Zhong WZ, Wu YL. The Chinese Thoracic Oncology Group (CTONG) therapeutic option scoring system: a multiple-parameter framework to assess the value of lung cancer treatment options. Transl Lung Cancer Res 2021; 10:3594-3607. [PMID: 34584859 PMCID: PMC8435396 DOI: 10.21037/tlcr-21-388] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2021] [Accepted: 08/17/2021] [Indexed: 12/24/2022]
Abstract
Background Currently, there is no standard context that conforms to the Chinese national framework for evaluating medical decisions regarding the treatment of lung cancer. Methods This draft was formulated after a systematic review and a focus group discussion among 20 experts, who were senior physicians with extensive clinical experience from the Chinese Thoracic Oncology Group (CTONG) task force. Subsequently, a draft and a five-point Likert scale were sent to 300 CTONG working group members. These were modified according to feedback from a four-round modified Delphi approach. Hence, the first version of the ‘Therapeutic option of lung cancer: CTONG scoring system’ was formulated. Afterward, a corresponding questionnaire was designed to collect opinions on the weight allocation of various indicators. This was issued through the WeChat platform, “Oncology News” application and e-mails from October 23, 2020, to November 25, 2020. Participants from numerous occupations in cancer-related fields from various regions of China were included in the study. Overall and subgroup analyses regarding weight allocations were performed. The differences between participant-allocated and reference weights were considered to adjust the framework. Results The framework contained four aspects and six indicators, including efficacy [progression-free survival (PFS)/overall survival (OS) and subsequent treatment], safety [treatment-related severe adverse event (SAE), dose adjustment], quality of life (Qol), and compensation. The reference weights were 50%, 5%, 10%, 5%, 10%, and 20% for each indicator. By November 25, 2020, 1,043 valid questionnaires had been obtained. The majority of the questionnaires were completed by physicians (86.5%). Subgroup analysis among the various groups showed an overall consistent trend. Besides, significant differences between the participant-allocated and reference weights were found among PFS/OS (difference: −11.5%), compensation (difference: −10.1%), and subsequent treatment (difference: 9.7%) indicators. After discussion, the final weight allocations were set at 45%, 10%, 15%, 5%, 10%, and 15% for PFS/OS, subsequent treatment, treatment-related SAE, dose adjustment, Qol, and compensation, respectively. Conclusions The CTONG scoring system, as an objective evaluation model that involves multiple parameters, is a breakthrough method for evaluating the therapeutic value of lung cancer treatment options in China, which is worthy of further verification in future clinical practice.
Collapse
Affiliation(s)
- Jiu-Wei Cui
- Cancer Center, the First Hospital of Jilin University, Changchun, China
| | - Qing Zhou
- Guangdong Lung Cancer Institute, Guangdong Provincial People's Hospital & Guangdong Academy of Medical Sciences, Guangzhou, China
| | - Shun Lu
- Department of Shanghai Lung Cancer Center, Shanghai Chest Hospital, Shanghai Jiao Tong University, Shanghai, China
| | - Ying Cheng
- Department of Medical Oncology, Jilin Province Tumour Hospital, Changchun, China
| | - Jie Wang
- Department of Medical Oncology, National Cancer Center/Cancer Hospital, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China
| | - Ri-Lan Bai
- Cancer Center, the First Hospital of Jilin University, Changchun, China
| | - Wen-Qian Li
- Cancer Center, the First Hospital of Jilin University, Changchun, China
| | - Lei Qian
- Cancer Center, the First Hospital of Jilin University, Changchun, China
| | | | - Yun Fan
- Department of Medical Oncology, Zhejiang Cancer Hospital, Hangzhou, China
| | - Cheng Huang
- Department of Respiratory Medicine, Fujian Provincial Tumor Hospital, Fuzhou, China
| | - Xiao-Qing Liu
- Department of Pulmonary Oncology, 307 Hospital of the Academy of Military Medical Sciences, Beijing, China
| | - Hai-Yan Tu
- Guangdong Lung Cancer Institute, Guangdong Provincial People's Hospital & Guangdong Academy of Medical Sciences, Guangzhou, China
| | - Jin-Ji Yang
- Guangdong Lung Cancer Institute, Guangdong Provincial People's Hospital & Guangdong Academy of Medical Sciences, Guangzhou, China
| | - Li Zhang
- Department of Medical Oncology, Sun Yat-Sen University Cancer Center, Guangzhou, China
| | - Jian-Ying Zhou
- Department of Respiratory Diseases, The First Affiliated Hospital of College of Medicine, Zhejiang University, Hangzhou, China
| | - Wen-Zhao Zhong
- Guangdong Lung Cancer Institute, Guangdong Provincial People's Hospital & Guangdong Academy of Medical Sciences, Guangzhou, China
| | - Yi-Long Wu
- Guangdong Lung Cancer Institute, Guangdong Provincial People's Hospital & Guangdong Academy of Medical Sciences, Guangzhou, China
| |
Collapse
|
4
|
Luo Z, Hauskrecht M. Hierarchical Active Learning with Overlapping Regions. PROCEEDINGS OF THE ... ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT. ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT 2020; 2020:1045-1054. [PMID: 33224554 DOI: 10.1145/3340531.3412022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
Learning of classification models from real-world data often requires substantial human effort devoted to instance annotation. As this process can be very time-consuming and costly, finding effective ways to reduce the annotation cost becomes critical for building such models. To address this problem we explore a new type of human feedback - region-based feedback. Briefly, a region is defined as a hypercubic subspace of the input data space and represents a subpopulation of data instances; the region's label is a human assessment of the class proportion of the data subpopulation. By using learning from label proportions algorithms one can learn instance-based classifiers from such labeled regions. In general, the key challenge is that there can be infinite many regions one can define and query in a given data space. To minimize the number and complexity of region-based queries, we propose and develop a hierarchical active learning solution that aims at incrementally building a concise hierarchy of regions. Furthermore, to avoid building a possibly class-irrelevant region hierarchy, we further propose to grow multiple different hierarchies in parallel and expand those more informative hierarchies. Through experiments on numerous data sets, we demonstrate that methods using region-based feedback can learn very good classifiers from very few and simple queries, and hence are highly effective in reducing human annotation effort needed for building classification models.
Collapse
Affiliation(s)
- Zhipeng Luo
- Department of Computer Science, University of Pittsburgh, Pittsburgh, Pennsylvania
| | - Milos Hauskrecht
- Department of Computer Science, University of Pittsburgh, Pittsburgh, Pennsylvania
| |
Collapse
|
5
|
Luo Z, Hauskrecht M. Region-Based Active Learning with Hierarchical and Adaptive Region Construction. PROCEEDINGS OF THE ... SIAM INTERNATIONAL CONFERENCE ON DATA MINING. SIAM INTERNATIONAL CONFERENCE ON DATA MINING 2020; 2019:441-449. [PMID: 31929950 DOI: 10.1137/1.9781611975673.50] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
Learning of classification models in practice often relies on human annotation effort in which humans assign class labels to data instances. As this process can be very time-consuming and costly, finding effective ways to reduce the annotation cost becomes critical for building such models. To solve this problem, instead of soliciting instance-based annotation we explore region-based annotation as the human feedback. A region is defined as a hyper-cubic subspace of the input space X and it covers a subpopulation of data instances that fall into this region. Each region is labeled with a number in [0,1] (in binary classification setting), representing a human estimate of the positive (or negative) class proportion in the subpopulation. To quickly discover pure regions (in terms of class proportion) in the data, we have developed a novel active learning framework that constructs regions in a hierarchical and adaptive way. Hierarchical means that regions are incrementally built into a hierarchical tree, which is done by repeatedly splitting the input space. Adaptive means that our framework can adaptively choose the best heuristic for each of the region splits. Through experiments on numerous datasets we demonstrate that our framework can identify pure regions in very few region queries. Thus our approach is shown to be effective in learning classification models from very limited human feedback.
Collapse
Affiliation(s)
- Zhipeng Luo
- Department of Computer Science, University of Pittsburgh, PA, USA
| | - Milos Hauskrecht
- Department of Computer Science, University of Pittsburgh, PA, USA
| |
Collapse
|
6
|
Kim LH, Lee EH, Galvez M, Aksoy M, Skare S, O’Halloran R, Edwards MSB, Holdsworth SJ, Yeom KW. Reduced field of view echo-planar imaging diffusion tensor MRI for pediatric spinal tumors. J Neurosurg Spine 2019; 31:607-615. [PMID: 31277060 PMCID: PMC6942637 DOI: 10.3171/2019.4.spine19178] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2019] [Accepted: 04/01/2019] [Indexed: 11/06/2022]
Abstract
OBJECTIVE Spine MRI is a diagnostic modality for evaluating pediatric CNS tumors. Applying diffusion-weighted MRI (DWI) or diffusion tensor imaging (DTI) to the spine poses challenges due to intrinsic spinal anatomy that exacerbates various image-related artifacts, such as signal dropouts or pileups, geometrical distortions, and incomplete fat suppression. The zonal oblique multislice (ZOOM)-echo-planar imaging (EPI) technique reduces geometric distortion and image blurring by reducing the field of view (FOV) without signal aliasing into the FOV. The authors hypothesized that the ZOOM-EPI method for spine DTI in concert with conventional spinal MRI is an efficient method for augmenting the evaluation of pediatric spinal tumors. METHODS Thirty-eight consecutive patients (mean age 8 years) who underwent ZOOM-EPI spine DTI for CNS tumor workup were retrospectively identified. Patients underwent conventional spine MRI and ZOOM-EPI DTI spine MRI. Two blinded radiologists independently reviewed two sets of randomized images: conventional spine MRI without ZOOM-EPI DTI, and conventional spine MRI with ZOOM-EPI DTI. For both image sets, the reviewers scored the findings based on lesion conspicuity and diagnostic confidence using a 5-point Likert scale. The reviewers also recorded presence of tumors. Quantitative apparent diffusion coefficient (ADC) measurements of various spinal tumors were extracted. Tractography was performed in a subset of patients undergoing presurgical evaluation. RESULTS Sixteen patients demonstrated spinal tumor lesions. The readers were in moderate agreement (kappa = 0.61, 95% CI 0.30-0.91). The mean scores for conventional MRI and combined conventional MRI and DTI were as follows, respectively: 3.0 and 4.0 for lesion conspicuity (p = 0.0039), and 2.8 and 3.9 for diagnostic confidence (p < 0.001). ZOOM-EPI DTI identified new lesions in 3 patients. In 3 patients, tractography used for neurosurgical planning showed characteristic fiber tract projections. The mean weighted ADCs of low- and high-grade tumors were 1201 × 10-6 and 865 × 10-6 mm2/sec (p = 0.002), respectively; the mean minimum weighted ADCs were 823 × 10-6 and 474 × 10-6 mm2/sec (p = 0.0003), respectively. CONCLUSIONS Diffusion MRI with ZOOM-EPI can improve the detection of spinal lesions while providing quantitative diffusion information that helps distinguish low- from high-grade tumors. By adding a 2-minute DTI scan, quantitative diffusion information and tract profiles can reliably be obtained and serve as a useful adjunct to presurgical planning for pediatric spinal tumors.
Collapse
Affiliation(s)
- Lily H. Kim
- Department of Neurosurgery, Stanford University School of Medicine, Stanford
| | - Edward H. Lee
- Department of Electrical Engineering, Stanford University, Stanford, California
| | - Michelle Galvez
- Department of Radiology, Stanford University School of Medicine, Stanford
| | - Murat Aksoy
- Department of Radiology, Stanford University School of Medicine, Stanford
| | - Stefan Skare
- Clinical Neuroscience, Karolinska Institute, Stockholm, Sweden
| | - Rafael O’Halloran
- Hyperfine Research Inc., Guilford, Connecticut; University of Auckland, New Zealand
| | | | - Samantha J. Holdsworth
- Department of Anatomy and Medical Imaging & Centre for Brain Research, Faculty of Medical and Health Sciences, University of Auckland, New Zealand
| | - Kristen W. Yeom
- Department of Radiology, Stanford University School of Medicine, Stanford
| |
Collapse
|
7
|
Luo Z, Hauskrecht M. Hierarchical Active Learning with Proportion Feedback on Regions. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES : EUROPEAN CONFERENCE, ECML PKDD ... : PROCEEDINGS. ECML PKDD (CONFERENCE) 2019; 11052:464-480. [PMID: 30740605 DOI: 10.1007/978-3-030-10928-8_28] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Learning of classification models in practice often relies on human annotation effort in which humans assign class labels to data instances. As this process can be very time-consuming and costly, finding effective ways to reduce the annotation cost becomes critical for building such models. To solve this problem, instead of soliciting instance-based annotation we explore region-based annotation as the feedback. A region is defined as a hyper-cubic subspace of the input feature space and it covers a subpopulation of data instances that fall into this region. Each region is labeled with a number in [0,1] (in binary classification setting), representing a human estimate of the positive (or negative) class proportion in the subpopulation. To learn a classifier from region-based feedback we develop an active learning framework that hierarchically divides the input space into smaller and smaller regions. In each iteration we split the region with the highest potential to improve the classification models. This iterative process allows us to gradually learn more refined classification models from more specific regions with more accurate proportions. Through experiments on numerous datasets we demonstrate that our approach offers a new and promising active learning direction that can outperform existing active learning approaches especially in situations when labeling budget is limited and small.
Collapse
Affiliation(s)
- Zhipeng Luo
- Department of Computer Science University of Pittsburgh, Pittsburgh PA 15260, USA
| | - Milos Hauskrecht
- Department of Computer Science University of Pittsburgh, Pittsburgh PA 15260, USA
| |
Collapse
|
8
|
Xue Y, Hauskrecht M. Active Learning of Multi-class Classification Models from Ordered Class Sets. PROCEEDINGS OF THE ... AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE. AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE 2019; 33:5589-5596. [PMID: 31750011 PMCID: PMC6867686] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
In this paper, we study the problem of learning multi-class classification models from a limited set of labeled examples obtained from human annotator. We propose a new machine learning framework that learns multi-class classification models from ordered class sets the annotator may use to express not only her top class choice but also other competing classes still under consideration. Such ordered sets of competing classes are common, for example, in various diagnostic tasks. In this paper, we first develop strategies for learning multi-class classification models from examples associated with ordered class set information. After that we develop an active learning strategy that considers such a feedback. We evaluate the benefit of the framework on multiple datasets. We show that class-order feedback and active learning can reduce the annotation cost both individually and jointly.
Collapse
Affiliation(s)
- Yanbing Xue
- Department of Computer Science, University of Pittsburgh,
| | | |
Collapse
|
9
|
Luo Z, Hauskrecht M. Hierarchical Active Learning with Group Proportion Feedback. IJCAI : PROCEEDINGS OF THE CONFERENCE 2018; 2018:2532-2538. [PMID: 30498326 DOI: 10.24963/ijcai.2018/351] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Learning of classification models in practice often relies on nontrivial human annotation effort in which humans assign class labels to data instances. As this process can be very time consuming and costly, finding effective ways to reduce the annotation cost becomes critical for building such models. In this work we solve this problem by exploring a new approach that actively learns classification models from groups, which are subpopulations of instances, and human feedback on the groups. Each group is labeled with a number in [0,1] interval representing a human estimate of the proportion of instances with one of the class labels in this subpopulation. To form the groups to be annotated, we develop a hierarchical active learning framework that divides the whole population into smaller subpopulations, which allows us to gradually learn more refined models from the subpopulations and their class proportion labels. Our extensive experiments on numerous datasets show that our method is competitive and outperforms existing approaches for reducing the human annotation cost.
Collapse
Affiliation(s)
- Zhipeng Luo
- Department of Computer Science, University of Pittsburgh, Pittsburgh, PA 15260, USA
| | - Milos Hauskrecht
- Department of Computer Science, University of Pittsburgh, Pittsburgh, PA 15260, USA
| |
Collapse
|
10
|
Luo Z, Hauskrecht M. Group-Based Active Learning of Classification Models. PROCEEDINGS OF THE ... INTERNATIONAL FLORIDA AI RESEARCH SOCIETY CONFERENCE. FLORIDA AI RESEARCH SYMPOSIUM 2017; 2017:92-97. [PMID: 28725882 PMCID: PMC5512732] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Learning of classification models from real-world data often requires additional human expert effort to annotate the data. However, this process can be rather costly and finding ways of reducing the human annotation effort is critical for this task. The objective of this paper is to develop and study new ways of providing human feedback for efficient learning of classification models by labeling groups of examples. Briefly, unlike traditional active learning methods that seek feedback on individual examples, we develop a new group-based active learning framework that solicits label information on groups of multiple examples. In order to describe groups in a user-friendly way, conjunctive patterns are used to compactly represent groups. Our empirical study on 12 UCI data sets demonstrates the advantages and superiority of our approach over both classic instance-based active learning work, as well as existing group-based active-learning methods.
Collapse
Affiliation(s)
- Zhipeng Luo
- Department of Computer Science, University of Pittsburgh, PA
| | | |
Collapse
|