1
|
Wei Q, Zhang W. Class-incremental learning with Balanced Embedding Discrimination Maximization. Neural Netw 2024; 179:106487. [PMID: 38986188 DOI: 10.1016/j.neunet.2024.106487] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2023] [Revised: 04/20/2024] [Accepted: 06/20/2024] [Indexed: 07/12/2024]
Abstract
Class incremental learning is committed to solving representation learning and classification assignments while avoiding catastrophic forgetting in scenarios where categories are increasing. In this work, a unified method named Balanced Embedding Discrimination Maximization (BEDM) is developed to make the intermediate embedding more distinctive. Specifically, we utilize an orthogonality constraint based on doubly-blocked Toeplitz matrix to minimize the correlation of convolution kernels, and an algorithm for similarity visualization is introduced. Furthermore, uneven samples and distribution shift among old and new tasks eventuate strongly biased classifiers. To mitigate the imbalance, we propose an adaptive balance weighting in softmax to compensate insufficient categories dynamically. In addition, hybrid embedding learning is introduced to preserve knowledge from old models, which involves less hyper-parameters than conventional knowledge distillation. Our proposed method outperforms the existing approaches on three mainstream benchmark datasets. Moreover, we technically visualize that our method can produce a more uniform similarity histogram and more stable spectrum. Grad-CAM and t-SNE visualizations further confirm its effectiveness. Code is available at https://github.com/wqzh/BEDM.
Collapse
Affiliation(s)
- Qinglai Wei
- State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China; School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, 100049, China; Institute of Systems Engineering, Macau University of Science and Technology, 999078, Macao Special Administrative Region of China.
| | - Weiqin Zhang
- State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China; School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, 100049, China
| |
Collapse
|
2
|
Petrenko S, Hier DB, Bone MA, Obafemi-Ajayi T, Timpson EJ, Marsh WE, Speight M, Wunsch DC. Analyzing Biomedical Datasets with Symbolic Tree Adaptive Resonance Theory. INFORMATION 2024; 15:125. [DOI: 10.3390/info15030125] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2025] Open
Abstract
Biomedical datasets distill many mechanisms of human diseases, linking diseases to genes and phenotypes (signs and symptoms of disease), genetic mutations to altered protein structures, and altered proteins to changes in molecular functions and biological processes. It is desirable to gain new insights from these data, especially with regard to the uncovering of hierarchical structures relating disease variants. However, analysis to this end has proven difficult due to the complexity of the connections between multi-categorical symbolic data. This article proposes symbolic tree adaptive resonance theory (START), with additional supervised, dual-vigilance (DV-START), and distributed dual-vigilance (DDV-START) formulations, for the clustering of multi-categorical symbolic data from biomedical datasets by demonstrating its utility in clustering variants of Charcot–Marie–Tooth disease using genomic, phenotypic, and proteomic data.
Collapse
Affiliation(s)
- Sasha Petrenko
- Department of Electrical and Computer Engineering, Missouri University of Science and Technology, Rolla, MO 65409, USA
| | - Daniel B. Hier
- Department of Electrical and Computer Engineering, Missouri University of Science and Technology, Rolla, MO 65409, USA
- Department of Neurology and Rehabilitation, University of Illinois at Chicago, Chicago, IL 60607, USA
| | - Mary A. Bone
- Department of Science and Industry Systems, University of Southeastern Norway, 3616 Kongsberg, Norway
| | - Tayo Obafemi-Ajayi
- Engineering Program, Missouri State University, Springfield, MO 65897, USA
| | - Erik J. Timpson
- Honeywell Federal Manufacturing & Technologies, Kansas City, MO 64147, USA
| | - William E. Marsh
- Honeywell Federal Manufacturing & Technologies, Kansas City, MO 64147, USA
| | - Michael Speight
- Honeywell Federal Manufacturing & Technologies, Kansas City, MO 64147, USA
| | - Donald C. Wunsch
- Department of Electrical and Computer Engineering, Missouri University of Science and Technology, Rolla, MO 65409, USA
| |
Collapse
|
4
|
Baker MM, New A, Aguilar-Simon M, Al-Halah Z, Arnold SMR, Ben-Iwhiwhu E, Brna AP, Brooks E, Brown RC, Daniels Z, Daram A, Delattre F, Dellana R, Eaton E, Fu H, Grauman K, Hostetler J, Iqbal S, Kent C, Ketz N, Kolouri S, Konidaris G, Kudithipudi D, Learned-Miller E, Lee S, Littman ML, Madireddy S, Mendez JA, Nguyen EQ, Piatko C, Pilly PK, Raghavan A, Rahman A, Ramakrishnan SK, Ratzlaff N, Soltoggio A, Stone P, Sur I, Tang Z, Tiwari S, Vedder K, Wang F, Xu Z, Yanguas-Gil A, Yedidsion H, Yu S, Vallabha GK. A domain-agnostic approach for characterization of lifelong learning systems. Neural Netw 2023; 160:274-296. [PMID: 36709531 DOI: 10.1016/j.neunet.2023.01.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2022] [Revised: 10/11/2022] [Accepted: 01/08/2023] [Indexed: 01/21/2023]
Abstract
Despite the advancement of machine learning techniques in recent years, state-of-the-art systems lack robustness to "real world" events, where the input distributions and tasks encountered by the deployed systems will not be limited to the original training context, and systems will instead need to adapt to novel distributions and tasks while deployed. This critical gap may be addressed through the development of "Lifelong Learning" systems that are capable of (1) Continuous Learning, (2) Transfer and Adaptation, and (3) Scalability. Unfortunately, efforts to improve these capabilities are typically treated as distinct areas of research that are assessed independently, without regard to the impact of each separate capability on other aspects of the system. We instead propose a holistic approach, using a suite of metrics and an evaluation framework to assess Lifelong Learning in a principled way that is agnostic to specific domains or system techniques. Through five case studies, we show that this suite of metrics can inform the development of varied and complex Lifelong Learning systems. We highlight how the proposed suite of metrics quantifies performance trade-offs present during Lifelong Learning system development - both the widely discussed Stability-Plasticity dilemma and the newly proposed relationship between Sample Efficient and Robust Learning. Further, we make recommendations for the formulation and use of metrics to guide the continuing development of Lifelong Learning systems and assess their progress in the future.
Collapse
Affiliation(s)
- Megan M Baker
- Johns Hopkins University Applied Physics Laboratory, 11100 Johns Hopkins Rd., Laurel, 20723, MD, USA.
| | - Alexander New
- Johns Hopkins University Applied Physics Laboratory, 11100 Johns Hopkins Rd., Laurel, 20723, MD, USA
| | - Mario Aguilar-Simon
- Teledyne Scientific Company - Intelligent Systems Laboratory, 19 T.W. Alexander Drive, RTP, 27709, NC, USA
| | - Ziad Al-Halah
- Department of Computer Science, University of Texas at Austin, Austin, TX, USA
| | - Sébastien M R Arnold
- Department of Computer Science, University of Southern California, Los Angeles, CA, USA
| | - Ese Ben-Iwhiwhu
- Department of Computer Science, Loughborough University, Loughborough, England, UK
| | - Andrew P Brna
- Teledyne Scientific Company - Intelligent Systems Laboratory, 19 T.W. Alexander Drive, RTP, 27709, NC, USA
| | - Ethan Brooks
- Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI, USA
| | - Ryan C Brown
- Teledyne Scientific Company - Intelligent Systems Laboratory, 19 T.W. Alexander Drive, RTP, 27709, NC, USA
| | | | - Anurag Daram
- University of Texas at San Antonio, San Antonio, TX, USA
| | - Fabien Delattre
- Department of Computer Science, University of Massachusetts Amherst, Amherst, MA, USA
| | - Ryan Dellana
- Sandia National Laboratories, Albuquerque, NM, USA
| | - Eric Eaton
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA, USA
| | - Haotian Fu
- Department of Computer Science, Brown University, Providence, RI, USA
| | - Kristen Grauman
- Department of Computer Science, University of Texas at Austin, Austin, TX, USA
| | | | - Shariq Iqbal
- Department of Computer Science, University of Southern California, Los Angeles, CA, USA
| | - Cassandra Kent
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA, USA
| | - Nicholas Ketz
- Information and Systems Sciences Laboratory, HRL Laboratories, 3011 Malibu Canyon Road, Malibu, 90265, CA, USA
| | - Soheil Kolouri
- Department of Computer Science, Vanderbilt University, Nashville, TN, USA
| | - George Konidaris
- Department of Computer Science, Brown University, Providence, RI, USA
| | | | - Erik Learned-Miller
- Department of Computer Science, University of Massachusetts Amherst, Amherst, MA, USA
| | - Seungwon Lee
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA, USA
| | - Michael L Littman
- Department of Computer Science, Brown University, Providence, RI, USA
| | | | - Jorge A Mendez
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA, USA
| | - Eric Q Nguyen
- Johns Hopkins University Applied Physics Laboratory, 11100 Johns Hopkins Rd., Laurel, 20723, MD, USA
| | - Christine Piatko
- Johns Hopkins University Applied Physics Laboratory, 11100 Johns Hopkins Rd., Laurel, 20723, MD, USA
| | - Praveen K Pilly
- Information and Systems Sciences Laboratory, HRL Laboratories, 3011 Malibu Canyon Road, Malibu, 90265, CA, USA
| | - Aswin Raghavan
- SRI International, 201 Washington Rd, Princeton, NJ, USA
| | - Abrar Rahman
- SRI International, 201 Washington Rd, Princeton, NJ, USA
| | | | - Neale Ratzlaff
- Information and Systems Sciences Laboratory, HRL Laboratories, 3011 Malibu Canyon Road, Malibu, 90265, CA, USA
| | - Andrea Soltoggio
- Department of Computer Science, Loughborough University, Loughborough, England, UK
| | - Peter Stone
- Department of Computer Science, University of Texas at Austin, Austin, TX, USA
| | - Indranil Sur
- SRI International, 201 Washington Rd, Princeton, NJ, USA
| | - Zhipeng Tang
- Department of Computer Science, University of Massachusetts Amherst, Amherst, MA, USA
| | - Saket Tiwari
- Department of Computer Science, Brown University, Providence, RI, USA
| | - Kyle Vedder
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA, USA
| | - Felix Wang
- Sandia National Laboratories, Albuquerque, NM, USA
| | - Zifan Xu
- Department of Computer Science, University of Texas at Austin, Austin, TX, USA
| | | | - Harel Yedidsion
- Department of Computer Science, University of Texas at Austin, Austin, TX, USA
| | - Shangqun Yu
- Department of Computer Science, Brown University, Providence, RI, USA
| | - Gautam K Vallabha
- Johns Hopkins University Applied Physics Laboratory, 11100 Johns Hopkins Rd., Laurel, 20723, MD, USA
| |
Collapse
|
6
|
Kudithipudi D, Aguilar-Simon M, Babb J, Bazhenov M, Blackiston D, Bongard J, Brna AP, Chakravarthi Raja S, Cheney N, Clune J, Daram A, Fusi S, Helfer P, Kay L, Ketz N, Kira Z, Kolouri S, Krichmar JL, Kriegman S, Levin M, Madireddy S, Manicka S, Marjaninejad A, McNaughton B, Miikkulainen R, Navratilova Z, Pandit T, Parker A, Pilly PK, Risi S, Sejnowski TJ, Soltoggio A, Soures N, Tolias AS, Urbina-Meléndez D, Valero-Cuevas FJ, van de Ven GM, Vogelstein JT, Wang F, Weiss R, Yanguas-Gil A, Zou X, Siegelmann H. Biological underpinnings for lifelong learning machines. NAT MACH INTELL 2022. [DOI: 10.1038/s42256-022-00452-0] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
|