1
|
Aggarwal M, Periwal V. Dory: Computation of persistence diagrams up to dimension two for Vietoris-Rips filtrations of large data sets. JOURNAL OF COMPUTATIONAL SCIENCE 2024; 79:102290. [PMID: 38774487 PMCID: PMC11105796 DOI: 10.1016/j.jocs.2024.102290] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2024]
Abstract
Persistent homology (PH) is an approach to topological data analysis (TDA) that computes multi-scale topologically invariant properties of high-dimensional data that are robust to noise. While PH has revealed useful patterns across various applications, computational requirements have limited applications to small data sets of a few thousand points. We present Dory, an efficient and scalable algorithm that can compute the persistent homology of sparse Vietoris-Rips complexes on larger data sets, up to and including dimension two and over the field Z 2 . As an application, we compute the PH of the human genome at high resolution as revealed by a genome-wide Hi-C data set containing approximately three million points. Extant algorithms were unable to process it, whereas Dory processed it within five minutes, using less than five GB of memory. Results show that the topology of the human genome changes significantly upon treatment with auxin, a molecule that degrades cohesin, corroborating the hypothesis that cohesin plays a crucial role in loop formation in DNA.
Collapse
Affiliation(s)
- Manu Aggarwal
- Laboratory of Biological Modeling, NIDDK, National Institutes of Health, 31 Center Dr, Bethesda, 20892, MD, United States
| | - Vipul Periwal
- Laboratory of Biological Modeling, NIDDK, National Institutes of Health, 31 Center Dr, Bethesda, 20892, MD, United States
| |
Collapse
|
2
|
Jeon ES, Choi H, Shukla A, Wang Y, Lee H, Buman MP, Turaga P. Topological Persistence Guided Knowledge Distillation for Wearable Sensor Data. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE 2024; 130:107719. [PMID: 38282698 PMCID: PMC10810240 DOI: 10.1016/j.engappai.2023.107719] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/30/2024]
Abstract
Deep learning methods have achieved a lot of success in various applications involving converting wearable sensor data to actionable health insights. A common application areas is activity recognition, where deep-learning methods still suffer from limitations such as sensitivity to signal quality, sensor characteristic variations, and variability between subjects. To mitigate these issues, robust features obtained by topological data analysis (TDA) have been suggested as a potential solution. However, there are two significant obstacles to using topological features in deep learning: (1) large computational load to extract topological features using TDA, and (2) different signal representations obtained from deep learning and TDA which makes fusion difficult. In this paper, to enable integration of the strengths of topological methods in deep-learning for time-series data, we propose to use two teacher networks - one trained on the raw time-series data, and another trained on persistence images generated by TDA methods. These two teachers are jointly used to distill a single student model, which utilizes only the raw time-series data at test-time. This approach addresses both issues. The use of KD with multiple teachers utilizes complementary information, and results in a compact model with strong supervisory features and an integrated richer representation. To assimilate desirable information from different modalities, we design new constraints, including orthogonality imposed on feature correlation maps for improving feature expressiveness and allowing the student to easily learn from the teacher. Also, we apply an annealing strategy in KD for fast saturation and better accommodation from different features, while the knowledge gap between the teachers and student is reduced. Finally, a robust student model is distilled, which can at test-time uses only the time-series data as an input, while implicitly preserving topological features. The experimental results demonstrate the effectiveness of the proposed method on wearable sensor data. The proposed method shows 71.74% in classification accuracy on GENEActiv with WRN16-1 (1D CNNs) student, which outperforms baselines and takes much less processing time (less than 17 sec) than teachers on 6k testing samples.
Collapse
Affiliation(s)
- Eun Som Jeon
- Geometric Media Lab, School of Arts, Media and Engineering and School of Electrical, Computer and Energy Engineering, Arizona State, University, Tempe, 85281, AZ, USA
| | - Hongjun Choi
- Geometric Media Lab, School of Arts, Media and Engineering and School of Electrical, Computer and Energy Engineering, Arizona State, University, Tempe, 85281, AZ, USA
| | - Ankita Shukla
- Geometric Media Lab, School of Arts, Media and Engineering and School of Electrical, Computer and Energy Engineering, Arizona State, University, Tempe, 85281, AZ, USA
| | - Yuan Wang
- Department of Epidemiology and Biostatistics, University of South Carolina, Columbia, 29208, SC, USA
| | - Hyunglae Lee
- School for Engineering of Matter, Transport and Energy, Tempe, 85281, AZ, USA
| | - Matthew P Buman
- College of Health Solutions, Arizona State University, Phoenix, 85004, AZ, USA
| | - Pavan Turaga
- Geometric Media Lab, School of Arts, Media and Engineering and School of Electrical, Computer and Energy Engineering, Arizona State, University, Tempe, 85281, AZ, USA
| |
Collapse
|
3
|
Jeon ES, Choi H, Shukla A, Wang Y, Buman MP, Turaga P. Constrained Adaptive Distillation Based on Topological Persistence for Wearable Sensor Data. IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT 2023; 72:2532014. [PMID: 38818128 PMCID: PMC11137740 DOI: 10.1109/tim.2023.3329818] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2024]
Abstract
Wearable sensor data analysis with persistence features generated by topological data analysis (TDA) has achieved great successes in various applications, however, it suffers from large computational and time resources for extracting topological features. In this paper, our approach utilizes knowledge distillation (KD) that involves the use of multiple teacher networks trained with the raw time-series and persistence images generated by TDA, respectively. However, direct transfer of knowledge from the teacher models utilizing different characteristics as inputs to the student model results in a knowledge gap and limited performance. To address this problem, we introduce a robust framework that integrates multimodal features from two different teachers and enables a student to learn desirable knowledge effectively. To account for statistical differences in multimodalities, entropy based constrained adaptive weighting mechanism is leveraged to automatically balance the effects of teachers and encourage the student model to adequately adopt the knowledge from two teachers. To assimilate dissimilar structural information generated by different style models for distillation, batch and channel similarities within a mini-batch are used. We demonstrate the effectiveness of the proposed method on wearable sensor data.
Collapse
Affiliation(s)
- Eun Som Jeon
- Geometric Media Lab, School of Arts, Media and Engineering and School of Electrical, Computer and Energy Engineering, Arizona State University, Tempe, AZ 85281 USA
| | - Hongjun Choi
- Lawrence Livermore National Laboratory, Livermore, CA, USA
| | - Ankita Shukla
- Geometric Media Lab, School of Arts, Media and Engineering and School of Electrical, Computer and Energy Engineering, Arizona State University, Tempe, AZ 85281 USA
| | - Yuan Wang
- Department of Epidemiology and Biostatistics, University of South Carolina, Columbia, SC 29208 USA
| | - Matthew P Buman
- College of Health Solutions, Arizona State University, Phoenix, AZ 85004 USA
| | - Pavan Turaga
- Geometric Media Lab, School of Arts, Media and Engineering and School of Electrical, Computer and Energy Engineering, Arizona State University, Tempe, AZ 85281 USA
| |
Collapse
|
4
|
Guo G, Zhao Y, Liu C, Fu Y, Xi X, Jin L, Shi D, Wang L, Duan Y, Huang J, Tan S, Yin G. Method for persistent topological features extraction of schizophrenia patients' electroencephalography signal based on persistent homology. Front Comput Neurosci 2022; 16:1024205. [PMID: 36277610 PMCID: PMC9579369 DOI: 10.3389/fncom.2022.1024205] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2022] [Accepted: 09/21/2022] [Indexed: 11/13/2022] Open
Abstract
With the development of network science and graph theory, brain network research has unique advantages in explaining those mental diseases, the neural mechanism of which is unclear. Additionally, it can provide a new perspective in revealing the pathophysiological mechanism of brain diseases from the system level. The selection of threshold plays an important role in brain networks construction. There are no generally accepted criteria for determining the proper threshold. Therefore, based on the topological data analysis of persistent homology theory, this study developed a multi-scale brain network modeling analysis method, which enables us to quantify various persistent topological features at different scales in a coherent manner. In this method, the Vietoris-Rips filtering algorithm is used to extract dynamic persistent topological features by gradually increasing the threshold in the range of full-scale distances. Subsequently, the persistent topological features are visualized using barcodes and persistence diagrams. Finally, the stability of persistent topological features is analyzed by calculating the Bottleneck distances and Wasserstein distances between the persistence diagrams. Experimental results show that compared with the existing methods, this method can extract the topological features of brain networks more accurately and improves the accuracy of diagnostic and classification. This work not only lays a foundation for exploring the higher-order topology of brain functional networks in schizophrenia patients, but also enhances the modeling ability of complex brain systems to better understand, analyze, and predict their dynamic behaviors.
Collapse
Affiliation(s)
- Guangxing Guo
- College of Geography Science, Taiyuan Normal University, Jinzhong, China
- Institute of Big Data Analysis Technology and Application, Taiyuan Normal University, Jinzhong, China
- College of Resource and Environment, Shanxi Agricultural University, Taigu, China
| | - Yanli Zhao
- Psychiatry Research Center, Beijing Huilongguan Hospital, Peking University Huilongguan Clinical Medical School, Beijing, China
| | - Chenxu Liu
- Laboratory of Data Mining and Machine Learning, College of Computer Science and Technology, Taiyuan Normal University, Jinzhong, China
| | - Yongcan Fu
- Laboratory of Data Mining and Machine Learning, College of Computer Science and Technology, Taiyuan Normal University, Jinzhong, China
| | - Xinhua Xi
- Laboratory of Data Mining and Machine Learning, College of Computer Science and Technology, Taiyuan Normal University, Jinzhong, China
| | - Lizhong Jin
- College of Applied Science, Taiyuan University of Science and Technology, Taiyuan, China
| | - Dongli Shi
- Laboratory of Data Mining and Machine Learning, College of Computer Science and Technology, Taiyuan Normal University, Jinzhong, China
| | - Lin Wang
- Laboratory of Data Mining and Machine Learning, College of Computer Science and Technology, Taiyuan Normal University, Jinzhong, China
| | - Yonghong Duan
- College of Resource and Environment, Shanxi Agricultural University, Taigu, China
| | - Jie Huang
- Psychiatry Research Center, Beijing Huilongguan Hospital, Peking University Huilongguan Clinical Medical School, Beijing, China
| | - Shuping Tan
- Psychiatry Research Center, Beijing Huilongguan Hospital, Peking University Huilongguan Clinical Medical School, Beijing, China
| | - Guimei Yin
- Laboratory of Data Mining and Machine Learning, College of Computer Science and Technology, Taiyuan Normal University, Jinzhong, China
| |
Collapse
|
5
|
Jeon ES, Choi H, Shukla A, Wang Y, Buman MP, Turaga P. Topological Knowledge Distillation for Wearable Sensor Data. CONFERENCE RECORD. ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS 2022; 2022:837-842. [PMID: 37583442 PMCID: PMC10426276 DOI: 10.1109/ieeeconf56349.2022.10052019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/17/2023]
Abstract
Converting wearable sensor data to actionable health insights has witnessed large interest in recent years. Deep learning methods have been utilized in and have achieved a lot of successes in various applications involving wearables fields. However, wearable sensor data has unique issues related to sensitivity and variability between subjects, and dependency on sampling-rate for analysis. To mitigate these issues, a different type of analysis using topological data analysis has shown promise as well. Topological data analysis (TDA) captures robust features, such as persistence images (PI), in complex data through the persistent homology algorithm, which holds the promise of boosting machine learning performance. However, because of the computational load required by TDA methods for large-scale data, integration and implementation has lagged behind. Further, many applications involving wearables require models to be compact enough to allow deployment on edge-devices. In this context, knowledge distillation (KD) has been widely applied to generate a small model (student model), using a pre-trained high-capacity network (teacher model). In this paper, we propose a new KD strategy using two teacher models - one that uses the raw time-series and another that uses persistence images from the time-series. These two teachers then train a student using KD. In essence, the student learns from heterogeneous teachers providing different knowledge. To consider different properties in features from teachers, we apply an annealing strategy and adaptive temperature in KD. Finally, a robust student model is distilled, which utilizes the time series data only. We find that incorporation of persistence features via second teacher leads to significantly improved performance. This approach provides a unique way of fusing deep-learning with topological features to develop effective models.
Collapse
Affiliation(s)
- Eun Som Jeon
- Geometric Media Lab, School of Arts, Media and Engineering and School of Electrical, Computer and Energy Engineering, Arizona State University, Tempe, AZ 85281 USA
| | - Hongjun Choi
- Geometric Media Lab, School of Arts, Media and Engineering and School of Electrical, Computer and Energy Engineering, Arizona State University, Tempe, AZ 85281 USA
| | - Ankita Shukla
- Geometric Media Lab, School of Arts, Media and Engineering and School of Electrical, Computer and Energy Engineering, Arizona State University, Tempe, AZ 85281 USA
| | - Yuan Wang
- Department of Epidemiology and Biostatistics, University of South Carolina, Columbia, SC 29208 USA
| | - Matthew P Buman
- College of Health Solutions, Arizona State University, Phoenix, AZ 85004 USA
| | - Pavan Turaga
- Geometric Media Lab, School of Arts, Media and Engineering and School of Electrical, Computer and Energy Engineering, Arizona State University, Tempe, AZ 85281 USA
| |
Collapse
|
6
|
Samani EU, Yang X, Banerjee AG. Visual Object Recognition in Indoor Environments Using Topologically Persistent Features. IEEE Robot Autom Lett 2021. [DOI: 10.1109/lra.2021.3099460] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
7
|
Moroni D, Pascali MA. Learning Topology: Bridging Computational Topology and Machine Learning. PATTERN RECOGNITION AND IMAGE ANALYSIS 2021. [DOI: 10.1134/s1054661821030184] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|