1
|
Ita K, Roshanaei S. Artificial intelligence for skin permeability prediction: deep learning. J Drug Target 2024; 32:334-346. [PMID: 38258521 DOI: 10.1080/1061186x.2024.2309574] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Accepted: 01/07/2024] [Indexed: 01/24/2024]
Abstract
BACKGROUND AND OBJECTIVE Researchers have put in significant laboratory time and effort in measuring the permeability coefficient (Kp) of xenobiotics. To develop alternative approaches to this labour-intensive procedure, predictive models have been employed by scientists to describe the transport of xenobiotics across the skin. Most quantitative structure-permeability relationship (QSPR) models are derived statistically from experimental data. Recently, artificial intelligence-based computational drug delivery has attracted tremendous interest. Deep learning is an umbrella term for machine-learning algorithms consisting of deep neural networks (DNNs). Distinct network architectures, like convolutional neural networks (CNNs), feedforward neural networks (FNNs), and recurrent neural networks (RNNs), can be employed for prediction. METHODS In this project, we used a convolutional neural network, feedforward neural network, and recurrent neural network to predict skin permeability coefficients from a publicly available database reported by Cheruvu et al. The dataset contains 476 records of 145 chemicals, xenobiotics, and pharmaceuticals, administered on the human epidermis in vitro from aqueous solutions of constant concentration either saturated in infinite dose quantities or diluted. All the computations were conducted with Python under Anaconda and Jupyterlab environment after importing the required Python, Keras, and Tensorflow modules. RESULTS We used a convolutional neural network, feedforward neural network, and recurrent neural network to predict log kp. CONCLUSION This research work shows that deep learning networks can be successfully used to digitally screen and predict the skin permeability of xenobiotics.
Collapse
Affiliation(s)
- Kevin Ita
- College of Pharmacy, Touro University, Vallejo, CA, USA
| | | |
Collapse
|
2
|
Yuan K, Zhang X, Yang Q, Deng X, Deng Z, Liao X, Si W. Risk prediction and analysis of gallbladder polyps with deep neural network. Comput Assist Surg (Abingdon) 2024; 29:2331774. [PMID: 38520294 DOI: 10.1080/24699322.2024.2331774] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/25/2024] Open
Abstract
The aim of this study is to analyze the risk factors associated with the development of adenomatous and malignant polyps in the gallbladder. Adenomatous polyps of the gallbladder are considered precancerous and have a high likelihood of progressing into malignancy. Preoperatively, distinguishing between benign gallbladder polyps, adenomatous polyps, and malignant polyps is challenging. Therefore, the objective is to develop a neural network model that utilizes these risk factors to accurately predict the nature of polyps. This predictive model can be employed to differentiate the nature of polyps before surgery, enhancing diagnostic accuracy. A retrospective study was done on patients who had cholecystectomy surgeries at the Department of Hepatobiliary Surgery of the Second People's Hospital of Shenzhen between January 2017 and December 2022. The patients' clinical characteristics, lab results, and ultrasonographic indices were examined. Using risk variables for the growth of adenomatous and malignant polyps in the gallbladder, a neural network model for predicting the kind of polyps will be created. A normalized confusion matrix, PR, and ROC curve were used to evaluate the performance of the model. In this comprehensive study, we meticulously analyzed a total of 287 cases of benign gallbladder polyps, 15 cases of adenomatous polyps, and 27 cases of malignant polyps. The data analysis revealed several significant findings. Specifically, hepatitis B core antibody (95% CI -0.237 to 0.061, p < 0.001), number of polyps (95% CI -0.214 to -0.052, p = 0.001), polyp size (95% CI 0.038 to 0.051, p < 0.001), wall thickness (95% CI 0.042 to 0.081, p < 0.001), and gallbladder size (95% CI 0.185 to 0.367, p < 0.001) emerged as independent predictors for gallbladder adenomatous polyps and malignant polyps. Based on these significant findings, we developed a predictive classification model for gallbladder polyps, represented as follows, Predictive classification model for GBPs = -0.149 * core antibody - 0.033 * number of polyps + 0.045 * polyp size + 0.061 * wall thickness + 0.276 * gallbladder size - 4.313. To assess the predictive efficiency of the model, we employed precision-recall (PR) and receiver operating characteristic (ROC) curves. The area under the curve (AUC) for the prediction model was 0.945 and 0.930, respectively, indicating excellent predictive capability. We determined that a polyp size of 10 mm served as the optimal cutoff value for diagnosing gallbladder adenoma, with a sensitivity of 81.5% and specificity of 60.0%. For the diagnosis of gallbladder cancer, the sensitivity and specificity were 81.5% and 92.5%, respectively. These findings highlight the potential of our predictive model and provide valuable insights into accurate diagnosis and risk assessment for gallbladder polyps. We identified several risk factors associated with the development of adenomatous and malignant polyps in the gallbladder, including hepatitis B core antibodies, polyp number, polyp size, wall thickness, and gallbladder size. To address the need for accurate prediction, we introduced a novel neural network learning algorithm. This algorithm utilizes the aforementioned risk factors to predict the nature of gallbladder polyps. By accurately identifying the nature of these polyps, our model can assist patients in making informed decisions regarding their treatment and management strategies. This innovative approach aims to improve patient outcomes and enhance the overall effectiveness of care.
Collapse
Affiliation(s)
- Kerong Yuan
- Department of Hepatobiliary Surgery, the First Affiliated Hospital of Shenzhen University, Health Science Center, Shenzhen Second People's Hospital, Shenzhen, P.R. China
| | - Xiaofeng Zhang
- School of Mechanical Engineering, Nantong University, Nantong, P.R. China
| | - Qian Yang
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, P.R. China
| | - Xuesong Deng
- Department of Hepatobiliary Surgery, the First Affiliated Hospital of Shenzhen University, Health Science Center, Shenzhen Second People's Hospital, Shenzhen, P.R. China
| | - Zhe Deng
- Department of Emergency Medicine, the First Affiliated Hospital of Shenzhen University, Health Science Center, Shenzhen Second People's Hospital, Shenzhen, P.R. China
| | - Xiangyun Liao
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, P.R. China
| | - Weixin Si
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, P.R. China
| |
Collapse
|
3
|
Bennett HJ, Estler K, Valenzuela K, Weinhandl JT. Predicting Knee Joint Contact Forces During Normal Walking Using Kinematic Inputs With a Long-Short Term Neural Network. J Biomech Eng 2024; 146:081004. [PMID: 38270972 DOI: 10.1115/1.4064550] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2023] [Accepted: 01/19/2024] [Indexed: 01/26/2024]
Abstract
Knee joint contact forces are commonly estimated via surrogate measures (i.e., external knee adduction moments or musculoskeletal modeling). Despite its capabilities, modeling is not optimal for clinicians or persons with limited experience. The purpose of this study was to design a novel prediction method for knee joint contact forces that is simplistic in terms of required inputs. This study included marker trajectories and instrumented knee forces during normal walking from the "Grand Challenge" (n = 6) and "CAMS" (n = 2) datasets. Inverse kinematics were used to derive stance phase hip (sagittal, frontal, transverse), knee (sagittal, frontal), ankle (sagittal), and trunk (frontal) kinematics. A long-short term memory network (LSTM) was created using matlab to predict medial and lateral knee force waveforms using combinations of the kinematics. The Grand Challenge and CAMS datasets trained and tested the network, respectively. Musculoskeletal modeling forces were derived using static optimization and joint reaction tools in OpenSim. Waveform accuracy was determined as the proportion of variance and root-mean-square error between network predictions and in vivo data. The LSTM network was highly accurate for medial forces (R2 = 0.77, RMSE = 0.27 BW) and required only frontal hip and knee and sagittal hip and ankle kinematics. Modeled medial force predictions were excellent (R2 = 0.77, RMSE = 0.33 BW). Lateral force predictions were poor for both methods (LSTM R2 = 0.18, RMSE = 0.08 BW; modeling R2 = 0.21, RMSE = 0.54 BW). The designed LSTM network outperformed most reports of musculoskeletal modeling, including those reached in this study, revealing knee joint forces can accurately be predicted by using only kinematic input variables.
Collapse
Affiliation(s)
- Hunter J Bennett
- Neuromechanics Laboratory, Old Dominion University, 1007 Student Recreation Center, Norfolk, VA 23529
| | - Kaileigh Estler
- Department of Kinesiology, Recreation, and Sport Studies, The University of Tennessee, Knoxville, TN 37996
- University of Tennessee at Knoxville
| | - Kevin Valenzuela
- Department of Kinesiology, California State University, Long Beach, CA 90840
| | - Joshua T Weinhandl
- Department of Kinesiology, Recreation, and Sport Studies, The University of Tennessee, Knoxville, TN 37996
| |
Collapse
|
4
|
Chen P, Fu R, Shi Y, Liu C, Yang C, Su Y, Lu T, Zhou P, He W, Guo Q, Fei C. Optimizing BP neural network algorithm for Pericarpium Citri Reticulatae (Chenpi) origin traceability based on computer vision and ultra-fast gas-phase electronic nose data fusion. Food Chem 2024; 442:138408. [PMID: 38241985 DOI: 10.1016/j.foodchem.2024.138408] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2023] [Revised: 01/03/2024] [Accepted: 01/08/2024] [Indexed: 01/21/2024]
Abstract
This study utilized computer vision to extract color and texture features of Pericarpium Citri Reticulatae (PCR). The ultra-fast gas-phase electronic nose (UF-GC-E-nose) technique successfully identified 98 volatile components, including olefins, alcohols, and esters, which significantly contribute to the flavor profile of PCR. Multivariate statistical Analysis was applied to the appearance traits of PCR, identifying 57 potential marker-trait factors (VIP > 1 and P < 0.05) from the 118 trait factors that can distinguish PCR from different origins. These factors include color, texture, and odor traits. By integrating multivariate statistical Analysis with the BP neural network algorithm, a novel artificial intelligence algorithm was developed and optimized for traceability of PCR origin. This algorithm achieved a 100% discrimination rate in differentiating PCR samples from various origins. This study offers a valuable reference and data support for developing intelligent algorithms that utilize data fusion from multiple intelligent sensory technologies to achieve rapid traceability of food origins.
Collapse
Affiliation(s)
- Peng Chen
- Institute of Chinese Medicinal Materials, Nanjing Agricultural University, Nanjing 210095, China
| | - Rao Fu
- College of Pharmacy, Nanjing University of Chinese Medicine, Nanjing 210023, China
| | - Yabo Shi
- College of Pharmacy, Nanjing University of Chinese Medicine, Nanjing 210023, China
| | - Chang Liu
- Institute of Chinese Medicinal Materials, Nanjing Agricultural University, Nanjing 210095, China
| | - Chenlu Yang
- Institute of Chinese Medicinal Materials, Nanjing Agricultural University, Nanjing 210095, China
| | - Yong Su
- Institute of Chinese Medicinal Materials, Nanjing Agricultural University, Nanjing 210095, China
| | - Tulin Lu
- College of Pharmacy, Nanjing University of Chinese Medicine, Nanjing 210023, China
| | - Peina Zhou
- State Key Laboratory of Natural Medicines, School of Traditional Chinese Pharmacy, China Pharmaceutical University, Nanjing 210009, China
| | - Weitong He
- Jiangsu Wigroup Technologies Co., Ltd., Nanjing 210000, China
| | - Qiaosheng Guo
- Institute of Chinese Medicinal Materials, Nanjing Agricultural University, Nanjing 210095, China.
| | - Chenghao Fei
- Institute of Chinese Medicinal Materials, Nanjing Agricultural University, Nanjing 210095, China.
| |
Collapse
|
5
|
Wei J, Dai J, Sun Y, Meng Z, Ma H, Zhou Y. TIRPnet: Risk prediction of traditional Chinese medicine ingredients based on a deep neural network. J Ethnopharmacol 2024; 325:117860. [PMID: 38316222 DOI: 10.1016/j.jep.2024.117860] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/06/2023] [Revised: 02/01/2024] [Accepted: 02/02/2024] [Indexed: 02/07/2024]
Abstract
ETHNOPHARMACOLOGICAL RELEVANCE Traditional Chinese medicine (TCM) has a history of over 3000 years of medical practice. Due to the complex ingredients and unclear pharmacological mechanism of TCM, it is very difficult to predict its risks. With the increase in the number and severity of spontaneous reports of adverse drug reactions (ADRs) of TCM, its safety has received widespread attention. AIM OF THE STUDY In this study, we proposed a framework based on deep learning to predict the probability of adverse reactions caused by TCM ingredients and validated the model using real-world data. MATERIALS AND METHODS The spontaneous reporting data from Jiangsu Province of China was selected as the research data, which included 72,561 ADR reports of TCMs. All the ingredients of these TCMs were collected from the medical website and correlated with the corresponding ADRs. Then, a risk prediction model was constructed based on a deep neural network (DNN), named TIRPnet. Based on one-hot encoded data, our model achieved the optimal performance by fine-tuning some hyperparameters. The ten most commonly used TCM ingredients and their ADRs were collected as the test set to evaluate their performance as objective criteria. RESULTS TIRPnet was constructed as a 7-layer DNN. The experimental results showed that TIRPnet performs excellently in all indicators, with a sensitivity of 0.950, specificity of 0.995, accuracy of 0.994, precision of 0.708, and F1 of 0.811. CONCLUSIONS The proposed TIRPnet owns the ability to predict the ADRs of a single TCM ingredient by learning a large number of TCM-related spontaneous reports, which can help doctors design safe prescriptions and provide technical support for the pharmacovigilance of TCM.
Collapse
Affiliation(s)
- Jianxiang Wei
- School of Management, Nanjing University of Posts and Telecommunications, Nanjing, 210003, China.
| | - Jimin Dai
- School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, 210094, China.
| | - Yuehong Sun
- School of Mathematical Sciences, Nanjing Normal University, Nanjing, 210023, China.
| | - Zhe Meng
- School of Management, Nanjing University of Posts and Telecommunications, Nanjing, 210003, China.
| | - Hengyuan Ma
- School of Management, Nanjing University of Posts and Telecommunications, Nanjing, 210003, China.
| | - Yujin Zhou
- School of Internet of Things, Nanjing University of Posts and Telecommunications, Nanjing, 210003, China.
| |
Collapse
|
6
|
Cheng N, Gao Y, Ju S, Kong X, Lyu J, Hou L, Jin L, Shen B. Serum analysis based on SERS combined with 2D convolutional neural network and Gramian angular field for breast cancer screening. Spectrochim Acta A Mol Biomol Spectrosc 2024; 312:124054. [PMID: 38382221 DOI: 10.1016/j.saa.2024.124054] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Revised: 02/08/2024] [Accepted: 02/17/2024] [Indexed: 02/23/2024]
Abstract
Breast cancer is a significant cause of death among women worldwide. It is crucial to quickly and accurately diagnose breast cancer in order to reduce mortality rates. While traditional diagnostic techniques for medical imaging and pathology samples have been commonly used in breast cancer screening, they still have certain limitations. Surface-enhanced Raman spectroscopy (SERS) is a fast, highly sensitive and user-friendly method that is often combined with deep learning techniques like convolutional neural networks. This combination helps identify unique molecular spectral features, also known as "fingerprint", in biological samples such as serum. Ultimately, this approach is able to accurately screen for cancer. The Gramian angular field (GAF) algorithm can convert one-dimensional (1D) time series into two-dimensional (2D) images. These images can be used for data visualization, pattern recognition and machine learning tasks. In this study, 640 serum SERS from breast cancer patients and healthy volunteers were converted into 2D spectral images by Gramian angular field (GAF) technique. These images were then used to train and test a two-dimensional convolutional neural network-GAF (2D-CNN-GAF) model for breast cancer classification. We compared the performance of the 2D-CNN-GAF model with other methods, including one-dimensional convolutional neural network (1D-CNN), support vector machine (SVM), K-nearest neighbor (KNN) and principal component analysis-linear discriminant analysis (PCA-LDA), using various evaluation metrics such as accuracy, precision, sensitivity, F1-score, receiver operating characteristic (ROC) curve and area under curve (AUC) value. The results showed that the 2D-CNN model outperformed the traditional models, achieving an AUC value of 0.9884, an accuracy of 98.13%, sensitivity of 98.65% and specificity of 97.67% for breast cancer classification. In this study, we used conventional nano-silver sol as the SERS-enhanced substrate and a portable laser Raman spectrometer to obtain the serum SERS data. The 2D-CNN-GAF model demonstrated accurate and automatic classification of breast cancer patients and healthy volunteers. The method does not require augmentation and preprocessing of spectral data, simplifying the processing steps of spectral data. This method has great potential for accurate breast cancer screening and also provides a useful reference in more types of cancer classification and automatic screening.
Collapse
Affiliation(s)
- Nuo Cheng
- School of Life Science and Technology, Changchun University of Science and Technology, Changchun 130022, PR China
| | - Yan Gao
- School of Life Science and Technology, Changchun University of Science and Technology, Changchun 130022, PR China; Chinese Academy of Science, Shenzhen Institutes of Advanced and Technology, Shenzhen 518000, PR China
| | - Shaowei Ju
- School of Life Science and Technology, Changchun University of Science and Technology, Changchun 130022, PR China
| | - Xiangwei Kong
- School of Life Science and Technology, Changchun University of Science and Technology, Changchun 130022, PR China
| | - Jiugong Lyu
- School of Life Science and Technology, Changchun University of Science and Technology, Changchun 130022, PR China; School of Biological Engineering, Dalian University of Technology, Dalian 116024, PR China
| | - Lijie Hou
- School of Life Science and Technology, Changchun University of Science and Technology, Changchun 130022, PR China
| | - Lihong Jin
- School of Life Science and Technology, Changchun University of Science and Technology, Changchun 130022, PR China
| | - Bingjun Shen
- School of Life Science and Technology, Changchun University of Science and Technology, Changchun 130022, PR China
| |
Collapse
|
7
|
Zhao Z, Jin Z, Wu G, Li C, Yu J. TriFNet: A triple-branch feature fusion network for pH determination by surface-enhanced Raman spectroscopy. Spectrochim Acta A Mol Biomol Spectrosc 2024; 312:124048. [PMID: 38387412 DOI: 10.1016/j.saa.2024.124048] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/04/2023] [Revised: 02/11/2024] [Accepted: 02/14/2024] [Indexed: 02/24/2024]
Abstract
Due to the acidic tumor microenvironment caused by metabolic changes in tumor cells, the accurate pH detection of extracellular fluid is helpful for doctors in precise tumor resection. The combination of Raman spectroscopy and deep learning provides a solution for pH detection. However, most existing studies use one-dimensional convolutional neural networks (1D-CNNs) for spectral analysis, which limits the performance due to insufficient feature extraction. In this work, we propose a 2D triple-branch feature fusion network (TriFNet) for accurate pH determination using surface-enhanced Raman spectra (SERS). Specifically, we design a triple-branch network structure by converting Raman spectra into three types of images to extensively extract complex patterns in spectra. In addition, an attention fusion module, which leverages the complementarity among features in both space and channel, is designed to obtain the valuable information, achieving further accurate pH determination. On our Raman spectral dataset containing 14,137 samples, we achieved mean absolute error (MAE) of 0.059, standard deviation of the absolute error (SD) of 0.07, root mean squared error (RMSE) of 0.092, and coefficient of determination (R2) of 0.991 on the test set. Compared with other published methods, the four metrics showed an average improvement of 47%, 39%, 43%, and 6%, respectively. In addition, visualization validates the diagnostic capability of our model to correlate with biomolecular signatures. Meanwhile, our model has robustness to different SERS chips. These results prove the potential of our method to develop an effective technology based on Raman spectroscopy for accurate pH determination to guide surgery.
Collapse
Affiliation(s)
- Zheng Zhao
- School of Information Science and Technology, Fudan University, Shanghai 200438, China
| | - Ziyi Jin
- School of Pharmacy, Fudan University, Shanghai 201203, China
| | - Guoqing Wu
- School of Information Science and Technology, Fudan University, Shanghai 200438, China
| | - Cong Li
- School of Pharmacy, Fudan University, Shanghai 201203, China.
| | - Jinhua Yu
- School of Information Science and Technology, Fudan University, Shanghai 200438, China.
| |
Collapse
|
8
|
Basri KN, Yazid F, Mohd Zain MN, Md Yusof Z, Abdul Rani R, Zoolfakar AS. Artificial neural network and convolutional neural network for prediction of dental caries. Spectrochim Acta A Mol Biomol Spectrosc 2024; 312:124063. [PMID: 38394882 DOI: 10.1016/j.saa.2024.124063] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Revised: 01/12/2024] [Accepted: 02/18/2024] [Indexed: 02/25/2024]
Abstract
Dental caries has high prevalence among kids and adults thus it has become one of the global health concerns. The current modern dentistry focused on the preventives measures to reduce the number of dental caries cases. The employment of machine learning coupled with UV spectroscopy plays a crucial role to detect the early stage of caries. Artificial neural network with hyperparameter tuning was employed to train spectral data for the classification based on the International Caries Detection and Assesment System (ICDAS). Spectra preprocessing namely mean center (MC), autoscale (AS) and Savitzky Golay smoothing (SG) were applied on the data for spectra correction. The best performance of ANN model obtained has accuracy of 0.85 with precision of 1.00. Convolutional neural network (CNN) combined with Savitzky Golay smoothing performed on the spectral data has accuracy, precision, sensitivity and specificity for validation data of 1.00 respectively. The result obtained shows that the application of ANN and CNN capable to produce robust model to be used as an early screening of dental caries.
Collapse
Affiliation(s)
- Katrul Nadia Basri
- School of Electrical Engineering, College of Engineering, Universiti Teknologi MARA, 40450 Shah Alam, Selangor, Malaysia; Photonics Technology Lab, MIMOS Berhad, Technology Park Malaysia, 57000 Kuala Lumpur, Malaysia
| | - Farinawati Yazid
- Faculty of Dentistry, Universiti Kebangsaan Malaysia, 50300 Kuala Lumpur, Malaysia
| | | | - Zalhan Md Yusof
- Photonics Technology Lab, MIMOS Berhad, Technology Park Malaysia, 57000 Kuala Lumpur, Malaysia
| | - Rozina Abdul Rani
- School of Mechanical Engineering, College of Engineering, Universiti Teknologi MARA, 40450 Shah Alam, Selangor, Malaysia
| | - Ahmad Sabirin Zoolfakar
- School of Electrical Engineering, College of Engineering, Universiti Teknologi MARA, 40450 Shah Alam, Selangor, Malaysia.
| |
Collapse
|
9
|
Yuan T, Yang J, Chi J, Yu T, Liu F. A cross-domain complex convolution neural network for undersampled magnetic resonance image reconstruction. Magn Reson Imaging 2024; 108:86-97. [PMID: 38331053 DOI: 10.1016/j.mri.2024.02.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2023] [Revised: 02/01/2024] [Accepted: 02/05/2024] [Indexed: 02/10/2024]
Abstract
To introduce a new cross-domain complex convolution neural network for accurate MR image reconstruction from undersampled k-space data. Most reconstruction methods utilize neural networks or cascade neural networks in either the image domain and/or the k-space domain. However, these methods encounter several challenges: 1) Applying neural networks directly in the k-space domain is suboptimal for feature extraction; 2) Classic image-domain networks have difficulty in fully extracting texture features; and 3) Existing cross-domain methods still face challenges in extracting and fusing features from both image and k-space domains simultaneously. In this work, we propose a novel deep-learning-based 2-D single-coil complex-valued MR reconstruction network termed TEID-Net. TEID-Net integrates three modules: 1) TE-Net, an image-domain-based sub-network designed to enhance contrast in input features by incorporating a Texture Enhancement Module; 2) ID-Net, an intermediate-domain sub-network tailored to operate in the image-Fourier space, with the specific goal of reducing aliasing artifacts realized by leveraging the superior incoherence property of the decoupled one-dimensional signals; and 3) TEID-Net, a cross-domain reconstruction network in which ID-Nets and TE-Nets are combined and cascaded to boost the quality of image reconstruction further. Extensive experiments have been conducted on the fastMRI and Calgary-Campinas datasets. Results demonstrate the effectiveness of the proposed TEID-Net in mitigating undersampling-induced artifacts and producing high-quality image reconstructions, outperforming several state-of-the-art methods while utilizing fewer network parameters. The cross-domain TEID-Net excels in restoring tissue structures and intricate texture details. The results illustrate that TEID-Net is particularly well-suited for regular Cartesian undersampling scenarios.
Collapse
Affiliation(s)
- Tengfei Yuan
- College of Electronics and Information, Qingdao University, Qingdao, Shandong, China
| | - Jie Yang
- College of Mechanical and Electrical Engineering, Qingdao University, Qingdao, Shandong, China
| | - Jieru Chi
- College of Electronics and Information, Qingdao University, Qingdao, Shandong, China.
| | - Teng Yu
- College of Electronics and Information, Qingdao University, Qingdao, Shandong, China
| | - Feng Liu
- School of Electrical Engineering and Computer Science, University of Queensland, Brisbane, Brisbane, Australia
| |
Collapse
|
10
|
Qin D, Amariucai GT, Qiao D, Guan Y, Fu S. A comprehensive and reliable feature attribution method: Double-sided remove and reconstruct (DoRaR). Neural Netw 2024; 173:106166. [PMID: 38367355 DOI: 10.1016/j.neunet.2024.106166] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2023] [Revised: 01/28/2024] [Accepted: 02/06/2024] [Indexed: 02/19/2024]
Abstract
The limited transparency of the inner decision-making mechanism in deep neural networks (DNN) and other machine learning (ML) models has hindered their application in several domains. In order to tackle this issue, feature attribution methods have been developed to identify the crucial features that heavily influence decisions made by these black box models. However, many feature attribution methods have inherent downsides. For example, one category of feature attribution methods suffers from the artifacts problem, which feeds out-of-distribution masked inputs directly through the classifier that was originally trained on natural data points. Another category of feature attribution method finds explanations by using jointly trained feature selectors and predictors. While avoiding the artifacts problem, this new category suffers from the Encoding Prediction in the Explanation (EPITE) problem, in which the predictor's decisions rely not on the features, but on the masks that selects those features. As a result, the credibility of attribution results is undermined by these downsides. In this research, we introduce the Double-sided Remove and Reconstruct (DoRaR) feature attribution method based on several improvement methods that addresses these issues. By conducting thorough testing on MNIST, CIFAR10 and our own synthetic dataset, we demonstrate that the DoRaR feature attribution method can effectively bypass the above issues and can aid in training a feature selector that outperforms other state-of-the-art feature attribution methods. Our code is available at https://github.com/dxq21/DoRaR.
Collapse
Affiliation(s)
- Dong Qin
- Department of Electrical and Computer Engineering, Iowa State University, 2215 Coover Hall, 2520 Osborn Drive, Ames, 50011-1046, IA, USA.
| | - George T Amariucai
- Department of Computer Science, Kansas State University, 2184 Engineering Hall, 1701D Platt St., Manhattan, 66506, KS, USA.
| | - Daji Qiao
- Department of Electrical and Computer Engineering, Iowa State University, 2215 Coover Hall, 2520 Osborn Drive, Ames, 50011-1046, IA, USA.
| | - Yong Guan
- Department of Electrical and Computer Engineering, Iowa State University, 2215 Coover Hall, 2520 Osborn Drive, Ames, 50011-1046, IA, USA.
| | - Shen Fu
- Department of Electrical and Computer Engineering, Iowa State University, 2215 Coover Hall, 2520 Osborn Drive, Ames, 50011-1046, IA, USA.
| |
Collapse
|
11
|
Kennedy C, Crowdis T, Hu H, Vaidyanathan S, Zhang HK. Data-driven learning of chaotic dynamical systems using Discrete-Temporal Sobolev Networks. Neural Netw 2024; 173:106152. [PMID: 38359640 DOI: 10.1016/j.neunet.2024.106152] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Revised: 01/01/2024] [Accepted: 01/28/2024] [Indexed: 02/17/2024]
Abstract
We introduce the Discrete-Temporal Sobolev Network (DTSN), a neural network loss function that assists dynamical system forecasting by minimizing variational differences between the network output and the training data via a temporal Sobolev norm. This approach is entirely data-driven, architecture agnostic, and does not require derivative information from the estimated system. The DTSN is particularly well suited to chaotic dynamical systems as it minimizes noise in the network output which is crucial for such sensitive systems. For our test cases we consider discrete approximations of the Lorenz-63 system and the Chua circuit. For the network architectures we use the Long Short-Term Memory (LSTM) and the Transformer. The performance of the DTSN is compared with the standard MSE loss for both architectures, as well as with the Physics Informed Neural Network (PINN) loss for the LSTM. The DTSN loss is shown to substantially improve accuracy for both architectures, while requiring less information than the PINN and without noticeably increasing computational time, thereby demonstrating its potential to improve neural network forecasting of dynamical systems.
Collapse
Affiliation(s)
- Connor Kennedy
- Department of Mathematics & Statistics, University of Massachusetts, Amherst, MA 01003, USA.
| | - Trace Crowdis
- Department of Mathematics & Statistics, University of Massachusetts, Amherst, MA 01003, USA.
| | - Haoran Hu
- Department of Mathematics & Statistics, University of Massachusetts, Amherst, MA 01003, USA.
| | - Sankaran Vaidyanathan
- Department of Mathematics & Statistics, University of Massachusetts, Amherst, MA 01003, USA.
| | - Hong-Kun Zhang
- Department of Mathematics & Statistics, University of Massachusetts, Amherst, MA 01003, USA.
| |
Collapse
|
12
|
Xie KY, Zhang CK, Lee S, He Y, Liu Y. Delay-dependent Lurie-Postnikov type Lyapunov-Krasovskii functionals for stability analysis of discrete-time delayed neural networks. Neural Netw 2024; 173:106195. [PMID: 38394998 DOI: 10.1016/j.neunet.2024.106195] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2023] [Revised: 01/18/2024] [Accepted: 02/18/2024] [Indexed: 02/25/2024]
Abstract
This paper addresses the influence of time-varying delay and nonlinear activation functions with sector restrictions on the stability of discrete-time neural networks. Compared to previous works that mainly focuses on the influence of delay information, this paper devotes to activation nonlinear functions information to help compensate the analysis technique based on Lyapunov-Krasovskii functional (LKF). A class of delay-dependent Lurie-Postnikov type integral terms involving sector constraints of nonlinear activation function is proposed to complement the LKF construction. The less conservative criteria for the stability analysis of discrete-time delayed networks is given by using improved LKF. Numerical examples show that conservatism can be reduced by the delay-dependent integral terms involving nonlinear activation functions.
Collapse
Affiliation(s)
- Ke-You Xie
- School of Automation, China University of Geosciences, Wuhan 430074, China; Hubei Key Laboratory of Advanced Control and Intelligent Automation for Complex Systems, Wuhan 430074, China; Engineering Research Center of Intelligent Technology for Geo-Exploration, Ministry of Education, Wuhan 430074, China
| | - Chuan-Ke Zhang
- School of Automation, China University of Geosciences, Wuhan 430074, China; Hubei Key Laboratory of Advanced Control and Intelligent Automation for Complex Systems, Wuhan 430074, China; Engineering Research Center of Intelligent Technology for Geo-Exploration, Ministry of Education, Wuhan 430074, China
| | - Sangmoon Lee
- School of Electronic and Electrical Engineering, Kyungpook National University, Daegu 41566, South Korea.
| | - Yong He
- School of Automation, China University of Geosciences, Wuhan 430074, China; Hubei Key Laboratory of Advanced Control and Intelligent Automation for Complex Systems, Wuhan 430074, China; Engineering Research Center of Intelligent Technology for Geo-Exploration, Ministry of Education, Wuhan 430074, China
| | - Yajuan Liu
- School of Control and Computer Engineering, North China Electric Power University, Beijing 102206, China
| |
Collapse
|
13
|
Hoon Yun B, Yu HY, Kim H, Myoung S, Yeo N, Choi J, Sook Chun H, Kim H, Ahn S. Geographical discrimination of Asian red pepper powders using 1H NMR spectroscopy and deep learning-based convolution neural networks. Food Chem 2024; 439:138082. [PMID: 38070234 DOI: 10.1016/j.foodchem.2023.138082] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2023] [Revised: 11/24/2023] [Accepted: 11/24/2023] [Indexed: 01/10/2024]
Abstract
This study investigated an innovative approach to discriminate the geographical origins of Asian red pepper powders by analyzing one-dimensional 1H NMR spectra through a deep learning-based convolution neural network (CNN). 1H NMR spectra were collected from 300 samples originating from China, Korea, and Vietnam and used as input data. Principal component analysis - linear discriminant analysis and support vector machine models were employed for comparison. Bayesian optimization was used for hyperparameter optimization, and cross-validation was performed to prevent overfitting. As a result, all three models discriminated the origins of the test samples with over 95 % accuracy. Specifically, the CNN models achieved a 100 % accuracy rate. Gradient-weighted class activation mapping analysis verified that the CNN models recognized the origins of the samples based on variations in metabolite distributions. This research demonstrated the potential of deep learning-based classification of 1H NMR spectra as an accurate and reliable approach for determining the geographical origins of various foods.
Collapse
Affiliation(s)
- Byung Hoon Yun
- Department of Chemistry, Chung-Ang University, Seoul 06974, South Korea.
| | - Hyo-Yeon Yu
- Department of Chemistry, Chung-Ang University, Seoul 06974, South Korea.
| | - Hyeongmin Kim
- Department of Chemistry, Chung-Ang University, Seoul 06974, South Korea.
| | - Sangki Myoung
- Department of Chemistry, Chung-Ang University, Seoul 06974, South Korea.
| | - Neulhwi Yeo
- Department of Chemistry, Chung-Ang University, Seoul 06974, South Korea.
| | - Jongwon Choi
- Department of Advanced Imaging, Chung-Ang University, Seoul 06974, South Korea.
| | - Hyang Sook Chun
- Department of Food Science & Technology, Chung-Ang University, Anseong 17546, South Korea.
| | - Hyeonjin Kim
- Department of Medical Sciences, Seoul National University, Seoul 03080, South Korea; Department of Radiology, Seoul National University Hospital, Seoul 03080, South Korea.
| | - Sangdoo Ahn
- Department of Chemistry, Chung-Ang University, Seoul 06974, South Korea.
| |
Collapse
|
14
|
Mediavilla-Relaño J, Lázaro M. One-step Bayesian example-dependent cost classification: The OsC-MLP method. Neural Netw 2024; 173:106168. [PMID: 38382396 DOI: 10.1016/j.neunet.2024.106168] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Revised: 12/19/2023] [Accepted: 02/06/2024] [Indexed: 02/23/2024]
Abstract
Example-dependent cost classification problems are those where the decision costs depend not only on the true and the attributed classes but also on the sample features. Discriminative algorithms that carry out such classification tasks must take this dependence into account. In some applications, the decision costs are known for the training set but not in production, which complicates the problem. In this paper, we introduce a new one-step Bayesian formulation to train Neural Networks and solve the above limitation for binary cases with one-step Learning Machines, avoiding the drawbacks that unknown analytical forms of the example-dependent costs create. The formulation is based on defining an artificial likelihood ratio by using the available training classification costs in its definition, and proposes a test that does not require the values of the costs for unseen samples. Furthermore, it also includes Bayesian rebalancing mechanisms to combat the negative effects of class imbalance. Experimental results support the consistency and effectiveness of the corresponding algorithms.
Collapse
Affiliation(s)
- Javier Mediavilla-Relaño
- Signal Theory and Communications Department, Universidad Carlos III de Madrid, Avda. de la Universidad, No. 30, 28911, Leganés, Madrid, Spain.
| | - Marcelino Lázaro
- Signal Theory and Communications Department, Universidad Carlos III de Madrid, Avda. de la Universidad, No. 30, 28911, Leganés, Madrid, Spain.
| |
Collapse
|
15
|
Shao Y, Zhang Y, Dong W, Zhang Q, Shan P, Guo J, Xu H. Enhancing adversarial attacks with resize-invariant and logical ensemble. Neural Netw 2024; 173:106194. [PMID: 38402809 DOI: 10.1016/j.neunet.2024.106194] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2023] [Revised: 01/16/2024] [Accepted: 02/18/2024] [Indexed: 02/27/2024]
Abstract
In black-box scenarios, most transfer-based attacks usually improve the transferability of adversarial examples by optimizing the gradient calculation of the input image. Unfortunately, since the gradient information is only calculated and optimized for each pixel point in the image individually, the generated adversarial examples tend to overfit the local model and have poor transferability to the target model. To tackle the issue, we propose a resize-invariant method (RIM) and a logical ensemble transformation method (LETM) to enhance the transferability of adversarial examples. Specifically, RIM is inspired by the resize-invariant property of Deep Neural Networks (DNNs). The range of resizable pixel is first divided into multiple intervals, and then the input image is randomly resized and padded within each interval. Finally, LETM performs logical ensemble of multiple images after RIM transformation to calculate the final gradient update direction. The proposed method adequately considers the information of each pixel in the image and the surrounding pixels. The probability of duplication of image transformations is minimized and the overfitting effect of adversarial examples is effectively mitigated. Numerous experiments on the ImageNet dataset show that our approach outperforms other advanced methods and is capable of generating more transferable adversarial examples.
Collapse
Affiliation(s)
- Yanling Shao
- School of Computer and Software, Nanyang Institute of Technology, Nanyang, 473000, China.
| | - Yuzhi Zhang
- School of Computer and Communication Engineering, Zhengzhou University of Light Industry, Zhengzhou, 450002, China
| | - Wenyong Dong
- School of Computer Science, Wuhan University, Wuhan, 430072, China
| | - Qikun Zhang
- School of Computer and Communication Engineering, Zhengzhou University of Light Industry, Zhengzhou, 450002, China
| | - Pingping Shan
- School of Computer and Software, Nanyang Institute of Technology, Nanyang, 473000, China
| | - Junying Guo
- School of Computer and Software, Nanyang Institute of Technology, Nanyang, 473000, China
| | - Hairui Xu
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou, 450001, China
| |
Collapse
|
16
|
Nazir N, Sarwar A, Saini BS. Recent developments in denoising medical images using deep learning: An overview of models, techniques, and challenges. Micron 2024; 180:103615. [PMID: 38471391 DOI: 10.1016/j.micron.2024.103615] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Revised: 02/20/2024] [Accepted: 02/26/2024] [Indexed: 03/14/2024]
Abstract
Medical imaging plays a critical role in diagnosing and treating various medical conditions. However, interpreting medical images can be challenging even for expert clinicians, as they are often degraded by noise and artifacts that can hinder the accurate identification and analysis of diseases, leading to severe consequences such as patient misdiagnosis or mortality. Various types of noise, including Gaussian, Rician, and Salt-pepper noise, can corrupt the area of interest, limiting the precision and accuracy of algorithms. Denoising algorithms have shown the potential in improving the quality of medical images by removing noise and other artifacts that obscure essential information. Deep learning has emerged as a powerful tool for image analysis and has demonstrated promising results in denoising different medical images such as MRIs, CT scans, PET scans, etc. This review paper provides a comprehensive overview of state-of-the-art deep learning algorithms used for denoising medical images. A total of 120 relevant papers were reviewed, and after screening with specific inclusion and exclusion criteria, 104 papers were selected for analysis. This study aims to provide a thorough understanding for researchers in the field of intelligent denoising by presenting an extensive survey of current techniques and highlighting significant challenges that remain to be addressed. The findings of this review are expected to contribute to the development of intelligent models that enable timely and accurate diagnoses of medical disorders. It was found that 40% of the researchers used models based on Deep convolutional neural networks to denoise the images, followed by encoder-decoder (18%) and other artificial intelligence-based techniques (15%) (Like DIP, etc.). Generative adversarial network was used by 12%, transformer-based approaches (13%) and multilayer perceptron was used by 2% of the researchers. Moreover, Gaussian noise was present in 35% of the images, followed by speckle noise (16%), poisson noise (14%), artifacts (10%), rician noise (7%), Salt-pepper noise (6%), Impulse noise (3%) and other types of noise (9%). While the progress in developing novel models for the denoising of medical images is evident, significant work remains to be done in creating standardized denoising models that perform well across a wide spectrum of medical images. Overall, this review highlights the importance of denoising medical images and provides a comprehensive understanding of the current state-of-the-art deep learning algorithms in this field.
Collapse
|
17
|
Zhou T, Ye H, Cao F. Node-personalized multi-graph convolutional networks for recommendation. Neural Netw 2024; 173:106169. [PMID: 38359642 DOI: 10.1016/j.neunet.2024.106169] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Revised: 01/10/2024] [Accepted: 02/07/2024] [Indexed: 02/17/2024]
Abstract
Graph neural networks have revealed powerful potential in ranking recommendation. Existing methods based on bipartite graphs for ranking recommendation mainly focus on homogeneous graphs and usually treat user and item nodes as the same kind of nodes, however, the user-item bipartite graph is always heterogeneous. Additionally, various types of nodes have varying effects on recommendations, and a good node representation can be learned by successfully differentiating the same type of nodes. In this paper, we develop a node-personalized multi-graph convolutional network (NP-MGCN) for ranking recommendation. It consists of a node importance awareness block, a graph construction module, and a node information propagation and aggregation framework. Specifically, a node importance awareness block is proposed to encode nodes using node degree information to highlight the differences between nodes. Subsequently, the Jaccard similarity and co-occurrence matrix fusion graph construction module is devised to acquire user-user and item-item graphs, enriching correlation information between users and between items. Finally, a composite hop node information propagation and aggregation framework, including single-hop and double-hop branches, is designed. The high-order connectivity is used to aggregate heterogeneous information for the single-hop branch, while the multi-hop dependency is utilized to aggregate homogeneous information for the double-hop branch. It makes user and item node embedding more discriminative and integrates the different nodes' heterogeneity into the model. Experiments on several datasets manifest that NP-MGCN achieves outstanding recommendation performance than existing methods.
Collapse
Affiliation(s)
- Tiantian Zhou
- Department of Applied Mathematics, College of Sciences, China Jiliang University, Hangzhou 310018, China.
| | - Hailiang Ye
- Department of Applied Mathematics, College of Sciences, China Jiliang University, Hangzhou 310018, China.
| | - Feilong Cao
- Department of Applied Mathematics, College of Sciences, China Jiliang University, Hangzhou 310018, China.
| |
Collapse
|
18
|
Gao K, Liu C, Wu J, Du B, Hu W. Towards a better negative sampling strategy for dynamic graphs. Neural Netw 2024; 173:106175. [PMID: 38387201 DOI: 10.1016/j.neunet.2024.106175] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Revised: 12/19/2023] [Accepted: 02/10/2024] [Indexed: 02/24/2024]
Abstract
As dynamic graphs have become indispensable in numerous fields due to their capacity to represent evolving relationships over time, there has been a concomitant increase in the development of Temporal Graph Neural Networks (TGNNs). When training TGNNs for dynamic graph link prediction, the commonly used negative sampling method often produces starkly contrasting samples, which can lead the model to overfit these pronounced differences and compromise its ability to generalize effectively to new data. To address this challenge, we introduce an innovative negative sampling approach named Enhanced Negative Sampling (ENS). This strategy takes into account two pervasive traits observed in dynamic graphs: (1) Historical dependence, indicating that nodes frequently reestablish connections they held in the past, and (2) Temporal proximity preference, which posits that nodes are more inclined to connect with those they have recently interacted with. Specifically, our technique employs a designed scheduling function to strategically control the progression of difficulty of the negative samples throughout the training. This ensures that the training progresses in a balanced manner, becoming incrementally challenging, and thereby enhancing TGNNs' proficiency in predicting links within dynamic graphs. In our empirical evaluation across multiple datasets, we discerned that our ENS, when integrated as a modular component, notably augments the performance of four SOTA baselines. Additionally, we further investigated the applicability of ENS in handling dynamic graphs of varied attributes. Our code is available at https://github.com/qqaazxddrr/ENS.
Collapse
Affiliation(s)
- Kuang Gao
- School of Computer Science, Wuhan University, China.
| | - Chuang Liu
- School of Computer Science, Wuhan University, China.
| | - Jia Wu
- Department of Computing, Macquarie University, Australia.
| | - Bo Du
- School of Computer Science, Wuhan University, China.
| | - Wenbin Hu
- School of Computer Science, Wuhan University, China; Wuhan University Shenzhen Research Institute, China.
| |
Collapse
|
19
|
Wang Z, Chen J, Gong M, Shao Z. Higher-order neurodynamical equation for simplex prediction. Neural Netw 2024; 173:106185. [PMID: 38387202 DOI: 10.1016/j.neunet.2024.106185] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2023] [Revised: 02/01/2024] [Accepted: 02/15/2024] [Indexed: 02/24/2024]
Abstract
It is demonstrated that higher-order patterns beyond pairwise relations can significantly enhance the learning capability of existing graph-based models, and simplex is one of the primary form for graphically representing higher-order patterns. Predicting unknown (disappeared) simplices in real-world complex networks can provide us with deeper insights, thereby assisting us in making better decisions. Nevertheless, previous efforts to predict simplices suffer from two issues: (i) they mainly focus on 2- or 3-simplices, and there are few models available for predicting simplices of arbitrary orders, and (ii) they lack the ability to analyze and learn the features of simplices from the perspective of dynamics. In this paper, we present a Higher-order Neurodynamical Equation for Simplex Prediction of arbitrary order (HNESP), which is a framework that combines neural networks and neurodynamics. Specifically, HNESP simulates the dynamical coupling process of nodes in simplicial complexes through different relations (i.e., strong pairwise relation, weak pairwise relation, and simplex) to learn node-level representations, while explaining the learning mechanism of neural networks from neurodynamics. To enrich the higher-order information contained in simplices, we exploit the entropy and normalized multivariate mutual information of different sub-structures of simplices to acquire simplex-level representations. Furthermore, simplex-level representations and multi-layer perceptron are used to quantify the existence probability of simplices. The effectiveness of HNESP is demonstrated by extensive simulations on seven higher-order benchmarks. Experimental results show that HNESP improves the AUC values of the state-of-the-art baselines by an average of 8.32%. Our implementations will be publicly available at: https://github.com/jianruichen/HNESP.
Collapse
Affiliation(s)
- Zhihui Wang
- Key Laboratory of Modern Teaching Technology, Ministry of Education, Xi'an, China; School of Computer Science, Shaanxi Normal University, Xi'an, China.
| | - Jianrui Chen
- Key Laboratory of Modern Teaching Technology, Ministry of Education, Xi'an, China; School of Computer Science, Shaanxi Normal University, Xi'an, China.
| | - Maoguo Gong
- Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education, Xi'an, China; School of Electronic Engineering, Xidian University, Xi'an, China.
| | - Zhongshi Shao
- Key Laboratory of Modern Teaching Technology, Ministry of Education, Xi'an, China; School of Computer Science, Shaanxi Normal University, Xi'an, China.
| |
Collapse
|
20
|
Beddar-Wiesing S, D'Inverno GA, Graziani C, Lachi V, Moallemy-Oureh A, Scarselli F, Thomas JM. Weisfeiler-Lehman goes dynamic: An analysis of the expressive power of Graph Neural Networks for attributed and dynamic graphs. Neural Netw 2024; 173:106213. [PMID: 38428377 DOI: 10.1016/j.neunet.2024.106213] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2023] [Revised: 02/05/2024] [Accepted: 02/23/2024] [Indexed: 03/03/2024]
Abstract
Graph Neural Networks (GNNs) are a large class of relational models for graph processing. Recent theoretical studies on the expressive power of GNNs have focused on two issues. On the one hand, it has been proven that GNNs are as powerful as the Weisfeiler-Lehman test (1-WL) in their ability to distinguish graphs. Moreover, it has been shown that the equivalence enforced by 1-WL equals unfolding equivalence. On the other hand, GNNs turned out to be universal approximators on graphs modulo the constraints enforced by 1-WL/unfolding equivalence. However, these results only apply to Static Attributed Undirected Homogeneous Graphs (SAUHG) with node attributes. In contrast, real-life applications often involve a much larger variety of graph types. In this paper, we conduct a theoretical analysis of the expressive power of GNNs for two other graph domains that are particularly interesting in practical applications, namely dynamic graphs and SAUGHs with edge attributes. Dynamic graphs are widely used in modern applications; hence, the study of the expressive capability of GNNs in this domain is essential for practical reasons and, in addition, it requires a new analyzing approach due to the difference in the architecture of dynamic GNNs compared to static ones. On the other hand, the examination of SAUHGs is of particular relevance since they act as a standard form for all graph types: it has been shown that all graph types can be transformed without loss of information to SAUHGs with both attributes on nodes and edges. This paper considers generic GNN models and appropriate 1-WL tests for those domains. Then, the known results on the expressive power of GNNs are extended to the mentioned domains: it is proven that GNNs have the same capability as the 1-WL test, the 1-WL equivalence equals unfolding equivalence and that GNNs are universal approximators modulo 1-WL/unfolding equivalence. Moreover, the proof of the approximation capability is mostly constructive and allows us to deduce hints on the architecture of GNNs that can achieve the desired approximation.
Collapse
Affiliation(s)
- Silvia Beddar-Wiesing
- Graphs in Artificial Intelligence and Neural Networks (GAIN), University of Kassel, Germany.
| | | | - Caterina Graziani
- Siena Artificial Intelligence Lab (SAILab), University of Siena, Italy.
| | - Veronica Lachi
- Siena Artificial Intelligence Lab (SAILab), University of Siena, Italy.
| | - Alice Moallemy-Oureh
- Graphs in Artificial Intelligence and Neural Networks (GAIN), University of Kassel, Germany.
| | - Franco Scarselli
- Siena Artificial Intelligence Lab (SAILab), University of Siena, Italy.
| | - Josephine Maria Thomas
- Graphs in Artificial Intelligence and Neural Networks (GAIN), University of Kassel, Germany.
| |
Collapse
|
21
|
Ju W, Fang Z, Gu Y, Liu Z, Long Q, Qiao Z, Qin Y, Shen J, Sun F, Xiao Z, Yang J, Yuan J, Zhao Y, Wang Y, Luo X, Zhang M. A Comprehensive Survey on Deep Graph Representation Learning. Neural Netw 2024; 173:106207. [PMID: 38442651 DOI: 10.1016/j.neunet.2024.106207] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Revised: 01/23/2024] [Accepted: 02/21/2024] [Indexed: 03/07/2024]
Abstract
Graph representation learning aims to effectively encode high-dimensional sparse graph-structured data into low-dimensional dense vectors, which is a fundamental task that has been widely studied in a range of fields, including machine learning and data mining. Classic graph embedding methods follow the basic idea that the embedding vectors of interconnected nodes in the graph can still maintain a relatively close distance, thereby preserving the structural information between the nodes in the graph. However, this is sub-optimal due to: (i) traditional methods have limited model capacity which limits the learning performance; (ii) existing techniques typically rely on unsupervised learning strategies and fail to couple with the latest learning paradigms; (iii) representation learning and downstream tasks are dependent on each other which should be jointly enhanced. With the remarkable success of deep learning, deep graph representation learning has shown great potential and advantages over shallow (traditional) methods, there exist a large number of deep graph representation learning techniques have been proposed in the past decade, especially graph neural networks. In this survey, we conduct a comprehensive survey on current deep graph representation learning algorithms by proposing a new taxonomy of existing state-of-the-art literature. Specifically, we systematically summarize the essential components of graph representation learning and categorize existing approaches by the ways of graph neural network architectures and the most recent advanced learning paradigms. Moreover, this survey also provides the practical and promising applications of deep graph representation learning. Last but not least, we state new perspectives and suggest challenging directions which deserve further investigations in the future.
Collapse
Affiliation(s)
- Wei Ju
- School of Computer Science, National Key Laboratory for Multimedia Information Processing, Peking University, Beijing, 100871, China
| | - Zheng Fang
- School of Intelligence Science and Technology, Peking University, Beijing, 100871, China
| | - Yiyang Gu
- School of Computer Science, National Key Laboratory for Multimedia Information Processing, Peking University, Beijing, 100871, China
| | - Zequn Liu
- School of Computer Science, National Key Laboratory for Multimedia Information Processing, Peking University, Beijing, 100871, China
| | - Qingqing Long
- Computer Network Information Center, Chinese Academy of Sciences, Beijing, 100086, China
| | - Ziyue Qiao
- Artificial Intelligence Thrust, The Hong Kong University of Science and Technology, Guangzhou, 511453, China
| | - Yifang Qin
- School of Computer Science, National Key Laboratory for Multimedia Information Processing, Peking University, Beijing, 100871, China
| | - Jianhao Shen
- School of Computer Science, National Key Laboratory for Multimedia Information Processing, Peking University, Beijing, 100871, China
| | - Fang Sun
- Department of Computer Science, University of California, Los Angeles, 90095, USA
| | - Zhiping Xiao
- Department of Computer Science, University of California, Los Angeles, 90095, USA
| | - Junwei Yang
- School of Computer Science, National Key Laboratory for Multimedia Information Processing, Peking University, Beijing, 100871, China
| | - Jingyang Yuan
- School of Computer Science, National Key Laboratory for Multimedia Information Processing, Peking University, Beijing, 100871, China
| | - Yusheng Zhao
- School of Computer Science, National Key Laboratory for Multimedia Information Processing, Peking University, Beijing, 100871, China
| | - Yifan Wang
- School of Information Technology & Management, University of International Business and Economics, Beijing, 100029, China
| | - Xiao Luo
- Department of Computer Science, University of California, Los Angeles, 90095, USA.
| | - Ming Zhang
- School of Computer Science, National Key Laboratory for Multimedia Information Processing, Peking University, Beijing, 100871, China.
| |
Collapse
|
22
|
Zhang D, Lin Z, Xuan L, Lu M, Shi B, Shi J, He F, Battino M, Zhao L, Zou X. Rapid determination of geographical authenticity and pungency intensity of the red Sichuan pepper (Zanthoxylum bungeanum) using differential pulse voltammetry and machine learning algorithms. Food Chem 2024; 439:137978. [PMID: 38048663 DOI: 10.1016/j.foodchem.2023.137978] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2023] [Revised: 11/03/2023] [Accepted: 11/11/2023] [Indexed: 12/06/2023]
Abstract
The development of an analytical method for assessing pungency intensity and determining geographical origins is crucial for evaluating the quality of visually similar Zanthoxylum bungeanum pericarp (PZB). This study analyzed 210 PZB samples from 14 origins across China, focusing on origin adulteration identification and pungency intensity using a combination of differential pulse voltammetry (DPV) and machine learning algorithms. The artificial neural network (ANN) and K-nearest neighbor (KNN) algorithms provided the highest accuracy in origin identification (100 %) and adulteration detection (97.9 %) respectively. Moreover, the ANN excelled in predicting pungency intensity (R2 = 0.918). Assessment via feature importance analysis of DPV features revealed that segments of polyphenols (0.34-0.52 V and 1.0-1.2 V) and alkylamides (1.0-1.2 V) contributed significantly to the PZB pungency intensity. These findings highlight the potential of DPV as a reliable method for assessing the quality of PZB, and offer a promising solution for ensuring the geographical authenticity of this important crop.
Collapse
Affiliation(s)
- Di Zhang
- School of Food and Biological Engineering, Jiangsu University, Zhenjiang 212013, China
| | - Zitao Lin
- School of Food and Biological Engineering, Jiangsu University, Zhenjiang 212013, China
| | - Lilei Xuan
- School of Food and Biological Engineering, Jiangsu University, Zhenjiang 212013, China
| | - Minmin Lu
- School of Food and Biological Engineering, Jiangsu University, Zhenjiang 212013, China
| | - Bolin Shi
- Food and Agriculture Standardization Institute, China National Institute of Standardization, Beijing 102200, China.
| | - Jiyong Shi
- School of Food and Biological Engineering, Jiangsu University, Zhenjiang 212013, China
| | - Fatao He
- Jinan Fruit Research Institute, China Federation of Supply and Marketing Co-operatives, Jinan, Shandong 250200, China
| | - Maurizio Battino
- International Research Center for Food Nutrition and Safety, Jiangsu University, Zhenjiang 212013, China; Department of Clinical Sciences, Faculty of Medicine, Polytechnic University of Marche, Ancona, Italy
| | - Lei Zhao
- Food and Agriculture Standardization Institute, China National Institute of Standardization, Beijing 102200, China
| | - Xiaobo Zou
- School of Food and Biological Engineering, Jiangsu University, Zhenjiang 212013, China
| |
Collapse
|
23
|
Wei C, Wang X, Ren F, Zeng Z. Quasi-synchronization for variable-order fractional complex dynamical networks with hybrid delay-dependent impulses. Neural Netw 2024; 173:106161. [PMID: 38335795 DOI: 10.1016/j.neunet.2024.106161] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Revised: 12/10/2023] [Accepted: 02/01/2024] [Indexed: 02/12/2024]
Abstract
This paper focuses on addressing the problem of quasi-synchronization in heterogeneous variable-order fractional complex dynamical networks (VFCDNs) with hybrid delay-dependent impulses. Firstly, a mathematics model of VFCDNs with short memory is established under multi-weighted networks and mismatched parameters, which is more diverse and practical. Secondly, under the framework of variable-order fractional derivative, a novel fractional differential inequality has been proposed to handle the issue of quasi-synchronization with hybrid delay-dependent impulses. Additionally, the quasi-synchronization criterion for VFCDNs is developed using differential inclusion theory and Lyapunov method. Finally, the practicality and feasibility of this theoretical analysis are demonstrated through numerical examples.
Collapse
Affiliation(s)
- Chen Wei
- School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Xiaoping Wang
- School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan 430074, China.
| | - Fangmin Ren
- School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Zhigang Zeng
- School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan 430074, China
| |
Collapse
|
24
|
Xue G, Zhong M, Qian T, Li J. PSA-GNN: An augmented GNN framework with priori subgraph knowledge. Neural Netw 2024; 173:106155. [PMID: 38335793 DOI: 10.1016/j.neunet.2024.106155] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2023] [Revised: 12/13/2023] [Accepted: 01/29/2024] [Indexed: 02/12/2024]
Abstract
Graph neural networks have become the primary graph representation learning paradigm, in which nodes update their embeddings by aggregating messages from their neighbors iteratively. However, current message passing based GNNs exploit the higher-order subgraph information other than 1st-order neighbors insufficiently. In contrast, the long-standing graph research has investigated various subgraphs such as motif, clique, core, and truss that contain important structural information to downstream tasks like node classification, which deserve to be preserved by GNNs. In this work, we propose to use the pre-mined subgraphs as priori knowledge to extend the receptive field of GNNs and enhance their expressive power to go beyond the 1st-order Weisfeiler-Lehman isomorphism test. For that, we introduce a general framework called PSA-GNN (Priori Subgraph Augmented Graph Neural Network), which augments each GNN layer by a pair of parallel convolution layers based on a bipartite graph between nodes and priori subgraphs. PSA-GNN intrinsically builds a hybrid receptive field by incorporating priori subgraphs as neighbors, while the embeddings and weights of subgraphs are trainable. Moreover, PSA-GNN can purify the noisy subgraphs both heuristically before training and deterministically during training based on a novel metric called homogeneity. Experimental results show that PSA-GNN achieves an improved performance compared with state-of-the-art message passing based GNN models.
Collapse
Affiliation(s)
- Guotong Xue
- School of Computer Science, Wuhan University, Wuhan, China
| | - Ming Zhong
- School of Computer Science, Wuhan University, Wuhan, China.
| | - Tieyun Qian
- School of Computer Science, Wuhan University, Wuhan, China
| | - Jianxin Li
- School of Information Technology, Deakin University, Burwood, Australia
| |
Collapse
|
25
|
Agliari E, Alemanno F, Aquaro M, Barra A, Durante F, Kanter I. Hebbian dreaming for small datasets. Neural Netw 2024; 173:106174. [PMID: 38359641 DOI: 10.1016/j.neunet.2024.106174] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2023] [Revised: 01/02/2024] [Accepted: 02/09/2024] [Indexed: 02/17/2024]
Abstract
The dreaming Hopfield model constitutes a generalization of the Hebbian paradigm for neural networks, that is able to perform on-line learning when "awake" and also to account for off-line "sleeping" mechanisms. The latter have been shown to enhance storing in such a way that, in the long sleep-time limit, this model can reach the maximal storage capacity achievable by networks equipped with symmetric pairwise interactions. In this paper, we inspect the minimal amount of information that must be supplied to such a network to guarantee a successful generalization, and we test it both on random synthetic and on standard structured datasets (i.e., MNIST, Fashion-MNIST and Olivetti). By comparing these minimal thresholds of information with those required by the standard (i.e., always "awake") Hopfield model, we prove that the present network can save up to ∼90% of the dataset size, yet preserving the same performance of the standard counterpart. This suggests that sleep may play a pivotal role in explaining the gap between the large volumes of data required to train artificial neural networks and the relatively small volumes needed by their biological counterparts. Further, we prove that the model Cost function (typically used in statistical mechanics) admits a representation in terms of a standard Loss function (typically used in machine learning) and this allows us to analyze its emergent computational skills both theoretically and computationally: a quantitative picture of its capabilities as a function of its control parameters is achieved and consistency between the two approaches is highlighted. The resulting network is an associative memory for pattern recognition tasks that learns from examples on-line, generalizes correctly (in suitable regions of its control parameters) and optimizes its storage capacity by off-line sleeping: such a reduction of the training cost can be inspiring toward sustainable AI and in situations where data are relatively sparse.
Collapse
Affiliation(s)
- Elena Agliari
- Department of Mathematics of Sapienza Università di Roma, Rome, Italy.
| | - Francesco Alemanno
- Department of Mathematics and Physics of Università del Salento, Lecce, Italy
| | - Miriam Aquaro
- Department of Mathematics of Sapienza Università di Roma, Rome, Italy
| | - Adriano Barra
- Department of Mathematics and Physics of Università del Salento, Lecce, Italy.
| | - Fabrizio Durante
- Department of Economic Sciences of Università del Salento, Lecce, Italy
| | - Ido Kanter
- Department of Physics of Bar-Ilan University, Ramat Gan, Israel
| |
Collapse
|
26
|
Yang C, Huebner ES, Tian L. Prediction of suicidal ideation among preadolescent children with machine learning models: A longitudinal study. J Affect Disord 2024; 352:403-409. [PMID: 38387673 DOI: 10.1016/j.jad.2024.02.070] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/19/2023] [Revised: 02/15/2024] [Accepted: 02/19/2024] [Indexed: 02/24/2024]
Abstract
BACKGROUND Machine learning (ML) has been widely used to predict suicidal ideation (SI) in adolescents and adults. Nevertheless, studies of accurate and efficient models of SI prediction with preadolescent children are still needed because SI is surprisingly prevalent during the transition into adolescence. This study aimed to explore the potential of ML models to predict SI among preadolescent children. METHODS A total of 4691 Chinese children (54.89 % boys, Mage = 10.92 at baseline) and their parents completed relevant measures at baseline and the children provided 6-month follow-up data for SI. The current study compared four ML models: Random Forest (RF), Decision Tree (DT), Support Vector Machine (SVM), and Multilayer Perceptron (MLP), to predict SI and to identify variables with predictive value based on the best-performing model among Chinese preadolescent children. RESULTS The RF model achieved the highest discriminant performance with an AUC of 0.92, accuracy of 0.93 (balanced accuracy = 0.88). The factors of internalizing problems, externalizing problems, neuroticism, childhood maltreatment, and subjective well-being in school demonstrated the highest values in predicting SI. CONCLUSION The findings of this study suggested that ML models based on the observation and assessment of children's general characteristics and experiences in everyday life can serve as convenient screening and evaluation tools for suicide risk assessment among Chinese preadolescent children. The findings also provide insights for early intervention.
Collapse
Affiliation(s)
- Chi Yang
- Philosophy and Social Science Laboratory of Reading and Development in Children and Adolescents, South China Normal University, Ministry of Education, Guangzhou 510631, People's Republic of China; School of Psychology, South China Normal University, Guangzhou 510631, People's Republic of China
| | - E Scott Huebner
- Department of Psychology, University of South Carolina, Columbia, SC 29208, USA
| | - Lili Tian
- Philosophy and Social Science Laboratory of Reading and Development in Children and Adolescents, South China Normal University, Ministry of Education, Guangzhou 510631, People's Republic of China.
| |
Collapse
|
27
|
Yan J, Liu Q, Zhang M, Feng L, Ma D, Li H, Pan G. Efficient spiking neural network design via neural architecture search. Neural Netw 2024; 173:106172. [PMID: 38402808 DOI: 10.1016/j.neunet.2024.106172] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Revised: 01/09/2024] [Accepted: 02/08/2024] [Indexed: 02/27/2024]
Abstract
Spiking neural networks (SNNs) are brain-inspired models that utilize discrete and sparse spikes to transmit information, thus having the property of energy efficiency. Recent advances in learning algorithms have greatly improved SNN performance due to the automation of feature engineering. While the choice of neural architecture plays a significant role in deep learning, the current SNN architectures are mainly designed manually, which is a time-consuming and error-prone process. In this paper, we propose a spiking neural architecture search (NAS) method that can automatically find efficient SNNs. To tackle the challenge of long search time faced by SNNs when utilizing NAS, the proposed NAS encodes candidate architectures in a branchless spiking supernet which significantly reduces the computation requirements in the search process. Considering that real-world tasks prefer efficient networks with optimal accuracy under a limited computational budget, we propose a Synaptic Operation (SynOps)-aware optimization to automatically find the computationally efficient subspace of the supernet. Experimental results show that, in less search time, our proposed NAS can find SNNs with higher accuracy and lower computational cost than state-of-the-art SNNs. We also conduct experiments to validate the search process and the trade-off between accuracy and computational cost.
Collapse
Affiliation(s)
- Jiaqi Yan
- Zhejiang University, Hangzhou, 310027, China
| | - Qianhui Liu
- National University of Singapore, 119077, Singapore
| | - Malu Zhang
- University of Electronic Science and Technology of China, Chengdu, 611731, China
| | - Lang Feng
- Zhejiang University, Hangzhou, 310027, China
| | - De Ma
- Zhejiang University, Hangzhou, 310027, China
| | - Haizhou Li
- National University of Singapore, 119077, Singapore; The Chinese University of Hong Kong, Shenzhen, 518172, China
| | - Gang Pan
- Zhejiang University, Hangzhou, 310027, China.
| |
Collapse
|
28
|
Bao LL, Zhang JS, Zhang CX. Spatial multi-attention conditional neural processes. Neural Netw 2024; 173:106201. [PMID: 38447305 DOI: 10.1016/j.neunet.2024.106201] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2023] [Revised: 01/03/2024] [Accepted: 02/20/2024] [Indexed: 03/08/2024]
Abstract
Spatial prediction tasks are challenging when observed samples are sparse and prediction samples are abundant. Gaussian processes (GPs) are commonly used in spatial prediction tasks and have the advantage of measuring the uncertainty of the interpolation result. However, as the sample size increases, GPs suffer from significant overhead. Standard neural networks (NNs) provide a powerful and scalable solution for modeling spatial data, but they often overfit small sample data. Based on conditional neural processes (CNPs), which combine the advantages of GPs and NNs, we propose a new framework called Spatial Multi-Attention Conditional Neural Processes (SMACNPs) for spatial small sample prediction tasks. SMACNPs are a modular model that can predict targets by employing different attention mechanisms to extract relevant information from different forms of sample data. The task representation is inferred by measuring the spatial correlation contained in different sample points and the relationship contained in attribute variables, respectively. The distribution of the target variable is predicted by GPs parameterized by NNs. SMACNPs allow us to obtain accurate predictions of the target value while quantifying the prediction uncertainty. Experiments on spatial prediction tasks on simulated and real-world datasets demonstrate that this framework flexibly incorporates spatial context and correlation into the model, achieving state-of-the-art results in spatial small sample prediction tasks in terms of both predictive performance and reliability. For example, on the California housing dataset, our method reduces MAE by 8% and MSE by 7% compared to the second-best method. In addition, a spatiotemporal prediction task to forecast traffic speed further confirms the effectiveness and generality of our method.
Collapse
Affiliation(s)
- Li-Li Bao
- School of Mathematics and Statistics, Xi'an Jiaotong University, Xi'an Shaanxi, 710049, China
| | - Jiang-She Zhang
- School of Mathematics and Statistics, Xi'an Jiaotong University, Xi'an Shaanxi, 710049, China.
| | - Chun-Xia Zhang
- School of Mathematics and Statistics, Xi'an Jiaotong University, Xi'an Shaanxi, 710049, China
| |
Collapse
|
29
|
Jin Y, Hou L, Zhong S. Extended Dynamic Mode Decomposition with Invertible Dictionary Learning. Neural Netw 2024; 173:106177. [PMID: 38382398 DOI: 10.1016/j.neunet.2024.106177] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2023] [Revised: 12/13/2023] [Accepted: 02/13/2024] [Indexed: 02/23/2024]
Abstract
The Koopman operator has received attention for providing a potentially global linearization representation of the nonlinear dynamical system. To estimate or control the original system, the invertibility problem is introduced into the data-driven modeling, i.e., the observables are required to be reconstructed the original system's states. Existing methods cannot solve this problem perfectly. Only linear or nonlinear but lossy reconstruction can be achieved. This paper proposed a novel data-driven modeling approach, denoted as the Extended Dynamic Mode Decomposition with Invertible Dictionary Learning (EDMD-IDL) to address this issue, which can be interpreted as a further extension of the classical Extended Dynamic Mode Decomposition (EDMD). The Invertible Neural Network (INN) is introduced in the proposed method, where its inverse process provides the explicit inverse on the dictionary functions, thus allowing the nonlinear and lossless reconstruction. An iterative algorithm is designed to solve the extended optimization problem defined by the Koopman operator and INN by combining the optimization algorithm based on the gradient descent and the classical EDMD method, making the method successfully obtain the finite-dimensional approximation of the Koopman operator. The method is tested on various canonical nonlinear dynamical systems and is shown that the predictions obtained in a linear fashion and the ground truth match well over the long-term, where only the initial status is provided. Comparison experiments highlight the superiority of the proposed method over the other EDMD-based methods. Notably, a typical example in fluid dynamics, cylinder wake, illustrates the potential of the method to be further extended to the high-dimensional system with tens of thousands of states. By combining the Proper Orthogonal Decomposition technique, nontrivial Kármán vortex sheet phenomenon is perfectly reconstructed. Our proposed method provides a new paradigm for solving the finite-dimensional approximation of the Koopman operator and applying it to data-driven modeling.
Collapse
Affiliation(s)
- Yuhong Jin
- School of Astronautics, Harbin Institute of Technology, Harbin, 150001, PR China
| | - Lei Hou
- School of Astronautics, Harbin Institute of Technology, Harbin, 150001, PR China.
| | - Shun Zhong
- Department of Mechanics, Tianjin University, Tianjin, 300072, PR China
| |
Collapse
|
30
|
Wan J, Xia N, Yin Y, Pan X, Hu J, Yi J. TCDformer: A transformer framework for non-stationary time series forecasting based on trend and change-point detection. Neural Netw 2024; 173:106196. [PMID: 38412739 DOI: 10.1016/j.neunet.2024.106196] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Revised: 01/25/2024] [Accepted: 02/18/2024] [Indexed: 02/29/2024]
Abstract
Although time series prediction models based on Transformer architecture have achieved significant advances, concerns have arisen regarding their performance with non-stationary real-world data. Traditional methods often use stabilization techniques to boost predictability, but this often results in the loss of non-stationarity, notably underperforming when tackling major events in practical applications. To address this challenge, this research introduces an innovative method named TCDformer (Trend and Change-point Detection Transformer). TCDformer employs a unique strategy, initially encoding abrupt changes in non-stationary time series using the local linear scaling approximation (LLSA) module. The reconstructed contextual time series is then decomposed into trend and seasonal components. The final prediction results are derived from the additive combination of a multilayer perceptron (MLP) for predicting trend components and wavelet attention mechanisms for seasonal components. Comprehensive experimental results show that on standard time series prediction datasets, TCDformer significantly surpasses existing benchmark models in terms of performance, reducing MSE by 47.36% and MAE by 31.12%. This approach offers an effective framework for managing non-stationary time series, achieving a balance between performance and interpretability, making it especially suitable for addressing non-stationarity challenges in real-world scenarios.
Collapse
Affiliation(s)
- Jiashan Wan
- College of Computer and Information Science, Hefei University of Technology, Hefei, 230601, Anhui, China; College of Big Data and Artificial Intelligence, Anhui Institute of Information Technology, Wuhu, 241000, Anhui, China.
| | - Na Xia
- College of Computer and Information Science, Hefei University of Technology, Hefei, 230601, Anhui, China
| | - Yutao Yin
- Shenzhen Hangsheng electronics Co., Ltd., Shenzhen, 518103, Guangdong, China
| | - Xulei Pan
- College of Big Data and Artificial Intelligence, Anhui Institute of Information Technology, Wuhu, 241000, Anhui, China
| | - Jin Hu
- Shenzhen Hangsheng electronics Co., Ltd., Shenzhen, 518103, Guangdong, China
| | - Jun Yi
- College of Computer and Information Science, Hefei University of Technology, Hefei, 230601, Anhui, China
| |
Collapse
|
31
|
Xiao Y, Adegoke M, Leung CS, Leung KW. Robust noise-aware algorithm for randomized neural network and its convergence properties. Neural Netw 2024; 173:106202. [PMID: 38422835 DOI: 10.1016/j.neunet.2024.106202] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Revised: 12/19/2023] [Accepted: 02/20/2024] [Indexed: 03/02/2024]
Abstract
The concept of randomized neural networks (RNNs), such as the random vector functional link network (RVFL) and extreme learning machine (ELM), is a widely accepted and efficient network method for constructing single-hidden layer feedforward networks (SLFNs). Due to its exceptional approximation capabilities, RNN is being extensively used in various fields. While the RNN concept has shown great promise, its performance can be unpredictable in imperfect conditions, such as weight noises and outliers. Thus, there is a need to develop more reliable and robust RNN algorithms. To address this issue, this paper proposes a new objective function that addresses the combined effect of weight noise and training data outliers for RVFL networks. Based on the half-quadratic optimization method, we then propose a novel algorithm, named noise-aware RNN (NARNN), to optimize the proposed objective function. The convergence of the NARNN is also theoretically validated. We also discuss the way to use the NARNN for ensemble deep RVFL (edRVFL) networks. Finally, we present an extension of the NARNN to concurrently address weight noise, stuck-at-fault, and outliers. The experimental results demonstrate that the proposed algorithm outperforms a number of state-of-the-art robust RNN algorithms.
Collapse
Affiliation(s)
- Yuqi Xiao
- Department of Electrical Engineering, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong, HKSAR, China; State Key Laboratory of Terahertz and Millimeter Waves, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong, HKSAR, China; Shenzhen Key Laboratory of Millimeter Wave and Wideband Wireless Communications, CityU Shenzhen Research Institute, Shenzhen, 518057, China.
| | - Muideen Adegoke
- Department of Electrical Engineering, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong, HKSAR, China.
| | - Chi-Sing Leung
- Department of Electrical Engineering, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong, HKSAR, China.
| | - Kwok Wa Leung
- Department of Electrical Engineering, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong, HKSAR, China; State Key Laboratory of Terahertz and Millimeter Waves, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong, HKSAR, China; Shenzhen Key Laboratory of Millimeter Wave and Wideband Wireless Communications, CityU Shenzhen Research Institute, Shenzhen, 518057, China.
| |
Collapse
|
32
|
Dubinin I, Effenberger F. Fading memory as inductive bias in residual recurrent networks. Neural Netw 2024; 173:106179. [PMID: 38387205 DOI: 10.1016/j.neunet.2024.106179] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Revised: 02/07/2024] [Accepted: 02/13/2024] [Indexed: 02/24/2024]
Abstract
Residual connections have been proposed as an architecture-based inductive bias to mitigate the problem of exploding and vanishing gradients and increased task performance in both feed-forward and recurrent networks (RNNs) when trained with the backpropagation algorithm. Yet, little is known about how residual connections in RNNs influence their dynamics and fading memory properties. Here, we introduce weakly coupled residual recurrent networks (WCRNNs) in which residual connections result in well-defined Lyapunov exponents and allow for studying properties of fading memory. We investigate how the residual connections of WCRNNs influence their performance, network dynamics, and memory properties on a set of benchmark tasks. We show that several distinct forms of residual connections yield effective inductive biases that result in increased network expressivity. In particular, those are residual connections that (i) result in network dynamics at the proximity of the edge of chaos, (ii) allow networks to capitalize on characteristic spectral properties of the data, and (iii) result in heterogeneous memory properties. In addition, we demonstrate how our results can be extended to non-linear residuals and introduce a weakly coupled residual initialization scheme that can be used for Elman RNNs.
Collapse
Affiliation(s)
- Igor Dubinin
- Ernst Strüngmann Institute, Deutschordenstraße 46, Frankfurt am Main, 60528, Germany; Frankfurt Institute for Advanced Studies, Ruth-Moufang-Straße 1, Frankfurt am Main, 60438, Germany.
| | - Felix Effenberger
- Ernst Strüngmann Institute, Deutschordenstraße 46, Frankfurt am Main, 60528, Germany.
| |
Collapse
|
33
|
Li H, Chen X, Ditzler G, Roveda J, Li A. Knowledge distillation under ideal joint classifier assumption. Neural Netw 2024; 173:106160. [PMID: 38330746 PMCID: PMC10961204 DOI: 10.1016/j.neunet.2024.106160] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2023] [Revised: 12/05/2023] [Accepted: 02/01/2024] [Indexed: 02/10/2024]
Abstract
Knowledge distillation constitutes a potent methodology for condensing substantial neural networks into more compact and efficient counterparts. Within this context, softmax regression representation learning serves as a widely embraced approach, leveraging a pre-established teacher network to guide the learning process of a diminutive student network. Notably, despite the extensive inquiry into the efficacy of softmax regression representation learning, the intricate underpinnings governing the knowledge transfer mechanism remain inadequately elucidated. This study introduces the 'Ideal Joint Classifier Knowledge Distillation' (IJCKD) framework, an overarching paradigm that not only furnishes a lucid and exhaustive comprehension of prevailing knowledge distillation techniques but also establishes a theoretical underpinning for prospective investigations. Employing mathematical methodologies derived from domain adaptation theory, this investigation conducts a comprehensive examination of the error boundary of the student network contingent upon the teacher network. Consequently, our framework facilitates efficient knowledge transference between teacher and student networks, thereby accommodating a diverse spectrum of applications.
Collapse
Affiliation(s)
- Huayu Li
- Department of Electrical & Computer Engineering at the University of Arizona, Tucson, 85721, AZ, USA
| | - Xiwen Chen
- School of Computing at Clemson University, Clemson, 29634, SC, USA
| | | | - Janet Roveda
- Department of Electrical & Computer Engineering at the University of Arizona, Tucson, 85721, AZ, USA; Department of Biomedical Engineering, The University of Arizona, Tucson, 85721, AZ, USA; BIO5 Institute, The University of Arizona, Tucson, 85721, AZ, USA
| | - Ao Li
- Department of Electrical & Computer Engineering at the University of Arizona, Tucson, 85721, AZ, USA; BIO5 Institute, The University of Arizona, Tucson, 85721, AZ, USA.
| |
Collapse
|
34
|
Zhang Z, Song Y, Chen T, He J. A regularized orthogonal activated inverse-learning neural network for regression and classification with outliers. Neural Netw 2024; 173:106208. [PMID: 38447304 DOI: 10.1016/j.neunet.2024.106208] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2023] [Revised: 02/20/2024] [Accepted: 02/21/2024] [Indexed: 03/08/2024]
Abstract
A novel regularized orthogonal activated inverse-learning (ROAIL) neural network is proposed and investigated for reducing the impact of outliers in regression and classification fields. The proposed ROAIL network does not require extensive iterative computations. Instead, it can achieve the desired results with a single step of computation, allowing for the efficient acquisition of network weights. By extending the Gegenbauer polynomials to a multi-variate version, and integrating the ℓ2 regularization and Welsch loss function into the orthogonal activated inverse-learning framework, two forms of ROAIL are obtained, i.e., ℓ2 norm ROAIL (ℓ2-ROAIL) and Welsch-ROAIL (W-ROAIL). ℓ2-ROAIL neural network is proposed to minimize the empirical and structural risk simultaneously since taking the structural risk as a part of loss function can effectively reduce the complexity of the model and thus improve the generalization ability. W-ROAIL neural network further improves the robustness of the ℓ2-ROAIL neural network by replacing the original two-norm in loss function with Welsch function. The Welsch function can determine the weights of each sample according to its output error, and influence of outliers could be weakened since the weights of outliers would be reduced. Both regression and classification experiments show that W-ROAIL neural network has strong ability to suppress the influence of outliers.
Collapse
Affiliation(s)
- Zhijun Zhang
- School of Automation Science and Engineering, South China University of Technology, China.
| | - Yating Song
- School of Automation Science and Engineering, South China University of Technology, China.
| | - Tao Chen
- School of Automation Science and Engineering, South China University of Technology, China.
| | - Jie He
- School of Automation Science and Engineering, South China University of Technology, China.
| |
Collapse
|
35
|
Villaizán-Vallelado M, Salvatori M, Carro B, Sanchez-Esguevillas AJ. Graph Neural Network contextual embedding for Deep Learning on tabular data. Neural Netw 2024; 173:106180. [PMID: 38447303 DOI: 10.1016/j.neunet.2024.106180] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Revised: 01/29/2024] [Accepted: 02/13/2024] [Indexed: 03/08/2024]
Abstract
All industries are trying to leverage Artificial Intelligence (AI) based on their existing big data which is available in so called tabular form, where each record is composed of a number of heterogeneous continuous and categorical columns also known as features. Deep Learning (DL) has constituted a major breakthrough for AI in fields related to human skills like natural language processing, but its applicability to tabular data has been more challenging. More classical Machine Learning (ML) models like tree-based ensemble ones usually perform better. This paper presents a novel DL model using Graph Neural Network (GNN) more specifically Interaction Network (IN), for contextual embedding and modeling interactions among tabular features. Its results outperform those of a recently published survey with DL benchmark based on seven public datasets, also achieving competitive results when compared to boosted-tree solutions.
Collapse
Affiliation(s)
- Mario Villaizán-Vallelado
- Artificial Intelligence Laboratory (AI-Lab), Telefonica I+D, Spain; Universidad de Valladolid, Valladolid, 47011, Spain.
| | - Matteo Salvatori
- Artificial Intelligence Laboratory (AI-Lab), Telefonica I+D, Spain.
| | - Belén Carro
- Universidad de Valladolid, Valladolid, 47011, Spain.
| | | |
Collapse
|
36
|
van Nuland TDH. Noncompact uniform universal approximation. Neural Netw 2024; 173:106181. [PMID: 38412737 DOI: 10.1016/j.neunet.2024.106181] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2023] [Revised: 12/28/2023] [Accepted: 02/13/2024] [Indexed: 02/29/2024]
Abstract
The universal approximation theorem is generalised to uniform convergence on the (noncompact) input space Rn. All continuous functions that vanish at infinity can be uniformly approximated by neural networks with one hidden layer, for all activation functions φ that are continuous, nonpolynomial, and asymptotically polynomial at ±∞. When φ is moreover bounded, we exactly determine which functions can be uniformly approximated by neural networks, with the following unexpected results. Let Nφl(Rn)¯ denote the vector space of functions that are uniformly approximable by neural networks with l hidden layers and n inputs. For all n and all l≥2, Nφl(Rn)¯ turns out to be an algebra under the pointwise product. If the left limit of φ differs from its right limit (for instance, when φ is sigmoidal) the algebra Nφl(Rn)¯ (l≥2) is independent of φ and l, and equals the closed span of products of sigmoids composed with one-dimensional projections. If the left limit of φ equals its right limit, Nφl(Rn)¯ (l≥1) equals the (real part of the) commutative resolvent algebra, a C*-algebra which is used in mathematical approaches to quantum theory. In the latter case, the algebra is independent of l≥1, whereas in the former case Nφ2(Rn)¯ is strictly bigger than Nφ1(Rn)¯.
Collapse
|
37
|
Eldred C, Gay-Balmaz F, Huraka S, Putkaradze V. Lie-Poisson Neural Networks (LPNets): Data-based computing of Hamiltonian systems with symmetries. Neural Netw 2024; 173:106162. [PMID: 38335794 DOI: 10.1016/j.neunet.2024.106162] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Revised: 01/22/2024] [Accepted: 02/01/2024] [Indexed: 02/12/2024]
Abstract
An accurate data-based prediction of the long-term evolution of Hamiltonian systems requires a network that preserves the appropriate structure under each time step. Every Hamiltonian system contains two essential ingredients: the Poisson bracket and the Hamiltonian. Hamiltonian systems with symmetries, whose paradigm examples are the Lie-Poisson systems, have been shown to describe a broad category of physical phenomena, from satellite motion to underwater vehicles, fluids, geophysical applications, complex fluids, and plasma physics. The Poisson bracket in these systems comes from the symmetries, while the Hamiltonian comes from the underlying physics. We view the symmetry of the system as primary, hence the Lie-Poisson bracket is known exactly, whereas the Hamiltonian is regarded as coming from physics and is considered not known, or known approximately. Using this approach, we develop a network based on transformations that exactly preserve the Poisson bracket and the special functions of the Lie-Poisson systems (Casimirs) to machine precision. We present two flavors of such systems: one, where the parameters of transformations are computed from data using a dense neural network (LPNets), and another, where the composition of transformations is used as building blocks (G-LPNets). We also show how to adapt these methods to a larger class of Poisson brackets. We apply the resulting methods to several examples, such as rigid body (satellite) motion, underwater vehicles, a particle in a magnetic field, and others. The methods developed in this paper are important for the construction of accurate data-based methods for simulating the long-term dynamics of physical systems.
Collapse
Affiliation(s)
- Christopher Eldred
- Computer Science Research Institute, Sandia National Laboratory, 1450 Innovation Pkwy SE, Albuquerque, NM, 87123, USA.
| | - François Gay-Balmaz
- Division of Mathematical Sciences, Nanyang Technological University, 637371, Singapore.
| | - Sofiia Huraka
- Department of Mathematical and Statistical Sciences, University of Alberta, Edmonton, T6G 2G1, Alberta, Canada.
| | - Vakhtang Putkaradze
- Department of Mathematical and Statistical Sciences, University of Alberta, Edmonton, T6G 2G1, Alberta, Canada.
| |
Collapse
|
38
|
Csiszárik A, Kiss MF, Kőrösi-Szabó P, Muntag M, Papp G, Varga D. Mode combinability: Exploring convex combinations of permutation aligned models. Neural Netw 2024; 173:106204. [PMID: 38412738 DOI: 10.1016/j.neunet.2024.106204] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2023] [Revised: 12/28/2023] [Accepted: 02/20/2024] [Indexed: 02/29/2024]
Abstract
We explore element-wise convex combinations of two permutation-aligned neural network parameter vectors ΘA and ΘB of size d. We conduct extensive experiments by examining various distributions of such model combinations parametrized by elements of the hypercube [0,1]d and its vicinity. Our findings reveal that broad regions of the hypercube form surfaces of low loss values, indicating that the notion of linear mode connectivity extends to a more general phenomenon which we call mode combinability. We also make several novel observations regarding linear mode connectivity and model re-basin. We demonstrate a transitivity property: two models re-based to a common third model are also linear mode connected, and a robustness property: even with significant perturbations of the neuron matchings the resulting combinations continue to form a working model. Moreover, we analyze the functional and weight similarity of model combinations and show that such combinations are non-vacuous in the sense that there are significant functional differences between the resulting models.
Collapse
Affiliation(s)
- Adrián Csiszárik
- HUN-REN Alfréd Rényi Institute of Mathematics, Reáltanoda utca 13-15., Budapest, 1053, Hungary; Eötvös Loránd University, Pázmány Péter sétány 1/C, Budapest, 1117, Hungary.
| | - Melinda F Kiss
- HUN-REN Alfréd Rényi Institute of Mathematics, Reáltanoda utca 13-15., Budapest, 1053, Hungary; Eötvös Loránd University, Pázmány Péter sétány 1/C, Budapest, 1117, Hungary.
| | - Péter Kőrösi-Szabó
- HUN-REN Alfréd Rényi Institute of Mathematics, Reáltanoda utca 13-15., Budapest, 1053, Hungary.
| | - Márton Muntag
- HUN-REN Alfréd Rényi Institute of Mathematics, Reáltanoda utca 13-15., Budapest, 1053, Hungary.
| | - Gergely Papp
- HUN-REN Alfréd Rényi Institute of Mathematics, Reáltanoda utca 13-15., Budapest, 1053, Hungary.
| | - Dániel Varga
- HUN-REN Alfréd Rényi Institute of Mathematics, Reáltanoda utca 13-15., Budapest, 1053, Hungary.
| |
Collapse
|
39
|
Sluijterman L, Cator E, Heskes T. How to evaluate uncertainty estimates in machine learning for regression? Neural Netw 2024; 173:106203. [PMID: 38442649 DOI: 10.1016/j.neunet.2024.106203] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2023] [Revised: 12/22/2023] [Accepted: 02/20/2024] [Indexed: 03/07/2024]
Abstract
As neural networks become more popular, the need for accompanying uncertainty estimates increases. There are currently two main approaches to test the quality of these estimates. Most methods output a density. They can be compared by evaluating their loglikelihood on a test set. Other methods output a prediction interval directly. These methods are often tested by examining the fraction of test points that fall inside the corresponding prediction intervals. Intuitively, both approaches seem logical. However, we demonstrate through both theoretical arguments and simulations that both ways of evaluating the quality of uncertainty estimates have serious flaws. Firstly, both approaches cannot disentangle the separate components that jointly create the predictive uncertainty, making it difficult to evaluate the quality of the estimates of these components. Specifically, the quality of a confidence interval cannot reliably be tested by estimating the performance of a prediction interval. Secondly, the loglikelihood does not allow a comparison between methods that output a prediction interval directly and methods that output a density. A better loglikelihood also does not necessarily guarantee better prediction intervals, which is what the methods are often used for in practice. Moreover, the current approach to test prediction intervals directly has additional flaws. We show why testing a prediction or confidence interval on a single test set is fundamentally flawed. At best, marginal coverage is measured, implicitly averaging out overconfident and underconfident predictions. A much more desirable property is pointwise coverage, requiring the correct coverage for each prediction. We demonstrate through practical examples that these effects can result in favouring a method, based on the predictive uncertainty, that has undesirable behaviour of the confidence or prediction intervals. Finally, we propose a simulation-based testing approach that addresses these problems while still allowing easy comparison between different methods. This approach can be used for the development of new uncertainty quantification methods.
Collapse
Affiliation(s)
- Laurens Sluijterman
- Department of Mathematics, Radboud University, P.O. Box 9010-59, 6500 GL, Nijmegen, Netherlands.
| | - Eric Cator
- Department of Mathematics, Radboud University, Netherlands.
| | - Tom Heskes
- Institute for Computing and Information Sciences, Radboud University, Netherlands.
| |
Collapse
|
40
|
Asif S, Zhao M, Li Y, Tang F, Zhu Y. CGO-ensemble: Chaos game optimization algorithm-based fusion of deep neural networks for accurate Mpox detection. Neural Netw 2024; 173:106183. [PMID: 38382397 DOI: 10.1016/j.neunet.2024.106183] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2023] [Revised: 12/19/2023] [Accepted: 02/15/2024] [Indexed: 02/23/2024]
Abstract
The rising global incidence of human Mpox cases necessitates prompt and accurate identification for effective disease control. Previous studies have predominantly delved into traditional ensemble methods for detection, we introduce a novel approach by leveraging a metaheuristic-based ensemble framework. In this research, we present an innovative CGO-Ensemble framework designed to elevate the accuracy of detecting Mpox infection in patients. Initially, we employ five transfer learning base models that integrate feature integration layers and residual blocks. These components play a crucial role in capturing significant features from the skin images, thereby enhancing the models' efficacy. In the next step, we employ a weighted averaging scheme to consolidate predictions generated by distinct models. To achieve the optimal allocation of weights for each base model in the ensemble process, we leverage the Chaos Game Optimization (CGO) algorithm. This strategic weight assignment enhances classification outcomes considerably, surpassing the performance of randomly assigned weights. Implementing this approach yields notably enhanced prediction accuracy compared to using individual models. We evaluate the effectiveness of our proposed approach through comprehensive experiments conducted on two widely recognized benchmark datasets: the Mpox Skin Lesion Dataset (MSLD) and the Mpox Skin Image Dataset (MSID). To gain insights into the decision-making process of the base models, we have performed Gradient Class Activation Mapping (Grad-CAM) analysis. The experimental results showcase the outstanding performance of the CGO-ensemble, achieving an impressive accuracy of 100% on MSLD and 94.16% on MSID. Our approach significantly outperforms other state-of-the-art optimization algorithms, traditional ensemble methods, and existing techniques in the context of Mpox detection on these datasets. These findings underscore the effectiveness and superiority of the CGO-Ensemble in accurately identifying Mpox cases, highlighting its potential in disease detection and classification.
Collapse
Affiliation(s)
- Sohaib Asif
- School of Computer Science and Engineering, Central South University, Changsha, China.
| | - Ming Zhao
- School of Computer Science and Engineering, Central South University, Changsha, China.
| | - Yangfan Li
- School of Computer Science and Engineering, Central South University, Changsha, China.
| | - Fengxiao Tang
- School of Computer Science and Engineering, Central South University, Changsha, China.
| | - Yusen Zhu
- School of Mathematics, Hunan University, Changsha, China
| |
Collapse
|
41
|
Lucas S, Portillo E. Methodology based on spiking neural networks for univariate time-series forecasting. Neural Netw 2024; 173:106171. [PMID: 38382399 DOI: 10.1016/j.neunet.2024.106171] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Revised: 11/17/2023] [Accepted: 02/08/2024] [Indexed: 02/23/2024]
Abstract
Spiking Neural Networks (SNN) are recognised as well-suited for processing spatiotemporal information with ultra-low energy consumption. However, proposals based on SNN for classification tasks are more common than for forecasting problems. In this sense, this paper presents a new general training methodology for univariate time-series forecasting based on SNN. The methodology is focused on one-step ahead forecasting problems and combines a PulseWidth Modulation based encoding-decoding algorithm with a Surrogate Gradient method as supervised training algorithm. In order to validate the generality of the presented methodology sine-wave, 3 UCI and 1 available real-world datasets are used. The results show very satisfactory forecasting results (MAE∈[0.0094,0.2891]) regardless of the characteristics of the dataset or the application field. In addition, weights can be initialised just once to achieve robust results, boosting the advantages of computational and energy cost of SNN.
Collapse
Affiliation(s)
- Sergio Lucas
- Department of Automatic Control and Systems Engineering, Faculty of Engineering of Bilbao, University of the Basque Country (UPV/EHU), Plaza Ingeniero Torres Quevedo, 1, Bilbao, 48013, Basque Country, Spain.
| | - Eva Portillo
- Department of Automatic Control and Systems Engineering, Faculty of Engineering of Bilbao, University of the Basque Country (UPV/EHU), Plaza Ingeniero Torres Quevedo, 1, Bilbao, 48013, Basque Country, Spain.
| |
Collapse
|
42
|
He L, Zhang L, Sun Q, Lin X. A generative adaptive convolutional neural network with attention mechanism for driver fatigue detection with class-imbalanced and insufficient data. Behav Brain Res 2024; 464:114898. [PMID: 38382711 DOI: 10.1016/j.bbr.2024.114898] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Revised: 01/26/2024] [Accepted: 02/05/2024] [Indexed: 02/23/2024]
Abstract
Over the past few years, fatigue driving has emerged as one of the main causes of traffic accidents, necessitating the development of driver fatigue detection systems. However, many existing methods involves tedious manual parameter tunings, a process that is both time-consuming and results in task-specific models. On the other hand, most of the researches on fatigue recognition are based on class-balanced and sufficient data, and effectively "mine" meaningful information from class-imbalanced and insufficient data for fatigue recognition is still a challenge. In this paper, we proposed two novel models, the attention-based residual adaptive multiscale fully convolutional network-long short term memory network (ARMFCN-LSTM), and the Generative ARMFCN-LSTM (GARMFCN-LSTM) aiming to address this issue. ARMFCN-LSTM excels at automatically extracting multiscale representations through adaptive multiscale temporal convolutions, while capturing temporal dependency features through LSTM. GARMFCN-LSTM integrates Wasserstein GAN with gradient penalty (WGAN-GP) into ARMFCN-LSTM to improve driver fatigue detection performance by alleviating data scarcity and addressing class imbalances. Experimental results show that ARMFCN-LSTM achieves the highest classification accuracy of 95.84% in driver fatigue detection on the class-balanced EEG dataset (binary classification), and GARMFCN-LSTM attained an improved classification accuracy of 84.70% on the class-imbalanced EOG dataset (triple classification), surpassing the competing methods. Therefore, the proposed models are promising for further implementations in online driver fatigue detection systems.
Collapse
Affiliation(s)
- Le He
- State Key Laboratory of Power Transmission Equipment Technology, School of Electrical Engineering, Chongqing University, Chongqing 400044, People's Republic of China
| | - Li Zhang
- State Key Laboratory of Power Transmission Equipment Technology, School of Electrical Engineering, Chongqing University, Chongqing 400044, People's Republic of China.
| | - Qiang Sun
- Laboratory for Neuro- and Psychophysiology, Department of Neurosciences, KU Leuven, Leuven, Belgium
| | - XiangTian Lin
- State Key Laboratory of Power Transmission Equipment Technology, School of Electrical Engineering, Chongqing University, Chongqing 400044, People's Republic of China
| |
Collapse
|
43
|
Walsh J, Neupane A, Li M. Evaluation of 1D convolutional neural network in estimation of mango dry matter content. Spectrochim Acta A Mol Biomol Spectrosc 2024; 311:124003. [PMID: 38354673 DOI: 10.1016/j.saa.2024.124003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/23/2023] [Revised: 01/30/2024] [Accepted: 02/04/2024] [Indexed: 02/16/2024]
Abstract
This study empirically validates prior claims regarding the superior performance of a Convolutional Neural Network (CNN) model for estimating mango Dry Matter Content (DMC) using Near Infrared (NIR) spectroscopy. The Partial Least Squares (PLS), Artificial Neural Network (ANN), and CNN models employed in the previous publications were compared on an equal footing, i.e., employing the same training and test data, with consideration of the effect of other practices employed in those studies, i.e., outlier removal, training set partitioning, sample ordering, and spectral pretreatment and augmentation. A new benchmark RMSEP of 0.77 %FW was achieved, being statistically significant (P<0.05) different than the previously published best RMSEP for the same independent test set. This CNN model was also shown to be more robust when tested on a new season of fruit than optimised ANN and PLS models, with RMSEPs of 1.18, 2.62, and 1.87, and bias of 0.16, 2.36 and 1.56 %FW, respectively. The combination of model type and data augmentation was important, with the CNN model only slightly outperforming the ANN model when using only a second derivative pretreatment. This requirement highlights the need for chemometric input to model development. The quantification of the sensitivity of neural network model training to use of differing seeds for pseudo-random sequence generation is also recommended. The standard deviation in RMSEP of 50 ANN and CNN models trained with differing random seeds was 0.03 and 0.02 %FW, respectively.
Collapse
Affiliation(s)
- Jeremy Walsh
- Central Queensland University, Rockhampton 4702, Queensland, Australia.
| | - Arjun Neupane
- Central Queensland University, Rockhampton 4702, Queensland, Australia
| | - Michael Li
- Central Queensland University, Rockhampton 4702, Queensland, Australia
| |
Collapse
|
44
|
Raj S, Mahanty B, Hait S. Coagulative removal of polystyrene microplastics from aqueous matrices using FeCl 3-chitosan system: Experimental and artificial neural network modeling. J Hazard Mater 2024; 468:133818. [PMID: 38377913 DOI: 10.1016/j.jhazmat.2024.133818] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Revised: 02/01/2024] [Accepted: 02/15/2024] [Indexed: 02/22/2024]
Abstract
Effluent from sewage treatment plants (STPs) is a significant source of microplastics (MPs) re-entry into the environment. Coagulation-flocculation-sedimentation (CFS) process as an initial tertiary treatment step requires investigation for coagulative MPs removal from secondary-treated sewage effluents. In this study, experiments were conducted on synthetic water containing 25 mg/L polystyrene (PS) MPs using varying dosages of FeCl3 (1-10 mg/L) and chitosan (0.25-9 mg/L) to assess the effect of process parameters, such as pH (4-8), stirring speed (0-200 rpm), and settling time (10-40 min). Results revealed that ∼89.3% and 21.4% of PS removal were achieved by FeCl3 and chitosan, respectively. Further, their combination resulted in a maximum of 99.8% removal at favorable conditions: FeCl3: 2 mg/L, chitosan: 7 mg/L, pH: 6.3, stirring speed: 100 rpm, and settling time: 30 min, with a statistically significant (p < 0.05) effect. Artificial neural network (ANN) validated the experimental results with RMSE = 1.0643 and R2 = 0.9997. Charge neutralization, confirmed by zeta potential, and adsorption, ascertained by field-emission scanning electron microscope (FESEM) and Fourier-transform infrared spectroscopy (FTIR), were primary mechanisms for efficient PS removal. For practical considerations, the application of the FeCl3-chitosan system on the effluents from moving bed biofilm reactor (MBBR) and sequencing batch reactor (SBR)-based STPs, spiked with PS microbeads, showed > 98% removal at favorable conditions.
Collapse
Affiliation(s)
- Shubham Raj
- Department of Civil and Environmental Engineering, Indian Institute of Technology Patna, Bihar 801 106, India
| | - Byomkesh Mahanty
- Department of Civil and Environmental Engineering, Indian Institute of Technology Patna, Bihar 801 106, India
| | - Subrata Hait
- Department of Civil and Environmental Engineering, Indian Institute of Technology Patna, Bihar 801 106, India.
| |
Collapse
|
45
|
Li R, Gao L, Wu G, Dong J. Multiple marine algae identification based on three-dimensional fluorescence spectroscopy and multi-label convolutional neural network. Spectrochim Acta A Mol Biomol Spectrosc 2024; 311:123938. [PMID: 38330754 DOI: 10.1016/j.saa.2024.123938] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/16/2023] [Revised: 12/14/2023] [Accepted: 01/20/2024] [Indexed: 02/10/2024]
Abstract
Accurate identification of algal populations plays a pivotal role in monitoring seawater quality. Fluorescence-based techniques are effective tools for quickly identifying different algae. However, multiple coexisting algae and their similar photosynthetic pigments can constrain the efficacy of fluorescence methods. This study introduces a multi-label classification model that combines a specific Excitation-Emission matric convolutional neural network (EEM-CNN) with three-dimensional (3D) fluorescence spectroscopy to detect single and mixed algal samples. Spectral data can be input directly into the model without transforming into images. Rectangular convolutional kernels and double convolutional layers are applied to enhance the extraction of balanced and comprehensive spectral features for accurate classification. A dataset comprising 3D fluorescence spectra from eight distinct algae species representing six different algal classes was obtained, preprocessed, and augmented to create input data for the classification model. The classification model was trained and validated using 4448 sets of test samples and 60 sets of test samples, resulting in an accuracy of 0.883 and an F1 score of 0.925. This model exhibited the highest recognition accuracy in both single and mixed algae samples, outperforming comparative methods such as ML-kNN and N-PLS-DA. Furthermore, the classification results were extended to three different algae species and mixed samples of skeletonema costatum to assess the impact of spectral similarity on multi-label classification performance. The developed classification models demonstrated robust performance across samples with varying concentrations and growth stages, highlighting CNN's potential as a promising tool for the precise identification of marine algae.
Collapse
Affiliation(s)
- Ruizhuo Li
- Xi'an Institute of Optics and Precision Mechanics, Chinese Academy of Science, Xi'an 710119, China; College of Photoelectricity, University of Chinese Academy of Science, Beijing 100049, China
| | - Limin Gao
- Xi'an Institute of Optics and Precision Mechanics, Chinese Academy of Science, Xi'an 710119, China
| | - Guojun Wu
- Xi'an Institute of Optics and Precision Mechanics, Chinese Academy of Science, Xi'an 710119, China; Laoshan Laboratory, Qingdao 266237, Shandong, China.
| | - Jing Dong
- Xi'an Institute of Optics and Precision Mechanics, Chinese Academy of Science, Xi'an 710119, China; College of Photoelectricity, University of Chinese Academy of Science, Beijing 100049, China
| |
Collapse
|
46
|
Zhang Q, Zhao Z, Wu Z, Niu X, Zhang Y, Wang Q, Ho SSH, Li Z, Shen Z. Toxicity source apportionment of fugitive dust PM 2.5-bound polycyclic aromatic hydrocarbons using multilayer perceptron neural network analysis in Guanzhong Plain urban agglomeration, China. J Hazard Mater 2024; 468:133773. [PMID: 38382337 DOI: 10.1016/j.jhazmat.2024.133773] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/14/2023] [Revised: 01/29/2024] [Accepted: 02/09/2024] [Indexed: 02/23/2024]
Abstract
Polycyclic aromatic hydrocarbons (PAHs) in urban fugitive dust, known for their toxicity and ability to generate reactive oxygen species (ROS), are a major public health concern. This study assessed the spatial distribution and health risks of 15 PAHs in construction dust (CD) and road dust (RD) samples collected from June to November 2021 over the cities of Tongchuan (TC), Baoji (BJ), Xianyang (XY), and Xi'an (XA) in the Guanzhong Plain, China. The average concentration of ΣPAHs in RD was 39.5 ± 20.0 μg g-1, approximately twice as much as in CD. Four-ring PAHs from fossil fuels combustion accounted for the highest proportion of ΣPAHs in fugitive dust over all four cities. Health-related indicators including benzo(a)pyrene toxic equivalency factors (BAPTEQ), oxidative potential (OP), and incremental lifetime cancer risk (ILCR) all presented higher risk in RD than those in CD. The multilayer perceptron neural network algorithm quantified that vehicular and industrial emissions contributed 86 % and 61 % to RD and CD BAPTEQ, respectively. For OP, the sources of biomass and coal combustion were the key generator which accounted for 31-54 %. These findings provide scientific evidence for the direct efforts toward decreasing the health risks of fugitive dust in Guanzhong Plain urban agglomeration, China.
Collapse
Affiliation(s)
- Qian Zhang
- Key Laboratory of Northwest Resource, Environment and Ecology, MOE, Xi'an University of Architecture and Technology, Xi'an 710055, China; Key Lab of Aerosol Chemistry & Physics, SKLLQG, Institute of Earth Environment, Chinese Academy of Sciences, Xi'an, China.
| | - Ziyi Zhao
- Key Laboratory of Northwest Resource, Environment and Ecology, MOE, Xi'an University of Architecture and Technology, Xi'an 710055, China
| | - Zhichun Wu
- Key Laboratory of Northwest Resource, Environment and Ecology, MOE, Xi'an University of Architecture and Technology, Xi'an 710055, China
| | - Xinyi Niu
- Department of Environmental Science and Engineering, Xi'an Jiaotong University, Xi'an 710049, China
| | - Yuhang Zhang
- Key Laboratory of Northwest Resource, Environment and Ecology, MOE, Xi'an University of Architecture and Technology, Xi'an 710055, China
| | - Qiyuan Wang
- Key Lab of Aerosol Chemistry & Physics, SKLLQG, Institute of Earth Environment, Chinese Academy of Sciences, Xi'an, China
| | - Steven Sai Hang Ho
- Division of Atmospheric Sciences, Desert Research Institute, Reno NV89512, United States
| | - Zhihua Li
- Key Laboratory of Northwest Resource, Environment and Ecology, MOE, Xi'an University of Architecture and Technology, Xi'an 710055, China
| | - Zhenxing Shen
- Department of Environmental Science and Engineering, Xi'an Jiaotong University, Xi'an 710049, China.
| |
Collapse
|
47
|
Zárate-Rochín AM. Contemporary neurocognitive models of memory: A descriptive comparative analysis. Neuropsychologia 2024; 196:108846. [PMID: 38430963 DOI: 10.1016/j.neuropsychologia.2024.108846] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2023] [Revised: 02/27/2024] [Accepted: 02/27/2024] [Indexed: 03/05/2024]
Abstract
The great complexity involved in the study of memory has given rise to numerous hypotheses and models associated with various phenomena at different levels of analysis. This has allowed us to delve deeper in our knowledge about memory but has also made it difficult to synthesize and integrate data from different lines of research. In this context, this work presents a descriptive comparative analysis of contemporary models that address the structure and function of multiple memory systems. The main goal is to outline a panoramic view of the key elements that constitute these models in order to visualize both the current state of research and possible future directions. The elements that stand out from different levels of analysis are distributed neural networks, hierarchical organization, predictive coding, homeostasis, and evolutionary perspective.
Collapse
Affiliation(s)
- Alba Marcela Zárate-Rochín
- Instituto de Investigaciones Cerebrales, Universidad Veracruzana, Dr. Castelazo Ayala s/n, Industrial Animas, 91190, Xalapa-Enríquez, Veracruz, Mexico.
| |
Collapse
|
48
|
Chang C, Liu H, Chen C, Wu L, Lv X, Xie X, Chen C. Rapid diagnosis of systemic lupus erythematosus by Raman spectroscopy combined with spiking neural network. Spectrochim Acta A Mol Biomol Spectrosc 2024; 310:123904. [PMID: 38262298 DOI: 10.1016/j.saa.2024.123904] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/04/2023] [Revised: 11/30/2023] [Accepted: 01/15/2024] [Indexed: 01/25/2024]
Abstract
Multiple organs are affected by the autoimmune inflammatory connective tissue disease known as systemic lupus erythematosus (SLE). If not diagnosed and treated in a timely manner, it can lead to nephritis and damage to the blood system in severe cases, resulting in the patient's death. Therefore, correct and timely diagnosis and treatment are essential for patients. In this study, a framework based on neural network algorithm and Raman spectroscopy technique was established to diagnose SLE patients. Firstly, we pre-processed the obtained Raman data by three methods: baseline correction, smoothing processing and normalization methods, before using it as input for the model, and then ANN, ResNet and SNN classification models were established. The respective classification accuracies for SLE patients were 89.61%, 85.71%, and 95.65% for the three models, with corresponding AUC values of 0.8772, 0.8100, and 0.9555. The results of the experimental indicate that SNN possesses a good classification effect, and the number of model parameters is only 525,826, which is 414,221 less than that of ResNet model. Since the network only uses 0 and 1 to transmit information, and only has basic operations such as summation, compared with the second-generation artificial neural network, which simplifies the product operation of floating point numbers into multiple addition operations, the network has low energy consumption and is suitable for embedding portable Raman spectrometer for clinical diagnosis. This research highlights the significant potential for quick and precise SLE patient discrimination offered by Raman spectroscopy in conjunction with spiking neural networks.
Collapse
Affiliation(s)
- Chenjie Chang
- College of Information Science and Engineering, Xinjiang University, Urumqi 830046, China
| | - Hao Liu
- College of Software, Xinjiang University, Urumqi 830046, China
| | - Chen Chen
- College of Information Science and Engineering, Xinjiang University, Urumqi 830046, China; Key Laboratory of Signal Detection and Processing, Xinjiang University, Urumqi 830046, China; Xinjiang Cloud Computing Application Laboratory, Karamay 834099, China; Xinjiang Aiqiside Testing Technology Co., Ltd, Urumqi 830000, China
| | - Lijun Wu
- Department of Rheumatology and Immunology, People's Hospital of Xinjiang Uygur Autonomous Region, Urumqi 830001, China; Xinjiang Clinical Research Center for Rheumatoid Arthritis, Urumqi 830001, China
| | - Xiaoyi Lv
- College of Software, Xinjiang University, Urumqi 830046, China; Key Laboratory of Signal Detection and Processing, Xinjiang University, Urumqi 830046, China
| | - Xiaodong Xie
- Xinjiang Uygur Autonomous Region People's Hospital, Urumqi 830001, China.
| | - Cheng Chen
- College of Software, Xinjiang University, Urumqi 830046, China.
| |
Collapse
|
49
|
Nagarajan B, Marques R, Aguilar E, Radeva P. Bayesian DivideMix++ for Enhanced Learning with Noisy Labels. Neural Netw 2024; 172:106122. [PMID: 38244356 DOI: 10.1016/j.neunet.2024.106122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2023] [Revised: 12/04/2023] [Accepted: 01/09/2024] [Indexed: 01/22/2024]
Abstract
Leveraging inexpensive and human intervention-based annotating methodologies, such as crowdsourcing and web crawling, often leads to datasets with noisy labels. Noisy labels can have a detrimental impact on the performance and generalization of deep neural networks. Robust models that are able to handle and mitigate the effect of these noisy labels are thus essential. In this work, we explore the open challenges of neural network memorization and uncertainty in creating robust learning algorithms with noisy labels. To overcome them, we propose a novel framework called "Bayesian DivideMix++" with two critical components: (i) DivideMix++, to enhance the robustness against memorization and (ii) Monte-Carlo MixMatch, which focuses on improving the effectiveness towards label uncertainty. DivideMix++ improves the pipeline by integrating the warm-up and augmentation pipeline with self-supervised pre-training and dedicated different data augmentations for loss analysis and backpropagation. Monte-Carlo MixMatch leverages uncertainty measurements to mitigate the influence of uncertain samples by reducing their weight in the data augmentation MixMatch step. We validate our proposed pipeline using four datasets encompassing various synthetic and real-world noise settings. We demonstrate the effectiveness and merits of our proposed pipeline using extensive experiments. Bayesian DivideMix++ outperforms the state-of-the-art models by considerable differences in all experiments. Our findings underscore the potential of leveraging these modifications to enhance the performance and generalization of deep neural networks in practical scenarios.
Collapse
Affiliation(s)
- Bhalaji Nagarajan
- Dept. de Matemàtiques i Informàtica, Universitat de Barcelona, Gran Via de les Corts Catalanes 585, 08007, Barcelona, Spain.
| | - Ricardo Marques
- Dept. de Matemàtiques i Informàtica, Universitat de Barcelona, Gran Via de les Corts Catalanes 585, 08007, Barcelona, Spain; Computer Vision Center, Cerdanyola (Barcelona), Spain
| | - Eduardo Aguilar
- Dept. de Matemàtiques i Informàtica, Universitat de Barcelona, Gran Via de les Corts Catalanes 585, 08007, Barcelona, Spain; Dept. de Ingeniería de Sistemas y Computación, Universidad Católica del Norte, Avenida Angamos 0610, 1270709, Antofagasta, Chile; Computer Vision Center, Cerdanyola (Barcelona), Spain
| | - Petia Radeva
- Dept. de Matemàtiques i Informàtica, Universitat de Barcelona, Gran Via de les Corts Catalanes 585, 08007, Barcelona, Spain; Computer Vision Center, Cerdanyola (Barcelona), Spain
| |
Collapse
|
50
|
Yuan M, Zhang C, Wang Z, Liu H, Pan G, Tang H. Trainable Spiking-YOLO for low-latency and high-performance object detection. Neural Netw 2024; 172:106092. [PMID: 38211460 DOI: 10.1016/j.neunet.2023.106092] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Revised: 12/06/2023] [Accepted: 12/26/2023] [Indexed: 01/13/2024]
Abstract
Spiking neural networks (SNNs) are considered an attractive option for edge-side applications due to their sparse, asynchronous and event-driven characteristics. However, the application of SNNs to object detection tasks faces challenges in achieving good detection accuracy and high detection speed. To overcome the aforementioned challenges, we propose an end-to-end Trainable Spiking-YOLO (Tr-Spiking-YOLO) for low-latency and high-performance object detection. We evaluate our model on not only frame-based PASCAL VOC dataset but also event-based GEN1 Automotive Detection dataset, and investigate the impacts of different decoding methods on detection performance. The experimental results show that our model achieves competitive/better performance in terms of accuracy, latency and energy consumption compared to similar artificial neural network (ANN) and conversion-based SNN object detection model. Furthermore, when deployed on an edge device, our model achieves a processing speed of approximately from 14 to 39 FPS while maintaining a desirable mean Average Precision (mAP), which is capable of real-time detection on resource-constrained platforms.
Collapse
Affiliation(s)
- Mengwen Yuan
- Research Institute of Intelligent Computing, Zhejiang Lab, Hangzhou 311100, China
| | - Chengjun Zhang
- Research Institute of Intelligent Computing, Zhejiang Lab, Hangzhou 311100, China
| | - Ziming Wang
- College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China
| | - Huixiang Liu
- Research Institute of Intelligent Computing, Zhejiang Lab, Hangzhou 311100, China
| | - Gang Pan
- College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China; The State Key Lab of Brain-Machine Intelligence, Zhejiang University, Hangzhou 310027, China; MOE Frontier Science Center for Brain Science and Brain-Machine Integration, Zhejiang University, Hangzhou 310027, China
| | - Huajin Tang
- College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China; The State Key Lab of Brain-Machine Intelligence, Zhejiang University, Hangzhou 310027, China; MOE Frontier Science Center for Brain Science and Brain-Machine Integration, Zhejiang University, Hangzhou 310027, China.
| |
Collapse
|