1
|
Chungnoy K, Tanantong T, Songmuang P. Missing value imputation on gene expression data using bee-based algorithm to improve classification performance. PLoS One 2024; 19:e0305492. [PMID: 39208345 PMCID: PMC11361674 DOI: 10.1371/journal.pone.0305492] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2023] [Accepted: 05/28/2024] [Indexed: 09/04/2024] Open
Abstract
Existing missing value imputation methods focused on imputing the data regarding actual values towards a completion of datasets as an input for machine learning tasks. This work proposes an imputation of missing values towards improvement of accuracy performance for classification. The proposed method was based on bee algorithm and the use of k-nearest neighborhood with linear regression to guide on finding the appropriate solution in prevention of randomness. Among the processes, GINI importance score was utilized in selecting values for imputation. The imputed values thus reflected on improving a discriminative power in classification tasks instead of replicating the actual values from the original dataset. In this study, we evaluated the proposed method against frequently used imputation methods such as k-nearest neighborhood, principal components analysis, nonlinear principal, and component analysis to compare root mean square error results and accuracy of using imputed datasets in a classification task. The experimental results indicated that our proposed method obtained the best accuracy results from all datasets comparing to other methods. In comparison to original dataset, the classification model from imputed datasets yielded 15-25% higher accuracy in class prediction. From analysis, the results showed that feature ranking used in a classification process was affected and lead to noticeably change in informativeness as the imputed data from the proposed method played the role to boost a discriminating power.
Collapse
Affiliation(s)
- Kritanat Chungnoy
- Department of Computer Science, Faculty of Science and Technology, Thammasat University (Rangsit Campus), Pathum Thani, Thailand
| | - Tanatorn Tanantong
- Department of Computer Science, Faculty of Science and Technology, Thammasat University (Rangsit Campus), Pathum Thani, Thailand
- Thammasat University Research Unit in Data Innovation and Artificial Intelligence, Thammasat University (Rangsit Campus), Pathum Thani, Thailand
| | - Pokpong Songmuang
- Department of Computer Science, Faculty of Science and Technology, Thammasat University (Rangsit Campus), Pathum Thani, Thailand
- Thammasat University Research Unit in Data Innovation and Artificial Intelligence, Thammasat University (Rangsit Campus), Pathum Thani, Thailand
| |
Collapse
|
2
|
Aqel K, Wang Z, Peng YB, Maia PD. Reconstructing rodent brain signals during euthanasia with eigensystem realization algorithm (ERA). Sci Rep 2024; 14:12261. [PMID: 38806534 PMCID: PMC11133335 DOI: 10.1038/s41598-024-61706-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2023] [Accepted: 05/08/2024] [Indexed: 05/30/2024] Open
Abstract
We accurately reconstruct the Local Field Potential time series obtained from anesthetized and awake rats, both before and during CO2 euthanasia. We apply the Eigensystem Realization Algorithm to identify an underlying linear dynamical system capable of generating the observed data. Time series exhibiting more intricate dynamics typically lead to systems of higher dimensions, offering a means to assess the complexity of the brain throughout various phases of the experiment. Our results indicate that anesthetized brains possess complexity levels similar to awake brains before CO2 administration. This resemblance undergoes significant changes following euthanization, as signals from the awake brain display a more resilient complexity profile, implying a state of heightened neuronal activity or a last fight response during the euthanasia process. In contrast, anesthetized brains seem to enter a more subdued state early on. Our data-driven techniques can likely be applied to a broader range of electrophysiological recording modalities.
Collapse
Affiliation(s)
- Khitam Aqel
- Department of Mathematics, University of Texas at Arlington, 411 S Nedderman Dr, Arlington, TX, 76019, USA.
| | - Zhen Wang
- Department of Psychology, University of Texas at Arlington, 501 S Nedderman Dr, Arlington, TX, 76019, USA
| | - Yuan B Peng
- Department of Psychology, University of Texas at Arlington, 501 S Nedderman Dr, Arlington, TX, 76019, USA
| | - Pedro D Maia
- Department of Mathematics, University of Texas at Arlington, 411 S Nedderman Dr, Arlington, TX, 76019, USA.
| |
Collapse
|
3
|
Fortunato C, Bennasar-Vázquez J, Park J, Chang JC, Miller LE, Dudman JT, Perich MG, Gallego JA. Nonlinear manifolds underlie neural population activity during behaviour. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.07.18.549575. [PMID: 37503015 PMCID: PMC10370078 DOI: 10.1101/2023.07.18.549575] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/29/2023]
Abstract
There is rich variety in the activity of single neurons recorded during behaviour. Yet, these diverse single neuron responses can be well described by relatively few patterns of neural co-modulation. The study of such low-dimensional structure of neural population activity has provided important insights into how the brain generates behaviour. Virtually all of these studies have used linear dimensionality reduction techniques to estimate these population-wide co-modulation patterns, constraining them to a flat "neural manifold". Here, we hypothesised that since neurons have nonlinear responses and make thousands of distributed and recurrent connections that likely amplify such nonlinearities, neural manifolds should be intrinsically nonlinear. Combining neural population recordings from monkey, mouse, and human motor cortex, and mouse striatum, we show that: 1) neural manifolds are intrinsically nonlinear; 2) their nonlinearity becomes more evident during complex tasks that require more varied activity patterns; and 3) manifold nonlinearity varies across architecturally distinct brain regions. Simulations using recurrent neural network models confirmed the proposed relationship between circuit connectivity and manifold nonlinearity, including the differences across architecturally distinct regions. Thus, neural manifolds underlying the generation of behaviour are inherently nonlinear, and properly accounting for such nonlinearities will be critical as neuroscientists move towards studying numerous brain regions involved in increasingly complex and naturalistic behaviours.
Collapse
Affiliation(s)
- Cátia Fortunato
- Department of Bioengineering, Imperial College London, London UK
| | | | - Junchol Park
- Janelia Research Campus, Howard Hughes Medical Institute, Ashburn VA, USA
| | - Joanna C. Chang
- Department of Bioengineering, Imperial College London, London UK
| | - Lee E. Miller
- Department of Neurosciences, Northwestern University, Chicago IL, USA
- Department of Biomedical Engineering, Northwestern University, Chicago IL, USA
- Department of Physical Medicine and Rehabilitation, Northwestern University, Chicago IL, USA, and Shirley Ryan Ability Lab, Chicago, IL, USA
| | - Joshua T. Dudman
- Janelia Research Campus, Howard Hughes Medical Institute, Ashburn VA, USA
| | - Matthew G. Perich
- Department of Neurosciences, Faculté de médecine, Université de Montréal, Montréal, Québec, Canada
- Québec Artificial Intelligence Institute (MILA), Montréal, Québec, Canada
| | - Juan A. Gallego
- Department of Bioengineering, Imperial College London, London UK
| |
Collapse
|
4
|
De Jesús Morales-Acuña E, Aguíñiga-García S, Cervantes-Duarte R, Cortés MY, Escobedo-Urías D, Silverberg N. Evaluation of particulate organic carbon from MODIS-Aqua in a marine-coastal water body. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2024:10.1007/s11356-024-33297-8. [PMID: 38637481 DOI: 10.1007/s11356-024-33297-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Accepted: 04/08/2024] [Indexed: 04/20/2024]
Abstract
La Paz Bay (LPB) in Mexico is one of the largest marine-coastal bodies of water in the Gulf of California (GC) and is ecologically important for the feeding, reproduction, and refuge of marine species. Although particulate organic carbon (POC) is an important reservoir of oceanic carbon and an indicator of productivity in the euphotic zone, studies in this region are scarce. This study evaluates the performance of satellite-derived POC in LPB from January 2003 to December 2020. The metrics obtained for COP ( RMSE = 33.8 mg m - 3 ;P bias = 29.6 % yr P = 0.4 con p < 0.05 ), Chla-a ( RMSE = 0.23 mg m - 3 ;P bias = - 4.3 % yr P = 0.94 con p < 0.05 ), and SST ( RMSE = 2 . 3 ∘ C ;P bias = - 2.2 % yr P = 0.92 con p < 0.05 ) establish that although in some cases there was a slight over/underestimation, the satellite estimates consistently represent the variability and average values measured in situ. On the other hand, the spatio-temporal analysis of the POC allowed us to identify two seasons with their respective transition periods and five subregions in which the POC is characterized by having its maximum variability; two of these coincide with the locations of the eddies reported for the winter and summer seasons in the LPB, while the following three are located: one in the coastal zone and in the two areas in which the LPB interacts with the GC. The associations, variability nodes, and multiple linear regression analysis suggest that POC fluctuations in the LPB respond mainly to biological processes and, to some extent, to the seasonality of SST and wind. Finally, our results justify the use of the MODIS-Aqua satellite POC for studies in marine-coastal water bodies with similar characteristics to the LPB and suggest that this water body can be considered a reservoir for the marine region of northwestern Mexico.
Collapse
Affiliation(s)
- Enrique De Jesús Morales-Acuña
- Departamento de Medio Ambiente, Centro Interdisciplinario Para El Desarrollo Integral Regional, Unidad Sinaloa, Instituto Politécnico Nacional (IPN), Bulevar Juan de Dios Batíz Paredes 250, Colonia San Joachin, Guasave, Sinaloa, CP, 81101, México.
| | - Sergio Aguíñiga-García
- Centro Interdisciplinario de Ciencias Marinas, Instituto Politécnico Nacional, Av. IPN, Playa Palo de Santa Rita, La Paz, B.C.S, México
| | - Rafael Cervantes-Duarte
- Centro Interdisciplinario de Ciencias Marinas, Instituto Politécnico Nacional, Av. IPN, Playa Palo de Santa Rita, La Paz, B.C.S, México
| | - Mara Yadira Cortés
- Departamento Académico de Ciencias de La Tierra, Universidad Autónoma de Baja California Sur, Apartado Postal 19B, La Paz, C.P. 23080, México
| | - Diana Escobedo-Urías
- Departamento de Medio Ambiente, Centro Interdisciplinario Para El Desarrollo Integral Regional, Unidad Sinaloa, Instituto Politécnico Nacional (IPN), Bulevar Juan de Dios Batíz Paredes 250, Colonia San Joachin, Guasave, Sinaloa, CP, 81101, México
| | - Norman Silverberg
- Centro Interdisciplinario de Ciencias Marinas, Instituto Politécnico Nacional, Av. IPN, Playa Palo de Santa Rita, La Paz, B.C.S, México
| |
Collapse
|
5
|
Baima J, Goryaeva AM, Swinburne TD, Maillet JB, Nastar M, Marinica MC. Capabilities and limits of autoencoders for extracting collective variables in atomistic materials science. Phys Chem Chem Phys 2022; 24:23152-23163. [PMID: 36128869 DOI: 10.1039/d2cp01917e] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Free energy calculations in materials science are routinely hindered by the need to provide reaction coordinates that can meaningfully partition atomic configuration space, a prerequisite for most enhanced sampling approaches. Recent studies on molecular systems have highlighted the possibility of constructing appropriate collective variables directly from atomic motions through deep learning techniques. Here we extend this class of approaches to condensed matter problems, for which we encode the finite temperature collective variable by an iterative procedure starting from 0 K features of the energy landscape i.e. activation events or migration mechanisms given by a minimum - saddle point - minimum sequence. We employ the autoencoder neural networks in order to build a scalar collective variable for use with the adaptive biasing force method. Particular attention is given to design choices required for application to crystalline systems with defects, including the filtering of thermal motions which otherwise dominate the autoencoder input. The machine-learning workflow is tested on body-centered cubic iron and its common defects, such as small vacancy or self-interstitial clusters and screw dislocations. For localized defects, excellent collective variables as well as derivatives, necessary for free energy sampling, are systematically obtained. However, the approach has a limited accuracy when dealing with reaction coordinates that include atomic displacements of a magnitude comparable to thermal motions, e.g. the ones produced by the long-range elastic field of dislocations. We then combine the extraction of collective variables by autoencoders with an adaptive biasing force free energy method based on Bayesian inference. Using a vacancy migration as an example, we demonstrate the performance of coupling these two approaches for simultaneous discovery of reaction coordinates and free energy sampling in systems with localized defects.
Collapse
Affiliation(s)
- Jacopo Baima
- Université Paris-Saclay, CEA, Service de Recherches de Métallurgie Physique, Gif-sur-Yvette 91191, France.
| | - Alexandra M Goryaeva
- Université Paris-Saclay, CEA, Service de Recherches de Métallurgie Physique, Gif-sur-Yvette 91191, France.
| | - Thomas D Swinburne
- Aix-Marseille Université, CNRS, CINaM UMR 7325, Campus de Luminy, 13288 Marseille, France
| | | | - Maylise Nastar
- Université Paris-Saclay, CEA, Service de Recherches de Métallurgie Physique, Gif-sur-Yvette 91191, France.
| | - Mihai-Cosmin Marinica
- Université Paris-Saclay, CEA, Service de Recherches de Métallurgie Physique, Gif-sur-Yvette 91191, France.
| |
Collapse
|
6
|
Ramil M, Boudier C, Goryaeva AM, Marinica MC, Maillet JB. On Sampling Minimum Energy Path. J Chem Theory Comput 2022; 18:5864-5875. [PMID: 36073162 DOI: 10.1021/acs.jctc.2c00314] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Sampling the minimum energy path (MEP) between two minima of a system is often hindered by the presence of an energy barrier separating the two metastable states. As a consequence, direct sampling based on molecular dynamics or Markov Chain Monte Carlo methods becomes inefficient, the crossing of the energy barrier being associated to a rare event. Augmented sampling methods based on the definition of collective variables or reaction coordinates allow us to circumvent this limitation at the price of an arbitrary choice of the dimensionality reduction algorithm. We couple the statistical sampling techniques, namely, metadynamics and invertible neural networks, with autoencoders so as to gradually learn the MEP and the collective variable at the same time. Learning is achieved through a succession of two steps: statistical sampling of the most probable path between the two minima and redefinition of the collective variable from the updated data points. The prototypical Mueller potential with nearly orthogonal minima is employed to demonstrate the ability of such coupling to unravel a complex MEP.
Collapse
Affiliation(s)
| | | | - Alexandra M Goryaeva
- Service de Recherches de Métallurgie Physique, Université Paris-Saclay, CEA, Gif-sur-Yvette 91191, France
| | - Mihai-Cosmin Marinica
- Service de Recherches de Métallurgie Physique, Université Paris-Saclay, CEA, Gif-sur-Yvette 91191, France
| | - Jean-Bernard Maillet
- CEA─DAM, DIF, Arpajon Cedex F-91297, France.,Université Paris-Saclay, CEA, LMCE, Bruyères-le-Châtel 91680, France
| |
Collapse
|
7
|
Zhao J, Huang S, Yousuf O, Gao Y, Hoskins BD, Adam GC. Gradient Decomposition Methods for Training Neural Networks With Non-ideal Synaptic Devices. Front Neurosci 2021; 15:749811. [PMID: 34880721 PMCID: PMC8645649 DOI: 10.3389/fnins.2021.749811] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2021] [Accepted: 10/20/2021] [Indexed: 11/21/2022] Open
Abstract
While promising for high-capacity machine learning accelerators, memristor devices have non-idealities that prevent software-equivalent accuracies when used for online training. This work uses a combination of Mini-Batch Gradient Descent (MBGD) to average gradients, stochastic rounding to avoid vanishing weight updates, and decomposition methods to keep the memory overhead low during mini-batch training. Since the weight update has to be transferred to the memristor matrices efficiently, we also investigate the impact of reconstructing the gradient matrixes both internally (rank-seq) and externally (rank-sum) to the memristor array. Our results show that streaming batch principal component analysis (streaming batch PCA) and non-negative matrix factorization (NMF) decomposition algorithms can achieve near MBGD accuracy in a memristor-based multi-layer perceptron trained on the MNIST (Modified National Institute of Standards and Technology) database with only 3 to 10 ranks at significant memory savings. Moreover, NMF rank-seq outperforms streaming batch PCA rank-seq at low-ranks making it more suitable for hardware implementation in future memristor-based accelerators.
Collapse
Affiliation(s)
- Junyun Zhao
- Department of Computer Science, George Washington University, Washington, DC, United States
| | - Siyuan Huang
- Department of Computer Science, George Washington University, Washington, DC, United States
| | - Osama Yousuf
- Department of Electrical and Computer Engineering, George Washington University, Washington, DC, United States
| | - Yutong Gao
- Department of Computer Science, George Washington University, Washington, DC, United States
| | - Brian D Hoskins
- Physical Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, United States
| | - Gina C Adam
- Department of Electrical and Computer Engineering, George Washington University, Washington, DC, United States
| |
Collapse
|
8
|
Ocampo-Marulanda C, Cerón WL, Avila-Diaz A, Canchala T, Alfonso-Morales W, Kayano MT, Torres RR. Missing data estimation in extreme rainfall indices for the Metropolitan area of Cali - Colombia: An approach based on artificial neural networks. Data Brief 2021; 39:107592. [PMID: 34869806 PMCID: PMC8626650 DOI: 10.1016/j.dib.2021.107592] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Revised: 11/10/2021] [Accepted: 11/15/2021] [Indexed: 11/29/2022] Open
Abstract
Changes observed in the current climate and projected for the future significantly concern researchers, decision-makers, and the general public. Climate indices of extreme rainfall events are a trend assessment tool to detect climate variability and change signals, which have an average reliability at least in the short term and given climatic inertia. This paper shows 12 climate indices of extreme rainfall events for annual and seasonal scales for 12 climate stations between 1969 to 2019 in the Metropolitan area of Cali (southwestern Colombia). The construction of the indices starts from daily rainfall time series, which although have between 0.5% and 5.4% of missing data, can affect the estimation of the indices. Here, we propose a methodology to complete missing data of the extreme event indices that model the peaks in the time series. This methodology uses an artificial neural network approach known as Non-Linear Principal Component Analysis (NLPCA). The approach reconstructs the time series by modulating the extreme values of the indices, a fundamental feature when evaluating extreme rainfall events in a region. The accuracy in the indices estimation shows values close to 1 in the Pearson's Correlation Coefficient and in the Bi-weighting Correlation. Moreover, values close to 0 in the percent bias and RMSE-observations standard deviation ratio. The database provided here is an essential input in future evaluation studies of extreme rainfall events in the Metropolitan area of Cali, the third most crucial urban conglomerate in Colombia with more than 3.9 million inhabitants.
Collapse
Affiliation(s)
- Camilo Ocampo-Marulanda
- Faculty of Natural Sciences and Engineering, Fundación Universitaria de San Gil, Unisangil, Km 2 via Matepantano, Yopal 850001, Colombia.,Water Resources Engineering and Soil (IREHISA) Research Group, School of Natural Resources and Environmental Engineering, Universidad del Valle, Calle 13 # 100-00, Cali 25360, Colombia
| | - Wilmar L Cerón
- Department of Geography, Faculty of Humanities, Universidad del Valle, Calle 13 # 100-00, Cali 25360, Colombia
| | - Alvaro Avila-Diaz
- Universidad de Ciencias Aplicadas y Ambientales - UDCA, Bogota 111166, Colombia.,Natural Resources Institute, Universidade Federal de Itajubá, Itajubá 36570-900, MG, Brazil
| | - Teresita Canchala
- Water Resources Engineering and Soil (IREHISA) Research Group, School of Natural Resources and Environmental Engineering, Universidad del Valle, Calle 13 # 100-00, Cali 25360, Colombia
| | - Wilfredo Alfonso-Morales
- Perception and Intelligent Systems (PSI) Research Group, School of Electrical and Electronics Engineering, Universidad del Valle, Calle 13 # 100-00, Cali 25360, Colombia
| | - Mary T Kayano
- Coordenação Geral de Ciências da Terra, Instituto Nacional de Pesquisas Espaciais, Avenida dos Astronautas, 1758, São José dos Campos, SP 12227-010, Brazil
| | - Roger R Torres
- Natural Resources Institute, Universidade Federal de Itajubá, Itajubá 36570-900, MG, Brazil
| |
Collapse
|
9
|
Dimension-reduction simplifies the analysis of signal crosstalk in a bacterial quorum sensing pathway. Sci Rep 2021; 11:19719. [PMID: 34611201 PMCID: PMC8492804 DOI: 10.1038/s41598-021-99169-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2021] [Accepted: 09/21/2021] [Indexed: 11/16/2022] Open
Abstract
Many pheromone sensing bacteria produce and detect more than one chemically distinct signal, or autoinducer. The pathways that detect these signals are typically noisy and interlocked through crosstalk and feedback. As a result, the sensing response of individual cells is described by statistical distributions that change under different combinations of signal inputs. Here we examine how signal crosstalk reshapes this response. We measure how combinations of two homoserine lactone (HSL) input signals alter the statistical distributions of individual cell responses in the AinS/R- and LuxI/R-controlled branches of the Vibrio fischeri bioluminescence pathway. We find that, while the distributions of pathway activation in individual cells vary in complex fashion with environmental conditions, these changes have a low-dimensional representation. For both the AinS/R and LuxI/R branches, the distribution of individual cell responses to mixtures of the two HSLs is effectively one-dimensional, so that a single tuning parameter can capture the full range of variability in the distributions. Combinations of crosstalking HSL signals extend the range of responses for each branch of the circuit, so that signals in combination allow population-wide distributions that are not available under a single HSL input. Dimension reduction also simplifies the problem of identifying the HSL conditions to which the pathways and their outputs are most sensitive. A comparison of the maximum sensitivity HSL conditions to actual HSL levels measured during culture growth indicates that the AinS/R and LuxI/R branches lack sensitivity to population density except during the very earliest and latest stages of growth respectively.
Collapse
|
10
|
Cai Y, Wu S, Fan X, Olson J, Evans L, Lollis S, Mirza SK, Paulsen KD, Ji S. A level-wise spine registration framework to account for large pose changes. Int J Comput Assist Radiol Surg 2021; 16:943-953. [PMID: 33973113 PMCID: PMC8358825 DOI: 10.1007/s11548-021-02395-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2021] [Accepted: 04/29/2021] [Indexed: 11/27/2022]
Abstract
PURPOSES Accurate and efficient spine registration is crucial to success of spine image guidance. However, changes in spine pose cause intervertebral motion that can lead to significant registration errors. In this study, we develop a geometrical rectification technique via nonlinear principal component analysis (NLPCA) to achieve level-wise vertebral registration that is robust to large changes in spine pose. METHODS We used explanted porcine spines and live pigs to develop and test our technique. Each sample was scanned with preoperative CT (pCT) in an initial pose and rescanned with intraoperative stereovision (iSV) in a different surgical posture. Patient registration rectified arbitrary spinal postures in pCT and iSV into a common, neutral pose through a parameterized moving-frame approach. Topologically encoded depth projection 2D images were then generated to establish invertible point-to-pixel correspondences. Level-wise point correspondences between pCT and iSV vertebral surfaces were generated via 2D image registration. Finally, closed-form vertebral level-wise rigid registration was obtained by directly mapping 3D surface point pairs. Implanted mini-screws were used as fiducial markers to measure registration accuracy. RESULTS In seven explanted porcine spines and two live animal surgeries (maximum in-spine pose change of 87.5 mm and 32.7 degrees averaged from all spines), average target registration errors (TRE) of 1.70 ± 0.15 mm and 1.85 ± 0.16 mm were achieved, respectively. The automated spine rectification took 3-5 min, followed by an additional 30 secs for depth image projection and level-wise registration. CONCLUSIONS Accuracy and efficiency of the proposed level-wise spine registration support its application in human open spine surgeries. The registration framework, itself, may also be applicable to other intraoperative imaging modalities such as ultrasound and MRI, which may expand utility of the approach in spine registration in general.
Collapse
Affiliation(s)
- Yunliang Cai
- Worcester Polytechnic Institute, 100 Institute Rd, Worcester, MA, 01609, USA
| | - Shaoju Wu
- Worcester Polytechnic Institute, 100 Institute Rd, Worcester, MA, 01609, USA
| | - Xiaoyao Fan
- Dartmouth College Dartmouth-Hitchcock Medical Center, 1 Medical Center Dr, Lebanon, NH, 03766, USA
| | - Jonathan Olson
- Dartmouth College Dartmouth-Hitchcock Medical Center, 1 Medical Center Dr, Lebanon, NH, 03766, USA
| | - Linton Evans
- Dartmouth College Dartmouth-Hitchcock Medical Center, 1 Medical Center Dr, Lebanon, NH, 03766, USA
| | - Scott Lollis
- University of Vermont Medical Center, Burlington, VT, 05401, USA
| | - Sohail K Mirza
- Dartmouth College Dartmouth-Hitchcock Medical Center, 1 Medical Center Dr, Lebanon, NH, 03766, USA
| | - Keith D Paulsen
- Dartmouth College Dartmouth-Hitchcock Medical Center, 1 Medical Center Dr, Lebanon, NH, 03766, USA
| | - Songbai Ji
- Worcester Polytechnic Institute, 100 Institute Rd, Worcester, MA, 01609, USA.
| |
Collapse
|
11
|
Rizzoglio F, Casadio M, De Santis D, Mussa-Ivaldi FA. Building an adaptive interface via unsupervised tracking of latent manifolds. Neural Netw 2021; 137:174-187. [PMID: 33636657 DOI: 10.1016/j.neunet.2021.01.009] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2020] [Revised: 11/16/2020] [Accepted: 01/14/2021] [Indexed: 01/05/2023]
Abstract
In human-machine interfaces, decoder calibration is critical to enable an effective and seamless interaction with the machine. However, recalibration is often necessary as the decoder off-line predictive power does not generally imply ease-of-use, due to closed loop dynamics and user adaptation that cannot be accounted for during the calibration procedure. Here, we propose an adaptive interface that makes use of a non-linear autoencoder trained iteratively to perform online manifold identification and tracking, with the dual goal of reducing the need for interface recalibration and enhancing human-machine joint performance. Importantly, the proposed approach avoids interrupting the operation of the device and it neither relies on information about the state of the task, nor on the existence of a stable neural or movement manifold, allowing it to be applied in the earliest stages of interface operation, when the formation of new neural strategies is still on-going. In order to more directly test the performance of our algorithm, we defined the autoencoder latent space as the control space of a body-machine interface. After an initial offline parameter tuning, we evaluated the performance of the adaptive interface versus that of a static decoder in approximating the evolving low-dimensional manifold of users simultaneously learning to perform reaching movements within the latent space. Results show that the adaptive approach increased the representational efficiency of the interface decoder. Concurrently, it significantly improved users' task-related performance, indicating that the development of a more accurate internal model is encouraged by the online co-adaptation process.
Collapse
Affiliation(s)
- Fabio Rizzoglio
- Department of Informatics, Bioengineering, Robotics and Systems Engineering, University of Genoa, 16145 Genoa, Italy; Department of Physiology, Feinberg School of Medicine, Northwestern University, Chicago, IL, 60611, USA; Shirley Ryan Ability Lab, Chicago, IL, 60611, USA.
| | - Maura Casadio
- Department of Informatics, Bioengineering, Robotics and Systems Engineering, University of Genoa, 16145 Genoa, Italy.
| | - Dalia De Santis
- Department of Physiology, Feinberg School of Medicine, Northwestern University, Chicago, IL, 60611, USA; Shirley Ryan Ability Lab, Chicago, IL, 60611, USA; Department of Robotics, Brain and Cognitive Sciences, Istituto Italiano di Tecnologia, Via Enrico Melen 83, 16152, Genoa, Italy.
| | - Ferdinando A Mussa-Ivaldi
- Department of Physiology, Feinberg School of Medicine, Northwestern University, Chicago, IL, 60611, USA; Shirley Ryan Ability Lab, Chicago, IL, 60611, USA.
| |
Collapse
|
12
|
Yan Y, Goodman JM, Moore DD, Solla SA, Bensmaia SJ. Unexpected complexity of everyday manual behaviors. Nat Commun 2020; 11:3564. [PMID: 32678102 PMCID: PMC7367296 DOI: 10.1038/s41467-020-17404-0] [Citation(s) in RCA: 30] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2019] [Accepted: 06/15/2020] [Indexed: 12/13/2022] Open
Abstract
How does the brain control an effector as complex and versatile as the hand? One possibility is that neural control is simplified by limiting the space of hand movements. Indeed, hand kinematics can be largely described within 8 to 10 dimensions. This oft replicated finding has been construed as evidence that hand postures are confined to this subspace. A prediction from this hypothesis is that dimensions outside of this subspace reflect noise. To address this question, we track the hand of human participants as they perform two tasks-grasping and signing in American Sign Language. We apply multiple dimension reduction techniques and replicate the finding that most postural variance falls within a reduced subspace. However, we show that dimensions outside of this subspace are highly structured and task dependent, suggesting they too are under volitional control. We propose that hand control occupies a higher dimensional space than previously considered.
Collapse
Affiliation(s)
- Yuke Yan
- Committee on Computational Neuroscience, University of Chicago, Chicago, IL, USA
| | - James M Goodman
- Committee on Computational Neuroscience, University of Chicago, Chicago, IL, USA
| | - Dalton D Moore
- Committee on Computational Neuroscience, University of Chicago, Chicago, IL, USA
| | - Sara A Solla
- Department of Physiology, Northwestern University, Chicago, IL, USA
| | - Sliman J Bensmaia
- Committee on Computational Neuroscience, University of Chicago, Chicago, IL, USA.
- Department of Organismal Biology and Anatomy, University of Chicago, Chicago, IL, USA.
- Grossman Institute for Neuroscience, Quantitative Biology, and Human Behavior, University of Chicago, Chicago, IL, USA.
| |
Collapse
|
13
|
Dorado-Moreno M, Navarin N, Gutiérrez P, Prieto L, Sperduti A, Salcedo-Sanz S, Hervás-Martínez C. Multi-task learning for the prediction of wind power ramp events with deep neural networks. Neural Netw 2020; 123:401-411. [DOI: 10.1016/j.neunet.2019.12.017] [Citation(s) in RCA: 35] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2018] [Revised: 10/27/2019] [Accepted: 12/20/2019] [Indexed: 11/17/2022]
|
14
|
Rocha M, Anzanello M, Caleffi F, Cybis H, Yamashita G. A multivariate-based variable selection framework for clustering traffic conflicts in a brazilian freeway. ACCIDENT; ANALYSIS AND PREVENTION 2019; 132:105269. [PMID: 31445462 DOI: 10.1016/j.aap.2019.105269] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/04/2019] [Revised: 07/15/2019] [Accepted: 08/12/2019] [Indexed: 06/10/2023]
Abstract
More than one million people die or suffer non-fatal injuries annually due to road accidents around the world. Understanding the causes that give rise to different types of conflict events, as well as their characteristics, can help researchers and traffic authorities to draw up strategies aimed at mitigating collision risks. This paper proposes a framework for grouping traffic conflicts relying on similar profiles and factors that contribute to conflict occurrence using self-organizing maps (SOM). In order to improve the quality of the formed groups, we developed a novel variable importance index relying on the outputs of the nonlinear principal component analysis (NLPCA) that intends to identify the most informative variables for grouping collision events. Such index guides a backward variable selection procedure in which less relevant variables are removed one-by-one; after each removal, the clustering quality is assessed via the Davies-Bouldin (DB) index. The proposed framework was applied to a real-time dataset collected from a Brazilian highway aimed at allocating traffic conflicts into groups presenting similar profiles. The selected variables suggest that lower average speeds, which are typically verified during congestion events, contribute to conflict occurrence. Higher variability on speed (denoted by high standard deviation, and speed's coefficient of variation levels on that variable), which are also perceived in the assessed freeway near to congestion periods, also contribute to conflicts.
Collapse
Affiliation(s)
- Miriam Rocha
- Department of Industrial Engineering, Federal University of Rio Grande do Sul, Porto Alegre, RS 90035-180, Brazil; Center of Engineering, Federal Rural University of Semi-Arid, Mossoró, RN 59.625-900, Brazil.
| | - Michel Anzanello
- Department of Industrial Engineering, Federal University of Rio Grande do Sul, Porto Alegre, RS 90035-180, Brazil
| | - Felipe Caleffi
- Laboratory of transport systems, Federal University of Rio Grande do Sul, Porto Alegre, RS, 90035-180, Brazil
| | - Helena Cybis
- Laboratory of transport systems, Federal University of Rio Grande do Sul, Porto Alegre, RS, 90035-180, Brazil
| | - Gabrielli Yamashita
- Department of Industrial Engineering, Federal University of Rio Grande do Sul, Porto Alegre, RS 90035-180, Brazil
| |
Collapse
|
15
|
van Gestel J, Ackermann M, Wagner A. Microbial life cycles link global modularity in regulation to mosaic evolution. Nat Ecol Evol 2019; 3:1184-1196. [PMID: 31332330 DOI: 10.1038/s41559-019-0939-6] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2018] [Accepted: 06/03/2019] [Indexed: 11/09/2022]
Abstract
Microbes are exposed to changing environments, to which they can respond by adopting various lifestyles such as swimming, colony formation or dormancy. These lifestyles are often studied in isolation, thereby giving a fragmented view of the life cycle as a whole. Here, we study lifestyles in the context of this whole. We first use machine learning to reconstruct the expression changes underlying life cycle progression in the bacterium Bacillus subtilis, based on hundreds of previously acquired expression profiles. This yields a timeline that reveals the modular organization of the life cycle. By analysing over 380 Bacillales genomes, we then show that life cycle modularity gives rise to mosaic evolution in which life stages such as motility and sporulation are conserved and lost as discrete units. We postulate that this mosaic conservation pattern results from habitat changes that make these life stages obsolete or detrimental. Indeed, when evolving eight distinct Bacillales strains and species under laboratory conditions that favour colony growth, we observe rapid and parallel losses of the sporulation life stage across species, induced by mutations that affect the same global regulator. We conclude that a life cycle perspective is pivotal to understanding the causes and consequences of modularity in both regulation and evolution.
Collapse
Affiliation(s)
- Jordi van Gestel
- Department of Evolutionary Biology and Environmental Studies, University of Zürich, Zürich, Switzerland. .,Swiss Institute of Bioinformatics, Lausanne, Switzerland. .,Department of Environmental Systems Science, ETH Zürich, Zürich, Switzerland. .,Department of Environmental Microbiology, Swiss Federal Institute of Aquatic Science and Technology (Eawag), Dübendorf, Switzerland.
| | - Martin Ackermann
- Department of Environmental Systems Science, ETH Zürich, Zürich, Switzerland.,Department of Environmental Microbiology, Swiss Federal Institute of Aquatic Science and Technology (Eawag), Dübendorf, Switzerland
| | - Andreas Wagner
- Department of Evolutionary Biology and Environmental Studies, University of Zürich, Zürich, Switzerland. .,Swiss Institute of Bioinformatics, Lausanne, Switzerland. .,The Santa Fe Institute, Santa Fe, NM, USA.
| |
Collapse
|
16
|
Wang J, Ferguson AL. Recovery of Protein Folding Funnels from Single-Molecule Time Series by Delay Embeddings and Manifold Learning. J Phys Chem B 2018; 122:11931-11952. [PMID: 30428261 DOI: 10.1021/acs.jpcb.8b08800] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
The stability and folding of proteins is governed by the underlying single-molecule free energy surface (smFES) mapping the free energy of the molecule as a function of configurational state. Ascertaining the smFES is of great value in understanding and engineering protein structure and function. By integrating tools from dynamical systems theory and nonlinear manifold learning, we describe an approach to reconstruct the multidimensional smFES for a protein from a time series in a single experimentally measurable observable. We employ Takens' delay embeddings to project the time series into a high-dimensional space in which the projected dynamics are C1-equivalent to the true system dynamics and employ diffusion maps to recover a low-dimensional reconstruction of the smFES that is equivalent to the true smFES up to a smooth and invertible transformation. We validate the approach in molecular dynamics simulations of Trp-cage, Villin, and BBA to demonstrate that landscapes recovered from univariate time series in the head-to-tail distance are topologically identical-they precisely preserve the metastable states and folding pathways-and topographically approximate-the free energy barrier heights and well depths are approximately preserved-to the true landscapes determined from complete knowledge of all atomic coordinates. We go on to show that the reconstructed landscapes reliably predict temperature denaturation and identify point mutations and groups of mutations critical to folding. These results demonstrate that protein folding funnels can be reconstructed from experimentally measurable time series and used to understand and engineer folding.
Collapse
Affiliation(s)
- Jiang Wang
- Department of Physics , University of Illinois at Urbana-Champaign , 1110 West Green Street , Urbana , Illinois 61801 , United States
| | - Andrew L Ferguson
- Institute for Molecular Engineering , University of Chicago , 5640 South Ellis Avenue , Chicago , Illinois 60637 , United States
| |
Collapse
|
17
|
Chen W, Ferguson AL. Molecular enhanced sampling with autoencoders: On-the-fly collective variable discovery and accelerated free energy landscape exploration. J Comput Chem 2018; 39:2079-2102. [PMID: 30368832 DOI: 10.1002/jcc.25520] [Citation(s) in RCA: 121] [Impact Index Per Article: 17.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2017] [Accepted: 06/14/2018] [Indexed: 01/08/2023]
Abstract
Macromolecular and biomolecular folding landscapes typically contain high free energy barriers that impede efficient sampling of configurational space by standard molecular dynamics simulation. Biased sampling can artificially drive the simulation along prespecified collective variables (CVs), but success depends critically on the availability of good CVs associated with the important collective dynamical motions. Nonlinear machine learning techniques can identify such CVs but typically do not furnish an explicit relationship with the atomic coordinates necessary to perform biased sampling. In this work, we employ auto-associative artificial neural networks ("autoencoders") to learn nonlinear CVs that are explicit and differentiable functions of the atomic coordinates. Our approach offers substantial speedups in exploration of configurational space, and is distinguished from existing approaches by its capacity to simultaneously discover and directly accelerate along data-driven CVs. We demonstrate the approach in simulations of alanine dipeptide and Trp-cage, and have developed an open-source and freely available implementation within OpenMM. © 2018 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Wei Chen
- Department of Physics, University of Illinois at Urbana-Champaign, 1110 West Green Street, Urbana, Illinois, 61801
| | - Andrew L Ferguson
- Department of Physics, University of Illinois at Urbana-Champaign, 1110 West Green Street, Urbana, Illinois, 61801.,Department of Materials Science and Engineering, University of Illinois at Urbana-Champaign, 1304 W Green Street, Urbana, Illinois, 61801.,Department of Chemical and Biomolecular Engineering, University of Illinois at Urbana-Champaign, 600 South Mathews Avenue, Urbana, Illinois, 61801
| |
Collapse
|
18
|
Venuto DD, Annese VF, Mezzina G, Scioscia F, Ruta M, Sciascio ED, Vincentelli AS. A Mobile Health System for Neurocognitive Impairment Evaluation Based on P300 Detection. ACM TRANSACTIONS ON CYBER-PHYSICAL SYSTEMS 2018. [DOI: 10.1145/3140236] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Abstract
A new mobile healthcare system for neuro-cognitive function monitoring and treatment is presented. The architecture of the system features sensors to measure the brain potential, localized data analysis and filtering, and in-cloud distribution to specialized medical personnel. As such, it presents tradeoffs typical of other cyber-physical systems, where hardware, algorithms, and software implementations have to come together in a coherent fashion. The system is based on spatio-temporal detection and characterization of a specific brain potential called P300. The diagnosis of cognitive deficit is achieved by analyzing the data collected by the system with a new algorithm called tuned-Residue Iteration Decomposition (t-RIDE). The system has been tested on 17 subjects (
n
= 12 healthy,
n
= 3 mildly cognitive impaired, and
n
= 2 with Alzheimer's disease involved in three different cognitive tasks with increasing difficulty. The system allows fast diagnosis of cognitive deficit, including mild and heavy cognitive impairment: t-RIDE convergence is achieved in 79 iterations (i.e., 1.95s), yielding an 80% accuracy in P300 amplitude evaluation with only 13 trials on a single EEG channel.
Collapse
Affiliation(s)
- D. De Venuto
- Dept. of Electrical and Information Engineering, Polytechnic of Bari, Bari, Italy
| | - V. F. Annese
- Dept. of Electrical and Information Engineering, Polytechnic of Bari, Bari, Italy
| | - G. Mezzina
- Dept. of Electrical and Information Engineering, Polytechnic of Bari, Bari, Italy
| | - F. Scioscia
- Dept. of Electrical and Information Engineering, Polytechnic of Bari, Bari, Italy
| | - M. Ruta
- Dept. of Electrical and Information Engineering, Polytechnic of Bari, Bari, Italy
| | - E. Di Sciascio
- Dept. of Electrical and Information Engineering, Polytechnic of Bari, Bari, Italy
| | | |
Collapse
|
19
|
Chen W, Tan AR, Ferguson AL. Collective variable discovery and enhanced sampling using autoencoders: Innovations in network architecture and error function design. J Chem Phys 2018; 149:072312. [PMID: 30134681 DOI: 10.1063/1.5023804] [Citation(s) in RCA: 86] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
Auto-associative neural networks ("autoencoders") present a powerful nonlinear dimensionality reduction technique to mine data-driven collective variables from molecular simulation trajectories. This technique furnishes explicit and differentiable expressions for the nonlinear collective variables, making it ideally suited for integration with enhanced sampling techniques for accelerated exploration of configurational space. In this work, we describe a number of sophistications of the neural network architectures to improve and generalize the process of interleaved collective variable discovery and enhanced sampling. We employ circular network nodes to accommodate periodicities in the collective variables, hierarchical network architectures to rank-order the collective variables, and generalized encoder-decoder architectures to support bespoke error functions for network training to incorporate prior knowledge. We demonstrate our approach in blind collective variable discovery and enhanced sampling of the configurational free energy landscapes of alanine dipeptide and Trp-cage using an open-source plugin developed for the OpenMM molecular simulation package.
Collapse
Affiliation(s)
- Wei Chen
- Department of Physics, University of Illinois at Urbana-Champaign, 1110 West Green Street, Urbana, Illinois 61801, USA
| | - Aik Rui Tan
- Department of Materials Science and Engineering, University of Illinois at Urbana-Champaign, 1304 West Green Street, Urbana, Illinois 61801, USA
| | - Andrew L Ferguson
- Department of Physics, University of Illinois at Urbana-Champaign, 1110 West Green Street, Urbana, Illinois 61801, USA
| |
Collapse
|
20
|
Ferguson AL. Machine learning and data science in soft materials engineering. JOURNAL OF PHYSICS. CONDENSED MATTER : AN INSTITUTE OF PHYSICS JOURNAL 2018; 30:043002. [PMID: 29111979 DOI: 10.1088/1361-648x/aa98bd] [Citation(s) in RCA: 69] [Impact Index Per Article: 9.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
In many branches of materials science it is now routine to generate data sets of such large size and dimensionality that conventional methods of analysis fail. Paradigms and tools from data science and machine learning can provide scalable approaches to identify and extract trends and patterns within voluminous data sets, perform guided traversals of high-dimensional phase spaces, and furnish data-driven strategies for inverse materials design. This topical review provides an accessible introduction to machine learning tools in the context of soft and biological materials by 'de-jargonizing' data science terminology, presenting a taxonomy of machine learning techniques, and surveying the mathematical underpinnings and software implementations of popular tools, including principal component analysis, independent component analysis, diffusion maps, support vector machines, and relative entropy. We present illustrative examples of machine learning applications in soft matter, including inverse design of self-assembling materials, nonlinear learning of protein folding landscapes, high-throughput antimicrobial peptide design, and data-driven materials design engines. We close with an outlook on the challenges and opportunities for the field.
Collapse
Affiliation(s)
- Andrew L Ferguson
- Department of Materials Science and Engineering, University of Illinois at Urbana-Champaign, 1304 West Green Street, Urbana, IL 61801, United States of America. Department of Chemical and Biomolecular Engineering, University of Illinois at Urbana-Champaign, 600 South Mathews Avenue, Urbana, IL 61801, United States of America. Department of Physics, University of Illinois at Urbana-Champaign, 1110 West Green Street, Urbana, IL 61801, United States of America. Frederick Seitz Materials Research Laboratory, University of Illinois at Urbana-Champaign, Urbana, IL 61801, United States of America. Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, United States of America
| |
Collapse
|
21
|
Howard P, Apley DW, Runger G. Distinct Variation Pattern Discovery Using Alternating Nonlinear Principal Component Analysis. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2018; 29:156-166. [PMID: 27810837 DOI: 10.1109/tnnls.2016.2616145] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Autoassociative neural networks (ANNs) have been proposed as a nonlinear extension of principal component analysis (PCA), which is commonly used to identify linear variation patterns in high-dimensional data. While principal component scores represent uncorrelated features, standard backpropagation methods for training ANNs provide no guarantee of producing distinct features, which is important for interpretability and for discovering the nature of the variation patterns in the data. Here, we present an alternating nonlinear PCA method, which encourages learning of distinct features in ANNs. A new measure motivated by the condition of orthogonal loadings in PCA is proposed for measuring the extent to which the nonlinear principal components represent distinct variation patterns. We demonstrate the effectiveness of our method using a simulated point cloud data set as well as a subset of the MNIST handwritten digits data. The results show that standard ANNs consistently mix the true variation sources in the low-dimensional representation learned by the model, whereas our alternating method produces solutions where the patterns are better separated in the low-dimensional space.
Collapse
|
22
|
Wang J, Ferguson AL. Nonlinear machine learning in simulations of soft and biological materials. MOLECULAR SIMULATION 2017. [DOI: 10.1080/08927022.2017.1400164] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Affiliation(s)
- J. Wang
- Department of Physics, University of Illinois Urbana-Champaign , Urbana, IL, USA
| | - A. L. Ferguson
- Department of Physics, University of Illinois Urbana-Champaign , Urbana, IL, USA
- Department of Materials Science and Engineering, University of Illinois Urbana-Champaign , Urbana, IL, USA
- Department of Chemical and Biomolecular Engineering, University of Illinois Urbana-Champaign , Urbana, IL, USA
| |
Collapse
|
23
|
Afraei S, Shahriar K, Madani SH. Statistical analysis of rock-burst events in underground mines and excavations to present reasonable data-driven predictors. J STAT COMPUT SIM 2017. [DOI: 10.1080/00949655.2017.1367000] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Affiliation(s)
- Sajjad Afraei
- Department of Mining and Metallurgical Engineering, Amirkabir University of Technology, Tehran, Iran
| | - Kourosh Shahriar
- Department of Mining and Metallurgical Engineering, Amirkabir University of Technology, Tehran, Iran
| | - Sayyed Hasan Madani
- Department of Mining and Metallurgical Engineering, Amirkabir University of Technology, Tehran, Iran
| |
Collapse
|
24
|
Abstract
This paper presents a new framework for manifold learning based on a sequence of principal polynomials that capture the possibly nonlinear nature of the data. The proposed Principal Polynomial Analysis (PPA) generalizes PCA by modeling the directions of maximal variance by means of curves, instead of straight lines. Contrarily to previous approaches, PPA reduces to performing simple univariate regressions, which makes it computationally feasible and robust. Moreover, PPA shows a number of interesting analytical properties. First, PPA is a volume-preserving map, which in turn guarantees the existence of the inverse. Second, such an inverse can be obtained in closed form. Invertibility is an important advantage over other learning methods, because it permits to understand the identified features in the input domain where the data has physical meaning. Moreover, it allows to evaluate the performance of dimensionality reduction in sensible (input-domain) units. Volume preservation also allows an easy computation of information theoretic quantities, such as the reduction in multi-information after the transform. Third, the analytical nature of PPA leads to a clear geometrical interpretation of the manifold: it allows the computation of Frenet-Serret frames (local features) and of generalized curvatures at any point of the space. And fourth, the analytical Jacobian allows the computation of the metric induced by the data, thus generalizing the Mahalanobis distance. These properties are demonstrated theoretically and illustrated experimentally. The performance of PPA is evaluated in dimensionality and redundancy reduction, in both synthetic and real datasets from the UCI repository.
Collapse
Affiliation(s)
- Valero Laparra
- Image Processing Laboratory (IPL), Universitat de València, 46980 Paterna, València, Spain
| | | | | | | | | |
Collapse
|