1
|
Rao AP, Nawar N, Annesley CJ. Machine Learning-Assisted Determination of C 6H 14 Mole Fraction From Molecular Emissions of Laser-Induced Hexane-Air Plasmas. Appl Spectrosc 2024:37028241233309. [PMID: 38403921 DOI: 10.1177/00037028241233309] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/27/2024]
Abstract
Laser-induced plasmas of materials containing hydrocarbons present strong carbon molecular emission features. Using these emissions to build models relating changes in spectral features to a physical parameter of the system, such as hydrocarbon content, can be difficult because of the dynamic complexity of the spectral features and temperature disequilibrium between molecular species. This study presents machine learning models trained to quantify the mole fraction of hexane in hexane-air plasmas from CN Violet and C2 Swan spectral features. Ensemble regression methods provide the most accurate predictions with root mean squared error on the order 10-2. Artificial neural network regressions produce predictions with superlative sensitivity, exhibiting detection limits as low as 0.008. These foundational models can be further refined with more advanced data to quantify the presence of carbon species in complex plasma environments, such as high-speed reacting flows.
Collapse
Affiliation(s)
- Ashwin P Rao
- Space Vehicles Directorate, Air Force Research Laboratory, Kirtland Air Force Base, New Mexico, USA
| | - Noshin Nawar
- Institute for Scientific Research, Boston College, Chestnut Hill, Massachusets, USA
| | - Christopher J Annesley
- Space Vehicles Directorate, Air Force Research Laboratory, Kirtland Air Force Base, New Mexico, USA
| |
Collapse
|
2
|
Hassanpour A, Geibel J, Simianer H, Pook T. Optimization of breeding program design through stochastic simulation with kernel regression. G3 (Bethesda) 2023; 13:jkad217. [PMID: 37742059 DOI: 10.1093/g3journal/jkad217] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/29/2023] [Revised: 07/29/2023] [Accepted: 09/02/2023] [Indexed: 09/25/2023]
Abstract
In recent years, breeding programs have increased significantly in size and complexity, with various highly interdependent parameters and many contrasting breeding goals. As a result, resource allocation in these programs has become more complex, and deriving an optimal breeding strategy has become increasingly challenging. To address this, a common practice is to reduce the optimization problem to a set of scenarios that differ only in a few parameters and can therefore be analyzed in detail. The goal of this article is to provide a framework for the numerical optimization of breeding programs that goes beyond the simple comparison of scenarios. For this, we first determine the space of potential breeding programs only limited by basic constraints like the budget and housing capacities. Subsequently, the goal is to identify the optimal breeding program by finding the parametrization that maximizes the target function by combining different breeding goals. To assess the value of the target function for a parametrization, we propose using stochastic simulations and the subsequent use of a kernel regression method to cope with the stochasticity of simulation outcomes. This procedure is performed iteratively to narrow down the most promising areas of the search space and perform more and more simulations in these areas of interest. In a simplified example applied to a dairy cattle program, our proposed framework has shown its ability to identify an optimal breeding strategy that aligns with a target function aiming at genetic gain and genetic diversity conservation limited by budget constraints.
Collapse
Affiliation(s)
- Azadeh Hassanpour
- Department of Animal Sciences, Center for Integrated Breeding Research, Animal Breeding and Genetics Group, University of Goettingen, 37075 Goettingen, Germany
| | - Johannes Geibel
- Department of Animal Sciences, Center for Integrated Breeding Research, Animal Breeding and Genetics Group, University of Goettingen, 37075 Goettingen, Germany
- Institute of Farm Animal Genetics, Friedrich-Loeffler-Institut, 31535 Neustadt, Germany
| | - Henner Simianer
- Department of Animal Sciences, Center for Integrated Breeding Research, Animal Breeding and Genetics Group, University of Goettingen, 37075 Goettingen, Germany
| | - Torsten Pook
- Department of Animal Sciences, Center for Integrated Breeding Research, Animal Breeding and Genetics Group, University of Goettingen, 37075 Goettingen, Germany
- Wageningen University & Research, Animal Breeding and Genomics, 6700 AH Wageningen, Netherlands
| |
Collapse
|
3
|
Alonso-Pena M, Gijbels I, Crujeiras RM. Flexible joint modeling of mean and dispersion for the directional tuning of neuronal spike counts. Biometrics 2023; 79:3431-3444. [PMID: 37327387 DOI: 10.1111/biom.13882] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2022] [Accepted: 05/18/2023] [Indexed: 06/18/2023]
Abstract
The study of how the number of spikes in a middle temporal visual area (MT/V5) neuron is tuned to the direction of a visual stimulus has attracted considerable attention over the years, but recent studies suggest that the variability of the number of spikes might also be influenced by the directional stimulus. This entails that Poisson regression models are not adequate for this type of data, as the observations usually present over/underdispersion (or both) with respect to the Poisson distribution. This paper makes use of the double exponential family and presents a flexible model to estimate, jointly, the mean and dispersion functions, accounting for the effect of a circular covariate. The empirical performance of the proposal is explored via simulations and an application to a neurological data set is shown.
Collapse
Affiliation(s)
- María Alonso-Pena
- ORSTAT, KU Leuven, Leuven, Belgium
- CITMAga, Universidade de Santiago de Compostela, Santiago de Compostela, Spain
| | - Irène Gijbels
- Department of Mathematics and Leuven Statistics Research Center (LStat), KU Leuven, Leuven, Belgium
| | - Rosa M Crujeiras
- CITMAga, Universidade de Santiago de Compostela, Santiago de Compostela, Spain
| |
Collapse
|
4
|
Galetzka W, Kowall B, Jusi C, Huessler EM, Stang A. Distance-Metric Learning for Personalized Survival Analysis. Entropy (Basel) 2023; 25:1404. [PMID: 37895525 PMCID: PMC10606222 DOI: 10.3390/e25101404] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/13/2023] [Revised: 09/21/2023] [Accepted: 09/26/2023] [Indexed: 10/29/2023]
Abstract
Personalized time-to-event or survival prediction with right-censored outcomes is a pervasive challenge in healthcare research. Although various supervised machine learning methods, such as random survival forests or neural networks, have been adapted to handle such outcomes effectively, they do not provide explanations for their predictions, lacking interpretability. In this paper, an alternative method for survival prediction by weighted nearest neighbors is proposed. Fitting this model to data entails optimizing the weights by learning a metric. An individual prediction of this method can be explained by providing the user with the most influential data points for this prediction, i.e., the closest data points and their weights. The strengths and weaknesses in terms of predictive performance are highlighted on simulated data and an application of the method on two different real-world datasets of breast cancer patients shows its competitiveness with established methods.
Collapse
Affiliation(s)
- Wolfgang Galetzka
- Institute of Medical Informatics, Biometrics and Epidemiology, University Hospital Essen, 45130 Essen, Germany
| | - Bernd Kowall
- Institute of Medical Informatics, Biometrics and Epidemiology, University Hospital Essen, 45130 Essen, Germany
| | - Cynthia Jusi
- Nisso Chemical Europe GmbH, 40212 Düsseldorf, Germany
| | - Eva-Maria Huessler
- Institute of Medical Informatics, Biometrics and Epidemiology, University Hospital Essen, 45130 Essen, Germany
| | - Andreas Stang
- Institute of Medical Informatics, Biometrics and Epidemiology, University Hospital Essen, 45130 Essen, Germany
| |
Collapse
|
5
|
Little P, Hsu L, Sun W. Associating somatic mutation with clinical outcomes through kernel regression and optimal transport. Biometrics 2023; 79:2705-2718. [PMID: 36217816 PMCID: PMC10455040 DOI: 10.1111/biom.13769] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2021] [Accepted: 09/16/2022] [Indexed: 11/30/2022]
Abstract
Somatic mutations in cancer patients are inherently sparse and potentially high dimensional. Cancer patients may share the same set of deregulated biological processes perturbed by different sets of somatically mutated genes. Therefore, when assessing the associations between somatic mutations and clinical outcomes, gene-by-gene analysis is often under-powered because it does not capture the complex disease mechanisms shared across cancer patients. Rather than testing genes one by one, an intuitive approach is to aggregate somatic mutation data of multiple genes to assess their joint association with clinical outcomes. The challenge is how to aggregate such information. Building on the optimal transport method, we propose a principled approach to estimate the similarity of somatic mutation profiles of multiple genes between tumor samples, while accounting for gene-gene similarities defined by gene annotations or empirical mutational patterns. Using such similarities, we can assess the associations between somatic mutations and clinical outcomes by kernel regression. We have applied our method to analyze somatic mutation data of 17 cancer types and identified at least five cancer types, where somatic mutations are associated with overall survival, progression-free interval, or cytolytic activity.
Collapse
Affiliation(s)
- Paul Little
- Biostatistics Program, Public Health Sciences Division, Fred Hutchinson Cancer Center, Seattle, Washington, U.S.A
| | - Li Hsu
- Biostatistics Program, Public Health Sciences Division, Fred Hutchinson Cancer Center, Seattle, Washington, U.S.A
- Department of Biostatistics, University of Washington, Seattle, Washington, U.S.A
| | - Wei Sun
- Biostatistics Program, Public Health Sciences Division, Fred Hutchinson Cancer Center, Seattle, Washington, U.S.A
- Department of Biostatistics, University of Washington, Seattle, Washington, U.S.A
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, U.S.A
| |
Collapse
|
6
|
Iype E, Pillai U J, Kumar I, Gaastra-Nedea SV, Subramanian R, Saha RN, Dutta M. In silico and in vitro assays reveal potential inhibitors against 3CL pro main protease of SARS-CoV-2. J Biomol Struct Dyn 2022; 40:12800-12811. [PMID: 34550861 DOI: 10.1080/07391102.2021.1977181] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Abstract
The COVID-19 pandemic, caused by the novel severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), is not showing any sign of slowing down even after the ongoing efforts of vaccination. The threats of new strains are concerning, as some of them are more infectious than the original one. A therapeutic against the disease is, therefore, of urgent need. Here, we use the DrugBank database to screen for potential inhibitors against the 3CLpro main protease of SARS-CoV-2. Instead of using the traditional approach of computational screening by docking, we developed a kernel ridge regressor (using a part of the docking data) to predict the binding energy of ligands. We used this model to screen the DrugBank database and shortlist two lead candidates (bromocriptine and avoralstat) for in vitro enzymatic study. Our results show that the 3CLpro enzyme activity in presence of 100 μM concentration of bromocriptine and avoralstat is 9.9% and 15.9%, respectively. Remarkably, bromocriptine exhibited submicromolar IC50 of 130 nM (0.13 μM). Avoralstat showed an IC50 of 2.16 μM. Further, the interactions of both drugs with 3CLpro were analyzed using molecular dynamics simulations of 100 ns. Results indicate that both ligands are stable in the binding pocket of the 3CLpro receptor. In addition, the MM-PBSA analysis revealed that bromocriptine (-29.37 kcal/mol) has a lower binding free energy compared to avoralstat (-6.91 kcal/mol). Further, hydrogen bond analysis also showed that bromocriptine interacts with the two catalytic residues, His41 and Cys145, more frequently than avoralstat.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Eldhose Iype
- Department of Chemical Engineering, BITS Pilani, Dubai Campus, Dubai, United Arab Emirates
| | - Jisha Pillai U
- Department of Biotechnology, BITS Pilani, Dubai Campus, Dubai, United Arab Emirates
| | - Indresh Kumar
- Department of Chemistry, BITS Pilani, Pilani Campus, Pilani, India
| | - Silvia V Gaastra-Nedea
- Department of Mechanical Engineering, Eindhoven University of Technology, Eindhoven, The Netherlands
| | | | | | - Mainak Dutta
- Department of Biotechnology, BITS Pilani, Dubai Campus, Dubai, United Arab Emirates
| |
Collapse
|
7
|
Li Y, Qi Y, Wang Y, Wang Y, Xu K, Pan G. Robust neural decoding by kernel regression with Siamese representation learning. J Neural Eng 2021; 18. [PMID: 34663771 DOI: 10.1088/1741-2552/ac2c4e] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2021] [Accepted: 10/01/2021] [Indexed: 11/12/2022]
Abstract
Objective. Brain-machine interfaces (BMIs) provide a direct pathway between the brain and external devices such as computer cursors and prosthetics, which have great potential in motor function restoration. One critical limitation of current BMI systems is the unstable performance, partly due to the variability of neural signals. Studies showed that neural activities exhibit trial-to-trial variability, and the preferred direction of neurons frequently changes under different conditions. Therefore, a fixed decoding function does not work well.Approach. To deal with the problems, we propose a novel kernel regression framework. The nonparametric kernel regression is used to fit diverse decoding functions by finding similar neural patterns to handle neural variations caused by varying tuning functions. Further, the representations of raw neural signals are learned by Siamese networks and constrained by kinematic parameters, which can alleviate neural variations caused by intrinsic noises and task-irrelevant information. The representations are jointly learned with the kernel regression framework in an end-to-end manner so that neural variations can be tackled effectively.Main results. Experiments on two datasets demonstrate that our approach outperforms most existing methods and significantly improves the robustness in challenging situations such as limited samples and missing channels.Significance. The proposed approach demonstrates robust performance with different conditions and provides a new and inspiring perspective toward robust BMI control.
Collapse
Affiliation(s)
- Yangang Li
- Qiushi Academy for Advanced Studies, Zhejiang University, Hangzhou, People's Republic of China.,College of Computer Science and Technology, Zhejiang University, Hangzhou, People's Republic of China
| | - Yu Qi
- College of Computer Science and Technology, Zhejiang University, Hangzhou, People's Republic of China.,MOE Frontier Science Center for Brain Science and Brain-machine Integration, Zhejiang University, Hangzhou, People's Republic of China
| | - Yiwen Wang
- Department of Chemical and Biological Engineering, Hong Kong University of Science and Technology, Hong Kong, People's Republic of China.,Department of Electronic and Computer Engineering, Hong Kong University of Science and Technology, Hong Kong, People's Republic of China
| | - Yueming Wang
- Qiushi Academy for Advanced Studies, Zhejiang University, Hangzhou, People's Republic of China
| | - Kedi Xu
- Qiushi Academy for Advanced Studies, Zhejiang University, Hangzhou, People's Republic of China.,Key Laboratory for Biomedical Engineering of Ministry of Education, Zhejiang University, Hangzhou, People's Republic of China.,Zhejiang Provincial Key Laboratory of Cardio-Cerebral Vascular Detection Technology and Medicinal Effectiveness Appraisal, Zhejiang University, Hangzhou, People's Republic of China
| | - Gang Pan
- College of Computer Science and Technology, Zhejiang University, Hangzhou, People's Republic of China
| |
Collapse
|
8
|
Liu J, Zhao M, Kong W. Sub-Graph Regularization on Kernel Regression for Robust Semi-Supervised Dimensionality Reduction. Entropy (Basel) 2019; 21:1125. [PMCID: PMC7514469 DOI: 10.3390/e21111125] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/07/2019] [Accepted: 11/07/2019] [Indexed: 06/17/2023]
Abstract
Dimensionality reduction has always been a major problem for handling huge dimensionality datasets. Due to the utilization of labeled data, supervised dimensionality reduction methods such as Linear Discriminant Analysis tend achieve better classification performance compared with unsupervised methods. However, supervised methods need sufficient labeled data in order to achieve satisfying results. Therefore, semi-supervised learning (SSL) methods can be a practical selection rather than utilizing labeled data. In this paper, we develop a novel SSL method by extending anchor graph regularization (AGR) for dimensionality reduction. In detail, the AGR is an accelerating semi-supervised learning method to propagate the class labels to unlabeled data. However, it cannot handle new incoming samples. We thereby improve AGR by adding kernel regression on the basic objective function of AGR. Therefore, the proposed method can not only estimate the class labels of unlabeled data but also achieve dimensionality reduction. Extensive simulations on several benchmark datasets are conducted, and the simulation results verify the effectiveness for the proposed work.
Collapse
Affiliation(s)
- Jiao Liu
- School of Management Studies, Shanghai University of Engineering Science, Shanghai 201600, China;
| | - Mingbo Zhao
- School of Information Science and Technology, Donghua University, Shanghai 201620, China
| | - Weijian Kong
- School of Information Science and Technology, Donghua University, Shanghai 201620, China
| |
Collapse
|
9
|
Zhang G, Fan W, Meng T, Jiang X, Chen G. Microscopic evaluation of traffic safety at signal coordinated intersections: A before-after study. Traffic Inj Prev 2018; 19:867-873. [PMID: 30543476 DOI: 10.1080/15389588.2018.1525611] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/13/2018] [Revised: 09/14/2018] [Accepted: 09/14/2018] [Indexed: 06/09/2023]
Abstract
OBJECTIVE This research aims to evaluate the safety impacts of signal coordination on signalized intersections and provide a scientific basis to design and improve signal control and management from a traffic safety perspective. METHODS A kernel regression model is adopted to evaluate the safety performance of intersections before and after implementing the signal coordination strategy. By using this statistical method, the authors identify the nonlinear relationship between crash frequency and the crash's spatial location and examine the discrepancy of crash spatial distributions between the coordination and noncoordination conditions at disaggregated levels, such as time of day and crash type. A case study is presented with the use of Michigan crash data (2003-2007). RESULTS The study finds that the (1) crash distribution on arterials tends to be spatially disperse when the signal coordination is in operation and (2) crash frequency at the approaches of intersections is increased with the use of signal coordination under the following conditions: Nonpeak hours, rear-end and sideswipe crashes, intersections with low speed limits, and both injury and property damage-only crashes. CONCLUSION Signal coordination poses safety concerns in addition to its operational benefits for intersections.
Collapse
Affiliation(s)
- Guopeng Zhang
- a School of Transportation and Logistics , Southwest Jiaotong University , Chengdu , China
| | - Wenbo Fan
- a School of Transportation and Logistics , Southwest Jiaotong University , Chengdu , China
- b National United Engineering Laboratory of Integrated and Intelligent Transportation , Chengdu , China
| | - Teng Meng
- a School of Transportation and Logistics , Southwest Jiaotong University , Chengdu , China
| | - Xinguo Jiang
- a School of Transportation and Logistics , Southwest Jiaotong University , Chengdu , China
- b National United Engineering Laboratory of Integrated and Intelligent Transportation , Chengdu , China
| | - Guangrong Chen
- c Chengdu Research Institute , Shenzhen Urban Transport Planning & Design Institute Co., Ltd. , Chengdu , China
| |
Collapse
|
10
|
Chmiela S, Tkatchenko A, Sauceda HE, Poltavsky I, Schütt KT, Müller KR. Machine learning of accurate energy-conserving molecular force fields. Sci Adv 2017; 3:e1603015. [PMID: 28508076 PMCID: PMC5419702 DOI: 10.1126/sciadv.1603015] [Citation(s) in RCA: 441] [Impact Index Per Article: 63.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/01/2016] [Accepted: 03/07/2017] [Indexed: 05/20/2023]
Abstract
Using conservation of energy-a fundamental property of closed classical and quantum mechanical systems-we develop an efficient gradient-domain machine learning (GDML) approach to construct accurate molecular force fields using a restricted number of samples from ab initio molecular dynamics (AIMD) trajectories. The GDML implementation is able to reproduce global potential energy surfaces of intermediate-sized molecules with an accuracy of 0.3 kcal mol-1 for energies and 1 kcal mol-1 Å̊-1 for atomic forces using only 1000 conformational geometries for training. We demonstrate this accuracy for AIMD trajectories of molecules, including benzene, toluene, naphthalene, ethanol, uracil, and aspirin. The challenge of constructing conservative force fields is accomplished in our work by learning in a Hilbert space of vector-valued functions that obey the law of energy conservation. The GDML approach enables quantitative molecular dynamics simulations for molecules at a fraction of cost of explicit AIMD calculations, thereby allowing the construction of efficient force fields with the accuracy and transferability of high-level ab initio methods.
Collapse
Affiliation(s)
- Stefan Chmiela
- Machine Learning Group, Technische Universität Berlin, 10587 Berlin, Germany
| | - Alexandre Tkatchenko
- Physics and Materials Science Research Unit, University of Luxembourg, L-1511 Luxembourg, Luxembourg
- Fritz-Haber-Institut der Max-Planck-Gesellschaft, 14195 Berlin, Germany
| | - Huziel E. Sauceda
- Fritz-Haber-Institut der Max-Planck-Gesellschaft, 14195 Berlin, Germany
| | - Igor Poltavsky
- Physics and Materials Science Research Unit, University of Luxembourg, L-1511 Luxembourg, Luxembourg
| | - Kristof T. Schütt
- Machine Learning Group, Technische Universität Berlin, 10587 Berlin, Germany
| | - Klaus-Robert Müller
- Machine Learning Group, Technische Universität Berlin, 10587 Berlin, Germany
- Department of Brain and Cognitive Engineering, Korea University, Anam-dong, Seongbuk-gu, Seoul 136-713, Korea
- Max Planck Institute for Informatics, Stuhlsatzenhausweg, 66123 Saarbrücken, Germany
| |
Collapse
|
11
|
Kim HJ, Smith BM, Adluru N, Dyer CR, Johnson SC, Singh V. Abundant Inverse Regression using Sufficient Reduction and its Applications. ACTA ACUST UNITED AC 2016; 9907:570-84. [PMID: 27796010 DOI: 10.1007/978-3-319-46487-9_35] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]
Abstract
Statistical models such as linear regression drive numerous applications in computer vision and machine learning. The landscape of practical deployments of these formulations is dominated by forward regression models that estimate the parameters of a function mapping a set of p covariates, x , to a response variable, y. The less known alternative, Inverse Regression, offers various benefits that are much less explored in vision problems. The goal of this paper is to show how Inverse Regression in the "abundant" feature setting (i.e., many subsets of features are associated with the target label or response, as is the case for images), together with a statistical construction called Sufficient Reduction, yields highly flexible models that are a natural fit for model estimation tasks in vision. Specifically, we obtain formulations that provide relevance of individual covariates used in prediction, at the level of specific examples/samples - in a sense, explaining why a particular prediction was made. With no compromise in performance relative to other methods, an ability to interpret why a learning algorithm is behaving in a specific way for each prediction, adds significant value in numerous applications. We illustrate these properties and the benefits of Abundant Inverse Regression (AIR) on three distinct applications.
Collapse
|
12
|
Wong E, Palande S, Wang B, Zielinski B, Anderson J, Fletcher PT. KERNEL PARTIAL LEAST SQUARES REGRESSION FOR RELATING FUNCTIONAL BRAIN NETWORK TOPOLOGY TO CLINICAL MEASURES OF BEHAVIOR. Proc IEEE Int Symp Biomed Imaging 2016; 2016:1303-1306. [PMID: 32742554 DOI: 10.1109/isbi.2016.7493506] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
In this paper we present a novel method for analyzing the relationship between functional brain networks and behavioral phenotypes. Drawing from topological data analysis, we first extract topological features using persistent homology from functional brain networks that are derived from correlations in resting-state fMRI. Rather than fixing a discrete network topology by thresholding the connectivity matrix, these topological features capture the network organization across all continuous threshold values. We then propose to use a kernel partial least squares (kPLS) regression to statistically quantify the relationship between these topological features and behavior measures. The kPLS also provides an elegant way to combine multiple image features by using linear combinations of multiple kernels. In our experiments we test the ability of our proposed brain network analysis to predict autism severity from rs-fMRI. We show that combining correlations with topological features gives better prediction of autism severity than using correlations alone.
Collapse
Affiliation(s)
- Eleanor Wong
- Scientific Computing and Imaging Institute, University of Utah.,School of Computing, University of Utah
| | | | - Bei Wang
- Scientific Computing and Imaging Institute, University of Utah
| | | | | | - P Thomas Fletcher
- Scientific Computing and Imaging Institute, University of Utah.,School of Computing, University of Utah
| |
Collapse
|
13
|
Briley DA, Harden KP, Bates TC, Tucker-Drob EM. Nonparametric Estimates of Gene × Environment Interaction Using Local Structural Equation Modeling. Behav Genet 2015; 45:581-96. [PMID: 26318287 PMCID: PMC5374877 DOI: 10.1007/s10519-015-9732-8] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2014] [Accepted: 07/29/2015] [Indexed: 10/23/2022]
Abstract
Gene × environment (G × E) interaction studies test the hypothesis that the strength of genetic influence varies across environmental contexts. Existing latent variable methods for estimating G × E interactions in twin and family data specify parametric (typically linear) functions for the interaction effect. An improper functional form may obscure the underlying shape of the interaction effect and may lead to failures to detect a significant interaction. In this article, we introduce a novel approach to the behavior genetic toolkit, local structural equation modeling (LOSEM). LOSEM is a highly flexible nonparametric approach for estimating latent interaction effects across the range of a measured moderator. This approach opens up the ability to detect and visualize new forms of G × E interaction. We illustrate the approach by using LOSEM to estimate gene × socioeconomic status interactions for six cognitive phenotypes. Rather than continuously and monotonically varying effects as has been assumed in conventional parametric approaches, LOSEM indicated substantial nonlinear shifts in genetic variance for several phenotypes. The operating characteristics of LOSEM were interrogated through simulation studies where the functional form of the interaction effect was known. LOSEM provides a conservative estimate of G × E interaction with sufficient power to detect statistically significant G × E signal with moderate sample size. We offer recommendations for the application of LOSEM and provide scripts for implementing these biometric models in Mplus and in OpenMx under R.
Collapse
Affiliation(s)
- Daniel A Briley
- Department of Psychology and Population Research Center, University of Texas at Austin, 108 E. Dean Keeton Stop A8000, Austin, TX, 78712-1043, USA,
| | | | | | | |
Collapse
|
14
|
Abstract
We develop an empirical likelihood (EL) inference on parameters in generalized estimating equations with nonignorably missing response data. We consider an exponential tilting model for the nonignorably missing mechanism, and propose modified estimating equations by imputing missing data through a kernel regression method. We establish some asymptotic properties of the EL estimators of the unknown parameters under different scenarios. With the use of auxiliary information, the EL estimators are statistically more efficient. Simulation studies are used to assess the finite sample performance of our proposed EL estimators. We apply our EL estimators to investigate a data set on earnings obtained from the New York Social Indicators Survey.
Collapse
Affiliation(s)
- Niansheng Tang
- Department of Statistics, Yunnan University, Kunming 650091, China
| | - Puying Zhao
- Department of Statistics, Yunnan University, Kunming 650091, China
| | - Hongtu Zhu
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| |
Collapse
|
15
|
Yin J, Geng Z, Li R, Wang H. NONPARAMETRIC COVARIANCE MODEL. Stat Sin 2010; 20:469-479. [PMID: 21170152 PMCID: PMC3002111] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
There has been considerable attention on estimation of conditional variance function in the literature. We propose here a nonparametric model for conditional covariance matrix. A kernel estimator is developed accordingly, its asymptotic bias and variance are derived, and its asymptotic normality is established. A real data example is used to illustrate the proposed estimation procedure.
Collapse
|