1
|
Pan W, Shan Y, Li C, Huang S, Li T, Li Y, Zhu H. FPLS-DC: functional partial least squares through distance covariance for imaging genetics. Bioinformatics 2024; 40:btae173. [PMID: 38552322 PMCID: PMC11034987 DOI: 10.1093/bioinformatics/btae173] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2023] [Revised: 02/28/2024] [Accepted: 03/27/2024] [Indexed: 04/24/2024] Open
Abstract
MOTIVATION Imaging genetics integrates imaging and genetic techniques to examine how genetic variations influence the function and structure of organs like the brain or heart, providing insights into their impact on behavior and disease phenotypes. The use of organ-wide imaging endophenotypes has increasingly been used to identify potential genes associated with complex disorders. However, analyzing organ-wide imaging data alongside genetic data presents two significant challenges: high dimensionality and complex relationships. To address these challenges, we propose a novel, nonlinear inference framework designed to partially mitigate these issues. RESULTS We propose a functional partial least squares through distance covariance (FPLS-DC) framework for efficient genome wide analyses of imaging phenotypes. It consists of two components. The first component utilizes the FPLS-derived base functions to reduce image dimensionality while screening genetic markers. The second component maximizes the distance correlation between genetic markers and projected imaging data, which is a linear combination of the FPLS-basis functions, using simulated annealing algorithm. In addition, we proposed an iterative FPLS-DC method based on FPLS-DC framework, which effectively overcomes the influence of inter-gene correlation on inference analysis. We efficiently approximate the null distribution of test statistics using a gamma approximation. Compared to existing methods, FPLS-DC offers computational and statistical efficiency for handling large-scale imaging genetics. In real-world applications, our method successfully detected genetic variants associated with the hippocampus, demonstrating its value as a statistical toolbox for imaging genetic studies. AVAILABILITY AND IMPLEMENTATION The FPLS-DC method we propose opens up new research avenues and offers valuable insights for analyzing functional and high-dimensional data. In addition, it serves as a useful tool for scientific analysis in practical applications within the field of imaging genetics research. The R package FPLS-DC is available in Github: https://github.com/BIG-S2/FPLSDC.
Collapse
Affiliation(s)
- Wenliang Pan
- Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China
| | - Yue Shan
- Departments of Biostatistics, Statistics, Genetics, and Computer Science and Biomedical Research Imaging Center, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Chuang Li
- Department of Statistical Science, School of Mathematics, Sun Yat-sen University, Guangzhou 510275, China
| | - Shuai Huang
- Departments of Biostatistics, Statistics, Genetics, and Computer Science and Biomedical Research Imaging Center, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Tengfei Li
- Departments of Radiology and Biomedical Research Imaging Center, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Yun Li
- Departments of Biostatistics, Statistics, Genetics, and Computer Science and Biomedical Research Imaging Center, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Hongtu Zhu
- Departments of Biostatistics, Statistics, Genetics, and Computer Science and Biomedical Research Imaging Center, University of North Carolina, Chapel Hill, NC 27599, USA
- Departments of Radiology and Biomedical Research Imaging Center, University of North Carolina, Chapel Hill, NC 27599, USA
| |
Collapse
|
2
|
Wang Z, Bai Y, Härdle WK, Tian M. Smoothed quantile regression for partially functional linear models in high dimensions. Biom J 2023; 65:e2200060. [PMID: 37147793 DOI: 10.1002/bimj.202200060] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2022] [Revised: 11/21/2022] [Accepted: 12/11/2022] [Indexed: 05/07/2023]
Abstract
Practitioners of current data analysis are regularly confronted with the situation where the heavy-tailed skewed response is related to both multiple functional predictors and high-dimensional scalar covariates. We propose a new class of partially functional penalized convolution-type smoothed quantile regression to characterize the conditional quantile level between a scalar response and predictors of both functional and scalar types. The new approach overcomes the lack of smoothness and severe convexity of the standard quantile empirical loss, considerably improving the computing efficiency of partially functional quantile regression. We investigate a folded concave penalized estimator for simultaneous variable selection and estimation by the modified local adaptive majorize-minimization (LAMM) algorithm. The functional predictors can be dense or sparse and are approximated by the principal component basis. Under mild conditions, the consistency and oracle properties of the resulting estimators are established. Simulation studies demonstrate a competitive performance against the partially functional standard penalized quantile regression. A real application using Alzheimer's Disease Neuroimaging Initiative data is utilized to illustrate the practicality of the proposed model.
Collapse
Affiliation(s)
- Zhihao Wang
- Center for Applied Statistics, School of Statistics, Renmin University of China, Beijing, P. R. China
- School of Statistics and Data Science, Xinjiang University of Finance and Economics, Urumqi, P. R. China
| | - Yongxin Bai
- School of Science, Beijing Information Science and Technology University, Beijing, P. R. China
| | - Wolfgang K Härdle
- School of Business and Economics, Humboldt-Universität Zu Berlin, Berlin, Germany
- Department of Information Management and Finance, National Yang Ming Chiao Tung University (NYCU), Hsinchu City, Taiwan
| | - Maozai Tian
- Center for Applied Statistics, School of Statistics, Renmin University of China, Beijing, P. R. China
- School of Statistics and Data Science, Xinjiang University of Finance and Economics, Urumqi, P. R. China
| |
Collapse
|
3
|
Su Z, Li B, Cook D. Envelope model for function-on-function linear regression. J Comput Graph Stat 2023. [DOI: 10.1080/10618600.2022.2163652] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Affiliation(s)
- Zhihua Su
- Department of Statistics, University of Florida
| | - Bing Li
- Department of Statistics, Pennsylvania State University
| | - Dennis Cook
- School of Statistics, University of Minnesota
| |
Collapse
|
4
|
Lundborg AR, Shah RD, Peters J. Conditional independence testing in Hilbert spaces with applications to functional data analysis. J R Stat Soc Series B Stat Methodol 2022. [DOI: 10.1111/rssb.12544] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
5
|
Beyaztas U, Shang HL. A robust partial least squares approach for function-on-function regression. BRAZ J PROBAB STAT 2022. [DOI: 10.1214/21-bjps523] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Affiliation(s)
- Ufuk Beyaztas
- Department of Statistics, Marmara University, 34722, Kadikoy-Istanbul, Turkey
| | - Han Lin Shang
- Department of Actuarial Studies and Business Analytics, Level 7, 4 Eastern Road, Macquarie University, Sydney, New South Wales 2109, Australia
| |
Collapse
|
6
|
Mutis M, Beyaztas U, Simsek GG, Shang HL. A robust scalar-on-function logistic regression for classification. COMMUN STAT-THEOR M 2022. [DOI: 10.1080/03610926.2022.2065018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
Affiliation(s)
- Muge Mutis
- Graduate School of Natural and Applied Sciences, Yildiz Technical University
| | | | | | - Han Lin Shang
- Department of Actuarial Studies and Business Analytics, Macquarie University
| |
Collapse
|
7
|
Fast implementation of partial least squares for function-on-function regression. J MULTIVARIATE ANAL 2021. [DOI: 10.1016/j.jmva.2021.104769] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
8
|
Yi M, Li Z, Tang Y. F‐type testing in functional linear models. Stat (Int Stat Inst) 2021. [DOI: 10.1002/sta4.420] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Affiliation(s)
- Menghan Yi
- School of Statistics East China Normal University Shanghai China
| | - Zaixing Li
- School of Science, China University of Mining and Technology (Beijing) Beijing China
| | - Yanlin Tang
- School of Statistics East China Normal University Shanghai China
| |
Collapse
|
9
|
Dimension reduction for functional regression with a binary response. Stat Pap (Berl) 2021. [DOI: 10.1007/s00362-019-01083-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
10
|
Aguilera-Morillo MC, Aguilera AM. Multi-class classification of biomechanical data: A functional LDA approach based on multi-class penalized functional PLS. STAT MODEL 2020. [DOI: 10.1177/1471082x19871157] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
A functional linear discriminant analysis approach to classify a set of kinematic data (human movement curves of individuals performing different physical activities) is performed. Kinematic data, usually collected in linear acceleration or angular rotation format, can be identified with functions in a continuous domain (time, percentage of gait cycle, etc.). Since kinematic curves are measured in the same sample of individuals performing different activities, they are a clear example of functional data with repeated measures. On the other hand, the sample curves are observed with noise. Then, a roughness penalty might be necessary in order to provide a smooth estimation of the discriminant functions, which would make them more interpretable. Moreover, because of the infinite dimension of functional data, a reduction dimension technique should be considered. To solve these problems, we propose a multi-class approach for penalized functional partial least squares (FPLS) regression. Then linear discriminant analysis (LDA) will be performed on the estimated FPLS components. This methodology is motivated by two case studies. The first study considers the linear acceleration recorded every two seconds in 30 subjects, related to three different activities (walking, climbing stairs and down stairs). The second study works with the triaxial angular rotation, for each joint, in 51 children when they completed a cycle walking under three conditions (walking, carrying a backpack and pulling a trolley). A simulation study is also developed for comparing the performance of the proposed functional LDA with respect to the corresponding multivariate and non-penalized approaches.
Collapse
Affiliation(s)
- M. Carmen Aguilera-Morillo
- Department of Statistics, Escuela Politécnica Superior and UC3M-BS Santander Big Data Institute, Universidad Carlos III de Madrid, Madrid, Spain
| | - Ana M. Aguilera
- Department of Statistics and O. R. and IEMath-GR, Facultad de Ciencias, Universidad de Granada, Granada, Spain
| |
Collapse
|
11
|
Palma M, Tavakoli S, Brettschneider J, Nichols TE. Quantifying uncertainty in brain-predicted age using scalar-on-image quantile regression. Neuroimage 2020; 219:116938. [PMID: 32502669 PMCID: PMC7443707 DOI: 10.1016/j.neuroimage.2020.116938] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2019] [Revised: 05/07/2020] [Accepted: 05/08/2020] [Indexed: 12/12/2022] Open
Abstract
Prediction of subject age from brain anatomical MRI has the potential to provide a sensitive summary of brain changes, indicative of different neurodegenerative diseases. However, existing studies typically neglect the uncertainty of these predictions. In this work we take into account this uncertainty by applying methods of functional data analysis. We propose a penalised functional quantile regression model of age on brain structure with cognitively normal (CN) subjects in the Alzheimer's Disease Neuroimaging Initiative (ADNI), and use it to predict brain age in Mild Cognitive Impairment (MCI) and Alzheimer's Disease (AD) subjects. Unlike the machine learning approaches available in the literature of brain age prediction, which provide only point predictions, the outcome of our model is a prediction interval for each subject.
Collapse
Affiliation(s)
- Marco Palma
- Department of Statistics, University of Warwick, Coventry, CV4 7AL, United Kingdom.
| | - Shahin Tavakoli
- Department of Statistics, University of Warwick, Coventry, CV4 7AL, United Kingdom
| | - Julia Brettschneider
- Department of Statistics, University of Warwick, Coventry, CV4 7AL, United Kingdom; The Alan Turing Institute, London, NW1 2DB, United Kingdom
| | - Thomas E Nichols
- Department of Statistics, University of Warwick, Coventry, CV4 7AL, United Kingdom; Oxford Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, Nuffield Department of Population Health, University of Oxford, Oxford, OX3 7LF, United Kingdom; Wellcome Centre for Integrative Neuroimaging, FMRIB, Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, OX3 9DU, United Kingdom
| |
Collapse
|
12
|
Turek C, Wróbel S, Piwowar M. OmicsON - Integration of omics data with molecular networks and statistical procedures. PLoS One 2020; 15:e0235398. [PMID: 32726348 PMCID: PMC7390260 DOI: 10.1371/journal.pone.0235398] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2019] [Accepted: 06/15/2020] [Indexed: 12/05/2022] Open
Abstract
A huge amount of atomized biological data collected in various databases and the need for a description of their relation by theoretical methods causes the development of data integration methods. The omics data analysis by integration of biological knowledge with mathematical procedures implemented in the OmicsON R library is presented in the paper. OmicsON is a tool for the integration of two sets of data: transcriptomics and metabolomics. In the workflow of the library, the functional grouping and statistical analysis are applied. Subgroups among the transcriptomic and metabolomics sets are created based on the biological knowledge stored in Reactome and String databases. It gives the possibility to analyze such sets of data by multivariate statistical procedures like Canonical Correlation Analysis (CCA) or Partial Least Squares (PLS). The integration of metabolomic and transcriptomic data based on the methodology contained in OmicsON helps to easily obtain information on the connection of data from two different sets. This information can significantly help in assessing the relationship between gene expression and metabolite concentrations, which in turn facilitates the biological interpretation of the analyzed process.
Collapse
Affiliation(s)
- Cezary Turek
- Department of Bioinformatics and Telemedicine, Jagiellonian University–Medical College, Krakow, Poland
| | - Sonia Wróbel
- Department of Medical Physics, Jagiellonian University, Marian Smoluchowski Institute of Physics, Krakow, Poland
| | - Monika Piwowar
- Department of Bioinformatics and Telemedicine, Jagiellonian University–Medical College, Krakow, Poland
- * E-mail:
| |
Collapse
|
13
|
Using HJ-CCD image and PLS algorithm to estimate the yield of field-grown winter wheat. Sci Rep 2020; 10:5173. [PMID: 32198471 PMCID: PMC7083868 DOI: 10.1038/s41598-020-62125-5] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2019] [Accepted: 03/09/2020] [Indexed: 11/16/2022] Open
Abstract
Remote sensing has been used as an important means of estimating crop production, especially for the estimation of crop yield in the middle and late growth period. In order to further improve the accuracy of estimating winter wheat yield through remote sensing, this study analyzed the quantitative relationship between satellite remote sensing variables obtained from HJ-CCD images and the winter wheat yield, and used the partial least square (PLS) algorithm to construct and validate the multivariate remote sensing models of estimating the yield. The research showed a close relationship between yield and most remote sensing variables. Significant multiple correlations were also recorded between most remote sensing variables. The optimal principal components numbers of PLS models used to estimate yield were 4. Green normalized difference vegetation index (GNDVI), optimized soil-adjusted vegetation index (OSAVI), normalized difference vegetation index (NDVI) and plant senescence reflectance index (PSRI) were sensitive variables for yield remote sensing estimation. Through model development and model validation evaluation, the yield estimation model’s coefficients of determination (R2) were 0.81 and 0.74 respectively. The root mean square error (RMSE) were 693.9 kg ha−1 and 786.5 kg ha−1. It showed that the PLS algorithm model estimates the yield better than the linear regression (LR) and principal components analysis (PCA) algorithms. The estimation accuracy was improved by more than 20% than the LR algorithm, and was 13% higher than the PCA algorithm. The results could provide an effective way to improve the estimation accuracy of winter wheat yield by remote sensing, and was conducive to large-area application and promotion.
Collapse
|
14
|
Tan C, Zhou X, Zhang P, Wang Z, Wang D, Guo W, Yun F. Predicting grain protein content of field-grown winter wheat with satellite images and partial least square algorithm. PLoS One 2020; 15:e0228500. [PMID: 32160185 PMCID: PMC7065814 DOI: 10.1371/journal.pone.0228500] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2019] [Accepted: 01/16/2020] [Indexed: 01/20/2023] Open
Abstract
Remote sensing has been used as an important means of modern crop production monitoring, especially for wheat quality prediction in the middle and late growth period. In order to further improve the accuracy of estimating grain protein content (GPC) through remote sensing, this study analyzed the quantitative relationship between 14 remote sensing variables obtained from images of environment and disaster monitoring and forecasting small satellite constellation system equipped with wide-band CCD sensors (abbreviated as HJ-CCD) and field-grown winter wheat GPC. The 14 remote sensing variables were normalized difference vegetation index (NDVI), soil-adjusted vegetation index (SAVI), optimized soil-adjusted vegetation index (OSAVI), nitrogen reflectance index (NRI), green normalized difference vegetation index (GNDVI), structure intensive pigment index (SIPI), plant senescence reflectance index (PSRI), enhanced vegetation index (EVI), difference vegetation index (DVI), ratio vegetation index (RVI), Rblue (reflectance at blue band), Rgreen (reflectance at green band), Rred (reflectance at red band) and Rnir (reflectance at near infrared band). The partial least square (PLS) algorithm was used to construct and validate the multivariate remote sensing model of predicting wheat GPC. The research showed a close relationship between wheat GPC and 12 remote sensing variables other than Rblue and Rgreen of the spectral reflectance bands. Among them, except PSRI and Rblue, Rgreen and Rred, other remote sensing vegetation indexes had significant multiple correlations. The optimal principal components of PLS model used to predict wheat GPC were: NDVI, SIPI, PSRI and EVI. All these were sensitive variables to predict wheat GPC. Through modeling set and verification set evaluation, GPC prediction models' coefficients of determination (R2) were 0.84 and 0.8, respectively. The root mean square errors (RMSE) were 0.43% and 0.54%, respectively. It indicated that the PLS algorithm model predicted wheat GPC better than models for linear regression (LR) and principal components analysis (PCA) algorithms. The PLS algorithm model's prediction accuracies were above 90%. The improvement was by more than 20% than the model for LR algorithm and more than 15% higher than the model for PCA algorithm. The results could provide an effective way to improve the accuracy of remotely predicting winter wheat GPC through satellite images, and was conducive to large-area application and promotion.
Collapse
Affiliation(s)
- Changwei Tan
- Jiangsu Key Laboratory of Crop Genetics and Physiology/Jiangsu Co-Innovation Center for Modern Production Technology of Grain Crops/Joint International Research Laboratory of Agriculture and Agri-Product Safety of the Ministry of Education of China, Yangzhou University, Yangzhou, China
- * E-mail: (C.T.); (F.Y.); (W.G.)
| | - Xinxing Zhou
- Jiangsu Key Laboratory of Crop Genetics and Physiology/Jiangsu Co-Innovation Center for Modern Production Technology of Grain Crops/Joint International Research Laboratory of Agriculture and Agri-Product Safety of the Ministry of Education of China, Yangzhou University, Yangzhou, China
| | - Pengpeng Zhang
- Jiangsu Key Laboratory of Crop Genetics and Physiology/Jiangsu Co-Innovation Center for Modern Production Technology of Grain Crops/Joint International Research Laboratory of Agriculture and Agri-Product Safety of the Ministry of Education of China, Yangzhou University, Yangzhou, China
| | - Zhixiang Wang
- Jiangsu Key Laboratory of Crop Genetics and Physiology/Jiangsu Co-Innovation Center for Modern Production Technology of Grain Crops/Joint International Research Laboratory of Agriculture and Agri-Product Safety of the Ministry of Education of China, Yangzhou University, Yangzhou, China
| | - Dunliang Wang
- Jiangsu Key Laboratory of Crop Genetics and Physiology/Jiangsu Co-Innovation Center for Modern Production Technology of Grain Crops/Joint International Research Laboratory of Agriculture and Agri-Product Safety of the Ministry of Education of China, Yangzhou University, Yangzhou, China
| | - Wenshan Guo
- Jiangsu Key Laboratory of Crop Genetics and Physiology/Jiangsu Co-Innovation Center for Modern Production Technology of Grain Crops/Joint International Research Laboratory of Agriculture and Agri-Product Safety of the Ministry of Education of China, Yangzhou University, Yangzhou, China
- * E-mail: (C.T.); (F.Y.); (W.G.)
| | - Fei Yun
- National Tobacco Cultivation and Physiology and Biochemistry Research Centre/Key Laboratory for Tobacco Cultivation of Tobacco Industry, Henan Agricultural University, Zhengzhou, China
- * E-mail: (C.T.); (F.Y.); (W.G.)
| |
Collapse
|
15
|
Integrating Imaging Spectrometer and Synthetic Aperture Radar Data for Estimating Wetland Vegetation Aboveground Biomass in Coastal Louisiana. REMOTE SENSING 2019. [DOI: 10.3390/rs11212533] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Aboveground biomass (AGB) plays a critical functional role in coastal wetland ecosystem stability, with high biomass vegetation contributing to organic matter production, sediment accretion potential, and the surface elevation’s ability to keep pace with relative sea level rise. Many remote sensing studies have employed either imaging spectrometer or synthetic aperture radar (SAR) for AGB estimation in various environments for assessing ecosystem health and carbon storage. This study leverages airborne data from NASA’s Airborne Visible/Infrared Imaging Spectrometer-Next Generation (AVIRIS-NG) and Uninhabited Aerial Vehicle Synthetic Aperture Radar (UAVSAR) to assess their unique capabilities in combination to estimate AGB in coastal deltaic wetlands. Here we develop AGB models for emergent herbaceous and forested wetland vegetation in coastal Louisiana. In addition to horizontally emitted, vertically received (HV) backscatter, SAR parameters are expressed by the Freeman–Durden polarimetric decomposition components representing volume and double-bounce scattering. The imaging spectrometer parameters include normalized difference vegetation index (NDVI), reflectance from 290 visible-shortwave infrared (VSWIR) bands, the first derivatives from those bands, or partial least squares (PLS) x-scores derived from those data. Model metrics and cross-validation indicate that the integrated models using the Freeman-Durden components and PLS x-scores improve AGB estimates for both wetland vegetation types. In our study domain over Louisiana’s Wax Lake Delta (WLD), we estimated a mean herbaceous wetland AGB of 3.58 Megagrams/hectare (Mg/ha) and a total of 3551.31 Mg over 9.92 km2, and a mean forested wetland AGB of 294.78 Mg/ha and a total of 27,499.14 Mg over 0.93 km2. While the addition of SAR-derived values to imaging spectrometer data provides a nominal error decrease for herbaceous wetland AGB, this combination significantly improves forested wetland AGB prediction. This integrative approach is particularly effective in forested wetlands as canopy-level biochemical characteristics are captured by the imaging spectrometer in addition to the variable structural information measured by the SAR.
Collapse
|
16
|
Aguilera-Morillo MC, Aguilera AM. Multi-class classification of biomechanical data: A functional LDA approach based on multi-class penalized functional PLS. STAT MODEL 2019. [DOI: 10.1177/1471082x17871157] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
A functional linear discriminant analysis approach to classify a set of kinematic data (human movement curves of individuals performing different physical activities) is performed. Kinematic data, usually collected in linear acceleration or angular rotation format, can be identified with functions in a continuous domain (time, percentage of gait cycle, etc.). Since kinematic curves are measured in the same sample of individuals performing different activities, they are a clear example of functional data with repeated measures. On the other hand, the sample curves are observed with noise. Then, a roughness penalty might be necessary in order to provide a smooth estimation of the discriminant functions, which would make them more interpretable. Moreover, because of the infinite dimension of functional data, a reduction dimension technique should be considered. To solve these problems, we propose a multi-class approach for penalized functional partial least squares (FPLS) regression. Then linear discriminant analysis (LDA) will be performed on the estimated FPLS components. This methodology is motivated by two case studies. The first study considers the linear acceleration recorded every two seconds in 30 subjects, related to three different activities (walking, climbing stairs and down stairs). The second study works with the triaxial angular rotation, for each joint, in 51 children when they completed a cycle walking under three conditions (walking, carrying a backpack and pulling a trolley). A simulation study is also developed for comparing the performance of the proposed functional LDA with respect to the corresponding multivariate and non-penalized approaches.
Collapse
Affiliation(s)
- M. Carmen Aguilera-Morillo
- Department of Statistics, Escuela Politécnica Superior and UC3M-BS Santander Big Data Institute, Universidad Carlos III de Madrid, Madrid, Spain
| | - Ana M. Aguilera
- Department of Statistics and O. R. and IEMath-GR, Facultad de Ciencias, Universidad de Granada, Granada, Spain
| |
Collapse
|
17
|
Zhu H, Zhang R, Li H. Estimation on semi-functional linear errors-in-variables models. COMMUN STAT-THEOR M 2019. [DOI: 10.1080/03610926.2018.1494836] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Affiliation(s)
- Hanbing Zhu
- School of Statistics, East China Normal University, Shanghai, P. R. China
| | - Riquan Zhang
- School of Statistics, East China Normal University, Shanghai, P. R. China
| | - Huiying Li
- School of Statistics, East China Normal University, Shanghai, P. R. China
| |
Collapse
|
18
|
|
19
|
Sparse wavelet estimation in quantile regression with multiple functional predictors. Comput Stat Data Anal 2019. [DOI: 10.1016/j.csda.2018.12.002] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
20
|
|
21
|
Wang Y, Kong L, Jiang B, Zhou X, Yu S, Zhang L, Heo G. Wavelet-based LASSO in functional linear quantile regression. J STAT COMPUT SIM 2019. [DOI: 10.1080/00949655.2019.1583228] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Affiliation(s)
- Yafei Wang
- College of Applied Sciences, Beijing University of Technology, Beijing, People's Republic of China
- Department of Mathematical and Statistical Sciences, University of Alberta, Edmonton, Canada
| | - Linglong Kong
- Department of Mathematical and Statistical Sciences, University of Alberta, Edmonton, Canada
| | - Bei Jiang
- Department of Mathematical and Statistical Sciences, University of Alberta, Edmonton, Canada
| | - Xingcai Zhou
- Institute of Statistics and Data Science, Nanjing Audit University, Nanjing, People's Republic of China
| | - Shimei Yu
- Department of Mathematical and Statistical Sciences, University of Alberta, Edmonton, Canada
| | - Li Zhang
- Department of Mathematical and Statistical Sciences, University of Alberta, Edmonton, Canada
| | - Giseon Heo
- Department of Mathematical and Statistical Sciences, University of Alberta, Edmonton, Canada
| |
Collapse
|
22
|
Zhu H, Zhang R, Yu Z, Lian H, Liu Y. Estimation and testing for partially functional linear errors-in-variables models. J MULTIVARIATE ANAL 2019. [DOI: 10.1016/j.jmva.2018.11.005] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
23
|
Kraus D, Stefanucci M. Classification of functional fragments by regularized linear classifiers with domain selection. Biometrika 2018. [DOI: 10.1093/biomet/asy060] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Affiliation(s)
- David Kraus
- Department of Mathematics and Statistics, Masaryk University, Kotlářská 2, Brno, Czech Republic
| | - Marco Stefanucci
- Department of Statistical Sciences, Sapienza University of Rome, Piazzale Aldo Moro 5, Roma, Italy
| |
Collapse
|
24
|
Sun X, Du P, Wang X, Ma P. Optimal Penalized Function-on-Function Regression under a Reproducing Kernel Hilbert Space Framework. J Am Stat Assoc 2018; 113:1601-1611. [PMID: 30799886 DOI: 10.1080/01621459.2017.1356320] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
Many scientific studies collect data where the response and predictor variables are both functions of time, location, or some other covariate. Understanding the relationship between these functional variables is a common goal in these studies. Motivated from two real-life examples, we present in this paper a function-on-function regression model that can be used to analyze such kind of functional data. Our estimator of the 2D coefficient function is the optimizer of a form of penalized least squares where the penalty enforces a certain level of smoothness on the estimator. Our first result is the Representer Theorem which states that the exact optimizer of the penalized least squares actually resides in a data-adaptive finite dimensional subspace although the optimization problem is defined on a function space of infinite dimensions. This theorem then allows us an easy incorporation of the Gaussian quadrature into the optimization of the penalized least squares, which can be carried out through standard numerical procedures. We also show that our estimator achieves the minimax convergence rate in mean prediction under the framework of function-on-function regression. Extensive simulation studies demonstrate the numerical advantages of our method over the existing ones, where a sparse functional data extension is also introduced. The proposed method is then applied to our motivating examples of the benchmark Canadian weather data and a histone regulation study.
Collapse
Affiliation(s)
| | - Pang Du
- Department of Statistics, Virginia Tech
| | - Xiao Wang
- Department of Statistics, Purdue University
| | - Ping Ma
- Department of Statistics, University of Georgia
| |
Collapse
|
25
|
Berrendero JR, Cuevas A, Torrecilla JL. On the Use of Reproducing Kernel Hilbert Spaces in Functional Classification. J Am Stat Assoc 2018. [DOI: 10.1080/01621459.2017.1320287] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Affiliation(s)
- José R. Berrendero
- Departamento de Matemáticas, Universidad Autónoma de Madrid, Madrid, Spain
| | - Antonio Cuevas
- Departamento de Matemáticas, Universidad Autónoma de Madrid, Madrid, Spain
| | - José L. Torrecilla
- Departamento de Matemáticas, Universidad Autónoma de Madrid, Madrid, Spain
- Institute BS-UC3M of Financial Big Data, Universidad Carlos III de Madrid, Madrid, Spain
| |
Collapse
|
26
|
Zhang X, Wang C, Wu Y. Functional envelope for model-free sufficient dimension reduction. J MULTIVARIATE ANAL 2018. [DOI: 10.1016/j.jmva.2017.09.010] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
27
|
|
28
|
|
29
|
|
30
|
Reiss PT, Goldsmith J, Shang HL, Ogden RT. Methods for scalar-on-function regression. Int Stat Rev 2017; 85:228-249. [PMID: 28919663 PMCID: PMC5598560 DOI: 10.1111/insr.12163] [Citation(s) in RCA: 72] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2015] [Accepted: 12/28/2015] [Indexed: 01/16/2023]
Abstract
Recent years have seen an explosion of activity in the field of functional data analysis (FDA), in which curves, spectra, images, etc. are considered as basic functional data units. A central problem in FDA is how to fit regression models with scalar responses and functional data points as predictors. We review some of the main approaches to this problem, categorizing the basic model types as linear, nonlinear and nonparametric. We discuss publicly available software packages, and illustrate some of the procedures by application to a functional magnetic resonance imaging dataset.
Collapse
Affiliation(s)
- Philip T. Reiss
- Department of Child and Adolescent Psychiatry and Department of Population Health, New York University School of Medicine
- Department of Statistics, University of Haifa
| | - Jeff Goldsmith
- Department of Biostatistics, Columbia University Mailman School of Public Health
| | - Han Lin Shang
- Research School of Finance, Actuarial Studies and Statistics, Australian National University
| | - R. Todd Ogden
- Department of Biostatistics, Columbia University Mailman School of Public Health
- New York State Psychiatric Institute
| |
Collapse
|
31
|
Local optimization of black-box functions with high or infinite-dimensional inputs: application to nuclear safety. Comput Stat 2017. [DOI: 10.1007/s00180-017-0751-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
32
|
Affiliation(s)
- R. Dennis Cook
- School of Statistics; University of Minnesota; Minneapolis, MN 55455
| | - Liliana Forzani
- Researcher of CONICET; Facultad de Ingeniería Química; UNL, Santiago del Estero 2819, Santa Fe Argentina
| |
Collapse
|
33
|
Affiliation(s)
- Ruiyan Luo
- Division of Epidemiology and Biostatistics, School of Public Health, Georgia State University, Atlanta, GA
| | - Xin Qi
- Department of Mathematics and Statistics, Georgia State University, Atlanta, GA
| |
Collapse
|
34
|
Aloglu AK, Harrington PDB, Sahin S, Demir C. Prediction of total antioxidant activity of Prunella L. species by automatic partial least square regression applied to 2-way liquid chromatographic UV spectral images. Talanta 2016; 161:503-510. [DOI: 10.1016/j.talanta.2016.09.014] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2016] [Revised: 08/30/2016] [Accepted: 09/03/2016] [Indexed: 12/13/2022]
|
35
|
|
36
|
Yu D, Kong L, Mizera I. Partial functional linear quantile regression for neuroimaging data analysis. Neurocomputing 2016. [DOI: 10.1016/j.neucom.2015.08.116] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
37
|
Singer M, Krivobokova T, Munk A, de Groot B. Partial least squares for dependent data. Biometrika 2016; 103:351-362. [DOI: 10.1093/biomet/asw010] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
|
38
|
Montoya EL, Meiring W. An F-type test for detecting departure from monotonicity in a functional linear model. J Nonparametr Stat 2016. [DOI: 10.1080/10485252.2016.1163352] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
39
|
Comments on: Probability enhanced effective dimension reduction for classifying sparse functional data. TEST-SPAIN 2016. [DOI: 10.1007/s11749-015-0475-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
40
|
Comments on: Probability enhanced effective dimension reduction for classifying sparse functional data. TEST-SPAIN 2016. [DOI: 10.1007/s11749-015-0472-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
41
|
|
42
|
Lian H. Functional sufficient dimension reduction: Convergence rates and multiple functional case. J Stat Plan Inference 2015. [DOI: 10.1016/j.jspi.2015.05.008] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
43
|
Febrero-Bande M, Galeano P, González-Manteiga W. Functional Principal Component Regression and Functional Partial Least-squares Regression: An Overview and a Comparative Study. Int Stat Rev 2015. [DOI: 10.1111/insr.12116] [Citation(s) in RCA: 40] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
44
|
|
45
|
Reiss PT, Huo L, Zhao Y, Kelly C, Ogden RT. WAVELET-DOMAIN REGRESSION AND PREDICTIVE INFERENCE IN PSYCHIATRIC NEUROIMAGING. Ann Appl Stat 2015; 9:1076-1101. [PMID: 27330652 PMCID: PMC4912166 DOI: 10.1214/15-aoas829] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
An increasingly important goal of psychiatry is the use of brain imaging data to develop predictive models. Here we present two contributions to statistical methodology for this purpose. First, we propose and compare a set of wavelet-domain procedures for fitting generalized linear models with scalar responses and image predictors: sparse variants of principal component regression and of partial least squares, and the elastic net. Second, we consider assessing the contribution of image predictors over and above available scalar predictors, in particular via permutation tests and an extension of the idea of confounding to the case of functional or image predictors. Using the proposed methods, we assess whether maps of a spontaneous brain activity measure, derived from functional magnetic resonance imaging, can meaningfully predict presence or absence of attention deficit/hyperactivity disorder (ADHD). Our results shed light on the role of confounding in the surprising outcome of the recent ADHD-200 Global Competition, which challenged researchers to develop algorithms for automated image-based diagnosis of the disorder.
Collapse
|
46
|
Cuevas A. Different perspectives on object oriented data analysis. Biom J 2014; 56:754-7. [DOI: 10.1002/bimj.201300177] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2013] [Revised: 12/27/2013] [Accepted: 01/04/2014] [Indexed: 11/06/2022]
Affiliation(s)
- Antonio Cuevas
- Departamento de Matemáticas; Universidad Autónoma de Madrid; Madrid Spain
| |
Collapse
|
47
|
Shang HL. Resampling Techniques for Estimating the Distribution of Descriptive Statistics of Functional Data. COMMUN STAT-SIMUL C 2014. [DOI: 10.1080/03610918.2013.788703] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
48
|
Ruiz-Medina MD, Espejo RM, Romano E. Spatial functional normal mixed effect approach for curve classification. ADV DATA ANAL CLASSI 2014. [DOI: 10.1007/s11634-014-0174-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
49
|
|
50
|
|