1
|
Matsui H, Mochida K. Functional data analysis-based yield modeling in year-round crop cultivation. HORTICULTURE RESEARCH 2024; 11:uhae144. [PMID: 38988614 PMCID: PMC11234900 DOI: 10.1093/hr/uhae144] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/17/2024] [Accepted: 05/16/2024] [Indexed: 07/12/2024]
Abstract
Crop yield prediction is essential for effective agricultural management. We introduce a methodology for modeling the relationship between environmental parameters and crop yield in longitudinal crop cultivation, exemplified by strawberry and tomato production based on year-round cultivation. Employing functional data analysis (FDA), we developed a model to assess the impact of these factors on crop yield, particularly in the face of environmental fluctuation. Specifically, we demonstrated that a varying-coefficient functional regression model (VCFRM) is utilized to analyze time-series data, enabling to visualize seasonal shifts and the dynamic interplay between environmental conditions such as solar radiation and temperature and crop yield. The interpretability of our FDA-based model yields insights for optimizing growth parameters, thereby augmenting resource efficiency and sustainability. Our results demonstrate the feasibility of VCFRM-based yield modeling, offering strategies for stable, efficient crop production, pivotal in addressing the challenges of climate adaptability in plant factory-based horticulture.
Collapse
Affiliation(s)
- Hidetoshi Matsui
- Faculty of Data Science, Shiga University, Banba, Hikone, Shiga 522-8522, Japan
| | - Keiichi Mochida
- RIKEN Center for Sustainable Resource Science, Yokohama 230-0045, Japan
- Kihara Institute for Biological Research, Yokohama City University, Yokohama 244-0813, Japan
- School of Information and Data Sciences, Nagasaki University, Nagasaki 852-8521 Japan
| |
Collapse
|
2
|
Yan X, Yu J, Ding W, Wang H, Zhao P. A novel two-way functional linear model with applications in human mortality data analysis. J Appl Stat 2023; 51:2025-2038. [PMID: 39071246 PMCID: PMC11271083 DOI: 10.1080/02664763.2023.2253379] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2023] [Accepted: 08/15/2023] [Indexed: 07/30/2024]
Abstract
Recently, two-way or longitudinal functional data analysis has attracted much attention in many fields. However, little is known on how to appropriately characterize the association between two-way functional predictor and scalar response. Motivated by a mortality study, in this paper, we propose a novel two-way functional linear model, where the response is a scalar and functional predictor is two-way trajectory. The model is intuitive, interpretable and naturally captures relationship between each way of two-way functional predictor and scalar-type response. Further, we develop a new estimation method to estimate the regression functions in the framework of weak separability. The main technical tools for the construction of the regression functions are product functional principal component analysis and iterative least square procedure. The solid performance of our method is demonstrated in extensive simulation studies. We also analyze the mortality dataset to illustrate the usefulness of the proposed procedure.
Collapse
Affiliation(s)
- Xingyu Yan
- School of Mathematics and Statistics and RIMS, Jiangsu Provincial Key Laboratory of Educational Big Data Science and Engineering, Jiangsu Normal University, Xuzhou, Jiangsu, People's Republic of China
| | - Jiaqian Yu
- School of Mathematics and Statistics and RIMS, Jiangsu Provincial Key Laboratory of Educational Big Data Science and Engineering, Jiangsu Normal University, Xuzhou, Jiangsu, People's Republic of China
| | - Weiyong Ding
- School of Mathematics and Statistics and RIMS, Jiangsu Provincial Key Laboratory of Educational Big Data Science and Engineering, Jiangsu Normal University, Xuzhou, Jiangsu, People's Republic of China
| | - Hao Wang
- School of Mathematics and Statistics, Anhui Normal University, Wuhu, People's Republic of China
| | - Peng Zhao
- School of Mathematics and Statistics and RIMS, Jiangsu Provincial Key Laboratory of Educational Big Data Science and Engineering, Jiangsu Normal University, Xuzhou, Jiangsu, People's Republic of China
| |
Collapse
|
3
|
Ren R, Fang K, Zhang Q, Wang X. Multivariate functional data clustering using adaptive density peak detection. Stat Med 2023; 42:1565-1582. [PMID: 36825602 DOI: 10.1002/sim.9687] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2022] [Revised: 02/01/2023] [Accepted: 02/06/2023] [Indexed: 02/25/2023]
Abstract
Clustering for multivariate functional data is a challenging problem since the data are represented by a set of curves and functions belonging to an infinite-dimensional space. In this article, we propose a novel clustering method for multivariate functional data using an adaptive density peak detection technique. It is a quick cluster center identification algorithm based on the two measures of each functional data observation: the functional density estimate and the distance to the closest observation with a higher functional density. We suggest two types of functional density estimators for multivariate functional data. The first one is a functional k $$ k $$ -nearest neighbor density estimator based on (a) an L2 distance between raw functional curves, or (b) a semimetric of multivariate functional principal components. The second one is a k $$ k $$ -nearest neighbor density estimator based on multivariate functional principal scores. Our clustering method is computationally fast since it does not need an iterative process. The flexibility and advantages of the method are examined by comparing it with other existing clustering methods in simulation studies. A user-friendly R package FADPclust is developed for public use. Finally, our method is applied to a real case study in lung cancer research.
Collapse
Affiliation(s)
- Rui Ren
- Department of Statistics and Data Science, Xiamen University, Xiamen, China
| | - Kuangnan Fang
- Department of Statistics and Data Science, Xiamen University, Xiamen, China
| | - Qingzhao Zhang
- Department of Statistics and Data Science, Xiamen University, Xiamen, China.,The Wang Yanan Institute for Studies in Economics, Xiamen University, Xiamen, China
| | - Xiaofeng Wang
- Department of Quantitative Health Sciences, Cleveland Clinic Lerner Research Institute, Cleveland, Ohio, USA
| |
Collapse
|
4
|
Sang P, Kashlak AB, Kong L. A reproducing kernel Hilbert space framework for functional classification. J Comput Graph Stat 2022. [DOI: 10.1080/10618600.2022.2138407] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/31/2022]
Affiliation(s)
- Peijun Sang
- Department of Statistics and Actuarial Science, University of Waterloo
| | - Adam B Kashlak
- Department of Mathematical and Statistical Sciences, University of Alberta
| | - Linglong Kong
- Department of Mathematical and Statistical Sciences, University of Alberta
| |
Collapse
|
5
|
Cui E, Li R, Crainiceanu CM, Xiao L. Fast Multilevel Functional Principal Component Analysis. J Comput Graph Stat 2022; 32:366-377. [PMID: 37313008 PMCID: PMC10260118 DOI: 10.1080/10618600.2022.2115500] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2021] [Accepted: 08/06/2022] [Indexed: 10/15/2022]
Abstract
We introduce fast multilevel functional principal component analysis (fast MFPCA), which scales up to high dimensional functional data measured at multiple visits. The new approach is orders of magnitude faster than and achieves comparable estimation accuracy with the original MFPCA (Di et al., 2009). Methods are motivated by the National Health and Nutritional Examination Survey (NHANES), which contains minute-level physical activity information of more than 10000 participants over multiple days and 1440 observations per day. While MFPCA takes more than five days to analyze these data, fast MFPCA takes less than five minutes. A theoretical study of the proposed method is also provided. The associated function mfpca.face() is available in the R package refund.
Collapse
Affiliation(s)
- Erjia Cui
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, 615 N. Wolfe Street, Baltimore, MD 21205
| | - Ruonan Li
- Department of Statistics, North Carolina State University, 2311 Stinson Dr, Raleigh, NC 27607
| | - Ciprian M. Crainiceanu
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, 615 N. Wolfe Street, Baltimore, MD 21205
| | - Luo Xiao
- Department of Statistics, North Carolina State University, 2311 Stinson Dr, Raleigh, NC 27607
| |
Collapse
|
6
|
Liu Y, Li Y, Carroll RJ, Wang N. Predictive Functional Linear Models with Diverging Number of Semiparametric Single-Index Interactions. JOURNAL OF ECONOMETRICS 2022; 230:221-239. [PMID: 36017081 PMCID: PMC9398183 DOI: 10.1016/j.jeconom.2021.03.010] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
When predicting crop yield using both functional and multivariate predictors, the prediction performances benefit from the inclusion of the interactions between the two sets of predictors. We assume the interaction depends on a nonparametric, single-index structure of the multivariate predictor and reduce each functional predictor's dimension using functional principal component analysis (FPCA). Allowing the number of FPCA scores to diverge to infinity, we consider a sequence of semiparametric working models with a diverging number of predictors, which are FPCA scores with estimation errors. We show that the parametric component of the model is root-n consistent and asymptotically normal, the overall prediction error is dominated by the estimation of the nonparametric interaction function, and justify a CV-based procedure to select the tuning parameters.
Collapse
Affiliation(s)
- Yanghui Liu
- School of Economics and Statistics, Guangzhou University, China
| | - Yehua Li
- Department of Statistics, University of California, Riverside, CA, 92521, USA
| | - Raymond J Carroll
- Department of Statistics, Texas A&M University, College Station, TX 77843-3143, and School of Mathematical and Physical Sciences, University of Technology Sydney, Broadway NSW 2007, Australia
| | - Naisyin Wang
- Department of Statistics, University of Michigan, Ann Arbor, MI 48109, USA
| |
Collapse
|
7
|
Park Y, Li B, Li Y. Crop Yield Prediction Using Bayesian Spatially Varying Coefficient Models with Functional Predictors. J Am Stat Assoc 2022. [DOI: 10.1080/01621459.2022.2123333] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
Affiliation(s)
- Yeonjoo Park
- Management Science and Statistics, University of Texas at San Antonio
| | - Bo Li
- Department of Statistics, University of Illinois at Urbana-Champaign
| | - Yehua Li
- Department of Statistics, University of California at Riverside
| |
Collapse
|
8
|
Tang Q, Tu W, Kong L. Estimation for partial functional partially linear additive model. Comput Stat Data Anal 2022. [DOI: 10.1016/j.csda.2022.107584] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/16/2022]
|
9
|
Feng S, Zhang X, Liang H, Pei L. Model selection for functional linear regression with hierarchical structure. BRAZ J PROBAB STAT 2022. [DOI: 10.1214/21-bjps525] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Affiliation(s)
- Sanying Feng
- School of Mathematics and Statistics, Zhengzhou University, Zhengzhou 450001, China
| | - Xinyu Zhang
- School of Mathematics and Statistics, Zhengzhou University, Zhengzhou 450001, China
| | - Hui Liang
- School of Mathematics and Statistics, Zhengzhou University, Zhengzhou 450001, China
| | - Lifang Pei
- School of Mathematics and Statistics, Zhengzhou University, Zhengzhou 450001, China
| |
Collapse
|
10
|
Zhong R, Liu S, Li H, Zhang J. Functional principal component analysis estimator for non-Gaussian data. J STAT COMPUT SIM 2022. [DOI: 10.1080/00949655.2022.2048302] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Affiliation(s)
- Rou Zhong
- Center for Applied Statistics, School of Statistics, Renmin University of China, Haidian-qu, People's Republic of China
| | - Shishi Liu
- School of Economics, Hangzhou Dianzi University, Hangzhou, People's Republic of China
| | - Haocheng Li
- Department of Mathematics and Statistics, University of Calgary, Calgary, Alberta, Canada
| | - Jingxiao Zhang
- Center for Applied Statistics, School of Statistics, Renmin University of China, Haidian-qu, People's Republic of China
| |
Collapse
|
11
|
Li Y, Qiu Y, Xu Y. From multivariate to functional data analysis: fundamentals, recent developments, and emerging areas. J MULTIVARIATE ANAL 2022; 188:104806. [PMID: 39040141 PMCID: PMC11261241 DOI: 10.1016/j.jmva.2021.104806] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Functional data analysis (FDA), which is a branch of statistics on modeling infinite dimensional random vectors resided in functional spaces, has become a major research area for Journal of Multivariate Analysis. We review some fundamental concepts of FDA, their origins and connections from multivariate analysis, and some of its recent developments, including multi-level functional data analysis, high-dimensional functional regression, and dependent functional data analysis. We also discuss the impact of these new methodology developments on genetics, plant science, wearable device data analysis, image data analysis, and business analytics. Two real data examples are provided to motivate our discussions.
Collapse
Affiliation(s)
- Yehua Li
- University of California - Riverside, Riverside, CA 92521, USA
| | - Yumou Qiu
- Iowa State University, Ames, IA 50011, USA
| | - Yuhang Xu
- Bowling Green State University, Bowling Green, OH 43403, USA
| |
Collapse
|
12
|
Sass D, Li B, Clifton M, Harbison J, Xamplas C, Smith R. The Impact of Adulticide on Culex Abundance and Infection Rate in North Shore of Cook County, Illinois. JOURNAL OF THE AMERICAN MOSQUITO CONTROL ASSOCIATION 2022; 38:46-58. [PMID: 35276731 DOI: 10.2987/21-7036] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Mosquito surveillance is critical to reduce the risk of West Nile virus (WNV) transmission to humans. In response to surveillance indicators such as elevated mosquito abundance or increased WNV levels, many mosquito control programs will perform truck-mounted ultra-low volume (ULV) adulticide application to reduce the number of mosquitoes and associated virus transmission. Despite the common use of truck-based ULV adulticiding as a public health measure to reduce WNV prevalence, limited evidence exists to support a role in reducing viral transmission to humans. We use a generalized additive and fused ridge regression model to quantify the location-specific impact of truck-mounted ULV adulticide spray efforts from 2010 to 2018 in the North Shore Mosquito Abatement District (NSMAD) in metropolitan Chicago, IL, on commonly assessed risk factors from NSMAD surveillance gravid traps: Culex abundance, infection rate, and vector index. Our model also takes into account environmental variables commonly associated with WNV, including temperature, precipitation, wind speed, location, and week of year. Since it is unlikely ULV adulticide spraying will have the same impact at each trap location, we use a spatially varying spray effect with a fused ridge penalty to determine how the effect varies by trap location. We found that ULV adulticide spraying has an immediate temporary reduction in abundance followed by an increase after 5 days. It is estimated that mosquito abundance increased more in sprayed areas than if left unsprayed in all but 3 trap locations. The impact on infection rate and vector index were inconclusive due to the large error associated with estimating trap-specific infection rates.
Collapse
|
13
|
Solea E, Dette H. Nonparametric and high-dimensional functional graphical models. Electron J Stat 2022. [DOI: 10.1214/22-ejs2087] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Affiliation(s)
- Eftychia Solea
- CREST and ENSAI, Rennes, France, Ruhr-Universität Bochum, Germany
| | - Holger Dette
- CREST and ENSAI, Rennes, France, Ruhr-Universität Bochum, Germany
| |
Collapse
|
14
|
|
15
|
Zhang H, Li Y. Unified Principal Component Analysis for Sparse and Dense Functional Data under Spatial Dependency. JOURNAL OF BUSINESS & ECONOMIC STATISTICS : A PUBLICATION OF THE AMERICAN STATISTICAL ASSOCIATION 2021; 40:1523-1537. [PMID: 36582252 PMCID: PMC9793858 DOI: 10.1080/07350015.2021.1938085] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
We consider spatially dependent functional data collected under a geostatistics setting, where locations are sampled from a spatial point process. The functional response is the sum of a spatially dependent functional effect and a spatially independent functional nugget effect. Observations on each function are made on discrete time points and contaminated with measurement errors. Under the assumption of spatial stationarity and isotropy, we propose a tensor product spline estimator for the spatio-temporal covariance function. When a coregionalization covariance structure is further assumed, we propose a new functional principal component analysis method that borrows information from neighboring functions. The proposed method also generates nonparametric estimators for the spatial covariance functions, which can be used for functional kriging. Under a unified framework for sparse and dense functional data, infill and increasing domain asymptotic paradigms, we develop the asymptotic convergence rates for the proposed estimators. Advantages of the proposed approach are demonstrated through simulation studies and two real data applications representing sparse and dense functional data, respectively.
Collapse
|
16
|
Qiu Z, Chen J, Zhang JT. Two-sample tests for multivariate functional data with applications. Comput Stat Data Anal 2021. [DOI: 10.1016/j.csda.2020.107160] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
17
|
Moon H, Chen K. Interpoint-ranking sign covariance for the test of independence. Biometrika 2021. [DOI: 10.1093/biomet/asab011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Summary
We generalize the sign covariance introduced by Bergsma & Dassios (2014) to multivariate random variables and beyond. The new interpoint-ranking sign covariance is applicable to general types of random objects as long as a meaningful similarity measure can be defined, and it is shown to be zero if and only if the two random variables are independent. The test statistic is a $U$-statistic, whose large-sample behaviour guarantees that the proposed test is consistent against general types of alternatives. Numerical experiments and data analyses demonstrate the superior empirical performance of the proposed method.
Collapse
Affiliation(s)
- Haeun Moon
- Department of Statistics, University of Pittsburgh, 230 S Bouquet Street, Pittsburgh, Pennsylvania 15260, U.S.A
| | - Kehui Chen
- Department of Statistics, University of Pittsburgh, 230 S Bouquet Street, Pittsburgh, Pennsylvania 15260, U.S.A
| |
Collapse
|
18
|
Wang J, Wong RKW, Zhang X. Low-Rank Covariance Function Estimation for Multidimensional Functional Data. J Am Stat Assoc 2020. [DOI: 10.1080/01621459.2020.1820344] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Affiliation(s)
- Jiayi Wang
- Department of Statistics, Texas A&M University, College Station, TX
| | | | - Xiaoke Zhang
- Department of Statistics, George Washington University, Washington, DC
| |
Collapse
|
19
|
Zhu Y, Huang X, Li L. Dynamic prediction of time to a clinical event with sparse and irregularly measured longitudinal biomarkers. Biom J 2020; 62:1371-1393. [PMID: 32196728 PMCID: PMC7502505 DOI: 10.1002/bimj.201900112] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2019] [Revised: 12/13/2019] [Accepted: 12/18/2019] [Indexed: 12/21/2022]
Abstract
In clinical research and practice, landmark models are commonly used to predict the risk of an adverse future event, using patients' longitudinal biomarker data as predictors. However, these data are often observable only at intermittent visits, making their measurement times irregularly spaced and unsynchronized across different subjects. This poses challenges to conducting dynamic prediction at any post-baseline time. A simple solution is the last-value-carry-forward method, but this may result in bias for the risk model estimation and prediction. Another option is to jointly model the longitudinal and survival processes with a shared random effects model. However, when dealing with multiple biomarkers, this approach often results in high-dimensional integrals without a closed-form solution, and thus the computational burden limits its software development and practical use. In this article, we propose to process the longitudinal data by functional principal component analysis techniques, and then use the processed information as predictors in a class of flexible linear transformation models to predict the distribution of residual time-to-event occurrence. The measurement schemes for multiple biomarkers are allowed to be different within subject and across subjects. Dynamic prediction can be performed in a real-time fashion. The advantages of our proposed method are demonstrated by simulation studies. We apply our approach to the African American Study of Kidney Disease and Hypertension, predicting patients' risk of kidney failure or death by using four important longitudinal biomarkers for renal functions.
Collapse
Affiliation(s)
- Yayuan Zhu
- The Department of Epidemiology and Biostatistics, University of Western Ontario, London, ON, Canada
| | - Xuelin Huang
- The Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX, United States
| | - Liang Li
- The Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX, United States
| |
Collapse
|
20
|
|
21
|
Abstract
Covariance estimation is essential yet underdeveloped for analyzing multivariate functional data. We propose a fast covariance estimation method for multivariate sparse functional data using bivariate penalized splines. The tensor-product B-spline formulation of the proposed method enables a simple spectral decomposition of the associated covariance operator and explicit expressions of the resulting eigenfunctions as linear combinations of B-spline bases, thereby dramatically facilitating subsequent principal component analysis. We derive a fast algorithm for selecting the smoothing parameters in covariance smoothing using leave-one-subject-out cross-validation. The method is evaluated with extensive numerical studies and applied to an Alzheimer's disease study with multiple longitudinal outcomes.
Collapse
Affiliation(s)
- Cai Li
- Department of Statistics, North Carolina State Univerisy, NC, USA
| | - Luo Xiao
- Department of Statistics, North Carolina State Univerisy, NC, USA
| | - Sheng Luo
- Department of Biostatistics and Bioinformatics, Duke Universitye, NC, USA
| |
Collapse
|