1
|
Anwar MB, Hanif M, Shahzad U, Emam W, Anas MM, Ali N, Shahzadi S. Incorporating the neutrosophic framework into kernel regression for predictive mean estimation. Heliyon 2024; 10:e25471. [PMID: 38322963 PMCID: PMC10845908 DOI: 10.1016/j.heliyon.2024.e25471] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2023] [Revised: 01/02/2024] [Accepted: 01/28/2024] [Indexed: 02/08/2024] Open
Abstract
In traditional statistics, all research endeavors revolve around utilizing precise, crisp data for the predictive estimation of population mean in survey sampling, when the supplementary information is accessible. However, these types of estimates often suffer from bias. The major aim is to uncover the most accurate estimates for the unknown value of the population mean while minimizing the mean square error (MSE). We have employed the neutrosophic approach, which is the extension of classical statistics that deals with the uncertain, vague, and indeterminate information, and proposed a neutrosophic predictive estimator of finite population mean using the kernel regression. The proposed estimator does not yield a single numerical value but instead provides an interval range within which the population parameter is likely to exist. This approach enhances the efficiency of the estimators by offering an estimated interval that encompasses the unknown value of the population mean with the least possible mean squared error (MSE). The simulation-based efficiency of the proposed estimator is discussed using the Sine, Bump and real-time temperature data set of Islamabad by using symmetric (Gaussian) kernel. The proposed non-parametric neutrosophic estimator has shown more effective results under the various bandwidth selectors than the adapted neutrosophic estimators.
Collapse
Affiliation(s)
- Muhammad Bilal Anwar
- Department of Mathematics and Statistics - PMAS-Arid Agriculture University, Rawalpindi, 46300, Pakistan
| | - Muhammad Hanif
- Department of Mathematics and Statistics - PMAS-Arid Agriculture University, Rawalpindi, 46300, Pakistan
| | - Usman Shahzad
- Department of Mathematics and Statistics - PMAS-Arid Agriculture University, Rawalpindi, 46300, Pakistan
| | - Walid Emam
- Department of Statistics and Operations Research, Faculty of Science, King Saud University, P.O. Box 2455, Riyadh, 11451, Saudi Arabia
| | - Malik Muhammad Anas
- Department of Economics and Statistics, University of Salerno, Fisciano, Salerno, 84084, Italy
| | - Nasir Ali
- Department of Mathematics and Statistics - PMAS-Arid Agriculture University, Rawalpindi, 46300, Pakistan
| | - Shabnam Shahzadi
- Department of Mathematics and Big Data, Anhui University of Science and Technology, Huainan, 232001, China
| |
Collapse
|
2
|
Safari MJS, Rahimzadeh Arashloo S, Vaheddoost B. Multiple kernel fusion: A novel approach for lake water depth modeling. Environ Res 2023; 217:114856. [PMID: 36410463 DOI: 10.1016/j.envres.2022.114856] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Revised: 10/14/2022] [Accepted: 11/17/2022] [Indexed: 06/16/2023]
Abstract
Multiple kernel fusion (MKF) refers to the task of combining multiple sources of information in the Hilbert space for improved performance. Very often the combined kernel is formed as a linear composition of multiple base kernels where the combination weights are learned from the data. As the first application of an MKF approach in hydrological modeling, lake water depth as one of the pivot factors in the reservoir analysis is simulated by considering different hydro-meteorological variables. The role of each individual input parameter is initially investigated by applying a kernel regression approach. We then illustrate the utility of an MKF formalism which learns kernel combination of weights to yield an optimal composition for kernel regression. A set of 40-year data collected from 27 groundwater and streamflow stations and 7 meteorological stations for precipitation and evaporation parameters in the vicinity of Lake Urmia are utilized for model development. Both visual and quantitative statistical performance criteria illustrate a superior performance for the MKF approach compared to kernel ridge regression (KRR), the support vector regression (SVR), back propagation neural network (BPNN) and auto regressive (AR) models. More specifically, while each individual input parameter fails to provide an accurate prediction for lake water depth modeling, an optimal combination of all input parameters incorporating the groundwater level, streamflow, precipitation and evaporation via a multiple kernel learning approach enhances the predictive performance of the model accuracy in the multiple scenarios. The promising results (RMSE = 0.098 m; R2 = 0.987; NSE = 0.986) may motivate the application of a MKF approach towards solving alternative and complex hydrological problems.
Collapse
Affiliation(s)
| | | | - Babak Vaheddoost
- Department of Civil Engineering, Bursa Technical University, Bursa, Turkey.
| |
Collapse
|
3
|
Wang H, Li Q, Liu Y. Adaptive Supervised Learning on Data Streams in Reproducing Kernel Hilbert Spaces with Data Sparsity Constraint. Stat (Int Stat Inst) 2023; 12:e514. [PMID: 38037648 PMCID: PMC10688597 DOI: 10.1002/sta4.514] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Accepted: 10/07/2022] [Indexed: 01/04/2023]
Abstract
Data are generated at an unprecedented rate and scale these days across many disciplines. The field of streaming data analysis has emerged as a result of new data collection and storage technologies in various areas, such as air pollution monitoring, detection of traffic congestion, disease surveillance, and recommendation systems. In this paper, we consider the problem of model estimation for data streams in reproducing kernel Hilbert spaces. We propose an adaptive supervised learning method with a data sparsity constraint that uses limited storage spaces and can handle non-stationary models. We demonstrate the competitive performance of the proposed method using simulations and analysis of the bike sharing dataset.
Collapse
Affiliation(s)
- Haodong Wang
- Department of Statistics and Operations Research, The University of North Carolina at Chapel Hill, North Carolina, USA
| | - Quefeng Li
- Department of Biostatistics, The University of North Carolina at Chapel Hill, North Carolina, USA
| | - Yufeng Liu
- Department of Statistics and Operations Research, The University of North Carolina at Chapel Hill, North Carolina, USA
- Department of Biostatistics, The University of North Carolina at Chapel Hill, North Carolina, USA
- Department of Genetics, The University of North Carolina at Chapel Hill, North Carolina, USA
- Carolina Center for Genome Sciences, The University of North Carolina at Chapel Hill, North Carolina, USA
- Lineberger Comprehensive Cancer Center, The University of North Carolina at Chapel Hill, North Carolina, USA
| |
Collapse
|
4
|
Man J, Zielinski MD, Das D, Wutthisirisart P, Pasupathy KS. Improving Non-invasive Hemoglobin Measurement Accuracy Using Nonparametric Models. J Biomed Inform 2021; 126:103975. [PMID: 34906736 DOI: 10.1016/j.jbi.2021.103975] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2021] [Revised: 12/03/2021] [Accepted: 12/06/2021] [Indexed: 11/19/2022]
Abstract
Uncontrolled hemorrhage is a leading cause of preventable death among patients with trauma. Early recognition of hemorrhage can aid in the decision to administer blood transfusion and improve patient outcomes. To provide real-time measurement and continuous monitoring of hemoglobin concentration, the non-invasive and continuous hemoglobin (SpHb) measurement device has drawn extensive attention in clinical practice. However, the accuracy of such a device varies in different scenarios, so the use is not yet widely accepted. This article focuses on using statistical nonparametric models to improve the accuracy of SpHb measurement device by considering measurement bias among instantaneous measurements and individual evolution trends. In the proposed method, the robust locally estimated scatterplot smoothing (LOESS) method and the Kernel regression model are considered to address those issues. Overall performance of the proposed method was evaluated by cross-validation, which showed a substantial improvement in accuracy with an 11.3% reduction of standard deviation, 23.7% reduction of mean absolute error, and 28% reduction of mean absolute percentage error compared to the original measurements. The effects of patient demographics and initial medical condition were analyzed and deemed to not have a significant effect on accuracy. Because of its high accuracy, the proposed method is highly promising to be considered to support transfusion decision-making and continuous monitoring of hemoglobin concentration. The method also has promise for similar advancement of other diagnostic devices in healthcare.
Collapse
Affiliation(s)
- Jianing Man
- Robert D. and Patricia E. Kern Center for the Science of Health Care Delivery, Mayo Clinic, Rochester, MN, USA.
| | | | - Devashish Das
- Department of Industrial and Management Systems Engineering, University of South Florida, Tempa, FL, USA
| | | | - Kalyan S Pasupathy
- Robert D. and Patricia E. Kern Center for the Science of Health Care Delivery, Mayo Clinic, Rochester, MN, USA; Biomedical & Health Information Sciences, University of Illinois at Chicago, Chicago, IL, USA.
| |
Collapse
|
5
|
Shi J, Boehnke M, Lee S. Trans-ethnic meta-analysis of rare variants in sequencing association studies. Biostatistics 2021; 22:706-722. [PMID: 31883325 DOI: 10.1093/biostatistics/kxz061] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2018] [Revised: 11/06/2019] [Accepted: 12/02/2019] [Indexed: 11/15/2022] Open
Abstract
Trans-ethnic meta-analysis is a powerful tool for detecting novel loci in genetic association studies. However, in the presence of heterogeneity among different populations, existing gene-/region-based rare variants meta-analysis methods may be unsatisfactory because they do not consider genetic similarity or dissimilarity among different populations. In response, we propose a score test under the modified random effects model for gene-/region-based rare variants associations. We adapt the kernel regression framework to construct the model and incorporate genetic similarities across populations into modeling the heterogeneity structure of the genetic effect coefficients. We use a resampling-based copula method to approximate asymptotic distribution of the test statistic, enabling efficient estimation of p-values. Simulation studies show that our proposed method controls type I error rates and increases power over existing approaches in the presence of heterogeneity. We illustrate our method by analyzing T2D-GENES consortium exome sequence data to explore rare variant associations with several traits.
Collapse
Affiliation(s)
- Jingchunzi Shi
- Thomas Francis, Jr. School of Public Health II, 1420 Washington Heights, Ann Arbor, MI 48109, USA
| | - Michael Boehnke
- Thomas Francis, Jr. School of Public Health II, 1420 Washington Heights, Ann Arbor, MI 48109, USA
| | - Seunggeun Lee
- Thomas Francis, Jr. School of Public Health II, 1420 Washington Heights, Ann Arbor, MI 48109, USA
| |
Collapse
|
6
|
Abstract
Learning in the Reproducing Kernel Hilbert Space (RKHS) has been widely used in many scientific disciplines. Because a RKHS can be very flexible, it is common to impose a regularization term in the optimization to prevent overfitting. Standard RKHS learning employs the squared norm penalty of the learning function. Despite its success, many challenges remain. In particular, one cannot directly use the squared norm penalty for variable selection or data extraction. Therefore, when there exists noise predictors, or the underlying function has a sparse representation in the dual space, the performance of standard RKHS learning can be suboptimal. In the literature, work has been proposed on how to perform variable selection in RKHS learning, and a data sparsity constraint was considered for data extraction. However, how to learn in a RKHS with both variable selection and data extraction simultaneously remains unclear. In this paper, we propose a unified RKHS learning method, namely, DOuble Sparsity Kernel (DOSK) learning, to overcome this challenge. An efficient algorithm is provided to solve the corresponding optimization problem. We prove that under certain conditions, our new method can asymptotically achieve variable selection consistency. Simulated and real data results demonstrate that DOSK is highly competitive among existing approaches for RKHS learning.
Collapse
Affiliation(s)
- Jingxiang Chen
- Department of Biostatistics, University of North Carolina at Chapel Hill
| | - Chong Zhang
- Department of Statistics and Actuarial Science, University of Waterloo
| | - Michael R. Kosorok
- Department of Biostatistics, University of North Carolina at Chapel Hill
| | - Yufeng Liu
- Department of Statistics and Operations Research, University of North Carolina at Chapel Hill
| |
Collapse
|
7
|
McIntyre J, Johnson BA, Rappaport SM. Monte Carlo methods for nonparametric regression with heteroscedastic measurement error. Biometrics 2017; 74:498-505. [PMID: 28914966 DOI: 10.1111/biom.12765] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2016] [Revised: 07/01/2017] [Accepted: 07/01/2017] [Indexed: 12/01/2022]
Abstract
Nonparametric regression is a fundamental problem in statistics but challenging when the independent variable is measured with error. Among the first approaches was an extension of deconvoluting kernel density estimators for homescedastic measurement error. The main contribution of this article is to propose a new simulation-based nonparametric regression estimator for the heteroscedastic measurement error case. Similar to some earlier proposals, our estimator is built on principles underlying deconvoluting kernel density estimators. However, the proposed estimation procedure uses Monte Carlo methods for estimating nonlinear functions of a normal mean, which is different than any previous estimator. We show that the estimator has desirable operating characteristics in both large and small samples and apply the method to a study of benzene exposure in Chinese factory workers.
Collapse
Affiliation(s)
- Julie McIntyre
- Department of Mathematics and Statistics, University of Alaska Fairbanks, Fairbanks, Alaska 99775, U.S.A
| | - Brent A Johnson
- Department of Biostatistics and Computational Biology, University of Rochester, Rochester, New York 14642, U.S.A
| | - Stephen M Rappaport
- Department of Environmental Health Sciences, University of California, Berkeley, California 94720, U.S.A
| |
Collapse
|
8
|
Abstract
Typical cerebral cortical analyses rely on spatial normalization and are sensitive to misregistration arising from partial homologies between subject brains and local optima in nonlinear registration. In contrast, we use a descriptor of the 3D cortical sheet (jointly modeling folding and thickness) that is robust to misregistration. Our histogram-based descriptor lies on a Riemannian manifold. We propose new regularized nonlinear methods for (i) detecting group differences, using a Mercer kernel with an implicit lifting map to a reproducing kernel Hilbert space, and (ii) regression against clinical variables, using kernel density estimation. For both methods, we employ kernels that exploit the Riemannian structure. Results on simulated and clinical data shows the improved accuracy and stability of our approach in cortical-sheet analysis.
Collapse
|
9
|
Welchowski T, Schmid M. A framework for parameter estimation and model selection in kernel deep stacking networks. Artif Intell Med 2016; 70:31-40. [PMID: 27431035 DOI: 10.1016/j.artmed.2016.04.002] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2015] [Revised: 03/09/2016] [Accepted: 04/21/2016] [Indexed: 10/21/2022]
Abstract
BACKGROUND AND OBJECTIVES Kernel deep stacking networks (KDSNs) are a novel method for supervised learning in biomedical research. Belonging to the class of deep learning techniques, KDSNs are based on artificial neural network architectures that involve multiple nonlinear transformations of the input data. Unlike traditional artificial neural networks, KDSNs do not rely on backpropagation algorithms but on an efficient fitting procedure that is based on a series of kernel ridge regression models with closed-form solutions. Although being computationally advantageous, KDSN modeling remains a challenging task, as it requires the specification of a large number of tuning parameters. METHODS AND MATERIAL We propose a new data-driven framework for parameter estimation, hyperparameter tuning, and model selection in KDSNs. The proposed methodology is based on a combination of model-based optimization and hill climbing approaches that do not require the pre-specification of any of the KDSN tuning parameters. We demonstrate the performance of KDSNs by analyzing three medical data sets on hospital readmission of diabetes patients, coronary artery disease, and hospital costs. RESULTS Our numerical studies show that the run-time of the proposed KDSN methodology is significantly shorter than the respective run-time of grid search strategies for hyperparameter tuning. They also show that KDSN modeling is competitive in terms of prediction accuracy with other state-of-the-art techniques for statistical learning. CONCLUSIONS KDSNs are a computationally efficient approximation of backpropagation-based artificial neural network techniques. Application of the proposed methodology results in a fast tuning procedure that generates KDSN fits having a similar prediction accuracy as other techniques in the field of deep learning.
Collapse
Affiliation(s)
- Thomas Welchowski
- Department of Medical Biometry, Informatics and Epidemiology, Rheinische Friedrich-Wilhelms-Universität Bonn, Sigmund-Freud-Str. 25, 53127 Bonn, Germany.
| | - Matthias Schmid
- Department of Medical Biometry, Informatics and Epidemiology, Rheinische Friedrich-Wilhelms-Universität Bonn, Sigmund-Freud-Str. 25, 53127 Bonn, Germany.
| |
Collapse
|
10
|
Shi J, Lee S. A novel random effect model for GWAS meta-analysis and its application to trans-ethnic meta-analysis. Biometrics 2016; 72:945-54. [PMID: 26916671 DOI: 10.1111/biom.12481] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2014] [Revised: 11/01/2015] [Accepted: 11/01/2015] [Indexed: 11/28/2022]
Abstract
Meta-analysis of trans-ethnic genome-wide association studies (GWAS) has proven to be a practical and profitable approach for identifying loci that contribute to the risk of complex diseases. However, the expected genetic effect heterogeneity cannot easily be accommodated through existing fixed-effects and random-effects methods. In response, we propose a novel random effect model for trans-ethnic meta-analysis with flexible modeling of the expected genetic effect heterogeneity across diverse populations. Specifically, we adopt a modified random effect model from the kernel regression framework, in which genetic effect coefficients are random variables whose correlation structure reflects the genetic distances across ancestry groups. In addition, we use the adaptive variance component test to achieve robust power regardless of the degree of genetic effect heterogeneity. Simulation studies show that our proposed method has well-calibrated type I error rates at very stringent significance levels and can improve power over the traditional meta-analysis methods. We reanalyzed the published type 2 diabetes GWAS meta-analysis (Consortium et al., 2014) and successfully identified one additional SNP that clearly exhibits genetic effect heterogeneity across different ancestry groups. Furthermore, our proposed method provides scalable computing time for genome-wide datasets, in which an analysis of one million SNPs would require less than 3 hours.
Collapse
Affiliation(s)
- Jingchunzi Shi
- Department of Biostatistics, University of Michigan, Ann Arbor, Michigan 48109, U.S.A..
| | - Seunggeun Lee
- Department of Biostatistics, University of Michigan, Ann Arbor, Michigan 48109, U.S.A..
| |
Collapse
|
11
|
Hu Y, Gibson E, Ahmed HU, Moore CM, Emberton M, Barratt DC. Population-based prediction of subject-specific prostate deformation for MR-to-ultrasound image registration. Med Image Anal 2015; 26:332-44. [PMID: 26606458 PMCID: PMC4686007 DOI: 10.1016/j.media.2015.10.006] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2015] [Revised: 10/21/2015] [Accepted: 10/24/2015] [Indexed: 11/24/2022]
Abstract
Statistical shape models of soft-tissue organ motion provide a useful means of imposing physical constraints on the displacements allowed during non-rigid image registration, and can be especially useful when registering sparse and/or noisy image data. In this paper, we describe a method for generating a subject-specific statistical shape model that captures prostate deformation for a new subject given independent population data on organ shape and deformation obtained from magnetic resonance (MR) images and biomechanical modelling of tissue deformation due to transrectal ultrasound (TRUS) probe pressure. The characteristics of the models generated using this method are compared with corresponding models based on training data generated directly from subject-specific biomechanical simulations using a leave-one-out cross validation. The accuracy of registering MR and TRUS images of the prostate using the new prostate models was then estimated and compared with published results obtained in our earlier research. No statistically significant difference was found between the specificity and generalisation ability of prostate shape models generated using the two approaches. Furthermore, no statistically significant difference was found between the landmark-based target registration errors (TREs) following registration using different models, with a median (95th percentile) TRE of 2.40 (6.19) mm versus 2.42 (7.15) mm using models generated with the new method versus a model built directly from patient-specific biomechanical simulation data, respectively (N = 800; 8 patient datasets; 100 registrations per patient). We conclude that the proposed method provides a computationally efficient and clinically practical alternative to existing complex methods for modelling and predicting subject-specific prostate deformation, such as biomechanical simulations, for new subjects. The method may also prove useful for generating shape models for other organs, for example, where only limited shape training data from dynamic imaging is available.
Collapse
Affiliation(s)
- Yipeng Hu
- Centre for Medical Image Computing, University College London, London, UK.
| | - Eli Gibson
- Centre for Medical Image Computing, University College London, London, UK; Diagnostic Image Analysis group, Radboud University Medical Centre, Nijmegen, The Netherlands
| | - Hashim Uddin Ahmed
- Division of Surgery and Interventional Science, University College London, London, UK
| | - Caroline M Moore
- Division of Surgery and Interventional Science, University College London, London, UK
| | - Mark Emberton
- Division of Surgery and Interventional Science, University College London, London, UK
| | - Dean C Barratt
- Centre for Medical Image Computing, University College London, London, UK
| |
Collapse
|
12
|
Kropat G, Bochud F, Jaboyedoff M, Laedermann JP, Murith C, Palacios Gruson M, Baechler S. Predictive analysis and mapping of indoor radon concentrations in a complex environment using kernel estimation: an application to Switzerland. Sci Total Environ 2015; 505:137-48. [PMID: 25314691 DOI: 10.1016/j.scitotenv.2014.09.064] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/29/2014] [Revised: 09/10/2014] [Accepted: 09/22/2014] [Indexed: 05/10/2023]
Abstract
PURPOSE The aim of this study was to develop models based on kernel regression and probability estimation in order to predict and map IRC in Switzerland by taking into account all of the following: architectural factors, spatial relationships between the measurements, as well as geological information. METHODS We looked at about 240,000 IRC measurements carried out in about 150,000 houses. As predictor variables we included: building type, foundation type, year of construction, detector type, geographical coordinates, altitude, temperature and lithology into the kernel estimation models. We developed predictive maps as well as a map of the local probability to exceed 300 Bq/m(3). Additionally, we developed a map of a confidence index in order to estimate the reliability of the probability map. RESULTS Our models were able to explain 28% of the variations of IRC data. All variables added information to the model. The model estimation revealed a bandwidth for each variable, making it possible to characterize the influence of each variable on the IRC estimation. Furthermore, we assessed the mapping characteristics of kernel estimation overall as well as by municipality. Overall, our model reproduces spatial IRC patterns which were already obtained earlier. On the municipal level, we could show that our model accounts well for IRC trends within municipal boundaries. Finally, we found that different building characteristics result in different IRC maps. Maps corresponding to detached houses with concrete foundations indicate systematically smaller IRC than maps corresponding to farms with earth foundation. CONCLUSIONS IRC mapping based on kernel estimation is a powerful tool to predict and analyze IRC on a large-scale as well as on a local level. This approach enables to develop tailor-made maps for different architectural elements and measurement conditions and to account at the same time for geological information and spatial relations between IRC measurements.
Collapse
Affiliation(s)
- Georg Kropat
- Institute of Radiation Physics, Lausanne University Hospital, Rue du Grand-Pré 1, 1007 Lausanne, Switzerland.
| | - Francois Bochud
- Institute of Radiation Physics, Lausanne University Hospital, Rue du Grand-Pré 1, 1007 Lausanne, Switzerland
| | - Michel Jaboyedoff
- Faculty of Geosciences and Environment, University of Lausanne, GEOPOLIS - 3793, 1015 Lausanne, Switzerland
| | - Jean-Pascal Laedermann
- Institute of Radiation Physics, Lausanne University Hospital, Rue du Grand-Pré 1, 1007 Lausanne, Switzerland
| | - Christophe Murith
- Swiss Federal Office of Public Health, Schwarzenburgstrasse 165, 3003 Berne, Switzerland
| | - Martha Palacios Gruson
- Swiss Federal Office of Public Health, Schwarzenburgstrasse 165, 3003 Berne, Switzerland
| | - Sébastien Baechler
- Institute of Radiation Physics, Lausanne University Hospital, Rue du Grand-Pré 1, 1007 Lausanne, Switzerland; Swiss Federal Office of Public Health, Schwarzenburgstrasse 165, 3003 Berne, Switzerland
| |
Collapse
|
13
|
Kachouie NN, Lin X, Schwartzman A. FDR control of detected regions by multiscale matched filtering. COMMUN STAT-SIMUL C 2014; 46:127-144. [PMID: 31501637 PMCID: PMC6733272 DOI: 10.1080/03610918.2014.957842] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2014] [Accepted: 08/15/2014] [Indexed: 10/24/2022]
Abstract
Feature extraction from observed noisy samples is a common important problem in statistics and engineering. This paper presents a novel general statistical approach to the region detection problem in long data sequences. The proposed technique is a multi-scale kernel regression in conjunction with statistical multiple testing for region detection while controlling the false discovery rate (FDR) and maximizing the signal to noise ratio (SNR) via matched filtering. This is achieved by considering a one-dimensional (1D) region detection problem as its equivalent 0D (zero dimensional) peak detection problem. The detection method does not require a priori knowledge of the shape of the non-zero regions. However, if the shape of the non-zero regions is known a priori, e.g. rectangular pulse, the signal regions can also be reconstructed from the detected peaks, seen as their topological point representatives. Simulations show that the method can effectively perform signal detection and reconstruction in the simulated data under high noise conditions, while controlling the FDR of detected regions and their reconstructed length.
Collapse
Affiliation(s)
- Nezamoddin N Kachouie
- Department of Mathematical Sciences, Florida Institute of Technology, Melbourne, FL, USA
| | - Xihong Lin
- Department of Biostatistics, Harvard School of Public Health, Boston, MA, USA
| | - Armin Schwartzman
- Department of Statistics, North Carolina State University, Raleigh, NC, USA
| |
Collapse
|
14
|
Abstract
We consider heteroscedastic regression models where the mean function is a partially linear single index model and the variance function depends upon a generalized partially linear single index model. We do not insist that the variance function depend only upon the mean function, as happens in the classical generalized partially linear single index model. We develop efficient and practical estimation methods for the variance function and for the mean function. Asymptotic theory for the parametric and nonparametric parts of the model is developed. Simulations illustrate the results. An empirical example involving ozone levels is used to further illustrate the results, and is shown to be a case where the variance function does not depend upon the mean function.
Collapse
Affiliation(s)
- Heng Lian
- Division of Mathematical Sciences, School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore 637371, Singapore.
| | - Hua Liang
- Department of Statistics, George Washington University, Washington, D.C. 20052, U.S.A
| | - Raymond J Carroll
- Department of Statistics, Texas A&M University, College Station, TX 77843-3143, USA
| |
Collapse
|
15
|
Pérez IA, Sánchez ML, García MÁ, Ozores M, Pardo N. Analysis of carbon dioxide concentration skewness at a rural site. Sci Total Environ 2014; 476-477:158-164. [PMID: 24463252 DOI: 10.1016/j.scitotenv.2014.01.019] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/15/2013] [Revised: 12/18/2013] [Accepted: 01/06/2014] [Indexed: 06/03/2023]
Abstract
This paper provides evidence that symmetry of CO2 concentration distribution may indicate sources or dispersive processes. Skewness was calculated by different procedures with CO2 measured at a rural site using a Picarro G1301 analyser over a two-year period. The usual skewness coefficient was considered together with fourteen robust estimators. A noticeable contrast was obtained between day and night, and skewness decreased linearly with the logarithm of the height. One coefficient was selected from its satisfactory relationship with the median concentration in daily evolution. Three analyses based on the kernel smoothing method were conducted with this coefficient to investigate its response to yearly and daily evolutions, wind direction, and wind speed. Left-skewed distributions were linked to thermal turbulence during midday, especially in spring-summer, or with high wind speeds. Almost symmetric distributions were associated with sources, such as the Valladolid City plume reinforced with spring emissions and the lack of emissions in summer in the remaining directions. Finally, right-skewed distributions were related to low wind speeds and stable stratification at night, furthered by strong emissions in spring. Skewness intervals were proposed and their average median concentrations were calculated such that the relationship between skewness and concentration depends on the analysis performed. Since some skewness coefficients may also be negative, they provide better information about sources or dispersive processes than concentration.
Collapse
Affiliation(s)
- Isidro A Pérez
- Department of Applied Physics, Faculty of Sciences, University of Valladolid, Paseo de Belén, 7, 47011 Valladolid, Spain.
| | - M Luisa Sánchez
- Department of Applied Physics, Faculty of Sciences, University of Valladolid, Paseo de Belén, 7, 47011 Valladolid, Spain
| | - M Ángeles García
- Department of Applied Physics, Faculty of Sciences, University of Valladolid, Paseo de Belén, 7, 47011 Valladolid, Spain
| | - Marta Ozores
- Department of Applied Physics, Faculty of Sciences, University of Valladolid, Paseo de Belén, 7, 47011 Valladolid, Spain
| | - Nuria Pardo
- Department of Applied Physics, Faculty of Sciences, University of Valladolid, Paseo de Belén, 7, 47011 Valladolid, Spain
| |
Collapse
|
16
|
Hong Y, Davis B, Marron JS, Kwitt R, Singh N, Kimbell JS, Pitkin E, Superfine R, Davis SD, Zdanski CJ, Niethammer M. Statistical atlas construction via weighted functional boxplots. Med Image Anal 2014; 18:684-98. [PMID: 24747271 DOI: 10.1016/j.media.2014.03.001] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2013] [Revised: 01/21/2014] [Accepted: 03/11/2014] [Indexed: 12/01/2022]
Abstract
Atlas-building from population data is widely used in medical imaging. However, the emphasis of atlas-building approaches is typically to estimate a spatial alignment to compute a mean/median shape or image based on population data. In this work, we focus on the statistical characterization of the population data, once spatial alignment has been achieved. We introduce and propose the use of the weighted functional boxplot. This allows the generalization of concepts such as the median, percentiles, or outliers to spaces where the data objects are functions, shapes, or images, and allows spatio-temporal atlas-building based on kernel regression. In our experiments, we demonstrate the utility of the approach to construct statistical atlases for pediatric upper airways and corpora callosa revealing their growth patterns. We also define a score system based on the pediatric airway atlas to quantitatively measure the severity of subglottic stenosis (SGS) in the airway. This scoring allows the classification of pre- and post-surgery SGS subjects and radiographically normal controls. Experimental results show the utility of atlas information to assess the effect of airway surgery in children.
Collapse
Affiliation(s)
- Yi Hong
- University of North Carolina (UNC) at Chapel Hill, NC, USA.
| | | | - J S Marron
- University of North Carolina (UNC) at Chapel Hill, NC, USA
| | - Roland Kwitt
- Department of Computer Science, University of Salzburg, Austria
| | - Nikhil Singh
- University of North Carolina (UNC) at Chapel Hill, NC, USA
| | | | | | | | | | | | - Marc Niethammer
- University of North Carolina (UNC) at Chapel Hill, NC, USA; Biomedical Research Imaging Center, UNC-Chapel Hill, NC, USA
| |
Collapse
|
17
|
Abstract
When the functional data are not homogeneous, e.g., there exist multiple classes of functional curves in the dataset, traditional estimation methods may fail. In this paper, we propose a new estimation procedure for the Mixture of Gaussian Processes, to incorporate both functional and inhomogeneous properties of the data. Our method can be viewed as a natural extension of high-dimensional normal mixtures. However, the key difference is that smoothed structures are imposed for both the mean and covariance functions. The model is shown to be identifiable, and can be estimated efficiently by a combination of the ideas from EM algorithm, kernel regression, and functional principal component analysis. Our methodology is empirically justified by Monte Carlo simulations and illustrated by an analysis of a supermarket dataset.
Collapse
Affiliation(s)
- Mian Huang
- School of Statistics and Management and Key Laboratory of Mathematical Economics at SHUFE, Ministry of Education, Shanghai University of Finance and Economics (SHUFE), Shanghai, 200433, P. R. China
| | - Runze Li
- Department of Statistics and The Methodology Center, The Pennsylvania State University, University Park, PA 16802-2111
| | - Hansheng Wang
- Department of Business Statistics and Econometrics, Guanghua School of Management, Peking University, Beijing, 100871, P. R. China
| | - Weixin Yao
- Department of Statistics, Kansas State University, Manhattan, Kansas 66506
| |
Collapse
|
18
|
Abstract
The continuum regression technique provides an appealing regression framework connecting ordinary least squares, partial least squares and principal component regression in one family. It offers some insight on the underlying regression model for a given application. Moreover, it helps to provide deep understanding of various regression techniques. Despite the useful framework, however, the current development on continuum regression is only for linear regression. In many applications, nonlinear regression is necessary. The extension of continuum regression from linear models to nonlinear models using kernel learning is considered. The proposed kernel continuum regression technique is quite general and can handle very flexible regression model estimation. An efficient algorithm is developed for fast implementation. Numerical examples have demonstrated the usefulness of the proposed technique.
Collapse
Affiliation(s)
- Myung Hee Lee
- Department of Statistics, Colorado State University, Fort Collins, CO 80525, U.S.A
| | | |
Collapse
|
19
|
Abstract
Motivated by an analysis of US house price index data, we propose nonparametric finite mixture of regression models. We study the identifiability issue of the proposed models, and develop an estimation procedure by employing kernel regression. We further systematically study the sampling properties of the proposed estimators, and establish their asymptotic normality. A modified EM algorithm is proposed to carry out the estimation procedure. We show that our algorithm preserves the ascent property of the EM algorithm in an asymptotic sense. Monte Carlo simulations are conducted to examine the finite sample performance of the proposed estimation procedure. An empirical analysis of the US house price index data is illustrated for the proposed methodology.
Collapse
|