1
|
Zaoui A. Variance function estimation in regression model via aggregation procedures. J Nonparametr Stat 2022. [DOI: 10.1080/10485252.2022.2155960] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Affiliation(s)
- Ahmed Zaoui
- LAMA, UMR-CNRS 8050, Université Gustave Eiffel, Marne la Vallee, France
| |
Collapse
|
2
|
Mary-Huard T, Perduca V, Martin-Magniette ML, Blanchard G. Error rate control for classification rules in multiclass mixture models. Int J Biostat 2022; 18:381-396. [PMID: 34845884 DOI: 10.1515/ijb-2020-0105] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2020] [Accepted: 09/27/2021] [Indexed: 01/10/2023]
Abstract
In the context of finite mixture models one considers the problem of classifying as many observations as possible in the classes of interest while controlling the classification error rate in these same classes. Similar to what is done in the framework of statistical test theory, different type I and type II-like classification error rates can be defined, along with their associated optimal rules, where optimality is defined as minimizing type II error rate while controlling type I error rate at some nominal level. It is first shown that finding an optimal classification rule boils down to searching an optimal region in the observation space where to apply the classical Maximum A Posteriori (MAP) rule. Depending on the misclassification rate to be controlled, the shape of the optimal region is provided, along with a heuristic to compute the optimal classification rule in practice. In particular, a multiclass FDR-like optimal rule is defined and compared to the thresholded MAP rules that is used in most applications. It is shown on both simulated and real datasets that the FDR-like optimal rule may be significantly less conservative than the thresholded MAP rule.
Collapse
Affiliation(s)
- Tristan Mary-Huard
- MIA-Paris, INRAE, AgroParisTech, Université Paris-Saclay, Paris, 75005, France.,Université Paris-Saclay, CNRS, INRAE, Université d'Evry, Institute of Plant Sciences Paris-Saclay (IPS2), Orsay, France
| | - Vittorio Perduca
- Laboratoire MAP5 (UMR CNRS 8145), Université Paris Descartes, Paris
| | - Marie-Laure Martin-Magniette
- MIA-Paris, INRAE, AgroParisTech, Université Paris-Saclay, Paris, 75005, France.,Université Paris-Saclay, CNRS, INRAE, Université d'Evry, Institute of Plant Sciences Paris-Saclay (IPS2), Orsay, France
| | - Gilles Blanchard
- Laboratoire de Math'ematiques d'Orsay, Université Paris-Sud, Saint-Aubin, Île-de-France, France
| |
Collapse
|
3
|
Wang W, Qiao X. Set-Valued Support Vector Machine with Bounded Error Rates. J Am Stat Assoc 2022. [DOI: 10.1080/01621459.2022.2089573] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Affiliation(s)
- Wenbo Wang
- Department of Mathematical Sciences at Binghamton University, State University of New York, Binghamton, New York, 13902
| | - Xingye Qiao
- Department of Mathematical Sciences at Binghamton University, State University of New York, Binghamton, New York, 13902
| |
Collapse
|
4
|
Guan L, Tibshirani R. Prediction and outlier detection in classification problems. J R Stat Soc Series B Stat Methodol 2022; 84:524-546. [PMID: 35910400 PMCID: PMC9305480 DOI: 10.1111/rssb.12443] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2019] [Accepted: 12/12/2020] [Indexed: 11/29/2022]
Abstract
We consider the multi‐class classification problem when the training data and the out‐of‐sample test data may have different distributions and propose a method called BCOPS (balanced and conformal optimized prediction sets). BCOPS constructs a prediction set C(x) as a subset of class labels, possibly empty. It tries to optimize the out‐of‐sample performance, aiming to include the correct class and to detect outliers x as often as possible. BCOPS returns no prediction (corresponding to C(x) equal to the empty set) if it infers x to be an outlier. The proposed method combines supervised learning algorithms with conformal prediction to minimize a misclassification loss averaged over the out‐of‐sample distribution. The constructed prediction sets have a finite sample coverage guarantee without distributional assumptions. We also propose a method to estimate the outlier detection rate of a given procedure. We prove asymptotic consistency and optimality of our proposals under suitable assumptions and illustrate our methods on real data examples.
Collapse
|
5
|
Chzhen E, Denis C, Hebiri M. Minimax semi-supervised set-valued approach to multi-class classification. BERNOULLI 2021. [DOI: 10.3150/20-bej1313] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Affiliation(s)
- Evgenii Chzhen
- CNRS, Inria, Laboratoire de Mathématiques d’Orsay, Université Paris-Saclay, 91405 Orsay, France
| | - Christophe Denis
- Laboratoire d’Analyse et de Mathématiques Appliquées, Université Gustave Eiffel, 77454 Marne-la-Vallée cedex 2, France
| | - Mohamed Hebiri
- Laboratoire d’Analyse et de Mathématiques Appliquées, Université Gustave Eiffel, 77454 Marne-la-Vallée cedex 2, France
| |
Collapse
|
6
|
Kocak MA, Ramirez D, Erkip E, Shasha DE. SafePredict: A Meta-Algorithm for Machine Learning That Uses Refusals to Guarantee Correctness. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2021; 43:663-678. [PMID: 31380747 DOI: 10.1109/tpami.2019.2932415] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
SafePredict is a novel meta-algorithm that works with any base prediction algorithm for online data to guarantee an arbitrarily chosen correctness rate, 1-ϵ, by allowing refusals. Allowing refusals means that the meta-algorithm may refuse to emit a prediction produced by the base algorithm so that the error rate on non-refused predictions does not exceed ϵ. The SafePredict error bound does not rely on any assumptions on the data distribution or the base predictor. When the base predictor happens not to exceed the target error rate ϵ, SafePredict refuses only a finite number of times. When the error rate of the base predictor changes through time SafePredict makes use of a weight-shifting heuristic that adapts to these changes without knowing when the changes occur yet still maintains the correctness guarantee. Empirical results show that (i) SafePredict compares favorably with state-of-the-art confidence-based refusal mechanisms which fail to offer robust error guarantees; and (ii) combining SafePredict with such refusal mechanisms can in many cases further reduce the number of refusals. Our software is included in the supplementary material, which can be found on the Computer Society Digital Library at http://doi.ieeecomputersociety.org/10.1109/TPAMI.2019.2932415.
Collapse
|
7
|
Abstract
Machine learning based systems and products are reaching society at large in many aspects of everyday life, including financial lending, online advertising, pretrial and immigration detention, child maltreatment screening, health care, social services, and education. This phenomenon has been accompanied by an increase in concern about the ethical issues that may rise from the adoption of these technologies. In response to this concern, a new area of machine learning has recently emerged that studies how to address disparate treatment caused by algorithmic errors and bias in the data. The central question is how to ensure that the learned model does not treat subgroups in the population unfairly. While the design of solutions to this issue requires an interdisciplinary effort, fundamental progress can only be achieved through a radical change in the machine learning paradigm. In this work, we will describe the state of the art on algorithmic fairness using statistical learning theory, machine learning, and deep learning approaches that are able to learn fair models and data representation.
Collapse
Affiliation(s)
- Luca Oneto
- DIBRIS, University of Genoa, 16145, Genova, Italy
- ZenaByte s.r.l., www.zenabyte.com
| |
Collapse
|
8
|
Guo FR, Richardson TS. On testing marginal versus conditional independence. Biometrika 2020. [DOI: 10.1093/biomet/asaa040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Summary
We consider testing marginal independence versus conditional independence in a trivariate Gaussian setting. The two models are nonnested, and their intersection is a union of two marginal independences. We consider two sequences of such models, one from each type of independence, that are closest to each other in the Kullback–Leibler sense as they approach the intersection. They become indistinguishable if the signal strength, as measured by the product of two correlation parameters, decreases faster than the standard parametric rate. Under local alternatives at such a rate, we show that the asymptotic distribution of the likelihood ratio depends on where and how the local alternatives approach the intersection. To deal with this nonuniformity, we study a class of envelope distributions by taking pointwise suprema over asymptotic cumulative distribution functions. We show that these envelope distributions are well behaved and lead to model selection procedures with rate-free uniform error guarantees and near-optimal power. To control the error even when the two models are indistinguishable, rather than insist on a dichotomous choice, the proposed procedure will choose either or both models.
Collapse
Affiliation(s)
- F Richard Guo
- Department of Statistics, University of Washington, Box 354322, Seattle, Washington 98195, U.S.A
| | - Thomas S Richardson
- Department of Statistics, University of Washington, Box 354322, Seattle, Washington 98195, U.S.A
| |
Collapse
|
9
|
Coscrato V, Izbicki R, Stern RB. Agnostic tests can control the type I and type II errors simultaneously. BRAZ J PROBAB STAT 2020. [DOI: 10.1214/19-bjps431] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
10
|
|
11
|
Denis C, Hebiri M. Consistency of plug-in confidence sets for classification in semi-supervised learning. J Nonparametr Stat 2019. [DOI: 10.1080/10485252.2019.1689241] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
12
|
Abstract
Summary
Conformal prediction is a general method that converts almost any point predictor to a prediction set. The resulting set retains the good statistical properties of the original estimator under standard assumptions, and guarantees valid average coverage even when the model is mis-specified. A main challenge in applying conformal prediction in modern applications is efficient computation, as it generally requires an exhaustive search over the entire output space. In this paper we develop an exact and computationally efficient conformalization of the lasso and elastic net. The method makes use of a novel piecewise linear homotopy of the lasso solution under perturbation of a single input sample point. As a by-product, we provide a simpler and better-justified online lasso algorithm, which may be of independent interest. Our derivation also reveals an interesting accuracy-stability trade-off in conformal inference, which is analogous to the bias-variance trade-off in traditional parameter estimation. The practical performance of the new algorithm is demonstrated in both synthetic and real data examples.
Collapse
Affiliation(s)
- J Lei
- Department of Statistics and Data Science, Carnegie Mellon University, 132 Baker Hall, Pittsburgh, Pennsylvannia, U.S.A
| |
Collapse
|
13
|
Liu Y, Lin L. Classification with minimum ambiguity under distribution heterogeneity. J STAT COMPUT SIM 2019. [DOI: 10.1080/00949655.2019.1615063] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Affiliation(s)
- Yongxin Liu
- Zhongtai Securities Institute for Financial Studies, Shandong University, Jinan, People's Republic of China
| | - Lu Lin
- Zhongtai Securities Institute for Financial Studies, Shandong University, Jinan, People's Republic of China
- School of Statistics, Qufu Normal University, Qufu, People's Republic of China
| |
Collapse
|
14
|
Abstract
Classification has applications in a wide range of fields including medicine, engineering, computer science and social sciences among others. In statistical terms, classification is inference about the unknown parameters, i.e., the true classes of future objects. Hence, various standard statistical approaches can be used, such as point estimators, confidence sets and decision theoretic approaches. For example, a classifier that classifies a future object as belonging to only one of several known classes is a point estimator. The purpose of this paper is to propose a confidence-set-based classifier that classifies a future object into a single class only when there is enough evidence to warrant this, and into several classes otherwise. By allowing classification of an object into possibly more than one class, this classifier guarantees a pre-specified proportion of correct classification among all future objects. An example is provided to illustrate the method, and a simulation study is included to highlight the desirable feature of the method.
Collapse
|
15
|
Sadinle M, Lei J, Wasserman L. Least Ambiguous Set-Valued Classifiers With Bounded Error Levels. J Am Stat Assoc 2018. [DOI: 10.1080/01621459.2017.1395341] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Affiliation(s)
- Mauricio Sadinle
- Department of Biostatistics, University of Washington, Seattle, WA
| | - Jing Lei
- Department of Statistics, Carnegie Mellon University, Pittsburgh, PA
| | - Larry Wasserman
- Department of Statistics, Carnegie Mellon University, Pittsburgh, PA
| |
Collapse
|
16
|
Lei J, G’Sell M, Rinaldo A, Tibshirani RJ, Wasserman L. Distribution-Free Predictive Inference for Regression. J Am Stat Assoc 2018. [DOI: 10.1080/01621459.2017.1307116] [Citation(s) in RCA: 71] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Affiliation(s)
- Jing Lei
- Department of Statistics, Carnegie Mellon University, Pittsburgh, PA
| | - Max G’Sell
- Department of Statistics, Carnegie Mellon University, Pittsburgh, PA
| | | | | | - Larry Wasserman
- Department of Statistics, Carnegie Mellon University, Pittsburgh, PA
| |
Collapse
|
17
|
Sun W, Wei Z. Hierarchical recognition of sparse patterns in large-scale simultaneous inference. Biometrika 2015. [DOI: 10.1093/biomet/asv012] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
|