51
|
Liang F, Xue J, Jia B. Markov Neighborhood Regression for High-Dimensional Inference. J Am Stat Assoc 2021; 117:1200-1214. [DOI: 10.1080/01621459.2020.1841646] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Affiliation(s)
- Faming Liang
- Department of Statistics, Purdue University, West Lafayette, IN
| | | | - Bochao Jia
- Eli Lilly and Company, Lilly Corporate Center, Indianapolis, IN
| |
Collapse
|
52
|
Dai R, Kolar M. Inference for high-dimensional varying-coefficient quantile regression. Electron J Stat 2021. [DOI: 10.1214/21-ejs1919] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Affiliation(s)
- Ran Dai
- Department of Biostatistics, University of Nebraska Medical Center, USA
| | - Mladen Kolar
- Booth School of Business, University of Chicago, USA
| |
Collapse
|
53
|
Luo K, Aimuzi R, Wang Y, Nian M, Zhang J. Urinary organophosphate esters metabolites, glucose homeostasis and prediabetes in adolescents. ENVIRONMENTAL POLLUTION (BARKING, ESSEX : 1987) 2020; 267:115607. [PMID: 33254666 DOI: 10.1016/j.envpol.2020.115607] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/11/2020] [Revised: 07/17/2020] [Accepted: 09/03/2020] [Indexed: 06/12/2023]
Abstract
BACKGROUND Emerging experimental evidence indicates that organophosphate esters (OPEs) can trigger glucose metabolic disorders. However, human evidence, especially in adolescents, is unavailable. OBJECTIVES We utilized data from the National Health and Nutrition Examination Survey 2011-2014 to evaluate whether urinary OPEs metabolites were associated with prediabetes and glucose homeostasis. METHODS A total of 349 adolescents (12-19-year old) who provided at least 8 h fasting blood samples, had urinary OPEs metabolites detected were included. Prediabetes was defined according to the levels of fasting plasma glucose (FPG), 2-h post oral plasma glucose (2 h-OGTT) and glycated hemoglobin A1c (HbA1c). The homeostatic model assessment (HOMA-IR) and the Single Point Insulin Sensitivity Estimator (SPISE) were used to assess insulin resistance and sensitivity, respectively. Multiple binary logistic and linear regressions were used to evaluate the associations with prediabetes and indices of glucose homeostasis. The least absolute shrinkage and selection operator (LASSO) regression was used to assess the associations in a multi-pollutant context. RESULTS After adjusting for covariates, certain urinary OPEs metabolites were associated with prediabetes and indices of glucose homeostasis in all adolescents. Stratified analyses by sex revealed that such associations were largely sex-dependent. In females, the multiple pollutant models showed that bis(1,3-32 dichloro-2-propyl) phosphate (BDCIPP) was positively associated with prediabetes [odds ratio (OR) = 2.51, 95%CI:1.29, 4.89, for one scaled unit increase in exposure] and 2 h-OGTT (β = 0.07, 95%CI:0.01,0.12); bis(2-chloroethyl) phosphate (BCEP) was negatively associated with fasting insulin (β = -0.10, 95%CI: 0.19,-0.01) and HOMA-IR (β = -0.10, 95%CI: 0.19,-0.003); and detectable bis(1-choloro-2-propyl) phosphate (BCIPP) (>LOD vs < LOD) was inversely associated with 2 h-OGTT (β = -0.11, 95%CI: 0.21,-0.02). In males, consistent inverse associations were found for detectable di-n-butyl phosphate (DNBP) with prediabetes, FPG, 2 h-OGTT, fasting insulin and HOMA-IR. CONCLUSION Urinary OPEs metabolites were associated with prediabetes and indices of glucose homeostasis in adolescents. But such associations varied by sex. Future studies with multiple measurements of OPEs exposure are needed to confirm our findings.
Collapse
Affiliation(s)
- Kai Luo
- Ministry of Education-Shanghai Key Laboratory of Children's Environmental Health, Xinhua Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, 200092, China; School of Public Health, Shanghai Jiao Tong University, Shanghai, 200025, China
| | - Ruxianguli Aimuzi
- Ministry of Education-Shanghai Key Laboratory of Children's Environmental Health, Xinhua Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, 200092, China; School of Public Health, Shanghai Jiao Tong University, Shanghai, 200025, China
| | - Yuqing Wang
- Ministry of Education-Shanghai Key Laboratory of Children's Environmental Health, Xinhua Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, 200092, China
| | - Min Nian
- Ministry of Education-Shanghai Key Laboratory of Children's Environmental Health, Xinhua Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, 200092, China; School of Public Health, Shanghai Jiao Tong University, Shanghai, 200025, China
| | - Jun Zhang
- Ministry of Education-Shanghai Key Laboratory of Children's Environmental Health, Xinhua Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, 200092, China; School of Public Health, Shanghai Jiao Tong University, Shanghai, 200025, China.
| |
Collapse
|
54
|
Liu L, Gu H, Van Limbergen J, Kenney T. SuRF: A new method for sparse variable selection, with application in microbiome data analysis. Stat Med 2020; 40:897-919. [PMID: 33219557 DOI: 10.1002/sim.8809] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2019] [Revised: 10/25/2020] [Accepted: 10/27/2020] [Indexed: 01/16/2023]
Abstract
In this article, we present a new variable selection method for regression and classification purposes, particularly for microbiome analysis. Our method, called subsampling ranking forward selection (SuRF), is based on LASSO penalized regression, subsampling and forward-selection methods. SuRF offers major advantages over existing variable selection methods in terms of both sparsity of selected models and model inference. We provide an R package that can implement our method for generalized linear models. We apply our method to classification problems from microbiome data, using a novel agglomeration approach to deal with the special tree-like correlation structure of the variables. Existing methods arbitrarily choose a taxonomic level a priori before performing the analysis, whereas by combining SuRF with these aggregated variables, we are able to identify the key biomarkers at the appropriate taxonomic level, as suggested by the data. We present simulations in multiple sparse settings to demonstrate that our approach performs better than several other popularly used existing approaches in recovering the true variables. We apply SuRF to two microbiome datasets: one about prediction of pouchitis and another for identifying samples from two healthy individuals. We find that SuRF can provide a better or comparable prediction with other methods while controlling the false positive rate of variable selection.
Collapse
Affiliation(s)
- Lihui Liu
- Department of Mathematics and Statistics, Dalhousie University, Halifax, Nova Scotia, Canada
| | - Hong Gu
- Department of Mathematics and Statistics, Dalhousie University, Halifax, Nova Scotia, Canada
| | - Johan Van Limbergen
- Department of Pediatrics, Dalhousie University, Halifax, Nova Scotia, Canada
| | - Toby Kenney
- Department of Mathematics and Statistics, Dalhousie University, Halifax, Nova Scotia, Canada
| |
Collapse
|
55
|
McCusker MR, Bazinet RP, Metherel AH, Klein RY, Kundra A, Haibe-Kains B, Li M. Nonesterified Fatty Acids and Depression in Cancer Patients and Caregivers. Curr Dev Nutr 2020; 4:nzaa156. [PMID: 33447694 PMCID: PMC7792569 DOI: 10.1093/cdn/nzaa156] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2020] [Revised: 09/23/2020] [Accepted: 10/06/2020] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND Nonesterified fatty acids (NEFAs) are known to have inflammatory effects. The inflammatory hypothesis of depression suggests that omega-3 (ω-3) and omega-6 (ω-6) fatty acids might be negatively and positively correlated with depression, respectively. OBJECTIVE An exploratory study was conducted to determine the association between dietary free fatty acids and depressive symptoms in cancer patients and caregivers. METHODS Associations between depression and the NEFA pool were investigated in 56 cancer patients and 23 caregivers using a combination of nonparametric tests and regularized regression. Plasma NEFAs were measured using thin layer and gas chromatography with flame ionization detection. Depression was characterized both as a continuous severity score using the GRID-Hamilton Depression Rating Scale (GRID Ham-D), and as a categorical diagnosis of major depression by structured clinical interview. RESULTS Initial hypotheses regarding the relation between depression and omega-3 or omega-6 fatty acids were not well supported. However, elaidic acid, a trans-unsaturated fatty acid found in hydrogenated vegetable oils, was found to be negatively correlated with continuous depression scores in cancer patients. No significant associations were found in caregivers. CONCLUSIONS An unexpected negative association between elaidic acid and depression was identified, supporting recent literature on the role of G protein-coupled receptors in depression. Further research is needed to confirm this result and to evaluate the potential role of G protein agonists as therapeutic agents for depression in cancer patients.
Collapse
Affiliation(s)
- Megan R McCusker
- Department of Supportive Care, Princess Margaret Cancer Centre, Toronto, Canada
| | - Richard P Bazinet
- Department of Nutritional Sciences, University of Toronto, Toronto, Canada
| | - Adam H Metherel
- Department of Nutritional Sciences, University of Toronto, Toronto, Canada
| | - Roberta Yael Klein
- Department of Supportive Care, Princess Margaret Cancer Centre, Toronto, Canada
| | - Arjun Kundra
- Department of Medicine, Queen's University, Kingston, Canada
| | - Benjamin Haibe-Kains
- Department of Medical Biophysics, University of Toronto, Toronto, Canada
- Department of Computer Science, University of Toronto, Toronto, Canada
- Ontario Institute of Cancer Research, Toronto, Canada
- Vector Institute for Artificial Intelligence, Toronto, Canada
| | - Madeline Li
- Department of Supportive Care, Princess Margaret Cancer Centre, Toronto, Canada
- Department of Psychiatry, University of Toronto, Toronto, Canada
| |
Collapse
|
56
|
Development of new media formulations for cell culture operations based on regression models. Bioprocess Biosyst Eng 2020; 44:453-472. [PMID: 33111178 DOI: 10.1007/s00449-020-02456-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2020] [Accepted: 09/25/2020] [Indexed: 10/23/2022]
Abstract
The paper discusses modelling and optimization of multi-component cell culture medium. The specific productivity (Qp) was considered a function of the medium components and possible interactions described by linear factors, two-way interactions and squared terms that results in a high dimensional problem where the number of variables p (represented by the medium components and their interactions) is much larger than the number of observations n. Principal Components Regression (PCR), Partial Least Squares (PLS), Lasso and Elastic Net regressions were compared as modelling tools to deal with a high dimensional [Formula: see text] problem. PCR and PLS regression models resulted in better prediction results and were used for robust optimization of the medium composition by a nonlinear optimization. The case studies show that it is possible to formulate new media that result in higher Qp than the ones provided by the initial media experiments available. Also, the multivariate statistical approach permitted us to select media that is most informative about the optimum thus permitting modelling and optimization with a reduced set of initial experiments.
Collapse
|
57
|
Lin L, Drton M, Shojaie A. Statistical significance in high-dimensional linear mixed models. FODS '20 : PROCEEDINGS OF THE 2020 ACM-IMS FOUNDATIONS OF DATA SCIENCE CONFERENCE : OCTOBER 19-20, 2020, VIRTUAL EVENT, USA. ACM-IMS FOUNDATIONS OF DATA SCIENCE CONFERENCE (2020 : ONLINE) 2020; 2020:171-181. [PMID: 35497571 PMCID: PMC9053448 DOI: 10.1145/3412815.3416883] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
This paper concerns the development of an inferential framework for high-dimensional linear mixed effect models. These are suitable models, for instance, when we have n repeated measurements for M subjects. We consider a scenario where the number of fixed effects p is large (and may be larger than M), but the number of random effects q is small. Our framework is inspired by a recent line of work that proposes de-biasing penalized estimators to perform inference for high-dimensional linear models with fixed effects only. In particular, we demonstrate how to correct a 'naive' ridge estimator in extension of work by Bühlmann (2013) to build asymptotically valid confidence intervals for mixed effect models. We validate our theoretical results with numerical experiments, in which we show our method outperforms those that fail to account for correlation induced by the random effects. For a practical demonstration we consider a riboflavin production dataset that exhibits group structure, and show that conclusions drawn using our method are consistent with those obtained on a similar dataset without group structure.
Collapse
Affiliation(s)
- Lina Lin
- Department of Statistics, University of Washington
| | - Mathias Drton
- Department of Mathematics, Technical University of Munich
| | - Ali Shojaie
- Department of Biostatistics, University of Washington
| |
Collapse
|
58
|
Ehrmann C, Reinhardt JD, Joseph C, Hasnan N, Perrouin-Verbe B, Tederko P, Zampolini M, Stucki G. Describing Functioning in People Living With Spinal Cord Injury Across 22 Countries: A Graphical Modeling Approach. Arch Phys Med Rehabil 2020; 101:2112-2143. [PMID: 32980339 DOI: 10.1016/j.apmr.2020.09.374] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2019] [Revised: 09/01/2020] [Accepted: 09/17/2020] [Indexed: 10/23/2022]
Abstract
OBJECTIVE To provide prevalence estimates for problems in functioning of community-dwelling persons with spinal cord injury (SCI) and to examine associations between various areas of functioning with the purpose of supporting countries in identifying targets for interventions. DESIGN Cross-sectional survey. SETTING Community, 22 countries including all World Health Organization regions. PARTICIPANTS Persons (N=12,591) with traumatic or nontraumatic SCI aged 18 years or older. INTERVENTIONS Not applicable. MAIN OUTCOME MEASURES We estimated the prevalence of problems in 53 areas of functioning from the Brief International Classification of Functioning, Disability and Health (ICF) core set for SCI, long-term context, or ICF rehabilitation set covering 4 domains: impairments in body functions, impairments in mental functions, independence in performing activities, and restrictions in participation. Associations between areas of functioning were identified and visualized using conditional independence graphs. RESULTS Participants had a median age of 52 years, 73% were male, and 63% had paraplegia. Feeling tired, bowel dysfunction, sexual functions, spasticity, pain, carrying out daily routine, doing housework, getting up off the floor from lying on the back, pushing open a heavy door, and standing unsupported had the highest prevalence of problems (>70%). Clustering of associations within the 4 functioning domains was found, with the highest numbers of associations within impairments in mental functions. For the whole International Spinal Cord Injury sample, areas with the highest numbers of associations were circulatory problems, transferring bed-wheelchair, and toileting, while for the World Health Organization European and Western Pacific regions, these were dressing upper body, transferring bed-wheelchair, handling stress, feeling downhearted and depressed, and feeling happy. CONCLUSIONS In each domain of functioning, high prevalence of problems and high connectivity of areas of functioning were identified. The understanding of problems and the identification of potential targets for intervention can inform decision makers at all levels of the health system aiming to improve the situation of people living with SCI.
Collapse
Affiliation(s)
- Cristina Ehrmann
- Swiss Paraplegic Research, Guido A. Zäch Institute, Nottwil, Switzerland; Department of Health Sciences and Medicine, University of Lucerne, Lucerne, Switzerland
| | - Jan D Reinhardt
- Swiss Paraplegic Research, Guido A. Zäch Institute, Nottwil, Switzerland; Department of Health Sciences and Medicine, University of Lucerne, Lucerne, Switzerland; Institute for Disaster Management and Reconstruction, Sichuan University and Hong Kong Polytechnic University, Chengdu, China.
| | - Conran Joseph
- Division of Physiotherapy, Stellenbosch University, Stellenbosch, South Africa; Department of Neurobiology, Care Sciences and Society, Karolinska Institutet, Solna, Sweden
| | - Nazirah Hasnan
- University Hospital of Nantes, St Jacques Hospital, Nantes, France
| | | | - Piotr Tederko
- Department of Rehabilitation, 1st Faculty of Medicine, Medical University of Warsaw, Warsaw, Poland
| | - Mauro Zampolini
- Department of Rehabilitation, San Giovanni Battista Hospital, Foligno, Perugia, Italy
| | | | - Gerold Stucki
- Swiss Paraplegic Research, Guido A. Zäch Institute, Nottwil, Switzerland; Department of Health Sciences and Medicine, University of Lucerne, Lucerne, Switzerland; Center for Rehabilitation in Global Health Systems, WHO Collaborating Center, Department of Health Sciences and Medicine, University of Lucerne, Lucerne, Switzerland
| |
Collapse
|
59
|
Feinhandler I, Cilento B, Beauvais B, Harrop J, Fulton L. Predictors of Death Rate during the COVID-19 Pandemic. Healthcare (Basel) 2020; 8:E339. [PMID: 32937804 PMCID: PMC7551935 DOI: 10.3390/healthcare8030339] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2020] [Revised: 09/07/2020] [Accepted: 09/09/2020] [Indexed: 11/16/2022] Open
Abstract
Coronavirus (COVID-19) is a potentially fatal viral infection. This study investigates geography, demography, socioeconomics, health conditions, hospital characteristics, and politics as potential explanatory variables for death rates at the state and county levels. Data from the Centers for Disease Control and Prevention, the Census Bureau, Centers for Medicare and Medicaid, Definitive Healthcare, and USAfacts.org were used to evaluate regression models. Yearly pneumonia and flu death rates (state level, 2014-2018) were evaluated as a function of the governors' political party using a repeated measures analysis. At the state and county level, spatial regression models were evaluated. At the county level, we discovered a statistically significant model that included geography, population density, racial and ethnic status, three health status variables along with a political factor. A state level analysis identified health status, minority status, and the interaction between governors' parties and health status as important variables. The political factor, however, did not appear in a subsequent analysis of 2014-2018 pneumonia and flu death rates. The pathogenesis of COVID-19 has a greater and disproportionate effect within racial and ethnic minority groups, and the political influence on the reporting of COVID-19 mortality was statistically relevant at the county level and as an interaction term only at the state level.
Collapse
Affiliation(s)
- Ian Feinhandler
- Department of Geography and Political Sciences, Front Range Community College, Longmont, CO 80501, USA;
| | | | - Brad Beauvais
- School of Health Administration, Texas State University, San Marcos, TX 78666, USA;
| | | | - Lawrence Fulton
- School of Health Administration, Texas State University, San Marcos, TX 78666, USA;
| |
Collapse
|
60
|
Yuan R, Chen S, Wang Y. Computational Prediction of Drug Responses in Cancer Cell Lines From Cancer Omics and Detection of Drug Effectiveness Related Methylation Sites. Front Genet 2020; 11:917. [PMID: 32849855 PMCID: PMC7426400 DOI: 10.3389/fgene.2020.00917] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2020] [Accepted: 07/23/2020] [Indexed: 12/13/2022] Open
Abstract
Accurately predicting the response of a cancer patient to a therapeutic agent remains an important challenge in precision medicine. With the rise of data science, researchers have applied computational models to study the drug inhibition effects on cancers based on cancer genomics and transcriptomics. Moreover, a common epigenetic modification, DNA methylation, has been related to the occurrence and development of cancer, as well as drug effectiveness. Therefore, it is helpful for improvement of drug response prediction through exploring the relationship between DNA methylation and drug effectiveness. Here, we proposed a computational model to predict drug responses in cancers through integration of cancer genomics, transcriptomics, epigenomics, and compound chemical properties. Meanwhile, we applied a regularized regression model (Least Absolute Shrinkage and Selection Operator, lasso) to detect the methylation sites that were closely related to drug effectiveness. The prediction models were trained on a well-known pharmacogenomics data resource, Genomics of Drug Sensitivity in Cancer (GDSC). The cross-validation indicates that the performance of the prediction model using DNA methylation is comparable to that of using other cancer omics, including oncogene mutation and gene expression data. It indicates the important role of DNA methylation in prediction of drug responses. Encyclopedia of DNA Elements (ENCODE) and Transcriptional Regulatory Relationships Unraveled by Sentence-based Text mining (TRRUST2) database analyses suggest that the methylation sites associated with drug effectiveness are mainly located in the transcription factor (TF) binding region. Therefore, we hypothesized that the sensitivity of cancer cells to drugs could be regulated by changing the methylation modification of TF binding region. In conclusion, we confirmed the important role of DNA methylation in prediction of drug responses, and provided some methylation sites that closely related to the drug effectiveness, which may be a great regulatory target for improvement of drug treatment effects on cancer patients.
Collapse
Affiliation(s)
- Rui Yuan
- Key Laboratory of Plateau Biological Adaptation and Evolution, Northwest Institute of Plateau Biology, Chinese Academy of Sciences, Xining, China.,University of Chinese Academy of Sciences, Beijing, China
| | - Shilong Chen
- Key Laboratory of Plateau Biological Adaptation and Evolution, Northwest Institute of Plateau Biology, Chinese Academy of Sciences, Xining, China.,Institute of Sanjiangyuan National Park, Chinese Academy of Sciences, Xining, China
| | - Yongcui Wang
- Key Laboratory of Plateau Biological Adaptation and Evolution, Northwest Institute of Plateau Biology, Chinese Academy of Sciences, Xining, China.,Qinghai Provincial Key Laboratory of Crop Molecular Breeding, Northwest Institute of Plateau Biology, Chinese Academy of Sciences, Xining, China
| |
Collapse
|
61
|
Lauber C, Correia N, Trumpp A, Rieger MA, Dolnik A, Bullinger L, Roeder I, Seifert M. Survival differences and associated molecular signatures of DNMT3A-mutant acute myeloid leukemia patients. Sci Rep 2020; 10:12761. [PMID: 32728112 PMCID: PMC7391693 DOI: 10.1038/s41598-020-69691-8] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2019] [Accepted: 07/13/2020] [Indexed: 12/17/2022] Open
Abstract
Acute myeloid leukemia (AML) is a very heterogeneous and highly malignant blood cancer. Mutations of the DNA methyltransferase DNMT3A are among the most frequent recurrent genetic lesions in AML. The majority of DNMT3A-mutant AML patients shows fast relapse and poor survival, but also patients with long survival or long-term remission have been reported. Underlying molecular signatures and mechanisms that contribute to these survival differences are only poorly understood and have not been studied in detail so far. We applied hierarchical clustering to somatic gene mutation profiles of 51 DNMT3A-mutant patients from The Cancer Genome Atlas (TCGA) AML cohort revealing two robust patient subgroups with profound differences in survival. We further determined molecular signatures that distinguish both subgroups. Our results suggest that FLT3 and/or NPM1 mutations contribute to survival differences of DNMT3A-mutant patients. We observed an upregulation of genes of the p53, VEGF and DNA replication pathway and a downregulation of genes of the PI3K-Akt pathway in short- compared to long-lived patients. We identified that the majority of measured miRNAs was downregulated in the short-lived group and we found differentially expressed microRNAs between both subgroups that have not been reported for AML so far (miR-153-2, miR-3065, miR-95, miR-6718) suggesting that miRNAs could be important for prognosis. In addition, we learned gene regulatory networks to predict potential major regulators and found several genes and miRNAs with known roles in AML pathogenesis, but also interesting novel candidates involved in the regulation of hematopoiesis, cell cycle, cell differentiation, and immunity that may contribute to the observed survival differences of both subgroups and could therefore be important for prognosis. Moreover, the characteristic gene mutation and expression signatures that distinguished short- from long-lived patients were also predictive for independent DNMT3A-mutant AML patients from other cohorts and could also contribute to further improve the European LeukemiaNet (ELN) prognostic scoring system. Our study represents the first in-depth computational approach to identify molecular factors associated with survival differences of DNMT3A-mutant AML patients and could trigger additional studies to develop robust molecular markers for a better stratification of AML patients with DNMT3A mutations.
Collapse
Affiliation(s)
- Chris Lauber
- Institute for Medical Informatics and Biometry (IMB), Carl Gustav Carus Faculty of Medicine, Technische Universität Dresden, Dresden, Germany
| | - Nádia Correia
- Division of Stem Cells and Cancer, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Andreas Trumpp
- Division of Stem Cells and Cancer, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Michael A Rieger
- Department of Medicine, Hematology/Oncology, Goethe University Hospital Frankfurt, Frankfurt, Germany
| | - Anna Dolnik
- Department of Hematology, Oncology and Tumorimmunology, Charité University Medicine Berlin, Campus Virchow Klinikum, Berlin, Germany
| | - Lars Bullinger
- Department of Hematology, Oncology and Tumorimmunology, Charité University Medicine Berlin, Campus Virchow Klinikum, Berlin, Germany
| | - Ingo Roeder
- Institute for Medical Informatics and Biometry (IMB), Carl Gustav Carus Faculty of Medicine, Technische Universität Dresden, Dresden, Germany.,National Center for Tumor Diseases (NCT), Dresden, Germany
| | - Michael Seifert
- Institute for Medical Informatics and Biometry (IMB), Carl Gustav Carus Faculty of Medicine, Technische Universität Dresden, Dresden, Germany. .,National Center for Tumor Diseases (NCT), Dresden, Germany.
| |
Collapse
|
62
|
Lees JA, Mai TT, Galardini M, Wheeler NE, Horsfield ST, Parkhill J, Corander J. Improved Prediction of Bacterial Genotype-Phenotype Associations Using Interpretable Pangenome-Spanning Regressions. mBio 2020; 11:e01344-20. [PMID: 32636251 PMCID: PMC7343994 DOI: 10.1128/mbio.01344-20] [Citation(s) in RCA: 38] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2020] [Accepted: 06/05/2020] [Indexed: 12/19/2022] Open
Abstract
Discovery of genetic variants underlying bacterial phenotypes and the prediction of phenotypes such as antibiotic resistance are fundamental tasks in bacterial genomics. Genome-wide association study (GWAS) methods have been applied to study these relations, but the plastic nature of bacterial genomes and the clonal structure of bacterial populations creates challenges. We introduce an alignment-free method which finds sets of loci associated with bacterial phenotypes, quantifies the total effect of genetics on the phenotype, and allows accurate phenotype prediction, all within a single computationally scalable joint modeling framework. Genetic variants covering the entire pangenome are compactly represented by extended DNA sequence words known as unitigs, and model fitting is achieved using elastic net penalization, an extension of standard multiple regression. Using an extensive set of state-of-the-art bacterial population genomic data sets, we demonstrate that our approach performs accurate phenotype prediction, comparable to popular machine learning methods, while retaining both interpretability and computational efficiency. Compared to those of previous approaches, which test each genotype-phenotype association separately for each variant and apply a significance threshold, the variants selected by our joint modeling approach overlap substantially.IMPORTANCE Being able to identify the genetic variants responsible for specific bacterial phenotypes has been the goal of bacterial genetics since its inception and is fundamental to our current level of understanding of bacteria. This identification has been based primarily on painstaking experimentation, but the availability of large data sets of whole genomes with associated phenotype metadata promises to revolutionize this approach, not least for important clinical phenotypes that are not amenable to laboratory analysis. These models of phenotype-genotype association can in the future be used for rapid prediction of clinically important phenotypes such as antibiotic resistance and virulence by rapid-turnaround or point-of-care tests. However, despite much effort being put into adapting genome-wide association study (GWAS) approaches to cope with bacterium-specific problems, such as strong population structure and horizontal gene exchange, current approaches are not yet optimal. We describe a method that advances methodology for both association and generation of portable prediction models.
Collapse
Affiliation(s)
- John A Lees
- MRC Centre for Global Infectious Disease Analysis, Department of Infectious Disease Epidemiology, Imperial College London, London, United Kingdom
| | - T Tien Mai
- Oslo Centre for Biostatistics and Epidemiology, Department of Biostatistics, University of Oslo, Oslo, Norway
| | - Marco Galardini
- Biological Design Center, Boston University, Boston, Massachusetts, USA
| | - Nicole E Wheeler
- Centre for Genomic Pathogen Surveillance, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, United Kingdom
| | - Samuel T Horsfield
- MRC Centre for Global Infectious Disease Analysis, Department of Infectious Disease Epidemiology, Imperial College London, London, United Kingdom
| | - Julian Parkhill
- Department of Veterinary Medicine, University of Cambridge, Cambridge, United Kingdom
| | - Jukka Corander
- Oslo Centre for Biostatistics and Epidemiology, Department of Biostatistics, University of Oslo, Oslo, Norway
- Centre for Genomic Pathogen Surveillance, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, United Kingdom
- Helsinki Institute of Information Technology, Department of Mathematics and Statistics, University of Helsinki, Helsinki, Finland
| |
Collapse
|
63
|
Wisler AA, Fletcher AR, McAuliffe MJ. Predicting Montreal Cognitive Assessment Scores From Measures of Speech and Language. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2020; 63:1752-1761. [PMID: 32459131 DOI: 10.1044/2020_jslhr-19-00183] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Purpose This study examined the relationship between measurements derived from spontaneous speech and participants' scores on the Montreal Cognitive Assessment. Method Participants (N = 521) aged between 64 and 97 years completed the cognitive assessment and were prompted to describe an early childhood memory. A range of acoustic and linguistic measures was extracted from the resulting speech sample. A least absolute shrinkage and selection operator approach was used to model the relationship between acoustic, lexical, and demographic information and participants' scores on the cognitive assessment. Results Using the covariance test statistic, four important variables were identified, which, together, explained 16.52% of the variance in participants' cognitive scores. Conclusions The degree to which cognition can be accurately predicted through spontaneously produced speech samples is limited. Statistically significant relationships were found between specific measurements of lexical variation, participants' speaking rate, and their scores on the Montreal Cognitive Assessment.
Collapse
Affiliation(s)
- Alan A Wisler
- New Zealand Institute of Language, Brain and Behaviour, Christchurch, New Zealand
| | - Annalise R Fletcher
- Department of Audiology and Speech-Language Pathology, University of North Texas, Denton
| | - Megan J McAuliffe
- Department of Communication Disorders, University of Canterbury, Christchurch, New Zealand
| |
Collapse
|
64
|
Prefrontal cortical activation during working memory task anticipation contributes to discrimination between bipolar and unipolar depression. Neuropsychopharmacology 2020; 45:956-963. [PMID: 32069475 PMCID: PMC7162920 DOI: 10.1038/s41386-020-0638-7] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/17/2019] [Revised: 02/03/2020] [Accepted: 02/10/2020] [Indexed: 01/10/2023]
Abstract
Distinguishing bipolar disorder (BD) from major depressive disorder (MDD) is clinically challenging, especially during depressive episodes. While both groups are characterized by aberrant working memory and anticipatory processing, the role of these processes in discriminating BD from MDD remains unexplored. In this study, we examine how brain activation corresponding to anticipation of and performance on easy vs. difficult working memory tasks with emotional stimuli contributes to discrimination among BD, MDD, and healthy controls (HC). Depressed individuals with BD (n = 18), MDD (n = 23), and HC (n = 23) were scanned while performing a working memory task in which they had to first anticipate performance on 1-back (easy) or 2-back (difficult) tasks with happy, fearful, or neutral faces, and then, perform the task. Anticipation-related and task-related brain activation was measured in the whole brain using functional magnetic resonance imagining. We used an elastic-net regression for variable selection, and a random forest classifier for BD vs. MDD classification. The former selected the activation differences (1-back minus 2-back) in the lateral and medial prefrontal cortices (PFC) during task anticipation and performance on the working memory tasks with fearful and neutral faces as variables relevant for BD vs. MDD classification. BD vs. MDD were classified with 70.7% accuracy (p < 0.01) based on the neuroimaging measures alone, with 80.5% accuracy (p = 0.001) based on clinical measures alone, and with 85.4% accuracy (p < 0.001) based on clinical and neuroimaging measures together. These findings suggest that PFC activation during working memory task anticipation and performance may be an important biological marker distinguishing BD from MDD.
Collapse
|
65
|
Williams DR, Rast P. Back to the basics: Rethinking partial correlation network methodology. THE BRITISH JOURNAL OF MATHEMATICAL AND STATISTICAL PSYCHOLOGY 2020; 73:187-212. [PMID: 31206621 PMCID: PMC8572131 DOI: 10.1111/bmsp.12173] [Citation(s) in RCA: 72] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/23/2018] [Revised: 03/02/2019] [Indexed: 05/08/2023]
Abstract
The Gaussian graphical model (GGM) is an increasingly popular technique used in psychology to characterize relationships among observed variables. These relationships are represented as elements in the precision matrix. Standardizing the precision matrix and reversing the sign yields corresponding partial correlations that imply pairwise dependencies in which the effects of all other variables have been controlled for. The graphical lasso (glasso) has emerged as the default estimation method, which uses ℓ1 -based regularization. The glasso was developed and optimized for high-dimensional settings where the number of variables (p) exceeds the number of observations (n), which is uncommon in psychological applications. Here we propose to go 'back to the basics', wherein the precision matrix is first estimated with non-regularized maximum likelihood and then Fisher Z transformed confidence intervals are used to determine non-zero relationships. We first show the exact correspondence between the confidence level and specificity, which is due to 1 minus specificity denoting the false positive rate (i.e., α). With simulations in low-dimensional settings (p ≪ n), we then demonstrate superior performance compared to the glasso for detecting the non-zero effects. Further, our results indicate that the glasso is inconsistent for the purpose of model selection and does not control the false discovery rate, whereas the proposed method converges on the true model and directly controls error rates. We end by discussing implications for estimating GGMs in psychology.
Collapse
|
66
|
Mitchelmore J, Grinberg NF, Wallace C, Spivakov M. Functional effects of variation in transcription factor binding highlight long-range gene regulation by epromoters. Nucleic Acids Res 2020; 48:2866-2879. [PMID: 32112106 PMCID: PMC7102942 DOI: 10.1093/nar/gkaa123] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2019] [Revised: 02/14/2020] [Accepted: 02/17/2020] [Indexed: 02/06/2023] Open
Abstract
Identifying DNA cis-regulatory modules (CRMs) that control the expression of specific genes is crucial for deciphering the logic of transcriptional control. Natural genetic variation can point to the possible gene regulatory function of specific sequences through their allelic associations with gene expression. However, comprehensive identification of causal regulatory sequences in brute-force association testing without incorporating prior knowledge is challenging due to limited statistical power and effects of linkage disequilibrium. Sequence variants affecting transcription factor (TF) binding at CRMs have a strong potential to influence gene regulatory function, which provides a motivation for prioritizing such variants in association testing. Here, we generate an atlas of CRMs showing predicted allelic variation in TF binding affinity in human lymphoblastoid cell lines and test their association with the expression of their putative target genes inferred from Promoter Capture Hi-C and immediate linear proximity. We reveal >1300 CRM TF-binding variants associated with target gene expression, the majority of them undetected with standard association testing. A large proportion of CRMs showing associations with the expression of genes they contact in 3D localize to the promoter regions of other genes, supporting the notion of 'epromoters': dual-action CRMs with promoter and distal enhancer activity.
Collapse
Affiliation(s)
- Joanna Mitchelmore
- Nuclear Dynamics Programme, Babraham Institute, Babraham Research Campus, Cambridge CB22 3AT, UK
| | - Nastasiya F Grinberg
- Cambridge Institute of Therapeutic Immunology & Infectious Disease (CITIID), University of Cambridge, Cambridge Biomedical Campus, Cambridge CB2 0AW, UK
| | - Chris Wallace
- Cambridge Institute of Therapeutic Immunology & Infectious Disease (CITIID), University of Cambridge, Cambridge Biomedical Campus, Cambridge CB2 0AW, UK
- MRC Biostatistics Unit, University of Cambridge, Cambridge Biomedical Campus, Cambridge CB2 0SR, UK
| | - Mikhail Spivakov
- Nuclear Dynamics Programme, Babraham Institute, Babraham Research Campus, Cambridge CB22 3AT, UK
- MRC London Institute of Medical Sciences, Du Cane Road, London W12 0NN, UK
- Institute of Clinical Sciences, Faculty of Medicine, Imperial College, Du Cane Road, London W12 0NN, UK
| |
Collapse
|
67
|
Dondelinger F, Mukherjee S. The joint lasso: high-dimensional regression for group structured data. Biostatistics 2020; 21:219-235. [PMID: 30192903 PMCID: PMC7868060 DOI: 10.1093/biostatistics/kxy035] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2017] [Revised: 05/11/2018] [Accepted: 06/02/2018] [Indexed: 11/24/2022] Open
Abstract
We consider high-dimensional regression over subgroups of observations. Our work is motivated by biomedical problems, where subsets of samples, representing for example disease subtypes, may differ with respect to underlying regression models. In the high-dimensional setting, estimating a different model for each subgroup is challenging due to limited sample sizes. Focusing on the case in which subgroup-specific models may be expected to be similar but not necessarily identical, we treat subgroups as related problem instances and jointly estimate subgroup-specific regression coefficients. This is done in a penalized framework, combining an $\ell_1$ term with an additional term that penalizes differences between subgroup-specific coefficients. This gives solutions that are globally sparse but that allow information-sharing between the subgroups. We present algorithms for estimation and empirical results on simulated data and using Alzheimer's disease, amyotrophic lateral sclerosis, and cancer datasets. These examples demonstrate the gains joint estimation can offer in prediction as well as in providing subgroup-specific sparsity patterns.
Collapse
Affiliation(s)
- Frank Dondelinger
- Lancaster Medical School, Lancaster University, Furness College, Bailrigg, Lancaster, UK
| | - Sach Mukherjee
- Statistics and Machine Learning, German Center for Neurodegenerative Diseases (DZNE), Sigmund-Freud-Straße 27, Bonn, Germany
| | | |
Collapse
|
68
|
Kawaguchi ES, Suchard MA, Liu Z, Li G. A surrogate ℓ 0 sparse Cox's regression with applications to sparse high-dimensional massive sample size time-to-event data. Stat Med 2020; 39:675-686. [PMID: 31814146 DOI: 10.1002/sim.8438] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2019] [Revised: 09/30/2019] [Accepted: 11/02/2019] [Indexed: 11/11/2022]
Abstract
Sparse high-dimensional massive sample size (sHDMSS) time-to-event data present multiple challenges to quantitative researchers as most current sparse survival regression methods and software will grind to a halt and become practically inoperable. This paper develops a scalable ℓ0 -based sparse Cox regression tool for right-censored time-to-event data that easily takes advantage of existing high performance implementation of ℓ2 -penalized regression method for sHDMSS time-to-event data. Specifically, we extend the ℓ0 -based broken adaptive ridge (BAR) methodology to the Cox model, which involves repeatedly performing reweighted ℓ2 -penalized regression. We rigorously show that the resulting estimator for the Cox model is selection consistent, oracle for parameter estimation, and has a grouping property for highly correlated covariates. Furthermore, we implement our BAR method in an R package for sHDMSS time-to-event data by leveraging existing efficient algorithms for massive ℓ2 -penalized Cox regression. We evaluate the BAR Cox regression method by extensive simulations and illustrate its application on an sHDMSS time-to-event data from the National Trauma Data Bank with hundreds of thousands of observations and tens of thousands sparsely represented covariates.
Collapse
Affiliation(s)
- Eric S Kawaguchi
- Department of Preventive Medicine, University of Southern California, Los Angeles, California
| | - Marc A Suchard
- Department of Preventive Medicine, University of Southern California, Los Angeles, California.,Department of Biomathematics, University of California, Los Angeles, California.,Department of Human Genetics, University of California, Los Angeles, California
| | - Zhenqiu Liu
- Department of Public Health Sciences, Penn State Cancer Institute, Hershey, Pennsylvania
| | - Gang Li
- Department of Preventive Medicine, University of Southern California, Los Angeles, California.,Department of Biomathematics, University of California, Los Angeles, California
| |
Collapse
|
69
|
Lu J, Kolar M, Liu H. Kernel Meets Sieve: Post-Regularization Confidence Bands for Sparse Additive Model. J Am Stat Assoc 2020. [DOI: 10.1080/01621459.2019.1689984] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Affiliation(s)
- Junwei Lu
- Department of Biostatistics, Harvard T. H. Chan School of Public Health, Boston, MA
| | - Mladen Kolar
- Booth School of Business, The University of Chicago, Chicago, IL
| | - Han Liu
- Department of Computer Science, Northwestern University, Evanston, IL
| |
Collapse
|
70
|
Using Class-Specific Feature Selection for Cancer Detection with Gene Expression Profile Data of Platelets. SENSORS 2020; 20:s20051528. [PMID: 32164283 PMCID: PMC7085688 DOI: 10.3390/s20051528] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/27/2020] [Revised: 03/04/2020] [Accepted: 03/07/2020] [Indexed: 12/16/2022]
Abstract
A novel multi-classification method, which integrates the elastic net and probabilistic support vector machine, was proposed to solve this problem in cancer detection with gene expression profile data of platelets, whose problems mainly are a kind of multi-class classification problem with high dimension, small samples, and collinear data. The strategy of one-against-all (OVA) was employed to decompose the multi-classification problem into a series of binary classification problems. The elastic net was used to select class-specific features for the binary classification problems, and the probabilistic support vector machine was used to make the outputs of the binary classifiers with class-specific features comparable. Simulation data and gene expression profile data were intended to verify the effectiveness of the proposed method. Results indicate that the proposed method can automatically select class-specific features and obtain better performance of classification than that of the conventional multi-class classification methods, which are mainly based on global feature selection methods. This study indicates the proposed method is suitable for general multi-classification problems featured with high-dimension, small samples, and collinear data.
Collapse
|
71
|
Rejoinder on: Hierarchical inference for genome-wide association studies: a view on methodology with software. Comput Stat 2020. [DOI: 10.1007/s00180-019-00948-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
|
72
|
Marini S, Davis KA, Soare TW, Zhu Y, Suderman MJ, Simpkin AJ, Smith ADAC, Wolf EJ, Relton CL, Dunn EC. Adversity exposure during sensitive periods predicts accelerated epigenetic aging in children. Psychoneuroendocrinology 2020; 113:104484. [PMID: 31918390 PMCID: PMC7832214 DOI: 10.1016/j.psyneuen.2019.104484] [Citation(s) in RCA: 70] [Impact Index Per Article: 17.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/22/2019] [Revised: 10/11/2019] [Accepted: 10/11/2019] [Indexed: 12/30/2022]
Abstract
OBJECTIVES Exposure to adversity has been linked to accelerated biological aging, which in turn has been shown to predict numerous physical and mental health problems. In recent years, measures of DNA methylation-based epigenetic age--known as "epigenetic clocks"--have been used to estimate accelerated epigenetic aging. Although a small number of studies have found an effect of adversity exposure on epigenetic age in children, none have investigated if there are "sensitive periods" when adversity is most impactful. METHODS Using data from the Avon Longitudinal Study of Parents and Children (ALSPAC; n = 973), we tested the prospective association between repeated measures of childhood exposure to seven types of adversity on epigenetic age assessed at age 7.5 using the Horvath and Hannum epigenetic clocks. With a Least Angle Regression variable selection procedure, we evaluated potential sensitive period effects. RESULTS We found that exposure to abuse, financial hardship, or neighborhood disadvantage during sensitive periods in early and middle childhood best explained variability in the deviation of Hannum-based epigenetic age from chronological age, even after considering the role of adversity accumulation and recency. Secondary sex-stratified analyses identified particularly strong sensitive period effects. These effects were undetected in analyses comparing children "exposed" versus "unexposed" to adversity. We did not identify any associations between adversity and epigenetic age using the Horvath epigenetic clock. CONCLUSIONS Our results suggest that adversity may alter methylation processes in ways that either directly or indirectly perturb normal cellular aging and that these effects may be heightened during specific life stages.
Collapse
Affiliation(s)
- Sandro Marini
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, 02114, USA
| | - Kathryn A Davis
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, 02114, USA
| | - Thomas W Soare
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, 02114, USA; Stanley Center for Psychiatric Research, The Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA; Department of Psychiatry, Harvard Medical School, Boston, MA, 02115, USA
| | - Yiwen Zhu
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, 02114, USA
| | - Matthew J Suderman
- MRC Integrative Epidemiology Unit, School of Social and Community Medicine, University of Bristol, Bristol, BSB 1TH, UK
| | - Andrew J Simpkin
- MRC Integrative Epidemiology Unit, School of Social and Community Medicine, University of Bristol, Bristol, BSB 1TH, UK; School of Mathematics, Statistics and Applied Mathematics, National University of Ireland, Galway, H91TK33, Ireland
| | - Andrew D A C Smith
- Applied Statistics Group, University of the West of England, Bristol, BS16 1QY, UK
| | - Erika J Wolf
- National Center for PTSD at VA Boston Healthcare System, Boston, MA, 02130, USA; Boston University School of Medicine, Department of Psychiatry, Boston, MA, 02118, USA
| | - Caroline L Relton
- MRC Integrative Epidemiology Unit, School of Social and Community Medicine, University of Bristol, Bristol, BSB 1TH, UK; Institute of Genetic Medicine, University of Newcastle, Newcastle upon Tyne, NE1 3BZ, UK
| | - Erin C Dunn
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, 02114, USA; Stanley Center for Psychiatric Research, The Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA; Department of Psychiatry, Harvard Medical School, Boston, MA, 02115, USA; McCance Center for Brain Health at Massachusetts General Hospital, Boston, MA, 02114, USA.
| |
Collapse
|
73
|
Di Nanni N, Bersanelli M, Milanesi L, Mosca E. Network Diffusion Promotes the Integrative Analysis of Multiple Omics. Front Genet 2020; 11:106. [PMID: 32180795 PMCID: PMC7057719 DOI: 10.3389/fgene.2020.00106] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2019] [Accepted: 01/29/2020] [Indexed: 02/01/2023] Open
Abstract
The development of integrative methods is one of the main challenges in bioinformatics. Network-based methods for the analysis of multiple gene-centered datasets take into account known and/or inferred relations between genes. In the last decades, the mathematical machinery of network diffusion—also referred to as network propagation—has been exploited in several network-based pipelines, thanks to its ability of amplifying association between genes that lie in network proximity. Indeed, network diffusion provides a quantitative estimation of network proximity between genes associated with one or more different data types, from simple binary vectors to real vectors. Therefore, this powerful data transformation method has also been increasingly used in integrative analyses of multiple collections of biological scores and/or one or more interaction networks. We present an overview of the state of the art of bioinformatics pipelines that use network diffusion processes for the integrative analysis of omics data. We discuss the fundamental ways in which network diffusion is exploited, open issues and potential developments in the field. Current trends suggest that network diffusion is a tool of broad utility in omics data analysis. It is reasonable to think that it will continue to be used and further refined as new data types arise (e.g. single cell datasets) and the identification of system-level patterns will be considered more and more important in omics data analysis.
Collapse
Affiliation(s)
- Noemi Di Nanni
- Institute of Biomedical Technologies, National Research Council, Milan, Italy.,Department of Industrial and Information Engineering, University of Pavia, Pavia, Italy
| | - Matteo Bersanelli
- Department of Physics and Astronomy, University of Bologna, Bologna, Italy.,National Institute of Nuclear Physics (INFN), Bologna, Italy
| | - Luciano Milanesi
- Institute of Biomedical Technologies, National Research Council, Milan, Italy
| | - Ettore Mosca
- Institute of Biomedical Technologies, National Research Council, Milan, Italy
| |
Collapse
|
74
|
Tardivel PJC, Servien R, Concordet D. Simple expressions of the LASSO and SLOPE estimators in low-dimension. STATISTICS-ABINGDON 2020. [DOI: 10.1080/02331888.2020.1720019] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Affiliation(s)
| | - Rémi Servien
- INTHERES, Université de Toulouse, INRA, ENVT, Toulouse, France
| | | |
Collapse
|
75
|
Solution paths for the generalized lasso with applications to spatially varying coefficients regression. Comput Stat Data Anal 2020. [DOI: 10.1016/j.csda.2019.106821] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
76
|
Belzak WCM, Bauer DJ. Improving the assessment of measurement invariance: Using regularization to select anchor items and identify differential item functioning. Psychol Methods 2020; 25:673-690. [PMID: 31916799 DOI: 10.1037/met0000253] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
A common challenge in the behavioral sciences is evaluating measurement invariance, or whether the measurement properties of a scale are consistent for individuals from different groups. Measurement invariance fails when differential item functioning (DIF) exists, that is, when item responses relate to the latent variable differently across groups. To identify DIF in a scale, many data-driven procedures iteratively test for DIF one item at a time while assuming other items have no DIF. The DIF-free items are used to anchor the scale of the latent variable across groups, identifying the model. A major drawback to these iterative testing procedures is that they can fail to select the correct anchor items and identify true DIF, particularly when DIF is present in many items. We propose an alternative method for selecting anchors and identifying DIF. Namely, we use regularization, a machine learning technique that imposes a penalty function during estimation to remove parameters that have little impact on the fit of the model. We focus specifically here on a lasso penalty for group differences in the item parameters within the two-parameter logistic item response theory model. We compare lasso regularization with the more commonly used likelihood ratio test method in a 2-group DIF analysis. Simulation and empirical results show that when large amounts of DIF are present and sample sizes are large, lasso regularization has far better control of Type I error than the likelihood ratio test method with little decrement in power. This provides strong evidence that lasso regularization is a promising alternative for testing DIF and selecting anchors. (PsycInfo Database Record (c) 2020 APA, all rights reserved).
Collapse
Affiliation(s)
- William C M Belzak
- Department of Psychology and Neuroscience, University of North Carolina at Chapel Hill
| | - Daniel J Bauer
- Department of Psychology and Neuroscience, University of North Carolina at Chapel Hill
| |
Collapse
|
77
|
|
78
|
Penalized Variable Selection for Lipid-Environment Interactions in a Longitudinal Lipidomics Study. Genes (Basel) 2019; 10:genes10121002. [PMID: 31816972 PMCID: PMC6947406 DOI: 10.3390/genes10121002] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2019] [Accepted: 11/26/2019] [Indexed: 12/20/2022] Open
Abstract
Lipid species are critical components of eukaryotic membranes. They play key roles in many biological processes such as signal transduction, cell homeostasis, and energy storage. Investigations of lipid-environment interactions, in addition to the lipid and environment main effects, have important implications in understanding the lipid metabolism and related changes in phenotype. In this study, we developed a novel penalized variable selection method to identify important lipid-environment interactions in a longitudinal lipidomics study. An efficient Newton-Raphson based algorithm was proposed within the generalized estimating equation (GEE) framework. We conducted extensive simulation studies to demonstrate the superior performance of our method over alternatives, in terms of both identification accuracy and prediction performance. As weight control via dietary calorie restriction and exercise has been demonstrated to prevent cancer in a variety of studies, analysis of the high-dimensional lipid datasets collected using 60 mice from the skin cancer prevention study identified meaningful markers that provide fresh insight into the underlying mechanism of cancer preventive effects.
Collapse
|
79
|
Rinaldo A, Wasserman L, G’Sell M. Bootstrapping and sample splitting for high-dimensional, assumption-lean inference. Ann Stat 2019. [DOI: 10.1214/18-aos1784] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
80
|
Kuismin M, Saatoglu D, Niskanen AK, Jensen H, Sillanpää MJ. Genetic assignment of individuals to source populations using network estimation tools. Methods Ecol Evol 2019. [DOI: 10.1111/2041-210x.13323] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Markku Kuismin
- Research Unit of Mathematical Sciences University of Oulu Oulu Finland
- Biocenter Oulu University of Oulu Oulu Finland
| | - Dilan Saatoglu
- Centre for Biodiversity Dynamics Department of Biology Norwegian University of Science and Technology Trondheim Norway
| | - Alina K. Niskanen
- Centre for Biodiversity Dynamics Department of Biology Norwegian University of Science and Technology Trondheim Norway
- Ecology and Genetics Research Unit University of Oulu Oulu Finland
| | - Henrik Jensen
- Centre for Biodiversity Dynamics Department of Biology Norwegian University of Science and Technology Trondheim Norway
| | - Mikko J. Sillanpää
- Research Unit of Mathematical Sciences University of Oulu Oulu Finland
- Biocenter Oulu University of Oulu Oulu Finland
- Infotech Oulu University of Oulu Oulu Finland
| |
Collapse
|
81
|
Kadir S, Kaza C, Weissbart H, Reichenbach T. Modulation of Speech-in-Noise Comprehension Through Transcranial Current Stimulation With the Phase-Shifted Speech Envelope. IEEE Trans Neural Syst Rehabil Eng 2019; 28:23-31. [PMID: 31751277 PMCID: PMC7001147 DOI: 10.1109/tnsre.2019.2939671] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]
Abstract
Neural activity tracks the envelope of a speech signal at latencies from 50 ms to 300 ms. Modulating this neural tracking through transcranial alternating current stimulation influences speech comprehension. Two important variables that can affect this modulation are the latency and the phase of the stimulation with respect to the sound. While previous studies have found an influence of both variables on speech comprehension, the interaction between both has not yet been measured. We presented 17 subjects with speech in noise coupled with simultaneous transcranial alternating current stimulation. The currents were based on the envelope of the target speech but shifted by different phases, as well as by two temporal delays of 100 ms and 250 ms. We also employed various control stimulations, and assessed the signal-to-noise ratio at which the subject understood half of the speech. We found that, at both latencies, speech comprehension is modulated by the phase of the current stimulation. However, the form of the modulation differed between the two latencies. Phase and latency of neurostimulation have accordingly distinct influences on speech comprehension. The different effects at the latencies of 100 ms and 250 ms hint at distinct neural processes for speech processing.
Collapse
|
82
|
Chen Y, Fan J, Ma C, Yan Y. Inference and uncertainty quantification for noisy matrix completion. Proc Natl Acad Sci U S A 2019; 116:22931-22937. [PMID: 31666329 PMCID: PMC6859358 DOI: 10.1073/pnas.1910053116] [Citation(s) in RCA: 37] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023] Open
Abstract
Noisy matrix completion aims at estimating a low-rank matrix given only partial and corrupted entries. Despite remarkable progress in designing efficient estimation algorithms, it remains largely unclear how to assess the uncertainty of the obtained estimates and how to perform efficient statistical inference on the unknown matrix (e.g., constructing a valid and short confidence interval for an unseen entry). This paper takes a substantial step toward addressing such tasks. We develop a simple procedure to compensate for the bias of the widely used convex and nonconvex estimators. The resulting debiased estimators admit nearly precise nonasymptotic distributional characterizations, which in turn enable optimal construction of confidence intervals/regions for, say, the missing entries and the low-rank factors. Our inferential procedures do not require sample splitting, thus avoiding unnecessary loss of data efficiency. As a byproduct, we obtain a sharp characterization of the estimation accuracy of our debiased estimators in both rate and constant. Our debiased estimators are tractable algorithms that provably achieve full statistical efficiency.
Collapse
Affiliation(s)
- Yuxin Chen
- Department of Electrical Engineering, Princeton University, Princeton, NJ 08544;
| | - Jianqing Fan
- Department of Operations Research and Financial Engineering, Princeton University, Princeton, NJ 08544
| | - Cong Ma
- Department of Operations Research and Financial Engineering, Princeton University, Princeton, NJ 08544
| | - Yuling Yan
- Department of Operations Research and Financial Engineering, Princeton University, Princeton, NJ 08544
| |
Collapse
|
83
|
Seifert M, Peitzsch C, Gorodetska I, Börner C, Klink B, Dubrovska A. Network-based analysis of prostate cancer cell lines reveals novel marker gene candidates associated with radioresistance and patient relapse. PLoS Comput Biol 2019; 15:e1007460. [PMID: 31682594 PMCID: PMC6855562 DOI: 10.1371/journal.pcbi.1007460] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2019] [Revised: 11/14/2019] [Accepted: 10/05/2019] [Indexed: 12/20/2022] Open
Abstract
Radiation therapy is an important and effective treatment option for prostate cancer, but high-risk patients are prone to relapse due to radioresistance of cancer cells. Molecular mechanisms that contribute to radioresistance are not fully understood. Novel computational strategies are needed to identify radioresistance driver genes from hundreds of gene copy number alterations. We developed a network-based approach based on lasso regression in combination with network propagation for the analysis of prostate cancer cell lines with acquired radioresistance to identify clinically relevant marker genes associated with radioresistance in prostate cancer patients. We analyzed established radioresistant cell lines of the prostate cancer cell lines DU145 and LNCaP and compared their gene copy number and expression profiles to their radiosensitive parental cells. We found that radioresistant DU145 showed much more gene copy number alterations than LNCaP and their gene expression profiles were highly cell line specific. We learned a genome-wide prostate cancer-specific gene regulatory network and quantified impacts of differentially expressed genes with directly underlying copy number alterations on known radioresistance marker genes. This revealed several potential driver candidates involved in the regulation of cancer-relevant processes. Importantly, we found that ten driver candidates from DU145 (ADAMTS9, AKR1B10, CXXC5, FST, FOXL1, GRPR, ITGA2, SOX17, STARD4, VGF) and four from LNCaP (FHL5, LYPLAL1, PAK7, TDRD6) were able to distinguish irradiated prostate cancer patients into early and late relapse groups. Moreover, in-depth in vitro validations for VGF (Neurosecretory protein VGF) showed that siRNA-mediated gene silencing increased the radiosensitivity of DU145 and LNCaP cells. Our computational approach enabled to predict novel radioresistance driver gene candidates. Additional preclinical and clinical studies are required to further validate the role of VGF and other candidate genes as potential biomarkers for the prediction of radiotherapy responses and as potential targets for radiosensitization of prostate cancer. Prostate cancer cell lines represent an important model system to characterize molecular alterations that contribute to radioresistance, but irradiation can cause deletions and amplifications of DNA segments that affect hundreds of genes. This in combination with the small number of cell lines that are usually considered does not allow a straight-forward identification of driver genes by standard statistical methods. Therefore, we developed a network-based approach to analyze gene copy number and expression profiles of such cell lines enabling to identify potential driver genes associated with radioresistance of prostate cancer. We used lasso regression in combination with a significance test for lasso to learn a genome-wide prostate cancer-specific gene regulatory network. We used this network for network flow computations to determine impacts of gene copy number alterations on known radioresistance marker genes. Mapping to prostate cancer samples and additional filtering allowed us to identify 14 driver gene candidates that distinguished irradiated prostate cancer patients into early and late relapse groups. In-depth literature analysis and wet-lab validations suggest that our method can predict novel radioresistance driver genes. Additional preclinical and clinical studies are required to further validate these genes for the prediction of radiotherapy responses and as potential targets to radiosensitize prostate cancer.
Collapse
Affiliation(s)
- Michael Seifert
- Institute for Medical Informatics and Biometry (IMB), Carl Gustav Carus Faculty of Medicine, Technische Universität Dresden, Dresden, Germany
- National Center for Tumor Diseases (NCT), Partner Site Dresden, Germany
- * E-mail:
| | - Claudia Peitzsch
- National Center for Tumor Diseases (NCT), Partner Site Dresden, Germany
- OncoRay - National Center for Radiation Research in Oncology, Faculty of Medicine and University Hospital Carl Gustav Carus, Technische Universität Dresden, Helmholtz-Zentrum Dresden-Rossendorf (HZDR), Dresden, Germany
| | - Ielizaveta Gorodetska
- OncoRay - National Center for Radiation Research in Oncology, Faculty of Medicine and University Hospital Carl Gustav Carus, Technische Universität Dresden, Helmholtz-Zentrum Dresden-Rossendorf (HZDR), Dresden, Germany
| | - Caroline Börner
- OncoRay - National Center for Radiation Research in Oncology, Faculty of Medicine and University Hospital Carl Gustav Carus, Technische Universität Dresden, Helmholtz-Zentrum Dresden-Rossendorf (HZDR), Dresden, Germany
| | - Barbara Klink
- Institute for Clinical Genetics, Carl Gustav Carus Faculty of Medicine, Technische Universität Dresden, Dresden, Germany
| | - Anna Dubrovska
- OncoRay - National Center for Radiation Research in Oncology, Faculty of Medicine and University Hospital Carl Gustav Carus, Technische Universität Dresden, Helmholtz-Zentrum Dresden-Rossendorf (HZDR), Dresden, Germany
- Helmholtz-Zentrum Dresden-Rossendorf (HZDR), Institute of Radiooncology-OncoRay, Dresden, Germany
- German Cancer Consortium (DKTK) Partner Site Dresden, Germany, and German Cancer Research Center (DKFZ), Heidelberg, Germany
| |
Collapse
|
84
|
Wang M, Xu S. A coordinate descent approach for sparse Bayesian learning in high dimensional QTL mapping and genome-wide association studies. Bioinformatics 2019; 35:4327-4335. [PMID: 31081037 DOI: 10.1093/bioinformatics/btz244] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2018] [Revised: 02/14/2019] [Accepted: 04/05/2019] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Genomic scanning approaches that detect one locus at a time are subject to many problems in genome-wide association studies and quantitative trait locus mapping. The problems include large matrix inversion, over-conservativeness for tests after Bonferroni correction and difficulty in evaluation of the total genetic contribution to a trait's variance. Targeting these problems, we take a further step and investigate a multiple locus model that detects all markers simultaneously in a single model. RESULTS We developed a sparse Bayesian learning (SBL) method for quantitative trait locus mapping and genome-wide association studies. This new method adopts a coordinate descent algorithm to estimate parameters (marker effects) by updating one parameter at a time conditional on current values of all other parameters. It uses an L2 type of penalty that allows the method to handle extremely large sample sizes (>100 000). Simulation studies show that SBL often has higher statistical powers and the simulated true loci are often detected with extremely small P-values, indicating that SBL is insensitive to stringent thresholds in significance testing. AVAILABILITY AND IMPLEMENTATION An R package (sbl) is available on the comprehensive R archive network (CRAN) and https://github.com/MeiyueComputBio/sbl/tree/master/R%20packge. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Meiyue Wang
- Department of Botany and Plant Sciences, University of California, Riverside, CA, USA
| | - Shizhong Xu
- Department of Botany and Plant Sciences, University of California, Riverside, CA, USA
| |
Collapse
|
85
|
Vijayabaskar MS, Goode DK, Obier N, Lichtinger M, Emmett AML, Abidin FNZ, Shar N, Hannah R, Assi SA, Lie-A-Ling M, Gottgens B, Lacaud G, Kouskoff V, Bonifer C, Westhead DR. Identification of gene specific cis-regulatory elements during differentiation of mouse embryonic stem cells: An integrative approach using high-throughput datasets. PLoS Comput Biol 2019; 15:e1007337. [PMID: 31682597 PMCID: PMC6855567 DOI: 10.1371/journal.pcbi.1007337] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2017] [Revised: 11/14/2019] [Accepted: 08/15/2019] [Indexed: 01/22/2023] Open
Abstract
Gene expression governs cell fate, and is regulated via a complex interplay of transcription factors and molecules that change chromatin structure. Advances in sequencing-based assays have enabled investigation of these processes genome-wide, leading to large datasets that combine information on the dynamics of gene expression, transcription factor binding and chromatin structure as cells differentiate. While numerous studies focus on the effects of these features on broader gene regulation, less work has been done on the mechanisms of gene-specific transcriptional control. In this study, we have focussed on the latter by integrating gene expression data for the in vitro differentiation of murine ES cells to macrophages and cardiomyocytes, with dynamic data on chromatin structure, epigenetics and transcription factor binding. Combining a novel strategy to identify communities of related control elements with a penalized regression approach, we developed individual models to identify the potential control elements predictive of the expression of each gene. Our models were compared to an existing method and evaluated using the existing literature and new experimental data from embryonic stem cell differentiation reporter assays. Our method is able to identify transcriptional control elements in a gene specific manner that reflect known regulatory relationships and to generate useful hypotheses for further testing.
Collapse
Affiliation(s)
- M. S. Vijayabaskar
- School of Molecular and Cellular Biology, Faculty of Biological Sciences, University of Leeds, Leeds, United Kingdom
| | - Debbie K. Goode
- Wellcome Trust & MRC Cambridge Stem Cell Institute and Cambridge Institute for Medical Research, University of Cambridge, Cambridge, United Kingdom
| | - Nadine Obier
- Institute for Cancer and Genomic Sciences, College of Medical and Dental Sciences, University of Birmingham. Birmingham, United Kingdom
| | - Monika Lichtinger
- Institute for Cancer and Genomic Sciences, College of Medical and Dental Sciences, University of Birmingham. Birmingham, United Kingdom
| | - Amber M. L. Emmett
- School of Molecular and Cellular Biology, Faculty of Biological Sciences, University of Leeds, Leeds, United Kingdom
| | - Fatin N. Zainul Abidin
- School of Molecular and Cellular Biology, Faculty of Biological Sciences, University of Leeds, Leeds, United Kingdom
| | - Nisar Shar
- School of Molecular and Cellular Biology, Faculty of Biological Sciences, University of Leeds, Leeds, United Kingdom
| | - Rebecca Hannah
- Wellcome Trust & MRC Cambridge Stem Cell Institute and Cambridge Institute for Medical Research, University of Cambridge, Cambridge, United Kingdom
| | - Salam A. Assi
- Institute for Cancer and Genomic Sciences, College of Medical and Dental Sciences, University of Birmingham. Birmingham, United Kingdom
| | - Michael Lie-A-Ling
- CRUK Manchester Institute, University of Manchester, Manchester, United Kingdom
| | - Berthold Gottgens
- Wellcome Trust & MRC Cambridge Stem Cell Institute and Cambridge Institute for Medical Research, University of Cambridge, Cambridge, United Kingdom
| | - Georges Lacaud
- CRUK Manchester Institute, University of Manchester, Manchester, United Kingdom
| | - Valerie Kouskoff
- Division of Developmental Biology and Medicine, The University of Manchester, Manchester, United Kingdom
| | - Constanze Bonifer
- Institute for Cancer and Genomic Sciences, College of Medical and Dental Sciences, University of Birmingham. Birmingham, United Kingdom
| | - David R. Westhead
- School of Molecular and Cellular Biology, Faculty of Biological Sciences, University of Leeds, Leeds, United Kingdom
| |
Collapse
|
86
|
Arroyo Relión JD, Kessler D, Levina E, Taylor SF. NETWORK CLASSIFICATION WITH APPLICATIONS TO BRAIN CONNECTOMICS. Ann Appl Stat 2019; 13:1648-1677. [PMID: 33408802 DOI: 10.1214/19-aoas1252] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
While statistical analysis of a single network has received a lot of attention in recent years, with a focus on social networks, analysis of a sample of networks presents its own challenges which require a different set of analytic tools. Here we study the problem of classification of networks with labeled nodes, motivated by applications in neuroimaging. Brain networks are constructed from imaging data to represent functional connectivity between regions of the brain, and previous work has shown the potential of such networks to distinguish between various brain disorders, giving rise to a network classification problem. Existing approaches tend to either treat all edge weights as a long vector, ignoring the network structure, or focus on graph topology as represented by summary measures while ignoring the edge weights. Our goal is to design a classification method that uses both the individual edge information and the network structure of the data in a computationally efficient way, and that can produce a parsimonious and interpretable representation of differences in brain connectivity patterns between classes. We propose a graph classification method that uses edge weights as predictors but incorporates the network nature of the data via penalties that promote sparsity in the number of nodes, in addition to the usual sparsity penalties that encourage selection of edges. We implement the method via efficient convex optimization and provide a detailed analysis of data from two fMRI studies of schizophrenia.
Collapse
|
87
|
|
88
|
Huang TJ, McKeague IW, Qian M. Marginal screening for high-dimensional predictors of survival outcomes. Stat Sin 2019; 29:2105-2139. [PMID: 31938013 PMCID: PMC6959482 DOI: 10.5705/ss.202017.0298] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Abstract
This study develops a marginal screening test to detect the presence of significant predictors for a right-censored time-to-event outcome under a high-dimensional accelerated failure time (AFT) model. Establishing a rigorous screening test in this setting is challenging, because of the right censoring and the post-selection inference. In the latter case, an implicit variable selection step needs to be included to avoid inflating the Type-I error. A prior study solved this problem by constructing an adaptive resampling test under an ordinary linear regression. To accommodate right censoring, we develop a new approach based on a maximally selected Koul-Susarla-Van Ryzin estimator from a marginal AFT working model. A regularized bootstrap method is used to calibrate the test. Our test is more powerful and less conservative than both a Bonferroni correction of the marginal tests and other competing methods. The proposed method is evaluated in simulation studies and applied to two real data sets.
Collapse
Affiliation(s)
| | | | - Min Qian
- Department of Biostatistics, Columbia University
| |
Collapse
|
89
|
Shi C, Song R, Chen Z, Li R. LINEAR HYPOTHESIS TESTING FOR HIGH DIMENSIONAL GENERALIZED LINEAR MODELS. Ann Stat 2019; 47:2671-2703. [PMID: 31534282 DOI: 10.1214/18-aos1761] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
This paper is concerned with testing linear hypotheses in high-dimensional generalized linear models. To deal with linear hypotheses, we first propose constrained partial regularization method and study its statistical properties. We further introduce an algorithm for solving regularization problems with folded-concave penalty functions and linear constraints. To test linear hypotheses, we propose a partial penalized likelihood ratio test, a partial penalized score test and a partial penalized Wald test. We show that the limiting null distributions of these three test statistics are χ2 distribution with the same degrees of freedom, and under local alternatives, they asymptotically follow non-central χ2 distributions with the same degrees of freedom and noncentral parameter, provided the number of parameters involved in the test hypothesis grows to ∞ at a certain rate. Simulation studies are conducted to examine the finite sample performance of the proposed tests. Empirical analysis of a real data example is used to illustrate the proposed testing procedures.
Collapse
Affiliation(s)
- Chengchun Shi
- Department of Statistics, North Carolina State University, Raleigh, NC 27695-8203, USA
| | - Rui Song
- Department of Statistics, North Carolina State University, Raleigh, NC 27695-8203, USA
| | - Zhao Chen
- Department of Statistics, and The Methodology Center, the Pennsylvania State University, University Park, PA 16802-2111, USA
| | - Runze Li
- Department of Statistics, and The Methodology Center, the Pennsylvania State University, University Park, PA 16802-2111, USA
| |
Collapse
|
90
|
Association of breast milk gamma-linolenic acid with infant anthropometric outcomes in urban, low-income Bangladeshi families: a prospective, birth cohort study. Eur J Clin Nutr 2019; 74:698-707. [PMID: 31501475 PMCID: PMC7214250 DOI: 10.1038/s41430-019-0498-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2018] [Revised: 07/17/2019] [Accepted: 08/23/2019] [Indexed: 02/08/2023]
Abstract
Background/Objectives Infant linear-growth faltering remains a major public health issue in low- and middle-income countries and suboptimal breast milk composition may be a local, population-specific risk factor. The relationship between early post-natal breast milk fatty acid (FA) composition and infant growth at 1 and 2 years of age was investigated prospectively in 563 families in Dhaka, Bangladesh. Subjects/Methods A maternal breast milk sample drawn before infant age 6 weeks was analyzed for percentage composition of 26 FAs, and infant length for age Z score (LAZ) was measured longitudinally to infant age 2 years. Individual FAs were tested as predictors of the infant growth outcomes. Results Of 26 tested FAs, %gamma-linolenic acid (%GLA) was mostly significantly associated with increase in LAZ from 6 to 52 weeks (ΔLAZ(52−6w)), and also to 104 weeks. The association was consistent over all breast milk stages with estimated effect size of +0.05 ΔLAZ(52−6w) per 20% change in %GLA (p value = 3 × 10−6), and stronger for ΔLAZ(104−6w) at +0.06 (p value = 8 × 10−7), explaining 1% of the outcome variance. Infant serum zinc measurements at 6 and 18 weeks of age were included in adjusted analyses, suggesting at least partial independence of infant zinc levels. The association was strongest in 417/563 (74.1%) families with %GLA <0.2%. Breast milk arachidonic acid fraction was within normal range with weaker evidence of association in early breast milk stages. Conclusions This study found that %GLA in breast milk was independently associated with infant linear growth, albeit with small effect size, in a predominantly slum-dwelling, low-income, Bangladeshi cohort.
Collapse
|
91
|
Antonelli J, Parmigiani G, Dominici F. High-Dimensional Confounding Adjustment Using Continuous Spike and Slab Priors. BAYESIAN ANALYSIS 2019; 14:805-828. [PMID: 32431779 PMCID: PMC7236769 DOI: 10.1214/18-ba1131] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
In observational studies, estimation of a causal effect of a treatment on an outcome relies on proper adjustment for confounding. If the number of the potential confounders (p) is larger than the number of observations (n), then direct control for all potential confounders is infeasible. Existing approaches for dimension reduction and penalization are generally aimed at predicting the outcome, and are less suited for estimation of causal effects. Under standard penalization approaches (e.g. Lasso), if a variable Xj is strongly associated with the treatment T but weakly with the outcome Y, the coefficient βj will be shrunk towards zero thus leading to confounding bias. Under the assumption of a linear model for the outcome and sparsity, we propose continuous spike and slab priors on the regression coefficients βj corresponding to the potential confounders Xj . Specifically, we introduce a prior distribution that does not heavily shrink to zero the coefficients (βj s) of the Xj s that are strongly associated with T but weakly associated with Y. We compare our proposed approach to several state of the art methods proposed in the literature. Our proposed approach has the following features: 1) it reduces confounding bias in high dimensional settings; 2) it shrinks towards zero coefficients of instrumental variables; and 3) it achieves good coverages even in small sample sizes. We apply our approach to the National Health and Nutrition Examination Survey (NHANES) data to estimate the causal effects of persistent pesticide exposure on triglyceride levels.
Collapse
Affiliation(s)
- Joseph Antonelli
- Department of Statistics, University of Florida, 102 Griffin-Floyd Hall, P.O. Box 118545, Gainesville, Fl, 32611, USA
| | - Giovanni Parmigiani
- Department of Biostatistics and Computational Biology, CLS 11007, Dana-Farber Cancer Institute, 450 Brookline Ave, Boston, MA, 02215, USA
| | - Francesca Dominici
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, 677 Huntington Avenue, Boston, MA, 02115, USA
| |
Collapse
|
92
|
Breheny PJ. Marginal false discovery rates for penalized regression models. Biostatistics 2019; 20:299-314. [PMID: 29420686 DOI: 10.1093/biostatistics/kxy004] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2017] [Accepted: 01/14/2018] [Indexed: 11/14/2022] Open
Abstract
Penalized regression methods are an attractive tool for high-dimensional data analysis, but their widespread adoption has been hampered by the difficulty of applying inferential tools. In particular, the question "How reliable is the selection of those features?" has proved difficult to address. In part, this difficulty arises from defining false discoveries in the classical, fully conditional sense, which is possible in low dimensions but does not scale well to high-dimensional settings. Here, we consider the analysis of marginal false discovery rates (mFDRs) for penalized regression methods. Restricting attention to the mFDR permits straightforward estimation of the number of selections that would likely have occurred by chance alone, and therefore provides a useful summary of selection reliability. Theoretical analysis and simulation studies demonstrate that this approach is quite accurate when the correlation among predictors is mild, and only slightly conservative when the correlation is stronger. Finally, the practical utility of the proposed method and its considerable advantages over other approaches are illustrated using gene expression data from The Cancer Genome Atlas and genome-wide association study data from the Myocardial Applied Genomics Network.
Collapse
|
93
|
Umezu Y, Takeuchi I. Selective inference via marginal screening for high dimensional classification. JAPANESE JOURNAL OF STATISTICS AND DATA SCIENCE 2019. [DOI: 10.1007/s42081-019-00058-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
94
|
Engebretsen S, Bohlin J. Statistical predictions with glmnet. Clin Epigenetics 2019; 11:123. [PMID: 31443682 PMCID: PMC6708235 DOI: 10.1186/s13148-019-0730-1] [Citation(s) in RCA: 422] [Impact Index Per Article: 84.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2019] [Accepted: 08/16/2019] [Indexed: 11/28/2022] Open
Abstract
Elastic net type regression methods have become very popular for prediction of certain outcomes in epigenome-wide association studies (EWAS). The methods considered accept biased coefficient estimates in return for lower variance thus obtaining improved prediction accuracy. We provide guidelines on how to obtain parsimonious models with low mean squared error and include easy to follow walk-through examples for each step in R.
Collapse
Affiliation(s)
- Solveig Engebretsen
- Division for Infection Control and Environmental Health, Department of Infectious Disease Epidemiology and Modelling, Norwegian Institute of Public Health, Oslo, Norway.,Oslo Centre for Biostatistics and Epidemiology, Department of Biostatistics, University of Oslo, Oslo, Norway
| | - Jon Bohlin
- Division for Infection Control and Environmental Health, Department of Infectious Disease Epidemiology and Modelling, Norwegian Institute of Public Health, Oslo, Norway. .,Centre for Fertility and Health (CEFH), Norwegian Institute of Public Health, Oslo, Norway. .,Faculty of Veterinary Science, Department of Production Animals, Norwegian University of Life Science, Ås, Norway.
| |
Collapse
|
95
|
Dukes O, Vansteelandt S. How to obtain valid tests and confidence intervals after propensity score variable selection? Stat Methods Med Res 2019; 29:677-694. [DOI: 10.1177/0962280219862005] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
The problem of how to best select variables for confounding adjustment forms one of the key challenges in the evaluation of exposure or treatment effects in observational studies. Routine practice is often based on stepwise selection procedures that use hypothesis testing, change-in-estimate assessments or the lasso, which have all been criticised for – amongst other things – not giving sufficient priority to the selection of confounders. This has prompted vigorous recent activity in developing procedures that prioritise the selection of confounders, while preventing the selection of so-called instrumental variables that are associated with exposure, but not outcome (after adjustment for the exposure). A major drawback of all these procedures is that there is no finite sample size at which they are guaranteed to deliver treatment effect estimators and associated confidence intervals with adequate performance. This is the result of the estimator jumping back and forth between different selected models, and standard confidence intervals ignoring the resulting model selection uncertainty. In this paper, we will develop insight into this by evaluating the finite-sample distribution of the exposure effect estimator in linear regression, under a number of the aforementioned confounder selection procedures. We will show that by making clever use of propensity scores, a simple and generic solution is obtained in the context of generalized linear models, which overcomes this concern (under weaker conditions than competing proposals). Specifically, we propose to use separate regularized regressions for the outcome and propensity score models in order to construct a doubly robust ‘g-estimator’; when these models are sufficiently sparse and correctly specified, standard confidence intervals for the g-estimator implicitly incorporate the uncertainty induced by the variable selection procedure.
Collapse
Affiliation(s)
- Oliver Dukes
- Department of Applied Mathematics, Computer Sciences and Statistics, Ghent University, Belgium
| | - Stijn Vansteelandt
- Department of Applied Mathematics, Computer Sciences and Statistics, Ghent University, Belgium
- Department of Medical Statistics, London School of Hygiene and Tropical Medicine, UK
| |
Collapse
|
96
|
Stelzer AS, Maccioni L, Gerhold-Ay A, Smedby KE, Schumacher M, Nieters A, Binder H. A multivariable approach for risk markers from pooled molecular data with only partial overlap. BMC MEDICAL GENETICS 2019; 20:128. [PMID: 31324155 PMCID: PMC6642584 DOI: 10.1186/s12881-019-0849-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/19/2018] [Accepted: 06/19/2019] [Indexed: 11/29/2022]
Abstract
Background Increasingly, molecular measurements from multiple studies are pooled to identify risk scores, with only partial overlap of measurements available from different studies. Univariate analyses of such markers have routinely been performed in such settings using meta-analysis techniques in genome-wide association studies for identifying genetic risk scores. In contrast, multivariable techniques such as regularized regression, which might potentially be more powerful, are hampered by only partial overlap of available markers even when the pooling of individual level data is feasible for analysis. This cannot easily be addressed at a preprocessing level, as quality criteria in the different studies may result in differential availability of markers – even after imputation. Methods Motivated by data from the InterLymph Consortium on risk factors for non-Hodgkin lymphoma, which exhibits these challenges, we adapted a regularized regression approach, componentwise boosting, for dealing with partial overlap in SNPs. This synthesis regression approach is combined with resampling to determine stable sets of single nucleotide polymorphisms, which could feed into a genetic risk score. The proposed approach is contrasted with univariate analyses, an application of the lasso, and with an analysis that discards studies causing the partial overlap. The question of statistical significance is faced with an approach called stability selection. Results Using an excerpt of the data from the InterLymph Consortium on two specific subtypes of non-Hodgkin lymphoma, it is shown that componentwise boosting can take into account all applicable information from different SNPs, irrespective of whether they are covered by all investigated studies and for all individuals in the single studies. The results indicate increased power, even when studies that would be discarded in a complete case analysis only comprise a small proportion of individuals. Conclusions Given the observed gains in power, the proposed approach can be recommended more generally whenever there is only partial overlap of molecular measurements obtained from pooled studies and/or missing data in single studies. A corresponding software implementation is available upon request. Trial registration All involved studies have provided signed GWAS data submission certifications to the U.S. National Institute of Health and have been retrospectively registered. Electronic supplementary material The online version of this article (10.1186/s12881-019-0849-0) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Anne-Sophie Stelzer
- Forest Research Institute Baden-Württemberg (FVA), Wonnhaldestraße 4, Freiburg, 79100, Germany. .,Institute for Medical Biometry and Statistics, Faculty of Medicine and Medical Center - University of Freiburg, Stefan-Meier-Straße 26, Freiburg, 79104, Germany. .,Freiburg Center for Data Analysis and Modeling, University of Freiburg, Eckerstraße 1, Freiburg, 79104, Germany. .,Center for Chronic Immunodeficiency, Faculty of Medicine and Medical Center - University of Freiburg, Breisacher Straße 115, Freiburg, 79106, Germany.
| | - Livia Maccioni
- Center for Chronic Immunodeficiency, Faculty of Medicine and Medical Center - University of Freiburg, Breisacher Straße 115, Freiburg, 79106, Germany
| | - Aslihan Gerhold-Ay
- Institute of Medical Biostatistics, Epidemiology and Informatics, University Medical Center Johannes Gutenberg University Mainz, Obere Zahlbacher Straße 69, Mainz, 55131, Germany
| | - Karin E Smedby
- Department of Medicine, Solna (MedS), Eugeniahemmet, T2, Karolinska Universitetssjukhuset, Solna, Stockholm, 17176, Sweden
| | - Martin Schumacher
- Institute for Medical Biometry and Statistics, Faculty of Medicine and Medical Center - University of Freiburg, Stefan-Meier-Straße 26, Freiburg, 79104, Germany
| | - Alexandra Nieters
- Center for Chronic Immunodeficiency, Faculty of Medicine and Medical Center - University of Freiburg, Breisacher Straße 115, Freiburg, 79106, Germany
| | - Harald Binder
- Institute for Medical Biometry and Statistics, Faculty of Medicine and Medical Center - University of Freiburg, Stefan-Meier-Straße 26, Freiburg, 79104, Germany
| |
Collapse
|
97
|
Li Z, Liu S, Yang Q. Incoherent Inputs Enhance the Robustness of Biological Oscillators. Cell Syst 2019; 5:72-81.e4. [PMID: 28750200 DOI: 10.1016/j.cels.2017.06.013] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2016] [Revised: 03/30/2017] [Accepted: 06/22/2017] [Indexed: 11/25/2022]
Abstract
Robust biological oscillators retain the critical ability to function in the presence of environmental perturbations. Although central architectures that support robust oscillations have been extensively studied, networks containing the same core vary drastically in their potential to oscillate, and it remains elusive what peripheral modifications to the core contribute to this functional variation. Here, we have generated a complete atlas of two- and three-node oscillators computationally, then systematically analyzed the association between network structure and robustness. We found that, while certain core topologies are essential for producing a robust oscillator, local structures can substantially modulate the robustness of oscillations. Notably, local nodes receiving incoherent or coherent inputs respectively promote or attenuate the overall network robustness in an additive manner. We validated these relationships in larger-scale networks reflective of real biological oscillators. Our findings provide an explanation for why auxiliary structures not required for oscillation are evolutionarily conserved and suggest simple ways to evolve or design robust oscillators.
Collapse
Affiliation(s)
- Zhengda Li
- Department of Biophysics, University of Michigan, Ann Arbor, MI, USA; Department of Computational Medicine & Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Shixuan Liu
- Cell Biology Program, The Hospital for Sick Children, Toronto, ON, Canada; Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
| | - Qiong Yang
- Department of Biophysics, University of Michigan, Ann Arbor, MI, USA; Department of Computational Medicine & Bioinformatics, University of Michigan, Ann Arbor, MI, USA.
| |
Collapse
|
98
|
Contreras‐Cristán A, Lockhart RA, Stephens MA, Sun SZ. On the use of priors in goodness‐of‐fit tests. CAN J STAT 2019. [DOI: 10.1002/cjs.11512] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Affiliation(s)
- Alberto Contreras‐Cristán
- Department of Probability and StatisticsIIMAS, Universidad Nacional Autónoma de MéxicoCiudad de México México
| | - Richard A. Lockhart
- Department of Statistics and Actuarial ScienceSimon Fraser University8888 University Blvd., Burnaby British Columbia Canada
| | - Michael A. Stephens
- Department of Statistics and Actuarial ScienceSimon Fraser University8888 University Blvd., Burnaby British Columbia Canada
| | - Shaun Z. Sun
- Department of Mathematics and StatisticsUniversity of the Fraser Valley33844 King Rd., Abbotsford British Columbia Canada
| |
Collapse
|
99
|
Lasso Regression for the Prediction of Intermediate Outcomes Related to Cardiovascular Disease Prevention Using the TRANSIT Quality Indicators. Med Care 2019; 57:63-72. [PMID: 30439793 DOI: 10.1097/mlr.0000000000001014] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
BACKGROUND Cardiovascular disease morbidity and mortality are largely influenced by poor control of hypertension, dyslipidemia, and diabetes. Process indicators are essential to monitor the effectiveness of quality improvement strategies. However, process indicators should be validated by demonstrating their ability to predict desirable outcomes. The objective of this study is to identify an effective method for building prediction models and to assess the predictive validity of the TRANSIT indicators. METHODS On the basis of blood pressure readings and laboratory test results at baseline, the TRANSIT study population was divided into 3 overlapping subpopulations: uncontrolled hypertension, uncontrolled dyslipidemia, and uncontrolled diabetes. A classic statistical method, a sparse machine learning technique, and a hybrid method combining both were used to build prediction models for whether a patient reached therapeutic targets for hypertension, dyslipidemia, and diabetes. The final models' performance for predicting these intermediate outcomes was established using cross-validated area under the curves (cvAUC). RESULTS At baseline, 320, 247, and 303 patients were uncontrolled for hypertension, dyslipidemia, and diabetes, respectively. Among the 3 techniques used to predict reaching therapeutic targets, the hybrid method had a better discriminative capacity (cvAUCs=0.73 for hypertension, 0.64 for dyslipidemia, and 0.79 for diabetes) and succeeded in identifying indicators with a better capacity for predicting intermediate outcomes related to cardiovascular disease prevention. CONCLUSIONS Even though this study was conducted in a complex population of patients, a set of 5 process indicators were found to have good predictive validity based on the hybrid method.
Collapse
|
100
|
|