1
|
Mallick H, Porwal A, Saha S, Basak P, Svetnik V, Paul E. An integrated Bayesian framework for multi-omics prediction and classification. Stat Med 2024; 43:983-1002. [PMID: 38146838 DOI: 10.1002/sim.9953] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2022] [Revised: 10/06/2023] [Accepted: 10/24/2023] [Indexed: 12/27/2023]
Abstract
With the growing commonality of multi-omics datasets, there is now increasing evidence that integrated omics profiles lead to more efficient discovery of clinically actionable biomarkers that enable better disease outcome prediction and patient stratification. Several methods exist to perform host phenotype prediction from cross-sectional, single-omics data modalities but decentralized frameworks that jointly analyze multiple time-dependent omics data to highlight the integrative and dynamic impact of repeatedly measured biomarkers are currently limited. In this article, we propose a novel Bayesian ensemble method to consolidate prediction by combining information across several longitudinal and cross-sectional omics data layers. Unlike existing frequentist paradigms, our approach enables uncertainty quantification in prediction as well as interval estimation for a variety of quantities of interest based on posterior summaries. We apply our method to four published multi-omics datasets and demonstrate that it recapitulates known biology in addition to providing novel insights while also outperforming existing methods in estimation, prediction, and uncertainty quantification. Our open-source software is publicly available at https://github.com/himelmallick/IntegratedLearner.
Collapse
Affiliation(s)
- Himel Mallick
- Division of Biostatistics, Department of Population Health Sciences, Weill Cornell Medicine, Cornell University, New York, 10065, New York, USA
- Department of Statistics and Data Science, Cornell University, Ithaca, New York, USA
| | - Anupreet Porwal
- Department of Statistics, University of Washington, Seattle, Washington, USA
| | - Satabdi Saha
- Department of Biostatistics, University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| | - Piyali Basak
- Biostatistics and Research Decision Sciences, Merck & Co., Inc., Rahway, New Jersey, USA
| | - Vladimir Svetnik
- Biostatistics and Research Decision Sciences, Merck & Co., Inc., Rahway, New Jersey, USA
| | - Erina Paul
- Biostatistics and Research Decision Sciences, Merck & Co., Inc., Rahway, New Jersey, USA
| |
Collapse
|
2
|
Yu L, Zhang Z, Yi H, Wang J, Li J, Wang X, Bai H, Ge H, Zheng X, Ni J, Qi H, Guan Y, Xu W, Zhu Z, Xing L, Dekker A, Wee L, Traverso A, Ye Z, Yuan Z. A PET/CT radiomics model for predicting distant metastasis in early-stage non-small cell lung cancer patients treated with stereotactic body radiotherapy: a multicentric study. Radiat Oncol 2024; 19:10. [PMID: 38254106 PMCID: PMC10802016 DOI: 10.1186/s13014-024-02402-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2023] [Accepted: 01/10/2024] [Indexed: 01/24/2024] Open
Abstract
OBJECTIVES Stereotactic body radiotherapy (SBRT) is a treatment option for patients with early-stage non-small cell lung cancer (NSCLC) who are unfit for surgery. Some patients may experience distant metastasis. This study aimed to develop and validate a radiomics model for predicting distant metastasis in patients with early-stage NSCLC treated with SBRT. METHODS Patients at five institutions were enrolled in this study. Radiomics features were extracted based on the PET/CT images. After feature selection in the training set (from Tianjin), CT-based and PET-based radiomics signatures were built. Models based on CT and PET signatures were built and validated using external datasets (from Zhejiang, Zhengzhou, Shandong, and Shanghai). An integrated model that included CT and PET radiomic signatures was developed. The performance of the proposed model was evaluated in terms of its discrimination, calibration, and clinical utility. Multivariate logistic regression was used to calculate the probability of distant metastases. The cutoff value was obtained using the receiver operator characteristic curve (ROC), and the patients were divided into high- and low-risk groups. Kaplan-Meier analysis was used to evaluate the distant metastasis-free survival (DMFS) of different risk groups. RESULTS In total, 228 patients were enrolled. The median follow-up time was 31.4 (2.0-111.4) months. The model based on CT radiomics signatures had an area under the curve (AUC) of 0.819 in the training set (n = 139) and 0.786 in the external dataset (n = 89). The PET radiomics model had an AUC of 0.763 for the training set and 0.804 for the external dataset. The model combining CT and PET radiomics had an AUC of 0.835 for the training set and 0.819 for the external dataset. The combined model showed a moderate calibration and a positive net benefit. When the probability of distant metastasis was greater than 0.19, the patient was considered to be at high risk. The DMFS of patients with high- and low-risk was significantly stratified (P < 0.001). CONCLUSIONS The proposed PET/CT radiomics model can be used to predict distant metastasis in patients with early-stage NSCLC treated with SBRT and provide a reference for clinical decision-making. In this study, the model was established by combining CT and PET radiomics signatures in a moderate-quantity training cohort of early-stage NSCLC patients treated with SBRT and was successfully validated in independent cohorts. Physicians could use this easy-to-use model to assess the risk of distant metastasis after SBRT. Identifying subgroups of patients with different risk factors for distant metastasis is useful for guiding personalized treatment approaches.
Collapse
Affiliation(s)
- Lu Yu
- Department of Radiation Oncology, Tianjin Medical University Cancer Institute and Hospital, National Clinical Research Center for Cancer, Tianjin, China
- Department of Radiology, Tianjin Medical University Cancer Institute and Hospital, National Clinical Research Center for Cancer, Tianjin, China
| | - Zhen Zhang
- Zhejiang Cancer Hospital, Hangzhou Institute of Medicine (HIM), Chinese Academy of Sciences, Hangzhou, Zhejiang, 310022, China
- Department of Radiation Oncology (Maastro), GROW School for Oncology and Reproduction, Maastricht University Medical Centre, Maastricht, The Netherlands
| | - HeQing Yi
- Zhejiang Cancer Hospital, Hangzhou Institute of Medicine (HIM), Chinese Academy of Sciences, Hangzhou, Zhejiang, 310022, China
| | - Jin Wang
- Zhejiang Cancer Hospital, Hangzhou Institute of Medicine (HIM), Chinese Academy of Sciences, Hangzhou, Zhejiang, 310022, China
| | - Junyi Li
- Department of Radiation Oncology, Tianjin Medical University Cancer Institute and Hospital, National Clinical Research Center for Cancer, Tianjin, China
| | - Xiaofeng Wang
- Department of Radiation Oncology, Tianjin Medical University Cancer Institute and Hospital, National Clinical Research Center for Cancer, Tianjin, China
| | - Hui Bai
- Department of Radiation Oncology, Tianjin Medical University Cancer Institute and Hospital, National Clinical Research Center for Cancer, Tianjin, China
| | - Hong Ge
- The Affiliated Cancer Hospital of Zhengzhou University, Zhengzhou, China
| | - Xiaoli Zheng
- The Affiliated Cancer Hospital of Zhengzhou University, Zhengzhou, China
| | - Jianjiao Ni
- Department of Radiation Oncology, Fudan University Shanghai Cancer Center, Shanghai, China
| | - Haoran Qi
- Department of Radiation Oncology, Shandong Cancer Hospital and Institute, Shandong First Medical University, Shandong Academy of Medical Science, Jinan, Shandong, China
| | - Yong Guan
- Department of Radiation Oncology, Tianjin Medical University Cancer Institute and Hospital, National Clinical Research Center for Cancer, Tianjin, China
| | - Wengui Xu
- Department of Radiation Oncology, Tianjin Medical University Cancer Institute and Hospital, National Clinical Research Center for Cancer, Tianjin, China
| | - Zhengfei Zhu
- Department of Radiation Oncology, Fudan University Shanghai Cancer Center, Shanghai, China
| | - Ligang Xing
- Department of Radiation Oncology, Shandong Cancer Hospital and Institute, Shandong First Medical University, Shandong Academy of Medical Science, Jinan, Shandong, China
| | - Andre Dekker
- Department of Radiation Oncology (Maastro), GROW School for Oncology and Reproduction, Maastricht University Medical Centre, Maastricht, The Netherlands
| | - Leonard Wee
- Department of Radiation Oncology (Maastro), GROW School for Oncology and Reproduction, Maastricht University Medical Centre, Maastricht, The Netherlands
| | - Alberto Traverso
- Department of Radiation Oncology (Maastro), GROW School for Oncology and Reproduction, Maastricht University Medical Centre, Maastricht, The Netherlands
| | - Zhaoxiang Ye
- Department of Radiology, Tianjin Medical University Cancer Institute and Hospital, National Clinical Research Center for Cancer, Tianjin, China.
| | - Zhiyong Yuan
- Department of Radiation Oncology, Tianjin Medical University Cancer Institute and Hospital, National Clinical Research Center for Cancer, Tianjin, China.
| |
Collapse
|
3
|
Lin Y, Teixeira-Pinto A, Craig JC, Opdam H, Chapman JC, Pleass H, Carter A, Rogers NM, Davies CE, McDonald S, Yang J, Lim WH, Wong G. Trajectories of systolic blood pressure decline in kidney transplant donors prior to circulatory death and delayed graft function. Clin Kidney J 2023; 16:1170-1179. [PMID: 37398694 PMCID: PMC10310517 DOI: 10.1093/ckj/sfad047] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2022] [Indexed: 07/28/2024] Open
Abstract
BACKGROUND Kidneys donated after circulatory death suffer a period of functional warm ischaemia before death, which may lead to early ischaemic injury. Effects of haemodynamic trajectories during the agonal phase on delayed graft function (DGF) is unknown. We aimed to predict the risk of DGF using patterns of trajectories of systolic blood pressure (SBP) declines in Maastricht category 3 kidney donors. METHODS We conducted a cohort study of all kidney transplant recipients in Australia who received kidneys from donation after circulatory death donors, divided into a derivation cohort (transplants between 9 April 2014 and 2 January 2018 [462 donors]) and a validation cohort (transplants between 6 January 2018 and 24 December 2019 [324 donors]). Patterns of SBP decline using latent class models were evaluated against the odds of DGF using a two-stage linear mixed effects model. RESULTS In the derivation cohort, 462 donors were included in the latent class analyses and 379 donors in the mixed effects model. Of the 696 eligible transplant recipients, 380 (54.6%) experienced DGF. Ten different trajectories, with distinct patterns of SBP decline were identified. Compared with recipients from donors with the slowest decline in SBP after withdrawal of cardiorespiratory support, the adjusted odds ratio (aOR) for DGF was 5.5 [95% confidence interval (CI) 1.38-28.0] for recipients from donors with a steeper decline and lowest SBP [mean 49.5 mmHg (standard deviation 12.5)] at the time of withdrawal. For every 1 mmHg/min reduction in the rate of decline of SBP, the respective aORs for DGF were 0.95 (95% CI 0.91-0.99) and 0.98 (95% CI 0.93-1.0) in the random forest and least absolute shrinkage and selection operator models. In the validation cohort, the respective aORs were 0.95 (95% CI 0.91-1.0) and 0.99 (95% CI 0.94-1.0). CONCLUSION Trajectories of SBP decline and their determinants are predictive of DGF. These results support a trajectory-based assessment of haemodynamic changes in donors after circulatory death during the agonal phase for donor suitability and post-transplant outcomes.
Collapse
Affiliation(s)
- Yingxin Lin
- Sydney School of Public Health, University of Sydney, Sydney, NSW, Australia
- Faculty of Science, School of Mathematics and Science, University of Sydney, Sydney, NSW, Australia
| | - Armando Teixeira-Pinto
- Sydney School of Public Health, University of Sydney, Sydney, NSW, Australia
- Centre for Kidney Research, Kids Research Institute, Children's Hospital at Westmead, Sydney, NSW, Australia
| | - Jonathan C Craig
- College of Medicine and Public Health, Flinders University, Adelaide, SA, Australia
| | - Helen Opdam
- Department of Surgery, DonateLife, Organ and Tissue Authority, Canberra, ACT, Australia
| | - Jeremy C Chapman
- Centre for Transplant and Renal Research, Westmead Hospital, Sydney, NSW, Australia
| | - Henry Pleass
- Centre for Transplant and Renal Research, Westmead Hospital, Sydney, NSW, Australia
- Specialty of Surgery, University of Sydney, Sydney, NSW, Australia
| | - Angus Carter
- Intensive Care Unit, Cairns Hospital, Cairns, QLD, Australia
| | - Natasha M Rogers
- Centre for Transplant and Renal Research, Westmead Hospital, Sydney, NSW, Australia
| | - Christopher E Davies
- Department of Renal Medicine, Australia and New Zealand Dialysis and Transplant (ANZDATA) Registry, South Australian Health and Medical Research Institute, Adelaide, SA, Australia
- Adelaide Medical School, University of Adelaide, Adelaide, SA, Australia
| | - Stephen McDonald
- Department of Renal Medicine, Australia and New Zealand Dialysis and Transplant (ANZDATA) Registry, South Australian Health and Medical Research Institute, Adelaide, SA, Australia
- Adelaide Medical School, University of Adelaide, Adelaide, SA, Australia
| | - Jean Yang
- Faculty of Science, School of Mathematics and Science, University of Sydney, Sydney, NSW, Australia
| | - Wai H Lim
- Faculty of Health and Medical Science, University of Western Australia, Perth, WA, Australia
- Department of Renal Medicine, Sir Charles Gairdner Hospital, Perth, WA, Australia
| | - Germaine Wong
- Sydney School of Public Health, University of Sydney, Sydney, NSW, Australia
- Centre for Kidney Research, Kids Research Institute, Children's Hospital at Westmead, Sydney, NSW, Australia
- Centre for Transplant and Renal Research, Westmead Hospital, Sydney, NSW, Australia
| |
Collapse
|
4
|
Penalized polygram regression. J Korean Stat Soc 2022. [DOI: 10.1007/s42952-022-00181-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
|
5
|
Sparse Index Tracking Portfolio with Sector Neutrality. MATHEMATICS 2022. [DOI: 10.3390/math10152645] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
As a popular passive investment strategy, a sparse index tracking strategy has advantages over a full index replication strategy because of higher liquidity and lower transaction costs. Sparsity and nonnegativity constraints are usually assumed in the construction of portfolios in sparse index tracking. However, none of the existing studies considered sector risk exposure of the portfolios that prices of stocks in one sector may fall at the same time due to sudden changes in policy or unexpected events that may affect the whole sector. Therefore, sector neutrality appeals to be critical when building a sparse index tracking portfolio if not using full replication. The statistical approach to sparse index tracking is a constrained variable selection problem. However, the constrained variable selection procedure using Lasso fails to produce a sparse portfolio under sector neutrality constraints. In this paper, we propose a high-dimensional constrained variable selection method using TLP for building index tracking portfolios under sparsity, sector neutrality and nonnegativity constraints. Selection consistency is established for the proposed method, and the asymptotic distribution is obtained for the sparse portfolio weights estimator. We also develop an efficient iteration algorithm for the weight estimation. We examine the performance of the proposed methodology through simulations and an application to the CSI 300 index of China. The results demonstrate the validity and advantages of our methodology.
Collapse
|
6
|
Weighted sparse simplex representation: a unified framework for subspace clustering, constrained clustering, and active learning. Data Min Knowl Discov 2022. [DOI: 10.1007/s10618-022-00820-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
Abstract
AbstractSpectral-based subspace clustering methods have proved successful in many challenging applications such as gene sequencing, image recognition, and motion segmentation. In this work, we first propose a novel spectral-based subspace clustering algorithm that seeks to represent each point as a sparse convex combination of a few nearby points. We then extend the algorithm to a constrained clustering and active learning framework. Our motivation for developing such a framework stems from the fact that typically either a small amount of labelled data are available in advance; or it is possible to label some points at a cost. The latter scenario is typically encountered in the process of validating a cluster assignment. Extensive experiments on simulated and real datasets show that the proposed approach is effective and competitive with state-of-the-art methods.
Collapse
|
7
|
Lim WH, Ooi E, Pilmore HL, Johnson DW, McDonald SP, Clayton P, Hawley C, Mulley WR, Francis R, Collins MG, Jaques B, Larkins NG, Davies CE, Wyburn K, Chadban SJ, Wong G. Interactions Between Donor Age and 12-Month Estimated Glomerular Filtration Rate on Allograft and Patient Outcomes After Kidney Transplantation. Transpl Int 2022; 35:10199. [PMID: 35185379 PMCID: PMC8842263 DOI: 10.3389/ti.2022.10199] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2021] [Accepted: 01/12/2022] [Indexed: 11/25/2022]
Abstract
Reduced estimated glomerular filtration rate (eGFR) at 12-months after kidney transplantation is associated with increased risk of allograft loss, but it is uncertain whether donor age and types modify this relationship. Using Australia and New Zealand registry data, multivariable Cox proportional modelling was used to examine the interactive effects between donor age, types and 12-month eGFR on overall allograft loss. We included 11,095 recipients (4,423 received live-donors). Recipients with lowest 12-month eGFR (<30 ml/min/1.73 m2) experienced the greatest risk of allograft loss, with adjusted HR [95% CI) of 2.65 [2.38–2.95] compared to eGFR of 30–60 ml/min/1.73 m2; whereas the adjusted HR for highest eGFR (>60 ml/min/1.73 m2) was 0.67 [0.62–0.74]. The association of 12-month eGFR and allograft loss was modified by donor age (but not donor types) where a higher risk of allograft loss in recipients with lower compared with higher 12-month eGFR being most pronounced in the younger donor age groups (p < 0.01). Recipients with eGFR <30 ml/min/1.73 m2 12-months after transplantation experienced ≥2.5-fold increased risk of overall allograft loss compared to those with eGFR of >60 ml/min/1.73 m2, and the magnitude of the increased risk is most marked among recipients with younger donors. Careful deliberation of other factors including donor age when considering eGFR as a surrogate for clinical endpoints is warranted.
Collapse
Affiliation(s)
- Wai H. Lim
- Department of Renal Medicine, Sir Charles Gairdner Hospital, Perth, WA, Australia
- Medical School, University of Western Australia, Perth, WA, Australia
- *Correspondence: Wai H. Lim,
| | - Esther Ooi
- Medical School, University of Western Australia, Perth, WA, Australia
- School of Biomedical Sciences, University of Western Australia, Perth, WA, Australia
| | - Helen L. Pilmore
- Department of Renal Medicine, Auckland City Hospital, Auckland, New Zealand
- Department of Medicine, University of Auckland, Auckland, New Zealand
| | - David W. Johnson
- Metro South Integrated Nephrology and Transplant Services, Princess Alexandra Hospital, Woolloongabba, QLD, Australia
- Faculty of Medicine, University of Queensland, St Lucia, QLD, Australia
- Translational Research Institute, Brisbane, QLD, Australia
| | - Stephen P. McDonald
- Australia and New Zealand Dialysis and Transplant Registry, Adelaide, SA, Australia
- Central and Northern Adelaide Renal and Transplantation Services, Adelaide, SA, Australia
- South Australia Health and Medical Research Institute, Adelaide, SA, Australia
- Adelaide Medical School, University of Adelaide, Adelaide, SA, Australia
| | - Philip Clayton
- Australia and New Zealand Dialysis and Transplant Registry, Adelaide, SA, Australia
- Central and Northern Adelaide Renal and Transplantation Services, Adelaide, SA, Australia
- South Australia Health and Medical Research Institute, Adelaide, SA, Australia
- Adelaide Medical School, University of Adelaide, Adelaide, SA, Australia
| | - Carmel Hawley
- Metro South Integrated Nephrology and Transplant Services, Princess Alexandra Hospital, Woolloongabba, QLD, Australia
- Faculty of Medicine, University of Queensland, St Lucia, QLD, Australia
- Translational Research Institute, Brisbane, QLD, Australia
| | - William R. Mulley
- Department of Nephrology, Monash Medical Centre, Melbourne, VIC, Australia
- Department of Medicine, Monash University, Melbourne, VIC, Australia
| | - Ross Francis
- Metro South Integrated Nephrology and Transplant Services, Princess Alexandra Hospital, Woolloongabba, QLD, Australia
- Faculty of Medicine, University of Queensland, St Lucia, QLD, Australia
| | - Michael G. Collins
- School of Biomedical Sciences, University of Western Australia, Perth, WA, Australia
- Department of Renal Medicine, Auckland City Hospital, Auckland, New Zealand
| | - Bryon Jaques
- Western Australia Liver and Kidney Transplant Service, Sir Charles Gairdner Hospital, Perth, WA, Australia
| | - Nicholas G. Larkins
- Medical School, University of Western Australia, Perth, WA, Australia
- Department of Nephrology, Perth Children’s Hospital, Perth, WA, Australia
| | - Christopher E. Davies
- Australia and New Zealand Dialysis and Transplant Registry, Adelaide, SA, Australia
- South Australia Health and Medical Research Institute, Adelaide, SA, Australia
- Adelaide Medical School, University of Adelaide, Adelaide, SA, Australia
| | - Kate Wyburn
- Faculty of Medicine and Health, University of Sydney, Sydney, NSW, Australia
- Renal Medicine, Royal Prince Alfred Hospital, Sydney, NSW, Australia
| | - Steve J. Chadban
- Faculty of Medicine and Health, University of Sydney, Sydney, NSW, Australia
- Renal Medicine, Royal Prince Alfred Hospital, Sydney, NSW, Australia
| | - Germaine Wong
- Faculty of Medicine and Health, University of Sydney, Sydney, NSW, Australia
- Centre for Kidney Research, The Children’s Hospital at Westmead, Sydney, NSW, Australia
- Department of Renal Medicine and National Pancreas Transplant Unit, Westmead Hospital, Sydney, NSW, Australia
| |
Collapse
|
8
|
|
9
|
Benítez-Peña S, Carrizosa E, Guerrero V, Jiménez-Gamero MD, Martín-Barragán B, Molero-Río C, Ramírez-Cobo P, Romero Morales D, Sillero-Denamiel MR. On sparse ensemble methods: An application to short-term predictions of the evolution of COVID-19. EUROPEAN JOURNAL OF OPERATIONAL RESEARCH 2021; 295:648-663. [PMID: 36569384 PMCID: PMC9759092 DOI: 10.1016/j.ejor.2021.04.016] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/27/2020] [Accepted: 04/07/2021] [Indexed: 06/02/2023]
Abstract
Since the seminal paper by Bates and Granger in 1969, a vast number of ensemble methods that combine different base regressors to generate a unique one have been proposed in the literature. The so-obtained regressor method may have better accuracy than its components, but at the same time it may overfit, it may be distorted by base regressors with low accuracy, and it may be too complex to understand and explain. This paper proposes and studies a novel Mathematical Optimization model to build a sparse ensemble, which trades off the accuracy of the ensemble and the number of base regressors used. The latter is controlled by means of a regularization term that penalizes regressors with a poor individual performance. Our approach is flexible to incorporate desirable properties one may have on the ensemble, such as controlling the performance of the ensemble in critical groups of records, or the costs associated with the base regressors involved in the ensemble. We illustrate our approach with real data sets arising in the COVID-19 context.
Collapse
Affiliation(s)
- Sandra Benítez-Peña
- Instituto de Matemáticas de la Universidad de Sevilla, Seville, Spain
- Departamento de Estadística e Investigación Operativa, Universidad de Sevilla, Seville, Spain
| | - Emilio Carrizosa
- Instituto de Matemáticas de la Universidad de Sevilla, Seville, Spain
- Departamento de Estadística e Investigación Operativa, Universidad de Sevilla, Seville, Spain
| | - Vanesa Guerrero
- Departamento de Estadística, Universidad Carlos III de Madrid, Getafe, Spain
| | - M Dolores Jiménez-Gamero
- Instituto de Matemáticas de la Universidad de Sevilla, Seville, Spain
- Departamento de Estadística e Investigación Operativa, Universidad de Sevilla, Seville, Spain
| | | | - Cristina Molero-Río
- Instituto de Matemáticas de la Universidad de Sevilla, Seville, Spain
- Departamento de Estadística e Investigación Operativa, Universidad de Sevilla, Seville, Spain
| | - Pepa Ramírez-Cobo
- Departamento de Estadística e Investigación Operativa, Universidad de Cádiz, Cadiz, Spain
- Instituto de Matemáticas de la Universidad de Sevilla, Seville, Spain
| | | | - M Remedios Sillero-Denamiel
- Instituto de Matemáticas de la Universidad de Sevilla, Seville, Spain
- Departamento de Estadística e Investigación Operativa, Universidad de Sevilla, Seville, Spain
| |
Collapse
|
10
|
Ray A, Chakraborty T, Ghosh D. Optimized ensemble deep learning framework for scalable forecasting of dynamics containing extreme events. CHAOS (WOODBURY, N.Y.) 2021; 31:111105. [PMID: 34881612 DOI: 10.1063/5.0074213] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/07/2021] [Accepted: 11/02/2021] [Indexed: 06/13/2023]
Abstract
The remarkable flexibility and adaptability of both deep learning models and ensemble methods have led to the proliferation for their application in understanding many physical phenomena. Traditionally, these two techniques have largely been treated as independent methodologies in practical applications. This study develops an optimized ensemble deep learning framework wherein these two machine learning techniques are jointly used to achieve synergistic improvements in model accuracy, stability, scalability, and reproducibility, prompting a new wave of applications in the forecasting of dynamics. Unpredictability is considered one of the key features of chaotic dynamics; therefore, forecasting such dynamics of nonlinear systems is a relevant issue in the scientific community. It becomes more challenging when the prediction of extreme events is the focus issue for us. In this circumstance, the proposed optimized ensemble deep learning (OEDL) model based on a best convex combination of feed-forward neural networks, reservoir computing, and long short-term memory can play a key role in advancing predictions of dynamics consisting of extreme events. The combined framework can generate the best out-of-sample performance than the individual deep learners and standard ensemble framework for both numerically simulated and real-world data sets. We exhibit the outstanding performance of the OEDL framework for forecasting extreme events generated from a Liénard-type system, prediction of COVID-19 cases in Brazil, dengue cases in San Juan, and sea surface temperature in the Niño 3.4 region.
Collapse
Affiliation(s)
- Arnob Ray
- Physics and Applied Mathematics Unit, Indian Statistical Institute, Kolkata 700108, India
| | - Tanujit Chakraborty
- Department of Science and Engineering, Sorbonne University Abu Dhabi, Abu Dhabi, UAE
| | - Dibakar Ghosh
- Physics and Applied Mathematics Unit, Indian Statistical Institute, Kolkata 700108, India
| |
Collapse
|
11
|
Raharinirina NA, Peppert F, von Kleist M, Schütte C, Sunkara V. Inferring gene regulatory networks from single-cell RNA-seq temporal snapshot data requires higher-order moments. PATTERNS 2021; 2:100332. [PMID: 34553172 PMCID: PMC8441581 DOI: 10.1016/j.patter.2021.100332] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/03/2021] [Revised: 02/23/2021] [Accepted: 07/22/2021] [Indexed: 11/30/2022]
Abstract
Single-cell RNA sequencing (scRNA-seq) has become ubiquitous in biology. Recently, there has been a push for using scRNA-seq snapshot data to infer the underlying gene regulatory networks (GRNs) steering cellular function. To date, this aspiration remains unrealized due to technical and computational challenges. In this work we focus on the latter, which is under-represented in the literature. We took a systemic approach by subdividing the GRN inference into three fundamental components: data pre-processing, feature extraction, and inference. We observed that the regulatory signature is captured in the statistical moments of scRNA-seq data and requires computationally intensive minimization solvers to extract it. Furthermore, current data pre-processing might not conserve these statistical moments. Although our moment-based approach is a didactic tool for understanding the different compartments of GRN inference, this line of thinking—finding computationally feasible multi-dimensional statistics of data—is imperative for designing GRN inference methods. Single-cell RNA-seq temporal snapshot data for detecting regulation Challenges in data pre-processing, feature extraction, and network inference for GRNs Encoding of regulatory information in higher-order raw moments Non-linear least-squares inference for temporal scRNA-seq snapshot data
Single-cell RNA sequencing (scRNA-seq) has become ubiquitous in biology. Recently, there has been a push for using scRNA-seq snapshot data to infer the underlying gene regulatory networks (GRNs) steering cellular function. A recent benchmark of 12 GRN methods demonstrated that the algorithms struggled to predict the ground-truth GRNs and speculated that the low performance was due to the insufficient resolution in the scRNA-seq data. Rather than proposing another method, this paper focuses on how to decompose a GRN problem into three subproblems (pre-processing, feature extraction, and inference), so that the gene regulatory information is preserved in each step. Subsequently, we discuss how to best approach each of the three subproblems.
Collapse
Affiliation(s)
| | - Felix Peppert
- Explainable A.I. for Biology, Zuse Institute Berlin, 14195 Berlin, Germany
| | - Max von Kleist
- MF1 Bioinformatics, Methods Development and Research Infrastructure, Robert Koch Institute, 13353 Berlin, Germany
| | - Christof Schütte
- Mathematics of Complex Systems, Zuse Institute Berlin, 14195 Berlin, Germany.,Department of Mathematics and Computer Science, Freie Universität Berlin, 14195 Berlin, Germany
| | - Vikram Sunkara
- Mathematics of Complex Systems, Zuse Institute Berlin, 14195 Berlin, Germany.,Explainable A.I. for Biology, Zuse Institute Berlin, 14195 Berlin, Germany
| |
Collapse
|
12
|
Bien J, Yan X, Simpson L, Müller CL. Tree-aggregated predictive modeling of microbiome data. Sci Rep 2021; 11:14505. [PMID: 34267244 PMCID: PMC8282688 DOI: 10.1038/s41598-021-93645-3] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2020] [Accepted: 06/22/2021] [Indexed: 01/05/2023] Open
Abstract
Modern high-throughput sequencing technologies provide low-cost microbiome survey data across all habitats of life at unprecedented scale. At the most granular level, the primary data consist of sparse counts of amplicon sequence variants or operational taxonomic units that are associated with taxonomic and phylogenetic group information. In this contribution, we leverage the hierarchical structure of amplicon data and propose a data-driven and scalable tree-guided aggregation framework to associate microbial subcompositions with response variables of interest. The excess number of zero or low count measurements at the read level forces traditional microbiome data analysis workflows to remove rare sequencing variants or group them by a fixed taxonomic rank, such as genus or phylum, or by phylogenetic similarity. By contrast, our framework, which we call trac (tree-aggregation of compositional data), learns data-adaptive taxon aggregation levels for predictive modeling, greatly reducing the need for user-defined aggregation in preprocessing while simultaneously integrating seamlessly into the compositional data analysis framework. We illustrate the versatility of our framework in the context of large-scale regression problems in human gut, soil, and marine microbial ecosystems. We posit that the inferred aggregation levels provide highly interpretable taxon groupings that can help microbiome researchers gain insights into the structure and functioning of the underlying ecosystem of interest.
Collapse
Affiliation(s)
- Jacob Bien
- Department of Data Sciences and Operations, University of Southern California, Los Angeles, CA, USA
| | | | - Léo Simpson
- Technische Universität München, Munich, Germany
- Institute of Computational Biology, Helmholtz Zentrum München, Munich, Germany
| | - Christian L Müller
- Institute of Computational Biology, Helmholtz Zentrum München, Munich, Germany.
- Department of Statistics, Ludwig-Maximilians-Universität München, Munich, Germany.
- Center for Computational Mathematics, Flatiron Institute, Simons Foundation, New York, NY, USA.
| |
Collapse
|
13
|
|
14
|
Kawaguchi ES, Darst BF, Wang K, Conti DV. Sign-based Shrinkage Based on an Asymmetric LASSO Penalty. JOURNAL OF DATA SCIENCE : JDS 2021; 19:429-449. [PMID: 35222618 PMCID: PMC8880910 DOI: 10.6339/21-jds1015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Penalized regression provides an automated approach to preform simultaneous variable selection and parameter estimation and is a popular method to analyze high-dimensional data. Since the conception of the LASSO in the mid-to-late 1990s, extensive research has been done to improve penalized regression. The LASSO, and several of its variations, performs penalization symmetrically around zero. Thus, variables with the same magnitude are shrunk the same regardless of the direction of effect. To the best of our knowledge, sign-based shrinkage, preferential shrinkage based on the sign of the coefficients, has yet to be explored under the LASSO framework. We propose a generalization to the LASSO, asymmetric LASSO, that performs sign-based shrinkage. Our method is motivated by placing an asymmetric Laplace prior on the regression coefficients, rather than a symmetric Laplace prior. This corresponds to an asymmetric ℓ 1 penalty under the penalized regression framework. In doing so, preferential shrinkage can be performed through an auxiliary tuning parameter that controls the degree of asymmetry. Our numerical studies indicate that the asymmetric LASSO performs better than the LASSO when effect sizes are sign skewed. Furthermore, in the presence of positively-skewed effects, the asymmetric LASSO is comparable to the non-negative LASSO without the need to place an a priori constraint on the effect estimates and outperforms the non-negative LASSO when negative effects are also present in the model. A real data example using the breast cancer gene expression data from The Cancer Genome Atlas is also provided, where the asymmetric LASSO identifies two potentially novel gene expressions that are associated with BRCA1 with a minor improvement in prediction performance over the LASSO and non-negative LASSO.
Collapse
Affiliation(s)
- Eric S. Kawaguchi
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, California, USA
| | - Burcu F. Darst
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, California, USA
| | - Kan Wang
- Google, Mountain View, California, USA
| | - David V. Conti
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, California, USA
| |
Collapse
|
15
|
Cortico-Spinal Neural Interface to Restore Hindlimb Movements in Spinally-Injured Rabbits. NEUROPHYSIOLOGY+ 2021. [DOI: 10.1007/s11062-021-09894-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
16
|
Wu X, Liang R, Yang H. Penalized and constrained LAD estimation in fixed and high dimension. Stat Pap (Berl) 2021; 63:53-95. [PMID: 33814727 PMCID: PMC8009762 DOI: 10.1007/s00362-021-01229-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2020] [Revised: 02/27/2021] [Indexed: 11/26/2022]
Abstract
Recently, many literatures have proved that prior information and structure in many application fields can be formulated as constraints on regression coefficients. Following these work, we propose a L 1 penalized LAD estimation with some linear constraints in this paper. Different from constrained lasso, our estimation performs well when heavy-tailed errors or outliers are found in the response. In theory, we show that the proposed estimation enjoys the Oracle property with adjusted normal variance when the dimension of the estimated coefficients p is fixed. And when p is much greater than the sample size n, the error bound of proposed estimation is sharper thank log ( p ) / n . It is worth noting the result is true for a wide range of noise distribution, even for the Cauchy distribution. In algorithm, we not only consider an typical linear programming to solve proposed estimation in fixed dimension , but also present an nested alternating direction method of multipliers (ADMM) in high dimension. Simulation and application to real data also confirm that proposed estimation is an effective alternative when constrained lasso is unreliable.
Collapse
Affiliation(s)
- Xiaofei Wu
- College of Mathematics and Statistics, Chongqing University, Chongqing, 401331 People’s Republic of China
| | - Rongmei Liang
- College of Mathematics and Statistics, Chongqing University, Chongqing, 401331 People’s Republic of China
| | - Hu Yang
- College of Mathematics and Statistics, Chongqing University, Chongqing, 401331 People’s Republic of China
| |
Collapse
|
17
|
Maric D, Jahanipour J, Li XR, Singh A, Mobiny A, Van Nguyen H, Sedlock A, Grama K, Roysam B. Whole-brain tissue mapping toolkit using large-scale highly multiplexed immunofluorescence imaging and deep neural networks. Nat Commun 2021; 12:1550. [PMID: 33692351 PMCID: PMC7946933 DOI: 10.1038/s41467-021-21735-x] [Citation(s) in RCA: 43] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2020] [Accepted: 02/09/2021] [Indexed: 12/17/2022] Open
Abstract
Mapping biological processes in brain tissues requires piecing together numerous histological observations of multiple tissue samples. We present a direct method that generates readouts for a comprehensive panel of biomarkers from serial whole-brain slices, characterizing all major brain cell types, at scales ranging from subcellular compartments, individual cells, local multi-cellular niches, to whole-brain regions from each slice. We use iterative cycles of optimized 10-plex immunostaining with 10-color epifluorescence imaging to accumulate highly enriched image datasets from individual whole-brain slices, from which seamless signal-corrected mosaics are reconstructed. Specific fluorescent signals of interest are isolated computationally, rejecting autofluorescence, imaging noise, cross-channel bleed-through, and cross-labeling. Reliable large-scale cell detection and segmentation are achieved using deep neural networks. Cell phenotyping is performed by analyzing unique biomarker combinations over appropriate subcellular compartments. This approach can accelerate pre-clinical drug evaluation and system-level brain histology studies by simultaneously profiling multiple biological processes in their native anatomical context.
Collapse
Affiliation(s)
- Dragan Maric
- National Institute of Neurological Disorders and Stroke, Bethesda, MD, 20892, USA.
| | - Jahandar Jahanipour
- National Institute of Neurological Disorders and Stroke, Bethesda, MD, 20892, USA
- Cullen College of Engineering, University of Houston, Houston, TX, 77204, USA
| | - Xiaoyang Rebecca Li
- Cullen College of Engineering, University of Houston, Houston, TX, 77204, USA
| | - Aditi Singh
- Cullen College of Engineering, University of Houston, Houston, TX, 77204, USA
| | - Aryan Mobiny
- Cullen College of Engineering, University of Houston, Houston, TX, 77204, USA
| | - Hien Van Nguyen
- Cullen College of Engineering, University of Houston, Houston, TX, 77204, USA
| | - Andrea Sedlock
- National Institute of Neurological Disorders and Stroke, Bethesda, MD, 20892, USA
| | - Kedar Grama
- Cullen College of Engineering, University of Houston, Houston, TX, 77204, USA
| | - Badrinath Roysam
- Cullen College of Engineering, University of Houston, Houston, TX, 77204, USA.
| |
Collapse
|
18
|
Jeon JJ, Kim Y, Won S, Choi H. Primal path algorithm for compositional data analysis. Comput Stat Data Anal 2020. [DOI: 10.1016/j.csda.2020.106958] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
|
19
|
Heravi MAY, Maghooli K, Nowshiravan Rahatabad F, Rezaee R. Application of a neural interface for restoration of leg movements: Intra-spinal stimulation using the brain electrical activity in spinally injured rabbits. J Appl Biomed 2020; 18:33-40. [PMID: 34907723 DOI: 10.32725/jab.2020.009] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2020] [Accepted: 06/12/2020] [Indexed: 11/05/2022] Open
Abstract
This study aimed to design a neural interface that extracts movement commands from the brain to generate appropriate intra-spinal stimulation to restore leg movement. This study comprised four steps: (1) Recording electrocorticographic (ECoG) signals and corresponding leg movements in different trials. (2) Partial laminectomy to induce spinal cord injury (SCI) and detect motor modules in the spinal cord. (3) Delivering appropriate intra-spinal stimulation to the motor modules for restoration of the movements to those documented before SCI. (4) Development of a neural interface created by sparse linear regression (SLiR) model to detect movement commands transmitted from the brain to the modules. Correlation coefficient (CC) and normalized root mean square (NRMS) error was calculated to evaluate the neural interface effectiveness. It was found that by stimulating detected spinal cord modules, joint angle evaluated before SCI was not significantly different from that of post-SCI (P > 0.05). Based on results of SLiR model, overall CC and NRMS values were 0.63 ± 0.14 and 0.34 ± 0.16 (mean ± SD), respectively. These results indicated that ECoG data contained information about intra-spinal stimulations and the developed neural interface could produce intra-spinal stimulation based on ECoG data, for restoration of leg movements after SCI.
Collapse
Affiliation(s)
| | - Keivan Maghooli
- Islamic Azad University, Science and Research Branch, Department of Biomedical Engineering, Tehran, Iran
| | | | - Ramin Rezaee
- Mashhad University of Medical Sciences, Faculty of Medicine, Clinical Research Unit, Mashhad, Iran.,Mashhad University of Medical Sciences, Neurogenic Inflammation Research Center, Mashhad, Iran
| |
Collapse
|
20
|
Blanquero R, Carrizosa E, Ramírez-Cobo P, Sillero-Denamiel MR. A cost-sensitive constrained Lasso. ADV DATA ANAL CLASSI 2020. [DOI: 10.1007/s11634-020-00389-5] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
|
21
|
Solution paths for the generalized lasso with applications to spatially varying coefficients regression. Comput Stat Data Anal 2020. [DOI: 10.1016/j.csda.2019.106821] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
22
|
James GM, Paulson C, Rusmevichientong P. Penalized and Constrained Optimization: An Application to High-Dimensional Website Advertising. J Am Stat Assoc 2020. [DOI: 10.1080/01621459.2019.1609970] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Affiliation(s)
- Gareth M. James
- Marshall School of Business, University of Southern California, Los Angeles, CA
| | - Courtney Paulson
- Smith School of Business, University of Maryland, College Park, MD
| | | |
Collapse
|
23
|
Li P, Taylor JM, Kong S, Jolly S, Schipper MJ. A utility approach to individualized optimal dose selection using biomarkers. Biom J 2019; 62:386-397. [DOI: 10.1002/bimj.201900030] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2019] [Revised: 09/02/2019] [Accepted: 09/08/2019] [Indexed: 11/07/2022]
Affiliation(s)
- Pin Li
- Department of BiostatisticsUniversity of MichiganAnn Arbor MI USA
| | | | - Spring Kong
- Department of Radiation OncologyCase Western Reserve UniversityCleveland OH USA
| | - Shruti Jolly
- Department of Radiation OncologyUniversity of MichiganAnn Arbor MI USA
| | - Matthew J. Schipper
- Department of BiostatisticsUniversity of MichiganAnn Arbor MI USA
- Department of Radiation OncologyUniversity of MichiganAnn Arbor MI USA
| |
Collapse
|
24
|
Shi Y, Ng CT, Feng Z, Yiu KFC. A descent algorithm for constrained LAD-Lasso estimation with applications in portfolio selection. J Appl Stat 2019. [DOI: 10.1080/02664763.2019.1575952] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Affiliation(s)
- Yue Shi
- School of Mathematics and Systems Science, Beihang University, Beijing, People's Republic of China
| | - Chi Tim Ng
- Department of Statistics, Chonnam National University, Gwangju, South Korea
| | - Zhiguo Feng
- Department of Mathematics and Computer Science, Guangdong Ocean University, Guangdong, People's Republic of China
| | - Ka-Fai Cedric Yiu
- Department of Applied Mathematics, The Hong Kong Polytechnic University, Kowloon, Hong Kong
| |
Collapse
|