Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For:	[Subscribe] [Scholar Register]

Number

Cited by Other Article(s)

101

Zahid FM, Faisal S, Heumann C. Multiple imputation with compatibility for high-dimensional data. PLoS One 2021;16:e0254112. [PMID: 34237092 PMCID: PMC8266107 DOI: 10.1371/journal.pone.0254112] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2020] [Accepted: 06/20/2021] [Indexed: 11/18/2022] Open

102

Mirzaei M, Furxhi I, Murphy F, Mullins M. A Machine Learning Tool to Predict the Antibacterial Capacity of Nanoparticles. NANOMATERIALS (BASEL, SWITZERLAND) 2021;11:1774. [PMID: 34361160 PMCID: PMC8308172 DOI: 10.3390/nano11071774] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/20/2021] [Revised: 06/13/2021] [Accepted: 07/06/2021] [Indexed: 12/22/2022]

Abstract

The emergence and rapid spread of multidrug-resistant bacteria strains are a public health concern. This emergence is caused by the overuse and misuse of antibiotics leading to the evolution of antibiotic-resistant strains. Nanoparticles (NPs) are objects with all three external dimensions in the nanoscale that varies from 1 to 100 nm. Research on NPs with enhanced antimicrobial activity as alternatives to antibiotics has grown due to the increased incidence of nosocomial and community acquired infections caused by pathogens. Machine learning (ML) tools have been used in the field of nanoinformatics with promising results. As a consequence of evident achievements on a wide range of predictive tasks, ML techniques are attracting significant interest across a variety of stakeholders. In this article, we present an ML tool that successfully predicts the antibacterial capacity of NPs while the model's validation demonstrates encouraging results (R2 = 0.78). The data were compiled after a literature review of 60 articles and consist of key physico-chemical (p-chem) properties and experimental conditions (exposure variables and bacterial clustering) from in vitro studies. Following data homogenization and pre-processing, we trained various regression algorithms and we validated them using diverse performance metrics. Finally, an important attribute evaluation, which ranks the attributes that are most important in predicting the outcome, was performed. The attribute importance revealed that NP core size, the exposure dose, and the species of bacterium are key variables in predicting the antibacterial effect of NPs. This tool assists various stakeholders and scientists in predicting the antibacterial effects of NPs based on their p-chem properties and diverse exposure settings. This concept also aids the safe-by-design paradigm by incorporating functionality tools.

Collapse

103

Köhler C, Robitzsch A, Fährmann K, von Davier M, Hartig J. A semiparametric approach for item response function estimation to detect item misfit. THE BRITISH JOURNAL OF MATHEMATICAL AND STATISTICAL PSYCHOLOGY 2021;74 Suppl 1:157-175. [PMID: 33332585 DOI: 10.1111/bmsp.12224] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/06/2019] [Revised: 09/17/2020] [Indexed: 06/12/2023]

104

Kim Y, Lee S, Jang JY, Lee S, Park T. Identifying miRNA-mRNA Integration Set Associated With Survival Time. Front Genet 2021;12:634922. [PMID: 34267778 PMCID: PMC8276759 DOI: 10.3389/fgene.2021.634922] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2020] [Accepted: 04/06/2021] [Indexed: 11/26/2022] Open

105

Li W, Chekouo T. Bayesian group selection with non-local priors. Comput Stat 2021. [DOI: 10.1007/s00180-021-01115-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]

106

Chen D, Cremona MA, Qi Z, Mitra RD, Chiaromonte F, Makova KD. Human L1 Transposition Dynamics Unraveled with Functional Data Analysis. Mol Biol Evol 2021;37:3576-3600. [PMID: 32722770 DOI: 10.1093/molbev/msaa194] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open

107

Oh B, Hwangbo S, Jung T, Min K, Lee C, Apio C, Lee H, Lee S, Moon MK, Kim SW, Park T. Prediction Models for the Clinical Severity of Patients With COVID-19 in Korea: Retrospective Multicenter Cohort Study. J Med Internet Res 2021;23:e25852. [PMID: 33822738 PMCID: PMC8054775 DOI: 10.2196/25852] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2020] [Revised: 02/04/2021] [Accepted: 03/18/2021] [Indexed: 12/26/2022] Open

108

Gao W, Shu T, Liu Q, Ling S, Guan Y, Liu S, Zhou L. Predictive Modeling of Lignin Content for the Screening of Suitable Poplar Genotypes Based on Fourier Transform-Raman Spectrometry. ACS OMEGA 2021;6:8578-8587. [PMID: 33817518 PMCID: PMC8015071 DOI: 10.1021/acsomega.1c00400] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/22/2021] [Accepted: 03/03/2021] [Indexed: 05/26/2023]

109

Huang J, Jiao Y, Kang L, Liu J, Liu Y, Lu X. GSDAR: a fast Newton algorithm for $$\ell _0$$ regularized generalized linear models with statistical guarantee. Comput Stat 2021. [DOI: 10.1007/s00180-021-01098-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]

110

Lu Z, Lou W. Bayesian approaches to variable selection: a comparative study from practical perspectives. Int J Biostat 2021;18:83-108. [PMID: 33761580 DOI: 10.1515/ijb-2020-0130] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2020] [Accepted: 02/27/2021] [Indexed: 11/15/2022]

111

Guo G. Taylor quasi-likelihood for limited generalized linear models. J Appl Stat 2021;48:669-692. [DOI: 10.1080/02664763.2020.1743650] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]

112

Wang C, Gonzalez Y, Shen C, Hrycushko B, Jia X. Simultaneous needle catheter selection and dwell time optimization for preplanning of high-dose-rate brachytherapy of prostate cancer. Phys Med Biol 2021;66:055028. [PMID: 33264753 DOI: 10.1088/1361-6560/abd00e] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]

Abstract

PURPOSE

Needle catheter positions critically affect the quality of treatment plans in prostate cancer high-dose-rate (HDR) brachytherapy. The current standard needle positioning approach is based on human intuition, which cannot guarantee a high-quality plan. This study proposed a method to simultaneously select needle catheter positions and determine dwell time for preplanning of HDR brachytherapy of prostate cancer.

METHODS

We formulated the needle catheter selection problem and inverse dwell time optimization problem in a unified framework. In addition to the dose objectives of the planning target volume (PTV) and organs at risk (OARs), the objective function incorporated a group-sparsity term with a needle-specific adaptive weighting scheme to generate high-quality plans with the minimal number of needle catheters. The optimization problem was solved by a fast-iterative shrinkage-thresholding algorithm. For validation purposes, we tested the proposed algorithm on 10 patient cases previously treated at our institution and compared the resulting plans with plans generated using needle catheters selected manually.

RESULTS

Compared to the plan with manually selected needle catheters, when normalizing both plans to the same PTV coverage V _100% = 95%, the plans generated by the proposed algorithm reduced median V _125% from 65% to 64%, but increased median V _150% from 35% to 38%, and V _200% from 14% to 16%. All planning objectives were met. All clinically important dosimetric parameters of OARs were reduced. D _1cc of bladder and rectum were reduced from 8.57 Gy to 8.50 Gy and from 7.24 Gy to 6.80 Gy, respectively. D _max of urethra was reduced from 15.85 Gy to 15.77 Gy. The median number of selected needle catheters was reduced by two. The computational time for solving the proposed optimization problem was ∼90 s using MATLAB.

CONCLUSION

The proposed algorithm was able to generate plans for prostate cancer HDR brachytherapy preplanning with increased median conformity index (0.73-0.77) and slightly lower median homogeneity index (0.64-0.62) with the number of selected needles reduced by two compared to the manual needle selection approach.

Collapse

113

Frias M, Moyano JM, Rivero-Juarez A, Luna JM, Camacho Á, Fardoun HM, Machuca I, Al-Twijri M, Rivero A, Ventura S. Classification Accuracy of Hepatitis C Virus Infection Outcome: Data Mining Approach. J Med Internet Res 2021;23:e18766. [PMID: 33624609 PMCID: PMC7946589 DOI: 10.2196/18766] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2020] [Revised: 11/02/2020] [Accepted: 12/17/2020] [Indexed: 11/30/2022] Open

114

Li Y, Tang C, Lu J, Wu J, Chang EF. Human cortical encoding of pitch in tonal and non-tonal languages. Nat Commun 2021;12:1161. [PMID: 33608548 PMCID: PMC7896081 DOI: 10.1038/s41467-021-21430-x] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2020] [Accepted: 01/26/2021] [Indexed: 11/09/2022] Open

115

Sperger J, Shah KS, Lu M, Zhang X, Ungaro RC, Brenner EJ, Agrawal M, Colombel JF, Kappelman MD, Kosorok MR. Development and validation of multivariable prediction models for adverse COVID-19 outcomes in IBD patients. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2021. [PMID: 33501455 PMCID: PMC7836127 DOI: 10.1101/2021.01.15.21249889] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]

Abstract

Importance

Risk calculators can facilitate shared medical decision-making¹. Demographics, comorbidities, medication use, geographic region, and other factors may increase the risk for COVID-19-related complications among patients with IBD^2,3.

Objectives

Develop an individualized prognostic risk prediction tool for predicting the probability of adverse COVID-19 outcomes in patients with IBD.

Design, Setting, and Participants

This study developed and validated prognostic penalized logistic regression models⁴ using reports to Surveillance Epidemiology of Coronavirus Under Research Exclusion for Inflammatory Bowel Disease (SECURE-IBD) from March–October 2020. Model development was done using a training data set (85% of cases reported March 13 – September 15, 2020), and model validation was conducted using a test data set (the remaining 15% of cases plus all cases reported September 16–October 20, 2020.

Main Outcomes and Measures

COVID-19 related:

Hospitalization+: composite outcome of hospitalization, ICU admission, mechanical ventilation, or death

ICU+: composite outcome of ICU admission, mechanical ventilation, or death

Death

We assessed the resulting models’ discrimination using the area under the curve (AUC) of the receiver-operator characteristic (ROC) curves and reported the corresponding 95% confidence intervals (CIs).

Results

We included 2709 cases from 59 countries (mean age 41.2 years [s.d. 18], 50.2% male). A total of 633 (24%) were hospitalized, 137 (5%) were admitted to the ICU or intubated, and 69 (3%) died. 2009 patients comprised the training set and 700 the test set.

The models demonstrated excellent discrimination, with a test set AUC (95% CI) of 0.79 (0.75, 0.83) for Hospitalization+, 0.88 (0.82, 0.95) for ICU+, and 0.94 (0.89, 0.99) for Death. Age, comorbidities, corticosteroid use, and male gender were associated with higher risk of death, while use of biologic therapies was associated with a lower risk.

Conclusions and Relevance

Prognostic models can effectively predict who is at higher risk for COVID-19-related adverse outcomes in a population of IBD patients. A free online risk calculator (https://covidibd.org/covid-19-risk-calculator/) is available for healthcare providers to facilitate discussion of risks due to COVID-19 with IBD patients. The tool numerically and visually summarizes the patient’s probabilities of adverse outcomes and associated CIs. Helping physicians identify their highest-risk patients will be important in the coming months as cases rise in the US and worldwide. This tool can also serve as a model for risk stratification in other chronic diseases.

Collapse

116

A panel of two miRNAs correlated to systolic blood pressure is a good diagnostic indicator for stroke. Biosci Rep 2021;41:227391. [PMID: 33345284 PMCID: PMC7805026 DOI: 10.1042/bsr20203458] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2020] [Revised: 12/07/2020] [Accepted: 12/10/2020] [Indexed: 12/21/2022] Open

117

Dang X, Huang S, Qian X. Risk Factor Identification in Heterogeneous Disease Progression with L1-Regularized Multi-state Models. JOURNAL OF HEALTHCARE INFORMATICS RESEARCH 2021;5:20-53. [DOI: 10.1007/s41666-020-00085-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2020] [Revised: 10/13/2020] [Accepted: 11/26/2020] [Indexed: 10/22/2022]

118

Kenney A, Chiaromonte F, Felici G. MIP-BOOST: Efficient and Effective L0 Feature Selection for Linear Regression. J Comput Graph Stat 2021;30:566-577. [DOI: 10.1080/10618600.2020.1845184] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]

119

Zhu G, Zhao T. Deep-gKnock: Nonlinear group-feature selection with deep neural networks. Neural Netw 2021;135:139-147. [PMID: 33385830 DOI: 10.1016/j.neunet.2020.12.004] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2020] [Revised: 11/26/2020] [Accepted: 12/02/2020] [Indexed: 01/21/2023]

120

An improved lasso regression model for evaluating the efficiency of intervention actions in a system reliability analysis. Neural Comput Appl 2021. [DOI: 10.1007/s00521-020-05537-8] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]

121

Hybrid safe–strong rules for efficient optimization in lasso-type problems. Comput Stat Data Anal 2021. [DOI: 10.1016/j.csda.2020.107063] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]

122

Li M, Kong L, Su Z. Double fused Lasso regularized regression with both matrix and vector valued predictors. Electron J Stat 2021. [DOI: 10.1214/21-ejs1829] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]

123

MOSS-Multi-Modal Best Subset Modeling in Smart Manufacturing. SENSORS 2021;21:s21010243. [PMID: 33401493 PMCID: PMC7796348 DOI: 10.3390/s21010243] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/09/2020] [Revised: 12/28/2020] [Accepted: 12/28/2020] [Indexed: 11/23/2022]

124

Conditional score matching for high-dimensional partial graphical models. Comput Stat Data Anal 2021. [DOI: 10.1016/j.csda.2020.107066] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]

125

Qiu H, Li Y, Cheng S, Li J, He C, Li J. A Prognostic Microenvironment-Related Immune Signature via ESTIMATE (PROMISE Model) Predicts Overall Survival of Patients With Glioma. Front Oncol 2020;10:580263. [PMID: 33425732 PMCID: PMC7793983 DOI: 10.3389/fonc.2020.580263] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2020] [Accepted: 10/22/2020] [Indexed: 12/13/2022] Open

Abstract

Objective

In the development of immunotherapies in gliomas, the tumor microenvironment (TME) needs to be investigated. We aimed to construct a prognostic microenvironment-related immune signature via ESTIMATE (PROMISE model) for glioma.

Methods

Stromal score (SS) and immune score (IS) were calculated via ESTIMATE for each glioma sample in the cancer genome atlas (TCGA), and differentially expressed genes (DEGs) were identified between high-score and low-score groups. Prognostic DEGs were selected via univariate Cox regression analysis. Using the lower-grcade glioma (LGG) data set in TCGA, we performed LASSO regression based on the prognostic DEGs and constructed a PROMISE model for glioma. The model was validated with survival analysis and the receiver operating characteristic (ROC) in TCGA glioma data sets (LGG, glioblastoma multiforme [GBM] and LGG+GBM) and Chinese glioma genome atlas (CGGA). A nomogram was developed to predict individual survival chances. Further, we explored the underlying mechanisms using gene set enrichment analysis (GSEA) and Cibersort analysis of tumor-infiltrating immune cells between risk groups as defined by the PROMISE model.

Results

We obtained 220 upregulated DEGs and 42 downregulated DEGs in both high-IS and high-SS groups. The Cox regression highlighted 155 prognostic DEGs, out of which we selected 4 genes (CD86, ANXA1, C5AR1, and CD5) to construct a PROMISE model. The model stratifies glioma patients in TCGA as well as in CGGA with distinct survival outcome (P<0.05, Hazard ratio [HR]>1) and acceptable predictive accuracy (AUCs>0.6). With the nomogram, an individualized survival chance could be predicted intuitively with specific age, tumor grade, Isocitrate dehydrogenase (IDH) status, and the PROMISE risk score. ROC showed significant discrimination with the area under curves (AUCs) of 0.917 and 0.817 in TCGA and CGGA, respectively. GSEA between risk groups in both data sets were significantly enriched in multiple immune-related pathways. The Cibersort analysis highlighted four immune cells, i.e., CD 8 T cells, neutrophils, follicular helper T (Tfh) cells, and Natural killer (NK) cells.

Conclusions

The PROMISE model can further stratify both LGG and GBM patients with distinct survival outcomes.These findings may help further our understanding of TME in gliomas and shed light on immunotherapies.

Collapse

126

Regression and subgroup detection for heterogeneous samples. Comput Stat 2020. [DOI: 10.1007/s00180-020-00965-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]

127

Huang A, Wu F. Two-stage adaptive integration of multi-source heterogeneous data based on an improved random subspace and prediction of default risk of microcredit. Neural Comput Appl 2020. [DOI: 10.1007/s00521-020-05489-z] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

128

Bastien B, Boukhobza T, Dumond H, Gégout-Petit A, Muller-Gueudin A, Thiébaut C. A statistical methodology to select covariates in high-dimensional data under dependence. Application to the classification of genetic profiles in oncology. J Appl Stat 2020;49:764-781. [DOI: 10.1080/02664763.2020.1837083] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]

129

AlJame M, Ahmad I, Imtiaz A, Mohammed A. Ensemble learning model for diagnosing COVID-19 from routine blood tests. INFORMATICS IN MEDICINE UNLOCKED 2020;21:100449. [PMID: 33102686 PMCID: PMC7572278 DOI: 10.1016/j.imu.2020.100449] [Citation(s) in RCA: 48] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2020] [Revised: 09/28/2020] [Accepted: 10/07/2020] [Indexed: 12/19/2022] Open

Abstract

BACKGROUND AND OBJECTIVES

The pandemic of novel coronavirus disease 2019 (COVID-19) has severely impacted human society with a massive death toll worldwide. There is an urgent need for early and reliable screening of COVID-19 patients to provide better and timely patient care and to combat the spread of the disease. In this context, recent studies have reported some key advantages of using routine blood tests for initial screening of COVID-19 patients. In this article, first we present a review of the emerging techniques for COVID-19 diagnosis using routine laboratory and/or clinical data. Then, we propose ERLX which is an ensemble learning model for COVID-19 diagnosis from routine blood tests.

METHOD

The proposed model uses three well-known diverse classifiers, extra trees, random forest and logistic regression, which have different architectures and learning characteristics at the first level, and then combines their predictions by using a second level extreme gradient boosting (XGBoost) classifier to achieve a better performance. For data preparation, the proposed methodology employs a KNNImputer algorithm to handle null values in the dataset, isolation forest (iForest) to remove outlier data, and a synthetic minority oversampling technique (SMOTE) to balance data distribution. For model interpretability, features importance are reported by using the SHapley Additive exPlanations (SHAP) technique.

RESULTS

The proposed model was trained and evaluated by using a publicly available data set from Albert Einstein Hospital in Brazil, which consisted of 5644 data samples with 559 confirmed COVID-19 cases. The ensemble model achieved outstanding performance with an overall accuracy of 99.88% [95% CI: 99.6-100], AUC of 99.38% [95% CI: 97.5-100], a sensitivity of 98.72% [95% CI: 94.6-100] and a specificity of 99.99% [95% CI: 99.99-100].

DISCUSSION

The proposed model revealed better performance when compared against existing state-of-the-art studies (Banerjee et al., 2020; de Freitas Barbosa et al., 2020; de Moraes Batista et al., 2020; Soares et al., 2020) [3,22,56,71] for the same set of features employed by them. As compared to the best performing Bayes Net model (de Freitas Barbosa et al., 2020) [22] average accuracy of 95.159%, ERLX achieved an average accuracy of 99.94%. In comparison with AUC of 85% reported by the SVM model (de Moraes Batista et al., 2020) [56], ERLX obtained AUC of 99.77% in addition to improvements in sensitivity, and specificity. As compared with ER-COV model (Soares et al., 2020) [71] average sensitivity of 70.25% and specificity of 85.98%, ERLX model achieved sensitivity of 99.47% and specificity of 99.99%. The ERLX model obtained a considerably higher score as compared with ANN model (Banerjee et al., 2020) [3] in all performance metrics. Therefore, the model presented is robust and can be deployed for reliable early and rapid screening of COVID-19 patients.

Collapse

130

Zhou J, Viles WD, Lu B, Li Z, Madan JC, Karagas MR, Gui J, Hoen AG. Identification of microbial interaction network: zero-inflated latent Ising model based approach. BioData Min 2020;13:16. [PMID: 33042226 PMCID: PMC7542390 DOI: 10.1186/s13040-020-00226-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2020] [Accepted: 09/22/2020] [Indexed: 12/12/2022] Open

Abstract

BACKGROUND

Throughout their lifespans, humans continually interact with the microbial world, including those organisms which live in and on the human body. Research in this domain has revealed the extensive links between the human-associated microbiota and health. In particular, the microbiota of the human gut plays essential roles in digestion, nutrient metabolism, immune maturation and homeostasis, neurological signaling, and endocrine regulation. Microbial interaction networks are frequently estimated from data and are an indispensable tool for representing and understanding the conditional correlation between the microbes. In this high-dimensional setting, zero-inflation and unit-sum constraint for relative abundance data pose challenges to the reliable estimation of microbial interaction networks.

METHODS AND RESULTS

To identify the microbial interaction network, the zero-inflated latent Ising (ZILI) model is proposed which assumes the distribution of relative abundance relies only on finite latent states and provides a novel way to solve issues induced by the unit-sum and zero-inflation constrains. A two-step algorithm is proposed for the model selection of ZILI. ZILI is evaluated through simulated data and subsequently applied to an infant gut microbiota dataset from New Hampshire Birth Cohort Study. The results are compared with results from Gaussian graphical model (GGM) and dichotomous Ising model (DIS). Providing ZILI is the true data-generating model, the simulation studies show that the two-step algorithm can identify the graphical structure effectively and is robust to a range of parameter settings. For the infant gut microbiota dataset, the final estimated networks from GGM and ZILI turn out to have significant overlap in which the ZILI tends to select the sparser network than those from GGM. From the shared subnetwork, a hub taxon Lachnospiraceae is identified whose involvement in human disease development has been discovered recently in literature.

CONCLUSIONS

Constrains induced by relative abundance of microbiota such as zero inflation and unit sum render the conditional correlation analysis unreliable for conventional methods such as GGM. The proposed optimal categoricalization based ZILI model provides an alternative yet elegant way to deal with these difficulties. The results from ZILI have reasonable biological interpretation. This model can also be used to study the microbial interaction in other body parts.

Collapse

131

Culos A, Tsai AS, Stanley N, Becker M, Ghaemi MS, McIlwain DR, Fallahzadeh R, Tanada A, Nassar H, Espinosa C, Xenochristou M, Ganio E, Peterson L, Han X, Stelzer IA, Ando K, Gaudilliere D, Phongpreecha T, Marić I, Chang AL, Shaw GM, Stevenson DK, Bendall S, Davis KL, Fantl W, Nolan GP, Hastie T, Tibshirani R, Angst MS, Gaudilliere B, Aghaeepour N. Integration of mechanistic immunological knowledge into a machine learning pipeline improves predictions. NAT MACH INTELL 2020;2:619-628. [PMID: 33294774 PMCID: PMC7720904 DOI: 10.1038/s42256-020-00232-8] [Citation(s) in RCA: 36] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2020] [Accepted: 08/26/2020] [Indexed: 12/17/2022]

Affiliation(s)

Anthony Culos Department of Anesthesiology, Perioperative and Pain Medicine, Stanford University School of Medicine, Stanford, CA, USA Department of Biomedical Data Sciences, Stanford University, Stanford, CA, USA These authors contributed equally: Anthony Culos, Amy S. Tsai
Amy S Tsai Department of Anesthesiology, Perioperative and Pain Medicine, Stanford University School of Medicine, Stanford, CA, USA These authors contributed equally: Anthony Culos, Amy S. Tsai
Natalie Stanley Department of Anesthesiology, Perioperative and Pain Medicine, Stanford University School of Medicine, Stanford, CA, USA Department of Biomedical Data Sciences, Stanford University, Stanford, CA, USA
Martin Becker Department of Anesthesiology, Perioperative and Pain Medicine, Stanford University School of Medicine, Stanford, CA, USA Department of Biomedical Data Sciences, Stanford University, Stanford, CA, USA
Mohammad S Ghaemi Department of Anesthesiology, Perioperative and Pain Medicine, Stanford University School of Medicine, Stanford, CA, USA Department of Biomedical Data Sciences, Stanford University, Stanford, CA, USA Digital Technologies Research Centre, National Research Council Canada, Toronto, Ontario, Canada
David R McIlwain Department of Microbiology and Immunology, Baxter Laboratory in Stem Cell Biology, Stanford University School of Medicine, Stanford, CA, USA
Ramin Fallahzadeh Department of Anesthesiology, Perioperative and Pain Medicine, Stanford University School of Medicine, Stanford, CA, USA Department of Biomedical Data Sciences, Stanford University, Stanford, CA, USA
Athena Tanada Department of Anesthesiology, Perioperative and Pain Medicine, Stanford University School of Medicine, Stanford, CA, USA Department of Biomedical Data Sciences, Stanford University, Stanford, CA, USA
Huda Nassar Department of Anesthesiology, Perioperative and Pain Medicine, Stanford University School of Medicine, Stanford, CA, USA Department of Biomedical Data Sciences, Stanford University, Stanford, CA, USA
Camilo Espinosa Department of Anesthesiology, Perioperative and Pain Medicine, Stanford University School of Medicine, Stanford, CA, USA Department of Biomedical Data Sciences, Stanford University, Stanford, CA, USA
Maria Xenochristou Department of Anesthesiology, Perioperative and Pain Medicine, Stanford University School of Medicine, Stanford, CA, USA Department of Biomedical Data Sciences, Stanford University, Stanford, CA, USA
Edward Ganio Department of Anesthesiology, Perioperative and Pain Medicine, Stanford University School of Medicine, Stanford, CA, USA
Laura Peterson Department of Anesthesiology, Perioperative and Pain Medicine, Stanford University School of Medicine, Stanford, CA, USA Department of Pediatrics, Division of Neonatal and Developmental Medicine, Stanford University School of Medicine, Stanford, CA, USA
Xiaoyuan Han Department of Anesthesiology, Perioperative and Pain Medicine, Stanford University School of Medicine, Stanford, CA, USA
Ina A Stelzer Department of Anesthesiology, Perioperative and Pain Medicine, Stanford University School of Medicine, Stanford, CA, USA
Kazuo Ando Department of Anesthesiology, Perioperative and Pain Medicine, Stanford University School of Medicine, Stanford, CA, USA
Dyani Gaudilliere Department of Anesthesiology, Perioperative and Pain Medicine, Stanford University School of Medicine, Stanford, CA, USA
Thanaphong Phongpreecha Department of Anesthesiology, Perioperative and Pain Medicine, Stanford University School of Medicine, Stanford, CA, USA Department of Biomedical Data Sciences, Stanford University, Stanford, CA, USA Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
Ivana Marić Department of Anesthesiology, Perioperative and Pain Medicine, Stanford University School of Medicine, Stanford, CA, USA Department of Pediatrics, Division of Neonatal and Developmental Medicine, Stanford University School of Medicine, Stanford, CA, USA
Alan L Chang Department of Anesthesiology, Perioperative and Pain Medicine, Stanford University School of Medicine, Stanford, CA, USA Department of Biomedical Data Sciences, Stanford University, Stanford, CA, USA
Gary M Shaw Department of Pediatrics, Division of Neonatal and Developmental Medicine, Stanford University School of Medicine, Stanford, CA, USA
David K Stevenson Department of Pediatrics, Division of Neonatal and Developmental Medicine, Stanford University School of Medicine, Stanford, CA, USA
Sean Bendall Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
Kara L Davis Department of Pediatrics, Division of Neonatal and Developmental Medicine, Stanford University School of Medicine, Stanford, CA, USA
Wendy Fantl Department of Microbiology and Immunology, Baxter Laboratory in Stem Cell Biology, Stanford University School of Medicine, Stanford, CA, USA Department of Obstetrics and Gynecology, Stanford University School of Medicine, Stanford, CA, USA Department of Urology, Stanford University School of Medicine, Stanford, CA, USA
Garry P Nolan Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
Trevor Hastie Department of Biomedical Data Sciences, Stanford University, Stanford, CA, USA Department of Statistics, Stanford University, Stanford, CA, USA
Robert Tibshirani Department of Biomedical Data Sciences, Stanford University, Stanford, CA, USA Department of Statistics, Stanford University, Stanford, CA, USA
Martin S Angst Department of Anesthesiology, Perioperative and Pain Medicine, Stanford University School of Medicine, Stanford, CA, USA These authors jointly supervised this work: Martin S. Angst, Brice Gaudilliere, Nima Aghaeepour
Brice Gaudilliere Department of Anesthesiology, Perioperative and Pain Medicine, Stanford University School of Medicine, Stanford, CA, USA Department of Pediatrics, Division of Neonatal and Developmental Medicine, Stanford University School of Medicine, Stanford, CA, USA These authors jointly supervised this work: Martin S. Angst, Brice Gaudilliere, Nima Aghaeepour
Nima Aghaeepour Department of Anesthesiology, Perioperative and Pain Medicine, Stanford University School of Medicine, Stanford, CA, USA Department of Biomedical Data Sciences, Stanford University, Stanford, CA, USA Department of Pediatrics, Division of Neonatal and Developmental Medicine, Stanford University School of Medicine, Stanford, CA, USA These authors jointly supervised this work: Martin S. Angst, Brice Gaudilliere, Nima Aghaeepour

Collapse

132

Davagdorj K, Pham VH, Theera-Umpon N, Ryu KH. XGBoost-Based Framework for Smoking-Induced Noncommunicable Disease Prediction. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2020;17:ijerph17186513. [PMID: 32906777 PMCID: PMC7558165 DOI: 10.3390/ijerph17186513] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/22/2020] [Revised: 08/28/2020] [Accepted: 09/05/2020] [Indexed: 12/23/2022]

133

Detmer FJ, Hadad S, Chung BJ, Mut F, Slawski M, Juchler N, Kurtcuoglu V, Hirsch S, Bijlenga P, Uchiyama Y, Fujimura S, Yamamoto M, Murayama Y, Takao H, Koivisto T, Frösen J, Cebral JR. Extending statistical learning for aneurysm rupture assessment to Finnish and Japanese populations using morphology, hemodynamics, and patient characteristics. Neurosurg Focus 2020;47:E16. [PMID: 31261120 DOI: 10.3171/2019.4.focus19145] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2019] [Accepted: 04/09/2019] [Indexed: 11/06/2022]

Abstract

OBJECTIVE

Incidental aneurysms pose a challenge for physicians, who need to weigh the rupture risk against the risks associated with treatment and its complications. A statistical model could potentially support such treatment decisions. A recently developed aneurysm rupture probability model performed well in the US data used for model training and in data from two European cohorts for external validation. Because Japanese and Finnish patients are known to have a higher aneurysm rupture risk, the authors' goals in the present study were to evaluate this model using data from Japanese and Finnish patients and to compare it with new models trained with Finnish and Japanese data.

METHODS

Patient and image data on 2129 aneurysms in 1472 patients were used. Of these aneurysm cases, 1631 had been collected mainly from US hospitals, 249 from European (other than Finnish) hospitals, 147 from Japanese hospitals, and 102 from Finnish hospitals. Computational fluid dynamics simulations and shape analyses were conducted to quantitatively characterize each aneurysm's shape and hemodynamics. Next, the previously developed model's discrimination was evaluated using the Finnish and Japanese data in terms of the area under the receiver operating characteristic curve (AUC). Models with and without interaction terms between patient population and aneurysm characteristics were trained and evaluated including data from all four cohorts obtained by repeatedly randomly splitting the data into training and test data.

RESULTS

The US model's AUC was reduced to 0.70 and 0.72, respectively, in the Finnish and Japanese data compared to 0.82 and 0.86 in the European and US data. When training the model with Japanese and Finnish data, the average AUC increased only slightly for the Finnish sample (to 0.76 ± 0.16) and Finnish and Japanese cases combined (from 0.74 to 0.75 ± 0.14) and decreased for the Japanese data (to 0.66 ± 0.33). In models including interaction terms, the AUC in the Finnish and Japanese data combined increased significantly to 0.83 ± 0.10.

CONCLUSIONS

Developing an aneurysm rupture prediction model that applies to Japanese and Finnish aneurysms requires including data from these two cohorts for model training, as well as interaction terms between patient population and the other variables in the model. When including this information, the performance of such a model with Japanese and Finnish data is close to its performance with US or European data. These results suggest that population-specific differences determine how hemodynamics and shape associate with rupture risk in intracranial aneurysms.

Collapse

Affiliation(s)

Felicitas J Detmer 1Bioengineering Department and
Sara Hadad 1Bioengineering Department and
Bong Jae Chung 2Department of Mathematical Sciences, Montclair State University, Montclair, New Jersey
Fernando Mut 1Bioengineering Department and
Martin Slawski 3Statistics Department, George Mason University, Fairfax, Virginia
Norman Juchler 4Institute of Applied Simulation, ZHAW University of Applied Sciences, Wädenswil, Switzerland.,5The Interface Group, Institute of Physiology, University of Zürich, Switzerland
Vartan Kurtcuoglu 5The Interface Group, Institute of Physiology, University of Zürich, Switzerland
Sven Hirsch 4Institute of Applied Simulation, ZHAW University of Applied Sciences, Wädenswil, Switzerland
Philippe Bijlenga 6Clinical Neurosciences Department, University of Geneva, Switzerland
Yuya Uchiyama 7Graduate School of Mechanical Engineering, Tokyo University of Science, Tokyo, Japan.,Departments of8Innovation for Medical Information Technology and
Soichiro Fujimura 7Graduate School of Mechanical Engineering, Tokyo University of Science, Tokyo, Japan.,Departments of8Innovation for Medical Information Technology and
Makoto Yamamoto 9Department of Mechanical Engineering, Tokyo University of Science, Tokyo, Japan; and
Yuichi Murayama 10Neurosurgery, The Jikei University of Medicine, Tokyo, Japan
Hiroyuki Takao 7Graduate School of Mechanical Engineering, Tokyo University of Science, Tokyo, Japan.,Departments of8Innovation for Medical Information Technology and.,10Neurosurgery, The Jikei University of Medicine, Tokyo, Japan
Timo Koivisto 11Hemorrhagic Brain Pathology Research Group, Department of Neurosurgery, Kuopio University Hospital, Kuopio, Finland
Juhana Frösen 11Hemorrhagic Brain Pathology Research Group, Department of Neurosurgery, Kuopio University Hospital, Kuopio, Finland
Juan R Cebral 1Bioengineering Department and

Collapse

134

Yao Y, Luo XL. Improving vertical positioning accuracy with the weighted multinomial logistic regression classifier. SN APPLIED SCIENCES 2020. [DOI: 10.1007/s42452-020-03240-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022] Open

135

Stieger M, Eck M, Rüegger D, Kowatsch T, Flückiger C, Allemand M. Who wants to become more conscientious, more extraverted, or less neurotic with the help of a digital intervention? JOURNAL OF RESEARCH IN PERSONALITY 2020. [DOI: 10.1016/j.jrp.2020.103983] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]

136

Alhamzawi R, Taha Mohammad Ali H. A new Gibbs sampler for Bayesian lasso. COMMUN STAT-SIMUL C 2020. [DOI: 10.1080/03610918.2018.1508699] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]

137

Jiang J, Wang C, Wu J, Qin W, Xu M, Yin E. Temporal Combination Pattern Optimization Based on Feature Selection Method for Motor Imagery BCIs. Front Hum Neurosci 2020;14:231. [PMID: 32714167 PMCID: PMC7344307 DOI: 10.3389/fnhum.2020.00231] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2020] [Accepted: 05/25/2020] [Indexed: 11/19/2022] Open

138

An B, Zhang B. Logistic regression with image covariates via the combination of L1 and Sobolev regularizations. PLoS One 2020;15:e0234975. [PMID: 32589677 PMCID: PMC7319310 DOI: 10.1371/journal.pone.0234975] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2019] [Accepted: 06/06/2020] [Indexed: 11/19/2022] Open

139

Stanciu A, Banciu M, Sadighi A, Marshall KA, Holland NR, Abedi V, Zand R. A predictive analytics model for differentiating between transient ischemic attacks (TIA) and its mimics. BMC Med Inform Decis Mak 2020;20:112. [PMID: 32552700 PMCID: PMC7302339 DOI: 10.1186/s12911-020-01154-6] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2020] [Accepted: 06/12/2020] [Indexed: 12/22/2022] Open

140

Jiang L, Greenwood CMT, Yao W, Li L. Bayesian Hyper-LASSO Classification for Feature Selection with Application to Endometrial Cancer RNA-seq Data. Sci Rep 2020;10:9747. [PMID: 32546735 PMCID: PMC7297975 DOI: 10.1038/s41598-020-66466-z] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2019] [Accepted: 04/29/2020] [Indexed: 11/30/2022] Open

141

Yang Z, Chen Z, Wang C. An accelerated stochastic variance-reduced method for machine learning problems. Knowl Based Syst 2020. [DOI: 10.1016/j.knosys.2020.105941] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]

142

Chu BB, Keys KL, German CA, Zhou H, Zhou JJ, Sobel EM, Sinsheimer JS, Lange K. Iterative hard thresholding in genome-wide association studies: Generalized linear models, prior weights, and double sparsity. Gigascience 2020;9:giaa044. [PMID: 32491161 PMCID: PMC7268817 DOI: 10.1093/gigascience/giaa044] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2019] [Revised: 02/27/2020] [Accepted: 04/14/2020] [Indexed: 11/17/2022] Open

143

Zhang C, Yu Z, Fu H, Zhu P, Chen L, Hu Q. Hybrid Noise-Oriented Multilabel Learning. IEEE TRANSACTIONS ON CYBERNETICS 2020;50:2837-2850. [PMID: 30762579 DOI: 10.1109/tcyb.2019.2894985] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]

144

Xia Y. Correlation and association analyses in microbiome study integrating multiomics in health and disease. PROGRESS IN MOLECULAR BIOLOGY AND TRANSLATIONAL SCIENCE 2020;171:309-491. [PMID: 32475527 DOI: 10.1016/bs.pmbts.2020.04.003] [Citation(s) in RCA: 37] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]

Abstract

Correlation and association analyses are one of the most widely used statistical methods in research fields, including microbiome and integrative multiomics studies. Correlation and association have two implications: dependence and co-occurrence. Microbiome data are structured as phylogenetic tree and have several unique characteristics, including high dimensionality, compositionality, sparsity with excess zeros, and heterogeneity. These unique characteristics cause several statistical issues when analyzing microbiome data and integrating multiomics data, such as large p and small n, dependency, overdispersion, and zero-inflation. In microbiome research, on the one hand, classic correlation and association methods are still applied in real studies and used for the development of new methods; on the other hand, new methods have been developed to target statistical issues arising from unique characteristics of microbiome data. Here, we first provide a comprehensive view of classic and newly developed univariate correlation and association-based methods. We discuss the appropriateness and limitations of using classic methods and demonstrate how the newly developed methods mitigate the issues of microbiome data. Second, we emphasize that concepts of correlation and association analyses have been shifted by introducing network analysis, microbe-metabolite interactions, functional analysis, etc. Third, we introduce multivariate correlation and association-based methods, which are organized by the categories of exploratory, interpretive, and discriminatory analyses and classification methods. Fourth, we focus on the hypothesis testing of univariate and multivariate regression-based association methods, including alpha and beta diversities-based, count-based, and relative abundance (or compositional)-based association analyses. We demonstrate the characteristics and limitations of each approaches. Fifth, we introduce two specific microbiome-based methods: phylogenetic tree-based association analysis and testing for survival outcomes. Sixth, we provide an overall view of longitudinal methods in analysis of microbiome and omics data, which cover standard, static, regression-based time series methods, principal trend analysis, and newly developed univariate overdispersed and zero-inflated as well as multivariate distance/kernel-based longitudinal models. Finally, we comment on current association analysis and future direction of association analysis in microbiome and multiomics studies.

Collapse

145

Lee K, Cao X. Bayesian group selection in logistic regression with application to MRI data analysis. Biometrics 2020;77:391-400. [PMID: 32365231 DOI: 10.1111/biom.13290] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2019] [Revised: 04/24/2020] [Accepted: 04/27/2020] [Indexed: 12/22/2022]

146

A Comparative Analysis of Machine Learning Methods for Class Imbalance in a Smoking Cessation Intervention. APPLIED SCIENCES-BASEL 2020. [DOI: 10.3390/app10093307] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]

Abstract Smoking is one of the major public health issues, which has a significant impact on premature death. In recent years, numerous decision support systems have been developed to deal with smoking cessation based on machine learning methods. However, the inevitable class imbalance is considered a major challenge in deploying such systems. In this paper, we study an empirical comparison of machine learning techniques to deal with the class imbalance problem in the prediction of smoking cessation intervention among the Korean population. For the class imbalance problem, the objective of this paper is to improve the prediction performance based on the utilization of synthetic oversampling techniques, which we called the synthetic minority over-sampling technique (SMOTE) and an adaptive synthetic (ADASYN). This has been achieved by the experimental design, which comprises three components. First, the selection of the best representative features is performed in two phases: the lasso method and multicollinearity analysis. Second, generate the newly balanced data utilizing SMOTE and ADASYN technique. Third, machine learning classifiers are applied to construct the prediction models among all subjects and each gender. In order to justify the effectiveness of the prediction models, the f-score, type I error, type II error, balanced accuracy and geometric mean indices are used. Comprehensive analysis demonstrates that Gradient Boosting Trees (GBT), Random Forest (RF) and multilayer perceptron neural network (MLP) classifiers achieved the best performances in all subjects and each gender when SMOTE and ADASYN were utilized. The SMOTE with GBT and RF models also provide feature importance scores that enhance the interpretability of the decision-support system. In addition, it is proven that the presented synthetic oversampling techniques with machine learning models outperformed baseline models in smoking cessation prediction. Collapse

147

Arayeshgari M, Tapak L, Roshanaei G, Poorolajal J, Ghaleiha A. Application of group smoothly clipped absolute deviation method in identifying correlates of psychiatric distress among college students. BMC Psychiatry 2020;20:198. [PMID: 32366242 PMCID: PMC7199302 DOI: 10.1186/s12888-020-02591-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/14/2019] [Accepted: 04/07/2020] [Indexed: 11/21/2022] Open

Abstract

BACKGROUND

College students are at an increased risk of psychiatric distress. So, identifying its important correlates using more reliable statistical models, instead of inefficient traditional variable selection methods like stepwise regression, is of great importance. The objective of this study was to investigate correlates of psychiatric distress among college students in Iran; using group smoothly clipped absolute deviation method (SCAD).

METHODS

A number of 1259 voluntary college students participated in this cross-sectional study (Jan-May 2016) at Hamadan University of Medical Sciences, Iran. The data were collected using a self-administered questionnaire consisting of demographic information, a behavioral risk factors checklist and the GHQ-28 questionnaire (with a cut-off of 23 to measure psychiatric distress, recommended by the Iranian version of the questionnaire). Penalized logistic regression with a group-SCAD regularization method was used to analyze the data (α = 0.05).

RESULTS

The majority of students were aged 18-25 (87.61%), and 60.76% of them were female. About 41% of students had psychiatric distress. Significant correlates of psychiatric distress among college students selected by group-SCAD included the average grade, educational level, being optimistic about future, having a boy/girlfriend, having an emotional breakup, the average daily number of cigarettes, substance abusing during previous month and having suicidal thoughts ever (P < 0.05).

CONCLUSIONS

Penalized logistic regression methods such as group-SCAD and group-Adaptive-LASSO should be considered as plausible alternatives to stepwise regression for identifying correlates of a binary response. Several behavioral variables were associated with psychological distress which highlights the necessity of designing multiple factors and behavioral changes in interventional programs.

Collapse

148

Jiang H, Fan X. A consistent variable screening procedure with family-wise error control. J STAT COMPUT SIM 2020. [DOI: 10.1080/00949655.2020.1724291] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]

149

Bhatnagar SR, Yang Y, Lu T, Schurr E, Loredo-Osti JC, Forest M, Oualkacha K, Greenwood CMT. Simultaneous SNP selection and adjustment for population structure in high dimensional prediction models. PLoS Genet 2020;16:e1008766. [PMID: 32365090 PMCID: PMC7224575 DOI: 10.1371/journal.pgen.1008766] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2019] [Revised: 05/14/2020] [Accepted: 04/08/2020] [Indexed: 12/23/2022] Open

150

Detmer FJ, Cebral J, Slawski M. A note on coding and standardization of categorical variables in (sparse) group lasso regression. J Stat Plan Inference 2020. [DOI: 10.1016/j.jspi.2019.08.003] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]