Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Speiser JL, Miller ME, Tooze J, Ip E. A Comparison of Random Forest Variable Selection Methods for Classification Prediction Modeling. Expert Syst Appl 2019;134:93-101. [PMID: 32968335 PMCID: PMC7508310 DOI: 10.1016/j.eswa.2019.05.028] [Citation(s) in RCA: 228] [Impact Index Per Article: 45.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/17/2023]

For:	Speiser JL, Miller ME, Tooze J, Ip E. A Comparison of Random Forest Variable Selection Methods for Classification Prediction Modeling. Expert Syst Appl 2019;134:93-101. [PMID: 32968335 PMCID: PMC7508310 DOI: 10.1016/j.eswa.2019.05.028] [Citation(s) in RCA: 228] [Impact Index Per Article: 45.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/17/2023]

Number

Cited by Other Article(s)

201

Hong W, Lu Y, Zhou X, Jin S, Pan J, Lin Q, Yang S, Basharat Z, Zippi M, Goyal H. Usefulness of Random Forest Algorithm in Predicting Severe Acute Pancreatitis. Front Cell Infect Microbiol 2022;12:893294. [PMID: 35755843 PMCID: PMC9226542 DOI: 10.3389/fcimb.2022.893294] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2022] [Accepted: 04/29/2022] [Indexed: 02/05/2023] Open

Abstract

BACKGROUND AND AIMS

This study aimed to develop an interpretable random forest model for predicting severe acute pancreatitis (SAP).

METHODS

Clinical and laboratory data of 648 patients with acute pancreatitis were retrospectively reviewed and randomly assigned to the training set and test set in a 3:1 ratio. Univariate analysis was used to select candidate predictors for the SAP. Random forest (RF) and logistic regression (LR) models were developed on the training sample. The prediction models were then applied to the test sample. The performance of the risk models was measured by calculating the area under the receiver operating characteristic (ROC) curves (AUC) and area under precision recall curve. We provide visualized interpretation by using local interpretable model-agnostic explanations (LIME).

RESULTS

The LR model was developed to predict SAP as the following function: -1.10-0.13×albumin (g/L) + 0.016 × serum creatinine (μmol/L) + 0.14 × glucose (mmol/L) + 1.63 × pleural effusion (0/1)(No/Yes). The coefficients of this formula were utilized to build a nomogram. The RF model consists of 16 variables identified by univariate analysis. It was developed and validated by a tenfold cross-validation on the training sample. Variables importance analysis suggested that blood urea nitrogen, serum creatinine, albumin, high-density lipoprotein cholesterol, low-density lipoprotein cholesterol, calcium, and glucose were the most important seven predictors of SAP. The AUCs of RF model in tenfold cross-validation of the training set and the test set was 0.89 and 0.96, respectively. Both the area under precision recall curve and the diagnostic accuracy of the RF model were higher than that of both the LR model and the BISAP score. LIME plots were used to explain individualized prediction of the RF model.

CONCLUSIONS

An interpretable RF model exhibited the highest discriminatory performance in predicting SAP. Interpretation with LIME plots could be useful for individualized prediction in a clinical setting. A nomogram consisting of albumin, serum creatinine, glucose, and pleural effusion was useful for prediction of SAP.

Collapse

202

Buenafe RJ, Rathnam A, Añonuevo JJ, Sundar S, Sreenivasulu N. Application of classification models in screening superior rice grain quality in male sterile and pollen parents. J Food Compost Anal 2021. [DOI: 10.1016/j.jfca.2021.104137] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]

203

Using a Random Forest Model to Predict the Location of Potential Damage on Asphalt Pavement. APPLIED SCIENCES-BASEL 2021. [DOI: 10.3390/app112110396] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]

204

Çakıroğlu MA, Kaplan AN, Süzen AA. Experimental and DBN-Based neural network extraction of radiation attenuation coefficient of dry mixture shotcrete produced using different additives. Radiat Phys Chem Oxf Engl 1993 2021. [DOI: 10.1016/j.radphyschem.2021.109636] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]

205

Potentials and Limitations of WorldView-3 Data for the Detection of Invasive Lupinus polyphyllus Lindl. in Semi-Natural Grasslands. REMOTE SENSING 2021. [DOI: 10.3390/rs13214333] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]

206

Guo JN, Chen D, Deng SH, Huang JR, Song JX, Li XY, Cui BB, Liu YL. Identification and quantification of immune infiltration landscape on therapy and prognosis in left- and right-sided colon cancer. Cancer Immunol Immunother 2021;71:1313-1330. [PMID: 34657172 PMCID: PMC9122887 DOI: 10.1007/s00262-021-03076-2] [Citation(s) in RCA: 38] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2021] [Accepted: 09/30/2021] [Indexed: 01/22/2023]

Abstract

Background

The left-sided and right-sided colon cancer (LCCs and RCCs, respectively) have unique molecular features and clinical heterogeneity. This study aimed to identify the characteristics of immune cell infiltration (ICI) subtypes for evaluating prognosis and therapeutic benefits.

Methods

The independent gene datasets, corresponding somatic mutation and clinical information were collected from The Cancer Genome Atlas and Gene Expression Omnibus. The ICI contents were evaluated by “ESTIMATE” and “CIBERSORT.” We performed two computational algorithms to identify the ICI landscape related to prognosis and found the unique infiltration characteristics. Next, principal component analysis was conducted to construct ICI score based on three ICI patterns. We analyzed the correlation between ICI score and tumor mutation burden (TMB), and stratified patients into prognostic-related high- and low- ICI score groups (HSG and LSG, respectively). The role of ICI scores in the prediction of therapeutic benefits was investigated by "pRRophetic" and verified by Immunophenoscores (IPS) (TCIA database) and an independent immunotherapy cohort (IMvigor210). The key genes were preliminary screened by weighted gene co-expression network analysis based on ICI scores. And they were further identified at various levels, including single cell, protein and immunotherapy response. The predictive ability of ICI score for prognosis was also verified in IMvigor210 cohort.

Results

The ICI features with a better prognosis were marked by high plasma cells, dendritic cells and mast cells, low memory CD4⁺ T cells, M0 macrophages, M1 macrophages, as well as M2 macrophages. A high ICI score was characterized by an increased TMB and genomic instability related signaling pathways. The prognosis, sensitivities of targeted inhibitors and immunotherapy, IPS and expression of immune checkpoints were significantly different in HSG and LSG. The genes identified by ICI scores and various levels included CA2 and TSPAN1.

Conclusion

The identification of ICI subtypes and ICI scores will help gain insights into the heterogeneity in LCC and RCC, and identify patients probably benefiting from treatments. ICI scores and the key genes could serve as an effective biomarker to predict prognosis and the sensitivity of immunotherapy.

Supplementary Information

The online version contains supplementary material available at 10.1007/s00262-021-03076-2.

Collapse

207

Feng W, Quan Y, Dauphin G, Li Q, Gao L, Huang W, Xia J, Zhu W, Xing M. Semi-supervised rotation forest based on ensemble margin theory for the classification of hyperspectral image with limited training data. Inf Sci (N Y) 2021. [DOI: 10.1016/j.ins.2021.06.059] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]

208

López-Castro T, Zhao Y, Fitzpatrick S, Ruglass LM, Hien DA. Seeing the forest for the trees: Predicting attendance in trials for co-occurring PTSD and substance use disorders with a machine learning approach. J Consult Clin Psychol 2021;89:869-884. [PMID: 34807661 PMCID: PMC9426719 DOI: 10.1037/ccp0000688] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

Abstract

Objective: High dropout rates are common in randomized clinical trials (RCTs) for comorbid posttraumatic stress disorder and substance use disorders (PTSD + SUD). Optimizing attendance is a priority for PTSD + SUD treatment development, yet research has found few consistent associations to guide responsive strategies. In this study, we employed a data-driven pipeline for identifying salient and reliable predictors of attendance. Method: In a novel application of the iterative Random Forest algorithm (iRF), we investigated the association of individual level characteristics and session attendance in a completed RCT for PTSD + SUD (n = 70; women = 22 [31.4%]). iRF identified a group of potential predictor candidates for the total trial sessions attended; then, a Poisson regression model assessed the association between the iRF-identified factors and attendance. As a validation set, a parallel regression of significant predictors was conducted on a second, independent RCT for PTSD + SUD (n = 60; women = 48 [80%]). Results: Two testable hypotheses were derived from iRF's variable importance measures. Faster within-treatment improvement of PTSD symptoms was associated with greater session attendance with age moderating this relationship (p = .01): faster PTSD symptom improvement predicted fewer sessions attended among younger patients and more sessions among older patients. Full-time employment was also associated with fewer sessions attended (p = .02). In the validation set, the interaction between age and speed of PTSD improvement was significant (p = .05) and the employment association was not. Conclusions: Results demonstrate the potential of data-driven methods to identifying meaningful predictors as well as the dynamic contribution of symptom change during treatment to understanding RCT attendance. (PsycInfo Database Record (c) 2021 APA, all rights reserved).

Collapse

209

Forecasting Solar Radiation. JOURNAL OF CASES ON INFORMATION TECHNOLOGY 2021. [DOI: 10.4018/jcit.296263] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

210

de Abreu Fontes J, Anzanello MJ, Brito JBG, Bucco GB, Fogliatto FS, Puglia FDP. Combining wavelength importance ranking to the random forest classifier to analyze multiclass spectral data. Forensic Sci Int 2021;328:110998. [PMID: 34551367 DOI: 10.1016/j.forsciint.2021.110998] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2020] [Revised: 09/04/2021] [Accepted: 09/09/2021] [Indexed: 10/20/2022]

211

Zhou X, Lin Q, Gui Y, Wang Z, Liu M, Lu H. Multimodal MR Images-Based Diagnosis of Early Adolescent Attention-Deficit/Hyperactivity Disorder Using Multiple Kernel Learning. Front Neurosci 2021;15:710133. [PMID: 34594183 PMCID: PMC8477011 DOI: 10.3389/fnins.2021.710133] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2021] [Accepted: 07/30/2021] [Indexed: 11/13/2022] Open

Abstract

Attention-deficit/hyperactivity disorder (ADHD) is one of the most common brain diseases among children. The current criteria of ADHD diagnosis mainly depend on behavior analysis, which is subjective and inconsistent, especially for children. The development of neuroimaging technologies, such as magnetic resonance imaging (MRI), drives the discovery of brain abnormalities in structure and function by analyzing multimodal neuroimages for computer-aided diagnosis of brain diseases. This paper proposes a multimodal machine learning framework that combines the Boruta based feature selection and Multiple Kernel Learning (MKL) to integrate the multimodal features of structural and functional MRIs and Diffusion Tensor Images (DTI) for the diagnosis of early adolescent ADHD. The rich and complementary information of the macrostructural features, microstructural properties, and functional connectivities are integrated at the kernel level, followed by a support vector machine classifier for discriminating ADHD from healthy children. Our experiments were conducted on the comorbidity-free ADHD subjects and covariable-matched healthy children aged 9-10 chosen from the Adolescent Brain and Cognitive Development (ABCD) study. This paper is the first work to combine structural and functional MRIs with DTI for early adolescents of the ABCD study. The results indicate that the kernel-level fusion of multimodal features achieves 0.698 of AUC (area under the receiver operating characteristic curves) and 64.3% of classification accuracy for ADHD diagnosis, showing a significant improvement over the early feature fusion and unimodal features. The abnormal functional connectivity predictors, involving default mode network, attention network, auditory network, and sensorimotor mouth network, thalamus, and cerebellum, as well as the anatomical regions in basal ganglia, are found to encode the most discriminative information, which collaborates with macrostructure and diffusion alterations to boost the performances of disorder diagnosis.

Collapse

212

Huang Y, Wei L, Hu Y, Shao N, Lin Y, He S, Shi H, Zhang X, Lin Y. Multi-Parametric MRI-Based Radiomics Models for Predicting Molecular Subtype and Androgen Receptor Expression in Breast Cancer. Front Oncol 2021;11:706733. [PMID: 34490107 PMCID: PMC8416497 DOI: 10.3389/fonc.2021.706733] [Citation(s) in RCA: 34] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2021] [Accepted: 07/28/2021] [Indexed: 12/30/2022] Open

Abstract

Objective

To investigate whether radiomics features extracted from multi-parametric MRI combining machine learning approach can predict molecular subtype and androgen receptor (AR) expression of breast cancer in a non-invasive way.

Materials and Methods

Patients diagnosed with clinical T2–4 stage breast cancer from March 2016 to July 2020 were retrospectively enrolled. The molecular subtypes and AR expression in pre-treatment biopsy specimens were assessed. A total of 4,198 radiomics features were extracted from the pre-biopsy multi-parametric MRI (including dynamic contrast-enhancement T1-weighted images, fat-suppressed T2-weighted images, and apparent diffusion coefficient map) of each patient. We applied several feature selection strategies including the least absolute shrinkage and selection operator (LASSO), and recursive feature elimination (RFE), the maximum relevance minimum redundancy (mRMR), Boruta and Pearson correlation analysis, to select the most optimal features. We then built 120 diagnostic models using distinct classification algorithms and feature sets divided by MRI sequences and selection strategies to predict molecular subtype and AR expression of breast cancer in the testing dataset of leave-one-out cross-validation (LOOCV). The performances of binary classification models were assessed via the area under the receiver operating characteristic curve (AUC), accuracy, sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV). And the performances of multiclass classification models were assessed via AUC, overall accuracy, precision, recall rate, and F1-score.

Results

A total of 162 patients (mean age, 46.91 ± 10.08 years) were enrolled in this study; 30 were low-AR expression and 132 were high-AR expression. HR+/HER2− cancers were diagnosed in 56 cases (34.6%), HER2+ cancers in 81 cases (50.0%), and TNBC in 25 patients (15.4%). There was no significant difference in clinicopathologic characteristics between low-AR and high-AR groups (P > 0.05), except the menopausal status, ER, PR, HER2, and Ki-67 index (P = 0.043, <0.001, <0.001, 0.015, and 0.006, respectively). No significant difference in clinicopathologic characteristics was observed among three molecular subtypes except the AR status and Ki-67 (P = <0.001 and 0.012, respectively). The Multilayer Perceptron (MLP) showed the best performance in discriminating AR expression, with an AUC of 0.907 and an accuracy of 85.8% in the testing dataset. The highest performances were obtained for discriminating TNBC vs. non-TNBC (AUC: 0.965, accuracy: 92.6%), HER2+ vs. HER2− (AUC: 0.840, accuracy: 79.0%), and HR+/HER2− vs. others (AUC: 0.860, accuracy: 82.1%) using MLP as well. The micro-AUC of MLP multiclass classification model was 0.896, and the overall accuracy was 0.735.

Conclusions

Multi-parametric MRI-based radiomics combining with machine learning approaches provide a promising method to predict the molecular subtype and AR expression of breast cancer non-invasively.

Collapse

213

Gu W, Kim M, Wang L, Yang Z, Nakajima T, Tsushima Y. Multi-omics Analysis of Ferroptosis Regulation Patterns and Characterization of Tumor Microenvironment in Patients with Oral Squamous Cell Carcinoma. Int J Biol Sci 2021;17:3476-3492. [PMID: 34512160 PMCID: PMC8416738 DOI: 10.7150/ijbs.61441] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2021] [Accepted: 07/20/2021] [Indexed: 02/06/2023] Open

214

Narváez-Villa P, Arenas-Ramírez B, Mira J, Aparicio-Izquierdo F. Analysis and Prediction of Vehicle Kilometers Traveled: A Case Study in Spain. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2021;18:ijerph18168327. [PMID: 34444076 PMCID: PMC8391987 DOI: 10.3390/ijerph18168327] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/18/2021] [Revised: 08/02/2021] [Accepted: 08/03/2021] [Indexed: 11/16/2022]

215

Chen Q, Zhao Y, Liu Y, Sun Y, Yang C, Li P, Zhang L, Gao C. MSLPNet: multi-scale location perception network for dental panoramic X-ray image segmentation. Neural Comput Appl 2021. [DOI: 10.1007/s00521-021-05790-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]

216

Lu M, Parel JM, Miller D. Interactions between staphylococcal enterotoxins A and D and superantigen-like proteins 1 and 5 for predicting methicillin and multidrug resistance profiles among Staphylococcus aureus ocular isolates. PLoS One 2021;16:e0254519. [PMID: 34320020 PMCID: PMC8318242 DOI: 10.1371/journal.pone.0254519] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Accepted: 06/29/2021] [Indexed: 11/18/2022] Open

Abstract

BACKGROUND

Methicillin-resistant Staphylococcus aureus (MRSA) and multidrug-resistant (MDR) S. aureus strains are well recognized as posing substantial problems in treating ocular infections. S. aureus has a vast array of virulence factors, including superantigens and enterotoxins. Their interactions and ability to signal antibiotics resistance have not been explored.

OBJECTIVES

To predict the relationship between superantigens and methicillin and multidrug resistance among S. aureus ocular isolates.

METHODS

We used a DNA microarray to characterize the enterotoxin and superantigen gene profiles of 98 S. aureus isolates collected from common ocular sources. The outcomes contained phenotypic and genotypic expressions of MRSA. We also included the MDR status as an outcome, categorized as resistance to three or more drugs, including oxacillin, penicillin, erythromycin, clindamycin, moxifloxacin, tetracycline, trimethoprim-sulfamethoxazole and gentamicin. We identified gene profiles that predicted each outcome through a classification analysis utilizing Random Forest machine learning techniques.

FINDINGS

Our machine learning models predicted the outcomes accurately utilizing 67 enterotoxin and superantigen genes. Strong correlates predicting the genotypic expression of MRSA were enterotoxins A, D, J and R and superantigen-like proteins 1, 3, 7 and 10. Among these virulence factors, enterotoxin D and superantigen-like proteins 1, 5 and 10 were also significantly informative for predicting both MDR and MRSA in terms of phenotypic expression. Strong interactions were identified including enterotoxins A (entA) interacting with superantigen-like protein 1 (set6-var1_11), and enterotoxin D (entD) interacting with superantigen-like protein 5 (ssl05/set3_probe 1): MRSA and MDR S. aureus are associated with the presence of both entA and set6-var1_11, or both entD and ssl05/set3_probe 1, while the absence of these genes in pairs indicates non-multidrug-resistant and methicillin-susceptible S. aureus.

CONCLUSIONS

MRSA and MDR S. aureus show a different spectrum of ocular pathology than their non-resistant counterparts. When assessing the role of enterotoxins in predicting antibiotics resistance, it is critical to consider both main effects and interactions.

Collapse

217

Biological knowledge-slanted random forest approach for the classification of calcified aortic valve stenosis. BioData Min 2021;14:35. [PMID: 34301292 PMCID: PMC8305490 DOI: 10.1186/s13040-021-00269-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2021] [Accepted: 07/18/2021] [Indexed: 11/29/2022] Open

218

Chun HJ, Coutavas E, Pine AB, Lee AI, Yu VL, Shallow MK, Giovacchini CX, Mathews AM, Stephenson B, Que LG, Lee PJ, Kraft BD. Immunofibrotic drivers of impaired lung function in postacute sequelae of SARS-CoV-2 infection. JCI Insight 2021;6:148476. [PMID: 34111030 PMCID: PMC8410030 DOI: 10.1172/jci.insight.148476] [Citation(s) in RCA: 39] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2021] [Accepted: 06/09/2021] [Indexed: 11/17/2022] Open

Abstract

BACKGROUNDIndividuals recovering from COVID-19 frequently experience persistent respiratory ailments, which are key elements of postacute sequelae of SARS-CoV-2 infection (PASC); however, little is known about the underlying biological factors that may direct lung recovery and the extent to which these are affected by COVID-19 severity.METHODSWe performed a prospective cohort study of individuals with persistent symptoms after acute COVID-19, collecting clinical data, pulmonary function tests, and plasma samples used for multiplex profiling of inflammatory, metabolic, angiogenic, and fibrotic factors.RESULTSSixty-one participants were enrolled across 2 academic medical centers at a median of 9 weeks (interquartile range, 6-10 weeks) after COVID-19 illness: n = 13 participants (21%) had mild COVID-19 and were not hospitalized, n = 30 participants (49%) were hospitalized but were considered noncritical, and n = 18 participants (30%) were hospitalized and in the intensive care unit (ICU). Fifty-three participants (85%) had lingering symptoms, most commonly dyspnea (69%) and cough (58%). Forced vital capacity (FVC), forced expiratory volume in 1 second (FEV1), and diffusing capacity for carbon monoxide (DLCO) declined as COVID-19 severity increased (P < 0.05) but these values did not correlate with respiratory symptoms. Partial least-squares discriminant analysis of plasma biomarker profiles clustered participants by past COVID-19 severity. Lipocalin-2 (LCN2), MMP-7, and HGF identified by our analysis were significantly higher in the ICU group (P < 0.05), inversely correlated with FVC and DLCO (P < 0.05), and were confirmed in a separate validation cohort (n = 53).CONCLUSIONSubjective respiratory symptoms are common after acute COVID-19 illness but do not correlate with COVID-19 severity or pulmonary function. Host response profiles reflecting neutrophil activation (LCN2), fibrosis signaling (MMP-7), and alveolar repair (HGF) track with lung impairment and may be novel therapeutic or prognostic targets.FundingNational Heart, Lung, and Blood Institute (K08HL130557 and R01HL142818), American Heart Association (Transformational Project Award), the DeLuca Foundation Award, a donation from Jack Levin to the Benign Hematology Program at Yale University, and Duke University.

Collapse

Affiliation(s)

Hyung J. Chun Yale Cardiovascular Research Center, Section of Cardiovascular Medicine, Department of Internal Medicine, Yale School of Medicine, New Haven, Connecticut, USA
Elias Coutavas Division of Pulmonary, Allergy, and Critical Care Medicine, Department of Medicine, Duke University School of Medicine, Durham, North Carolina, USA
Alexander B. Pine Section of Hematology, Department of Internal Medicine, Yale School of Medicine, New Haven, Connecticut, USA
Alfred I. Lee Section of Hematology, Department of Internal Medicine, Yale School of Medicine, New Haven, Connecticut, USA
Vanessa L. Yu Yale Cardiovascular Research Center, Section of Cardiovascular Medicine, Department of Internal Medicine, Yale School of Medicine, New Haven, Connecticut, USA
Marcus K. Shallow Yale Cardiovascular Research Center, Section of Cardiovascular Medicine, Department of Internal Medicine, Yale School of Medicine, New Haven, Connecticut, USA
Coral X. Giovacchini Division of Pulmonary, Allergy, and Critical Care Medicine, Department of Medicine, Duke University School of Medicine, Durham, North Carolina, USA
Anne M. Mathews Division of Pulmonary, Allergy, and Critical Care Medicine, Department of Medicine, Duke University School of Medicine, Durham, North Carolina, USA
Brian Stephenson Division of Pulmonary, Allergy, and Critical Care Medicine, Department of Medicine, Duke University School of Medicine, Durham, North Carolina, USA
Loretta G. Que Division of Pulmonary, Allergy, and Critical Care Medicine, Department of Medicine, Duke University School of Medicine, Durham, North Carolina, USA
Patty J. Lee Division of Pulmonary, Allergy, and Critical Care Medicine, Department of Medicine, Duke University School of Medicine, Durham, North Carolina, USA
Bryan D. Kraft Division of Pulmonary, Allergy, and Critical Care Medicine, Department of Medicine, Duke University School of Medicine, Durham, North Carolina, USA

Collapse

219

Applying random forest in a health administrative data context: a conceptual guide. HEALTH SERVICES AND OUTCOMES RESEARCH METHODOLOGY 2021. [DOI: 10.1007/s10742-021-00255-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]

220

Morita-Sherman M, Li M, Joseph B, Yasuda C, Vegh D, De Campos BM, Alvim MKM, Louis S, Bingaman W, Najm I, Jones S, Wang X, Blümcke I, Brinkmann BH, Worrell G, Cendes F, Jehi L. Incorporation of quantitative MRI in a model to predict temporal lobe epilepsy surgery outcome. Brain Commun 2021;3:fcab164. [PMID: 34396113 PMCID: PMC8361423 DOI: 10.1093/braincomms/fcab164] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/01/2021] [Indexed: 11/23/2022] Open

Abstract

Quantitative volumetric brain MRI measurement is important in research applications, but translating it into patient care is challenging. We explore the incorporation of clinical automated quantitative MRI measurements in statistical models predicting outcomes of surgery for temporal lobe epilepsy. Four hundred and thirty-five patients with drug-resistant epilepsy who underwent temporal lobe surgery at Cleveland Clinic, Mayo Clinic and University of Campinas were studied. We obtained volumetric measurements from the pre-operative T1-weighted MRI using NeuroQuant, a Food and Drug Administration approved software package. We created sets of statistical models to predict the probability of complete seizure-freedom or an Engel score of I at the last follow-up. The cohort was randomly split into training and testing sets, with a ratio of 7:3. Model discrimination was assessed using the concordance statistic (C-statistic). We compared four sets of models and selected the one with the highest concordance index. Volumetric differences in pre-surgical MRI located predominantly in the frontocentral and temporal regions were associated with poorer outcomes. The addition of volumetric measurements to the model with clinical variables alone increased the model’s C-statistic from 0.58 to 0.70 (right-sided surgery) and from 0.61 to 0.66 (left-sided surgery) for complete seizure freedom and from 0.62 to 0.67 (right-sided surgery) and from 0.68 to 0.73 (left-sided surgery) for an Engel I outcome score. 57% of patients with extra-temporal abnormalities were seizure-free at last follow-up, compared to 68% of those with no such abnormalities (P-value = 0.02). Adding quantitative MRI data increases the performance of a model developed to predict post-operative seizure outcomes. The distribution of the regions of interest included in the final model supports the notion that focal epilepsies are network disorders and that subtle cortical volume loss outside the surgical site influences seizure outcome.

Collapse

221

Wang Y, Guo H, Li S, Wang L, Song X, Zhao X. Identify risk factors and predict the postoperative risk of ESCC using ensemble learning. Biomed Signal Process Control 2021. [DOI: 10.1016/j.bspc.2021.102784] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]

222

Buckley SJ, Harvey RJ, Shan Z. Application of the random forest algorithm to Streptococcus pyogenes response regulator allele variation: from machine learning to evolutionary models. Sci Rep 2021;11:12687. [PMID: 34135390 PMCID: PMC8209152 DOI: 10.1038/s41598-021-91941-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Accepted: 05/27/2021] [Indexed: 02/07/2023] Open

223

Speiser JL. A random forest method with feature selection for developing medical prediction models with clustered and longitudinal data. J Biomed Inform 2021;117:103763. [PMID: 33781921 PMCID: PMC8131242 DOI: 10.1016/j.jbi.2021.103763] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2020] [Revised: 03/03/2021] [Accepted: 03/23/2021] [Indexed: 12/22/2022]

Abstract

BACKGROUND

Machine learning methodologies are gaining popularity for developing medical prediction models for datasets with a large number of predictors, particularly in the setting of clustered and longitudinal data. Binary Mixed Model (BiMM) forest is a promising machine learning algorithm which may be applied to develop prediction models for clustered and longitudinal binary outcomes. Although machine learning methods for clustered and longitudinal methods such as BiMM forest exist, feature selection has not been analyzed via data simulations. Feature selection improves the practicality and ease of use of prediction models for clinicians by reducing the burden of data collection. Thus, feature selection procedures are not only beneficial, but are often necessary for development of medical prediction models. In this study, we aim to assess feature selection within the BiMM forest setting for modeling clustered and longitudinal binary outcomes.

METHODS

We conducted a simulation study to compare BiMM forest with feature selection (backward elimination or stepwise selection) to standard generalized linear mixed model feature selection methods (shrinkage and backward elimination). We also evaluated feature selection methods to develop models predicting mobility disability in older adults using the Health, Aging and Body Composition Study dataset as an example utilization of the proposed methodology.

RESULTS

BiMM forest with backward elimination generally offered higher computational efficiency, similar or higher predictive performance (accuracy and area under the receiver operating curve), and similar or higher ability to identify correct features compared to linear methods for the different simulated scenarios. For predicting mobility disability in older adults, methods generally performed similarly in terms of accuracy, area under the receiver operating curve, and specificity; however, BiMM forest with backward elimination had the highest sensitivity.

CONCLUSIONS

This study is novel because it is the first investigation of feature selection for developing random forest prediction models for clustered and longitudinal binary outcomes. Results from the simulation study reveal that BiMM forest with backward elimination has the highest accuracy (performance and identification of correct features) and lowest computation time compared to other feature selection methods in some scenarios and similar performance in other scenarios. Many informatics datasets have clustered and longitudinal outcomes and results from this study suggest that BiMM forest with backward elimination may be beneficial for developing medical prediction models.

Collapse

224

Ellis CJ, Eaton S. Microclimates hold the key to spatial forest planning under climate change: Cyanolichens in temperate rainforest. GLOBAL CHANGE BIOLOGY 2021;27:1915-1926. [PMID: 33421251 DOI: 10.1111/gcb.15514] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/28/2020] [Revised: 12/23/2020] [Accepted: 12/27/2020] [Indexed: 06/12/2023]

Abstract

There is deepening interest in how microclimatic refugia can reduce species threat, if suitable climatic conditions are maintained locally, despite global climate change. Microclimates are a particularly important consideration in topographically heterogeneous landscapes, while in some habitats, such as forests and woodlands, microclimates are also extremely labile and affected by management practices that could consequently be used to offset climate change impact. This study explored a conservation priority guild-cyanolichen epiphytes in temperate rainforest-quantifying the niche response to macroclimate, and landscape or woodland stand structures that determine the microclimate. Based on epiphyte survey in a core region of European temperate rainforest (western Scotland), a 'random forest' machine-learning model confirmed a strong cyanolichen response to summer dryness, as well as the effects of distance to running water, topographic heatload and tree species identity, which modify the local moisture regime and/or lichen growth rates. By quantifying this response to macroclimate, landscape and stand structures, it was possible to estimate an extent to which woodland may be expanded in the future, to offset a negative effect of increasing summer dryness projected through to the 2080s. Using current policy as a yardstick, sufficient woodland expansion could be delivered relatively quickly for median impacted sites, but with times to woodland delivery extending over 10, 20 and 25 years for sites at the 75th, 90th and 95th percentiles of cyanolichen decline. Furthermore, the extent of new woodland required, and delivery times, increase almost threefold on average, as new woodland becomes distributed over wider riparian zones. These contrasting implications emphasize an urgent need for afforestation that achieves targeted spatial planning responsive to microclimates as refugia.

Collapse

225

Alcalá-Rmz V, Galván-Tejada CE, García-Hernández A, Valladares-Salgado A, Cruz M, Galván-Tejada JI, Celaya-Padilla JM, Luna-Garcia H, Gamboa-Rosales H. Identification of People with Diabetes Treatment through Lipids Profile Using Machine Learning Algorithms. Healthcare (Basel) 2021;9:422. [PMID: 33917300 PMCID: PMC8067355 DOI: 10.3390/healthcare9040422] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2020] [Revised: 03/02/2021] [Accepted: 03/08/2021] [Indexed: 11/16/2022] Open

Affiliation(s)

Vanessa Alcalá-Rmz Unidad Académica de Ingeniería Eléctrica, Universidad Autónoma de Zacatecas, Jardín Juárez 147, Centro, Zacatecas 98000, Mexico; (V.A.-R.); (A.G.-H.); (J.I.G.-T.); (J.M.C.-P.); (H.L.-G.); (H.G.-R.)
Carlos E. Galván-Tejada Unidad Académica de Ingeniería Eléctrica, Universidad Autónoma de Zacatecas, Jardín Juárez 147, Centro, Zacatecas 98000, Mexico; (V.A.-R.); (A.G.-H.); (J.I.G.-T.); (J.M.C.-P.); (H.L.-G.); (H.G.-R.)
Alejandra García-Hernández Unidad Académica de Ingeniería Eléctrica, Universidad Autónoma de Zacatecas, Jardín Juárez 147, Centro, Zacatecas 98000, Mexico; (V.A.-R.); (A.G.-H.); (J.I.G.-T.); (J.M.C.-P.); (H.L.-G.); (H.G.-R.)
Adan Valladares-Salgado Unidad de Investigación Médica en Bioquímica, Hospital de Especialidades, Centro Médico Nacional Siglo XXI, Instituto Mexicano del Seguro Social, Av. Cuauhtémoc 330, Col. Doctores, Del. Cuauhtémoc, Mexico City 06720, Mexico; (A.V.-S.); (M.C.)
Miguel Cruz Unidad de Investigación Médica en Bioquímica, Hospital de Especialidades, Centro Médico Nacional Siglo XXI, Instituto Mexicano del Seguro Social, Av. Cuauhtémoc 330, Col. Doctores, Del. Cuauhtémoc, Mexico City 06720, Mexico; (A.V.-S.); (M.C.)
Jorge I. Galván-Tejada Unidad Académica de Ingeniería Eléctrica, Universidad Autónoma de Zacatecas, Jardín Juárez 147, Centro, Zacatecas 98000, Mexico; (V.A.-R.); (A.G.-H.); (J.I.G.-T.); (J.M.C.-P.); (H.L.-G.); (H.G.-R.)
Jose M. Celaya-Padilla Unidad Académica de Ingeniería Eléctrica, Universidad Autónoma de Zacatecas, Jardín Juárez 147, Centro, Zacatecas 98000, Mexico; (V.A.-R.); (A.G.-H.); (J.I.G.-T.); (J.M.C.-P.); (H.L.-G.); (H.G.-R.)
Huizilopoztli Luna-Garcia Unidad Académica de Ingeniería Eléctrica, Universidad Autónoma de Zacatecas, Jardín Juárez 147, Centro, Zacatecas 98000, Mexico; (V.A.-R.); (A.G.-H.); (J.I.G.-T.); (J.M.C.-P.); (H.L.-G.); (H.G.-R.)
Hamurabi Gamboa-Rosales Unidad Académica de Ingeniería Eléctrica, Universidad Autónoma de Zacatecas, Jardín Juárez 147, Centro, Zacatecas 98000, Mexico; (V.A.-R.); (A.G.-H.); (J.I.G.-T.); (J.M.C.-P.); (H.L.-G.); (H.G.-R.)

Collapse

226

Kim YJ, Jeon JS, Cho SE, Kim KG, Kang SG. Prediction Models for Obstructive Sleep Apnea in Korean Adults Using Machine Learning Techniques. Diagnostics (Basel) 2021;11:diagnostics11040612. [PMID: 33808100 PMCID: PMC8066462 DOI: 10.3390/diagnostics11040612] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2021] [Revised: 03/24/2021] [Accepted: 03/26/2021] [Indexed: 12/01/2022] Open

227

Kalina J, Neoral A, Vidnerová P. Effective Automatic Method Selection for Nonlinear Regression Modeling. Int J Neural Syst 2021;31:2150020. [PMID: 33787471 DOI: 10.1142/s0129065721500209] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]

228

Buenafe RJQ, Kumanduri V, Sreenivasulu N. Deploying viscosity and starch polymer properties to predict cooking and eating quality models: A novel breeding tool to predict texture. Carbohydr Polym 2021;260:117766. [PMID: 33712124 PMCID: PMC7973724 DOI: 10.1016/j.carbpol.2021.117766] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2020] [Revised: 01/30/2021] [Accepted: 02/02/2021] [Indexed: 12/15/2022]

229

Maeda-Gutiérrez V, Galván-Tejada CE, Cruz M, Valladares-Salgado A, Galván-Tejada JI, Gamboa-Rosales H, García-Hernández A, Luna-García H, Gonzalez-Curiel I, Martínez-Acuña M. Distal Symmetric Polyneuropathy Identification in Type 2 Diabetes Subjects: A Random Forest Approach. Healthcare (Basel) 2021;9:138. [PMID: 33535510 PMCID: PMC7912731 DOI: 10.3390/healthcare9020138] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2020] [Revised: 01/23/2021] [Accepted: 01/25/2021] [Indexed: 12/05/2022] Open

Affiliation(s)

Valeria Maeda-Gutiérrez Unidad Académica de Ingeniería Eléctrica, Universidad Autónoma de Zacatecas, Jardín Juarez 147, Centro, 98000 Zacatecas, Zac, Mexico; (V.M.-G.); (J.I.G.-T.); (H.G.-R.); (A.G.-H.); (H.L.-G.)
Carlos E. Galván-Tejada Unidad Académica de Ingeniería Eléctrica, Universidad Autónoma de Zacatecas, Jardín Juarez 147, Centro, 98000 Zacatecas, Zac, Mexico; (V.M.-G.); (J.I.G.-T.); (H.G.-R.); (A.G.-H.); (H.L.-G.)
Miguel Cruz Unidad de Investigación Médica en Bioquímica, Hospital de Especialidades, Centro Médico Nacional Siglo XXI. Instituto Mexicano del Seguro Social, Av. Cuauhtémoc 330, Col. Doctores, Del. Cuauhtémoc, Mexico City 06720, Mexico; (M.C.); (A.V.-S.)
Adan Valladares-Salgado Unidad de Investigación Médica en Bioquímica, Hospital de Especialidades, Centro Médico Nacional Siglo XXI. Instituto Mexicano del Seguro Social, Av. Cuauhtémoc 330, Col. Doctores, Del. Cuauhtémoc, Mexico City 06720, Mexico; (M.C.); (A.V.-S.)
Jorge I. Galván-Tejada Unidad Académica de Ingeniería Eléctrica, Universidad Autónoma de Zacatecas, Jardín Juarez 147, Centro, 98000 Zacatecas, Zac, Mexico; (V.M.-G.); (J.I.G.-T.); (H.G.-R.); (A.G.-H.); (H.L.-G.)
Hamurabi Gamboa-Rosales Unidad Académica de Ingeniería Eléctrica, Universidad Autónoma de Zacatecas, Jardín Juarez 147, Centro, 98000 Zacatecas, Zac, Mexico; (V.M.-G.); (J.I.G.-T.); (H.G.-R.); (A.G.-H.); (H.L.-G.)
Alejandra García-Hernández Unidad Académica de Ingeniería Eléctrica, Universidad Autónoma de Zacatecas, Jardín Juarez 147, Centro, 98000 Zacatecas, Zac, Mexico; (V.M.-G.); (J.I.G.-T.); (H.G.-R.); (A.G.-H.); (H.L.-G.)
Huizilopoztli Luna-García Unidad Académica de Ingeniería Eléctrica, Universidad Autónoma de Zacatecas, Jardín Juarez 147, Centro, 98000 Zacatecas, Zac, Mexico; (V.M.-G.); (J.I.G.-T.); (H.G.-R.); (A.G.-H.); (H.L.-G.)
Irma Gonzalez-Curiel Unidad Académica de Ciencias Químicas, Universidad Autónoma de Zacatecas, Jardín Juarez 147, Centro, Zacatecas 98000, Mexico; (I.G.-C.); (M.M.-A.)
Mónica Martínez-Acuña Unidad Académica de Ciencias Químicas, Universidad Autónoma de Zacatecas, Jardín Juarez 147, Centro, Zacatecas 98000, Mexico; (I.G.-C.); (M.M.-A.)

Collapse

230

Continual learning classification method with constant-sized memory cells based on the artificial immune system. Knowl Based Syst 2021. [DOI: 10.1016/j.knosys.2020.106673] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]

231

Rahman T, Khandakar A, Hoque ME, Ibtehaz N, Kashem SB, Masud R, Shampa L, Hasan MM, Islam MT, Al-Maadeed S, Zughaier SM, Badran S, Doi SAR, Chowdhury MEH. Development and Validation of an Early Scoring System for Prediction of Disease Severity in COVID-19 Using Complete Blood Count Parameters. IEEE ACCESS : PRACTICAL INNOVATIONS, OPEN SOLUTIONS 2021;9:120422-120441. [PMID: 34786318 PMCID: PMC8545188 DOI: 10.1109/access.2021.3105321] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/17/2021] [Accepted: 08/07/2021] [Indexed: 05/08/2023]

Abstract

The coronavirus disease 2019 (COVID-19) after outbreaking in Wuhan increasingly spread throughout the world. Fast, reliable, and easily accessible clinical assessment of the severity of the disease can help in allocating and prioritizing resources to reduce mortality. The objective of the study was to develop and validate an early scoring tool to stratify the risk of death using readily available complete blood count (CBC) biomarkers. A retrospective study was conducted on twenty-three CBC blood biomarkers for predicting disease mortality for 375 COVID-19 patients admitted to Tongji Hospital, China from January 10 to February 18, 2020. Machine learning based key biomarkers among the CBC parameters as the mortality predictors were identified. A multivariate logistic regression-based nomogram and a scoring system was developed to categorize the patients in three risk groups (low, moderate, and high) for predicting the mortality risk among COVID-19 patients. Lymphocyte count, neutrophils count, age, white blood cell count, monocytes (%), platelet count, red blood cell distribution width parameters collected at hospital admission were selected as important biomarkers for death prediction using random forest feature selection technique. A CBC score was devised for calculating the death probability of the patients and was used to categorize the patients into three sub-risk groups: low (<=5%), moderate (>5% and <=50%), and high (>50%), respectively. The area under the curve (AUC) of the model for the development and internal validation cohort were 0.961 and 0.88, respectively. The proposed model was further validated with an external cohort of 103 patients of Dhaka Medical College, Bangladesh, which exhibits in an AUC of 0.963. The proposed CBC parameter-based prognostic model and the associated web-application, can help the medical doctors to improve the management by early prediction of mortality risk of the COVID-19 patients in the low-resource countries.

Collapse

232

Jiang F, Kutia M, Sarkissian AJ, Lin H, Long J, Sun H, Wang G. Estimating the Growing Stem Volume of Coniferous Plantations Based on Random Forest Using an Optimized Variable Selection Method. SENSORS 2020;20:s20247248. [PMID: 33348807 PMCID: PMC7766647 DOI: 10.3390/s20247248] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/29/2020] [Revised: 12/08/2020] [Accepted: 12/14/2020] [Indexed: 11/16/2022]

Abstract

Forest growing stem volume (GSV) reflects the richness of forest resources as well as the quality of forest ecosystems. Remote sensing technology enables robust and efficient GSV estimation as it greatly reduces the survey time and cost while facilitating periodic monitoring. Given its red edge bands and a short revisit time period, Sentinel-2 images were selected for the GSV estimation in Wangyedian forest farm, Inner Mongolia, China. The variable combination was shown to significantly affect the accuracy of the estimation model. After extracting spectral variables, texture features, and topographic factors, a stepwise random forest (SRF) method was proposed to select variable combinations and establish random forest regressions (RFR) for GSV estimation. The linear stepwise regression (LSR), Boruta, Variable Selection Using Random Forests (VSURF), and random forest (RF) methods were then used as references for comparison with the proposed SRF for selection of predictors and GSV estimation. Combined with the observed GSV data and the Sentinel-2 images, the distributions of GSV were generated by the RFR models with the variable combinations determined by the LSR, RF, Boruta, VSURF, and SRF. The results show that the texture features of Sentinel-2’s red edge bands can significantly improve the accuracy of GSV estimation. The SRF method can effectively select the optimal variable combination, and the SRF-based model results in the highest estimation accuracy with the decreases of relative root mean square error by 16.4%, 14.4%, 16.3%, and 10.6% compared with those from the LSR-, RF-, Boruta-, and VSURF-based models, respectively. The GSV distribution generated by the SRF-based model matched that of the field observations well. The results of this study are expected to provide a reference for GSV estimation of coniferous plantations.

Collapse

Affiliation(s)

Fugen Jiang Research Center of Forestry Remote Sensing and Information Engineering, Central South University of Forestry and Technology, Changsha 410004, China; (F.J.); (H.L.); (J.L.); (G.W.) Key Laboratory of Forestry Remote Sensing Based Big Data and Ecological Security for Hunan Province, Changsha 410004, China Key Laboratory of National Forestry and Grassland Administration on Forest Resources Management and Monitoring in Southern Area, Changsha 410004, China
Mykola Kutia Bangor College China, Bangor University, 498 Shaoshan Rd., Changsha 410004, China; (M.K.); (A.J.S.)
Arbi J. Sarkissian Bangor College China, Bangor University, 498 Shaoshan Rd., Changsha 410004, China; (M.K.); (A.J.S.)
Hui Lin Research Center of Forestry Remote Sensing and Information Engineering, Central South University of Forestry and Technology, Changsha 410004, China; (F.J.); (H.L.); (J.L.); (G.W.) Key Laboratory of Forestry Remote Sensing Based Big Data and Ecological Security for Hunan Province, Changsha 410004, China Key Laboratory of National Forestry and Grassland Administration on Forest Resources Management and Monitoring in Southern Area, Changsha 410004, China
Jiangping Long Research Center of Forestry Remote Sensing and Information Engineering, Central South University of Forestry and Technology, Changsha 410004, China; (F.J.); (H.L.); (J.L.); (G.W.) Key Laboratory of Forestry Remote Sensing Based Big Data and Ecological Security for Hunan Province, Changsha 410004, China Key Laboratory of National Forestry and Grassland Administration on Forest Resources Management and Monitoring in Southern Area, Changsha 410004, China
Hua Sun Research Center of Forestry Remote Sensing and Information Engineering, Central South University of Forestry and Technology, Changsha 410004, China; (F.J.); (H.L.); (J.L.); (G.W.) Key Laboratory of Forestry Remote Sensing Based Big Data and Ecological Security for Hunan Province, Changsha 410004, China Key Laboratory of National Forestry and Grassland Administration on Forest Resources Management and Monitoring in Southern Area, Changsha 410004, China Correspondence: ; Tel.: +86-138-758-821-84
Guangxing Wang Research Center of Forestry Remote Sensing and Information Engineering, Central South University of Forestry and Technology, Changsha 410004, China; (F.J.); (H.L.); (J.L.); (G.W.) Department of Geography and Environmental Resources, Southern Illinois University, Carbondale, IL 62901, USA

Collapse

233

On the Influence of Reference Mahalanobis Distance Space for Quality Classification of Complex Metal Parts Using Vibrations. APPLIED SCIENCES-BASEL 2020. [DOI: 10.3390/app10238620] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]

234

Remote Sensing of Lake Sediment Core Particle Size Using Hyperspectral Image Analysis. REMOTE SENSING 2020. [DOI: 10.3390/rs12233850] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]

235

Glacier Mapping Based on Random Forest Algorithm: A Case Study over the Eastern Pamir. WATER 2020. [DOI: 10.3390/w12113231] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]

236

Amato MP, Portaccio E, De Meo E. Understanding the pathophysiology of cognitive changes in MS: A step forward. Mult Scler 2020;27:4-5. [PMID: 33146049 DOI: 10.1177/1352458520968038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]

237

Bai X, Li J. The best configuration of collaborative knowledge innovation management from the perspective of artificial intelligence. KNOWLEDGE MANAGEMENT RESEARCH & PRACTICE 2020. [DOI: 10.1080/14778238.2020.1834886] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]

238

Modeling Road Accident Severity with Comparisons of Logistic Regression, Decision Tree and Random Forest. INFORMATION 2020. [DOI: 10.3390/info11050270] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open

239

An Optimized Object-Based Random Forest Algorithm for Marsh Vegetation Mapping Using High-Spatial-Resolution GF-1 and ZY-3 Data. REMOTE SENSING 2020. [DOI: 10.3390/rs12081270] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]

Abstract Discriminating marsh vegetation is critical for the rapid assessment and management of wetlands. The study area, Honghe National Nature Reserve (HNNR), a typical freshwater wetland, is located in Northeast China. This study optimized the parameters (mtry and ntrees) of an object-based random forest (RF) algorithm to improve the applicability of marsh vegetation classification. Multidimensional datasets were used as the input variables for model training, then variable selection was performed on the variables to eliminate redundancy, which improved classification efficiency and overall accuracy. Finally, the performance of a new generation of Chinese high-spatial-resolution Gaofen-1 (GF-1) and Ziyuan-3 (ZY-3) satellite images for marsh vegetation classification was evaluated using the improved object-based RF algorithm with accuracy assessment. The specific conclusions of this study are as follows: (1) Optimized object-based RF classifications consistently produced more than 70.26% overall accuracy for all scenarios of GF-1 and ZY-3 at the 95% confidence interval. The performance of ZY-3 imagery applied to marsh vegetation mapping is lower than that of GF-1 imagery due to the coarse spatial resolution. (2) Parameter optimization of the object-based RF algorithm effectively improved the stability and classification accuracy of the algorithm. After parameter adjustment, scenario 3 for GF-1 data had the highest classification accuracy of 84% (ZY-3 is 74.72%) at the 95% confidence interval. (3) The introduction of multidimensional datasets improved the overall accuracy of marsh vegetation mapping, but with many redundant variables. Using three variable selection algorithms to remove redundant variables from the multidimensional datasets effectively improved the classification efficiency and overall accuracy. The recursive feature elimination (RFE)-based variable selection algorithm had the best performance. (4) Optical spectral bands, spectral indices, mean value of green and NIR bands in textural information, DEM, TWI, compactness, max difference, and shape index are valuable variables for marsh vegetation mapping. (5) GF-1 and ZY-3 images had higher classification accuracy for forest, cropland, shrubs, and open water. Collapse

240

A Random Forest Modelling Procedure for a Multi-Sensor Assessment of Tree Species Diversity. REMOTE SENSING 2020. [DOI: 10.3390/rs12071210] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]

241

Antoniadi AM, Galvin M, Heverin M, Hardiman O, Mooney C. Prediction of caregiver burden in amyotrophic lateral sclerosis: a machine learning approach using random forests applied to a cohort study. BMJ Open 2020;10:e033109. [PMID: 32114464 PMCID: PMC7050406 DOI: 10.1136/bmjopen-2019-033109] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/12/2019] [Revised: 02/05/2020] [Accepted: 02/07/2020] [Indexed: 12/13/2022] Open

Abstract

OBJECTIVES

Amyotrophic lateral sclerosis (ALS) is a rare neurodegenerative disease that is characterised by the rapid degeneration of upper and lower motor neurons and has a fatal trajectory 3-4 years from symptom onset. Due to the nature of the condition patients with ALS require the assistance of informal caregivers whose task is demanding and can lead to high feelings of burden. This study aims to predict caregiver burden and identify related features using machine learning techniques.

DESIGN

This included demographic and socioeconomic information, quality of life, anxiety and depression questionnaires, for patients and carers, resource use of patients and clinical information. The method used for prediction was the Random forest algorithm.

SETTING AND PARTICIPANTS

This study investigates a cohort of 90 patients and their primary caregiver at three different time-points. The patients were attending the National ALS/Motor Neuron Disease Multidisciplinary Clinic at Beaumont Hospital, Dublin.

RESULTS

The caregiver's quality of life and psychological distress were the most predictive features of burden (0.92 sensitivity and 0.78 specificity). The most predictive features for Clinical Decision Support model were associated with the weekly caregiving duties of the primary caregiver as well as their age and health and also the patient's physical functioning and age of onset. However, this model had a lower sensitivity and specificity score (0.84 and 0.72, respectively). The ability of patients without gastrostomy to cut food and handle utensils was also highly predictive of burden in this study. Generally, our models are better in predicting the high-risk category, and we suggest that information related to the caregiver's quality of life and psychological distress is required.

CONCLUSION

This work demonstrates a proof of concept of an informatics solution to identifying caregivers at risk of burden that could be incorporated into future care pathways.

Collapse

242

Predicting Microhabitat Suitability for an Endangered Small Mammal Using Sentinel-2 Data. REMOTE SENSING 2020. [DOI: 10.3390/rs12030562] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]

Abstract Accurate mapping is a main challenge for endangered small-sized terrestrial species. Freely available spatio-temporal data at high resolution from multispectral satellite offer excellent opportunities for improving predictive distribution models of such species based on fine-scale habitat features, thus making it easier to achieve comprehensive biodiversity conservation goals. However, there are still few examples showing the utility of remote-sensing-based products in mapping microhabitat suitability for small species of conservation concern. Here, we address this issue using Sentinel-2 sensor-derived habitat variables, used in combination with more commonly used explanatory variables (e.g., topography), to predict the distribution of the endangered Cabrera vole (Microtus cabrerae) in agrosilvopastorial systems. Based on vole surveys conducted in two different seasons over a ~176,000 ha landscape in Southern Portugal, we assessed the significance of each predictor in explaining Cabrera vole occurrence using the Boruta algorithm, a novel Random forest variant for dealing with high dimensionality of explanatory variables. Overall, results showed a strong contribution of Sentinel-2-derived variables for predicting microhabitat suitability of Cabrera voles. In particular, we found that photosynthetic activity (NDI45), specific spectral signal (SWIR1), and landscape heterogeneity (Rao’s Q) were good proxies of Cabrera voles’ microhabitat, mostly during temporally greener and wetter conditions. In addition to remote-sensing-based variables, the presence of road verges was also an important driver of voles’ distribution, highlighting their potential role as refuges and/or corridors. Overall, our study supports the use of remote-sensing data to predict microhabitat suitability for endangered small-sized species in marginal areas that potentially hold most of the biodiversity found in human-dominated landscapes. We believe our approach can be widely applied to other species, for which detailed habitat mapping over large spatial extents is difficult to obtain using traditional descriptors. This would certainly contribute to improving conservation planning, thereby contributing to global conservation efforts in landscapes that are managed for multiple purposes. Collapse

243

Chen J, Li Q, Wang H, Deng M. A Machine Learning Ensemble Approach Based on Random Forest and Radial Basis Function Neural Network for Risk Evaluation of Regional Flood Disaster: A Case Study of the Yangtze River Delta, China. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2019;17:E49. [PMID: 31861677 PMCID: PMC6982166 DOI: 10.3390/ijerph17010049] [Citation(s) in RCA: 52] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/12/2019] [Revised: 12/07/2019] [Accepted: 12/17/2019] [Indexed: 11/16/2022]

Abstract

The Yangtze River Delta (YRD) is one of the most developed regions in China. This is also a flood-prone area where flood disasters are frequently experienced; the situations between the people-land nexus and the people-water nexus are very complicated. Therefore, the accurate assessment of flood risk is of great significance to regional development. The paper took the YRD urban agglomeration as the research case. The driving force, pressure, state, impact and response (DPSIR) conceptual framework was established to analyze the indexes of flood disasters. The random forest (RF) algorithm was used to screen important indexes of floods risk, and a risk assessment model based on the radial basis function (RBF) neural network was constructed to evaluate the flood risk level in this region from 2009 to 2018. The risk map showed the I-V level of flood risk in the YRD urban agglomeration from 2016 to 2018 by using the geographic information system (GIS). Further analysis indicated that the indexes such as flood season rainfall, urban impervious area ratio, gross domestic product (GDP) per square kilometer of land, water area ratio, population density and emergency rescue capacity of public administration departments have important influence on flood risk. The flood risk has been increasing in the YRD urban agglomeration during the past ten years under the urbanization background, and economic development status showed a significant positive correlation with flood risks. In addition, there were serious differences in the rising rate of flood risks and the status quo among provinces. There are still a few cities that have stabilized at a better flood-risk level through urban flood control measures from 2016 to 2018. These results were basically in line with the actual situation, which validated the effectiveness of the model. Finally, countermeasures and suggestions for reducing the urban flood risk in the YRD region were proposed, in order to provide decision support for flood control, disaster reduction and emergency management in the YRD region.

Collapse