Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Smith PF, Ganesh S, Liu P. A comparison of random forest regression and multiple linear regression for prediction in neuroscience. J Neurosci Methods 2013;220:85-91. [DOI: 10.1016/j.jneumeth.2013.08.024] [Citation(s) in RCA: 83] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2013] [Revised: 08/13/2013] [Accepted: 08/28/2013] [Indexed: 11/20/2022]

For:	Smith PF, Ganesh S, Liu P. A comparison of random forest regression and multiple linear regression for prediction in neuroscience. J Neurosci Methods 2013;220:85-91. [DOI: 10.1016/j.jneumeth.2013.08.024] [Citation(s) in RCA: 83] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2013] [Revised: 08/13/2013] [Accepted: 08/28/2013] [Indexed: 11/20/2022]

Number

Cited by Other Article(s)

Bonaccorso A, Ortis A, Musumeci T, Carbone C, Hussain M, Di Salvatore V, Battiato S, Pappalardo F, Pignatello R. Nose-to-Brain Drug Delivery and Physico-Chemical Properties of Nanosystems: Analysis and Correlation Studies of Data from Scientific Literature. Int J Nanomedicine 2024;19:5619-5636. [PMID: 38882536 PMCID: PMC11179666 DOI: 10.2147/ijn.s452316] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2023] [Accepted: 03/12/2024] [Indexed: 06/18/2024] Open

Abstract

Background

In the last few decades, nose-to-brain delivery has been investigated as an alternative route to deliver molecules to the Central Nervous System (CNS), bypassing the Blood-Brain Barrier. The use of nanotechnological carriers to promote drug transfer via this route has been widely explored. The exact mechanisms of transport remain unclear because different pathways (systemic or axonal) may be involved. Despite the large number of studies in this field, various aspects still need to be addressed. For example, what physicochemical properties should a suitable carrier possess in order to achieve this goal? To determine the correlation between carrier features (eg, particle size and surface charge) and drug targeting efficiency percentage (DTE%) and direct transport percentage (DTP%), correlation studies were performed using machine learning.

Methods

Detailed analysis of the literature from 2010 to 2021 was performed on Pubmed in order to build "NANOSE" database. Regression analyses have been applied to exploit machine-learning technology.

Results

A total of 64 research articles were considered for building the NANOSE database (102 formulations). Particle-based formulations were characterized by an average size between 150-200 nm and presented a negative zeta potential (ZP) from -10 to -25 mV. The most general-purpose model for the regression of DTP/DTE values is represented by Decision Tree regression, followed by K-Nearest Neighbors Regressor (KNeighbor regression).

Conclusion

A literature review revealed that nose-to-brain delivery has been widely investigated in neurodegenerative diseases. Correlation studies between the physicochemical properties of nanosystems (mean size and ZP) and DTE/DTP parameters suggest that ZP may be more significant than particle size for DTP/DTE predictability.

Collapse

Khadem H, Nemat H, Elliott J, Benaissa M. In Vitro Glucose Measurement from NIR and MIR Spectroscopy: Comprehensive Benchmark of Machine Learning and Filtering Chemometrics. Heliyon 2024;10:e30981. [PMID: 38778952 PMCID: PMC11108977 DOI: 10.1016/j.heliyon.2024.e30981] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2024] [Revised: 05/08/2024] [Accepted: 05/08/2024] [Indexed: 05/25/2024] Open

Abstract

The quantitative analysis of glucose using spectroscopy is a topic of great significance and interest in science and industry. One conundrum in this area is deploying appropriate preprocessing and regression tools. To contribute to addressing this challenge, in this study, we conducted a comprehensive and novel comparative analysis of various machine learning and preprocessing filtering techniques applied to near-infrared, mid-infrared, and a combination of near-infrared and mid-infrared spectroscopy for glucose assay. Our objective was to evaluate the effectiveness of these techniques in accurately predicting glucose levels and to determine which approach was most optimal. Our investigation involved the acquisition of spectral data from samples of glucose solutions using the three aforementioned spectroscopy techniques. The data was subjected to several preprocessing filtering methods, including convolutional moving average, Savitzky-Golay, multiplicative scatter correction, and normalisation. We then applied representative machine learning algorithms from three categories: linear modelling, traditional nonlinear modelling, and artificial neural networks. The evaluation results revealed that linear models exhibited higher predictive accuracy than nonlinear models, whereas artificial neural network models demonstrated comparable performance. Additionally, the comparative analysis of various filtering methods demonstrated that the convolutional moving average and Savitzky-Golay filters yielded the most precise outcomes overall. In conclusion, our study provides valuable insights into the efficacy of different machine learning techniques for glucose measurement and highlights the importance of applying appropriate filtering methods in enhancing predictive accuracy. These findings have important implications for the development of new and improved glucose quantification technologies.

Collapse

Pasokh Z, Seif M, Ghaem H, Rezaianzadeh A, Ghoddusi Johari M. Age at natural menopause and its determinants in female population of Kharameh cohort study: Comparison of regression, conditional tree and forests. PLoS One 2024;19:e0300448. [PMID: 38625988 PMCID: PMC11020934 DOI: 10.1371/journal.pone.0300448] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Accepted: 02/28/2024] [Indexed: 04/18/2024] Open

Vera Cruz G, Aboujaoude E, Rochat L, Bianchi-Demicheli F, Khazaal Y. Online dating: predictors of problematic tinder use. BMC Psychol 2024;12:106. [PMID: 38424651 PMCID: PMC10905798 DOI: 10.1186/s40359-024-01566-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Accepted: 01/30/2024] [Indexed: 03/02/2024] Open

Abstract

BACKGROUND

Geolocation apps have radically transformed dating practices around the world, with profound sociocultural implications. Few studies, however, have explored their addictive potential or factors that are associated with their misuse.

OBJECTIVE

The present study aimed to assess the level of problematic Tinder use (PTU) in an adult sample, using a machine learning algorithm to determine, among 29 relevant variables, the most important predictors of PTU.

METHODS

1,387 users of Tinder (18-74 years-old; male = 50.3%; female = 49.1%) completed an online questionnaire, and a machine learning tool was used to analyze their responses.

RESULTS

On 5-point scale, participants' mean PTU score was 1.91 (SD = 0.70), indicating a relatively low overall level of problematic app use. Among the most important predictors of Problematic use were the use of Tinder for enhancement (reduce boredom and increase positive emotions), coping with psychological problems, and increasing social connectedness. The number of "matches" (when two users show mutual interest), the number of online contacts on Tinder, and the number of resulting offline dates were also among the top predictors of PTU. Depressive mood and loneliness were among the middle-ranked predictors of PTU.

CONCLUSION

In accordance with the Interaction of Person-Affect-Cognition-Execution model of problematic internet use, the results suggest that PTU relates to how individual experience on the app interacts with dispositional and situational characteristics. However, variables that seemed to relate to PTU, including lack of self-esteem, negative mood states and loneliness, are not problems that online dating services as currently designed can be expected to resolve. This argues for increased digital services to identify and address potential problems helping drive the popularity of dating apps.

Collapse

Sidorov P, Tsuji N. A Primer on 2D Descriptors in Selectivity Modeling for Asymmetric Catalysis. Chemistry 2024;30:e202302837. [PMID: 38010242 DOI: 10.1002/chem.202302837] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Revised: 11/21/2023] [Accepted: 11/23/2023] [Indexed: 11/29/2023]

Manessa MDM, Ummam MAF, Efriana AF, Semedi JM, Ayu F. Assessing Derawan Island's Coral Reefs over Two Decades: A Machine Learning Classification Perspective. SENSORS (BASEL, SWITZERLAND) 2024;24:466. [PMID: 38257559 PMCID: PMC10818429 DOI: 10.3390/s24020466] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/17/2023] [Revised: 12/23/2023] [Accepted: 01/09/2024] [Indexed: 01/24/2024]

Yousefmarzi F, Haratian A, Mahdavi Kalatehno J, Keihani Kamal M. Machine learning approaches for estimating interfacial tension between oil/gas and oil/water systems: a performance analysis. Sci Rep 2024;14:858. [PMID: 38195685 PMCID: PMC10776576 DOI: 10.1038/s41598-024-51597-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2023] [Accepted: 01/07/2024] [Indexed: 01/11/2024] Open

Xu S, Yang X, Zhang S, Zheng X, Zheng F, Liu Y, Zhang H, Ye Q, Li L. Machine learning models for orthokeratology lens fitting and axial length prediction. Ophthalmic Physiol Opt 2023;43:1462-1468. [PMID: 37574762 DOI: 10.1111/opo.13212] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Revised: 07/25/2023] [Accepted: 07/26/2023] [Indexed: 08/15/2023]

Dang T, Fermin ASR, Machizawa MG. oFVSD: a Python package of optimized forward variable selection decoder for high-dimensional neuroimaging data. Front Neuroinform 2023;17:1266713. [PMID: 37829329 PMCID: PMC10566623 DOI: 10.3389/fninf.2023.1266713] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2023] [Accepted: 09/08/2023] [Indexed: 10/14/2023] Open

Abstract

The complexity and high dimensionality of neuroimaging data pose problems for decoding information with machine learning (ML) models because the number of features is often much larger than the number of observations. Feature selection is one of the crucial steps for determining meaningful target features in decoding; however, optimizing the feature selection from such high-dimensional neuroimaging data has been challenging using conventional ML models. Here, we introduce an efficient and high-performance decoding package incorporating a forward variable selection (FVS) algorithm and hyper-parameter optimization that automatically identifies the best feature pairs for both classification and regression models, where a total of 18 ML models are implemented by default. First, the FVS algorithm evaluates the goodness-of-fit across different models using the k-fold cross-validation step that identifies the best subset of features based on a predefined criterion for each model. Next, the hyperparameters of each ML model are optimized at each forward iteration. Final outputs highlight an optimized number of selected features (brain regions of interest) for each model with its accuracy. Furthermore, the toolbox can be executed in a parallel environment for efficient computation on a typical personal computer. With the optimized forward variable selection decoder (oFVSD) pipeline, we verified the effectiveness of decoding sex classification and age range regression on 1,113 structural magnetic resonance imaging (MRI) datasets. Compared to ML models without the FVS algorithm and with the Boruta algorithm as a variable selection counterpart, we demonstrate that the oFVSD significantly outperformed across all of the ML models over the counterpart models without FVS (approximately 0.20 increase in correlation coefficient, r, with regression models and 8% increase in classification models on average) and with Boruta variable selection algorithm (approximately 0.07 improvement in regression and 4% in classification models). Furthermore, we confirmed the use of parallel computation considerably reduced the computational burden for the high-dimensional MRI data. Altogether, the oFVSD toolbox efficiently and effectively improves the performance of both classification and regression ML models, providing a use case example on MRI datasets. With its flexibility, oFVSD has the potential for many other modalities in neuroimaging. This open-source and freely available Python package makes it a valuable toolbox for research communities seeking improved decoding accuracy.

Collapse

Berni M, Veronesi F, Fini M, Giavaresi G, Marchiori G. Relations between Structure/Composition and Mechanics in Osteoarthritic Regenerated Articular Tissue: A Machine Learning Approach. Int J Mol Sci 2023;24:13374. [PMID: 37686179 PMCID: PMC10487849 DOI: 10.3390/ijms241713374] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2023] [Revised: 08/22/2023] [Accepted: 08/25/2023] [Indexed: 09/10/2023] Open

Shi Y, Du Z, Zhang J, Han F, Chen F, Wang D, Liu M, Zhang H, Dong C, Sui S. Construction and evaluation of hourly average indoor PM_2.5 concentration prediction models based on multiple types of places. Front Public Health 2023;11:1213453. [PMID: 37637795 PMCID: PMC10447970 DOI: 10.3389/fpubh.2023.1213453] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2023] [Accepted: 07/28/2023] [Indexed: 08/29/2023] Open

Abstract

Background

People usually spend most of their time indoors, so indoor fine particulate matter (PM2.5) concentrations are crucial for refining individual PM2.5 exposure evaluation. The development of indoor PM2.5 concentration prediction models is essential for the health risk assessment of PM2.5 in epidemiological studies involving large populations.

Methods

In this study, based on the monitoring data of multiple types of places, the classical multiple linear regression (MLR) method and random forest regression (RFR) algorithm of machine learning were used to develop hourly average indoor PM2.5 concentration prediction models. Indoor PM2.5 concentration data, which included 11,712 records from five types of places, were obtained by on-site monitoring. Moreover, the potential predictor variable data were derived from outdoor monitoring stations and meteorological databases. A ten-fold cross-validation was conducted to examine the performance of all proposed models.

Results

The final predictor variables incorporated in the MLR model were outdoor PM2.5 concentration, type of place, season, wind direction, surface wind speed, hour, precipitation, air pressure, and relative humidity. The ten-fold cross-validation results indicated that both models constructed had good predictive performance, with the determination coefficients (R2) of RFR and MLR were 72.20 and 60.35%, respectively. Generally, the RFR model had better predictive performance than the MLR model (RFR model developed using the same predictor variables as the MLR model, R2 = 71.86%). In terms of predictors, the importance results of predictor variables for both types of models suggested that outdoor PM2.5 concentration, type of place, season, hour, wind direction, and surface wind speed were the most important predictor variables.

Conclusion

In this research, hourly average indoor PM2.5 concentration prediction models based on multiple types of places were developed for the first time. Both the MLR and RFR models based on easily accessible indicators displayed promising predictive performance, in which the machine learning domain RFR model outperformed the classical MLR model, and this result suggests the potential application of RFR algorithms for indoor air pollutant concentration prediction.

Collapse

Dargi M, Khamehchi E, Mahdavi Kalatehno J. Optimizing acidizing design and effectiveness assessment with machine learning for predicting post-acidizing permeability. Sci Rep 2023;13:11851. [PMID: 37481625 PMCID: PMC10363159 DOI: 10.1038/s41598-023-39156-9] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2023] [Accepted: 07/20/2023] [Indexed: 07/24/2023] Open

Abstract

Formation damage poses a widespread challenge in the oil and gas industry, leading to diminished permeability, flow rates, and overall well productivity. Acidizing is a commonly employed technique aimed at mitigating damage and enhancing permeability. In this study, to predict the permeability after acidizing in oil and gas reservoirs, three machine learning models, namely artificial neural networks, random forest, and XGBoost, along with genetic programming were used to estimate permeability changes after acidizing. These models are utilized to estimate permeability changes following acidizing operations. Training of the models involved a dataset comprising 218 acidizing operations conducted in diverse reservoirs across Iran. The input parameters, namely permeability, porosity, skin factor, calcite mineral fraction, acid injection rate, and injected acid volume, were optimized through the use of a genetic algorithm. Statistical and graphical analysis of the results demonstrates that genetic programming outperformed the other machine learning techniques, yielding superior performance with R square and RMSE values of 0.82 and 17.65, respectively. Nevertheless, the other models also exhibited commendable performance, surpassing an R square value of 0.73. The post-acidizing permeability data obtained from core flooding experiments conducted on carbonate and sandstone cores was utilized to validate the models. The genetic programming model demonstrates an average error of 21.1%. The evaluation of post-acidizing permeability using genetic programming, in comparison with the results obtained from the core-flood test, revealed errors of 22.95% and 32.4% for carbonate and sandstone cores, respectively. Furthermore, a comparison between the calculated post-acidizing permeability derived from the GP model and previous studies indicated errors within the range of 8.6-26.59%. The findings highlight the potential of genetic programming and machine learning algorithms in accurately predicting post-acidizing permeability, thereby aiding in acidizing design, effectiveness assessment, and ultimately enhancing oil and gas production rates.

Collapse

Vera Cruz G, Aboujaoude E, Rochat L, Bianchi-Demichelli F, Khazaal Y. Finding Intimacy Online: A Machine Learning Analysis of Predictors of Success. CYBERPSYCHOLOGY, BEHAVIOR AND SOCIAL NETWORKING 2023. [PMID: 37352415 DOI: 10.1089/cyber.2022.0367] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/25/2023]

Amankulova K, Farmonov N, Akramova P, Tursunov I, Mucsi L. Comparison of PlanetScope, Sentinel-2, and landsat 8 data in soybean yield estimation within-field variability with random forest regression. Heliyon 2023;9:e17432. [PMID: 37408926 PMCID: PMC10319221 DOI: 10.1016/j.heliyon.2023.e17432] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/25/2022] [Revised: 06/12/2023] [Accepted: 06/16/2023] [Indexed: 07/07/2023] Open

Abstract

Accurate timely and early-season crop yield estimation within the field variability is important for precision farming and sustainable management applications. Therefore, the ability to estimate the within-field variability of grain yield is crucial for ensuring food security worldwide, especially under climate change. Several Earth observation systems have thus been developed to monitor crops and predict yields. Despite this, new research is required to combine multiplatform data integration, advancements in satellite technologies, data processing, and the application of this discipline to agricultural practices. This study provides further developments in soybean yield estimation by comparing multisource satellite data from PlanetScope (PS), Sentinel-2 (S2), and Landsat 8 (L8) and introducing topographic and meteorological variables. Herein, a new method of combining soybean yield, global positioning systems, harvester data, climate, topographic variables, and remote sensing images has been demonstrated. Soybean yield shape points were obtained from a combine-harvester-installed GPS and yield monitoring system from seven fields over the 2021 season. The yield estimation models were trained and validated using random forest, and four vegetation indices were tested. The result showed that soybean yield can be accurately predicted at 3-, 10-, and 30-m resolutions with mean absolute error (MAE) value of 0.091 t/ha for PS, 0.118 t/ha for S2, and 0.120 t/ha for L8 data (root mean square error (RMSE) of 0.111, 0.076). The combination of the environmental data with the original bands provided further improvements and an accurate yield estimation model within the soybean yield variability with MAE of 0.082 t/ha for PS, 0.097 t/ha for S2, and 0.109 t/ha for L8 (RMSE of 0.094, 0.069, and 0.108 t/ha). The results showed that the optimal date to predict the soybean yield within the field scale was approximately 60 or 70 days before harvesting periods during the beginning bloom stage. The developed model can be applied for other crops and locations when suitable training yield data, which are critical for precision farming, are available.

Collapse

Gnyawali K, Dahal K, Talchabhadel R, Nirandjan S. Framework for rainfall-triggered landslide-prone critical infrastructure zonation. THE SCIENCE OF THE TOTAL ENVIRONMENT 2023;872:162242. [PMID: 36804983 DOI: 10.1016/j.scitotenv.2023.162242] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/07/2022] [Revised: 02/09/2023] [Accepted: 02/10/2023] [Indexed: 06/18/2023]

Abstract

Rainfall-induced landslides cause frequent disruptions to critical infrastructure in mountainous countries. Climate change is altering rainfall patterns and localizing extreme rainfall events, increasing the occurrence of landslides. For planning climate-resilient critical infrastructure in landslide-prone regions, it is urgent to understand the changing landslide susceptibility in relation to changing rainfall extremes and spatially overlay them with critical infrastructure to determine risk zones. As such, areas requiring financial reinforcements can be prioritized. In this paper, we develop a framework linking changing rainfall extremes to landslide susceptibility and intensity of critical infrastructure - exemplified on a national scale using Nepal as a case study. First, we define a set of 21 different unique rainfall indices that describe extreme and localized rainfall. Second, we prepare a new annual (2016-2020) inventory of 107,900 landslides in Nepal mapped on PlanetScope satellite imagery. Next, we prepare a landslide susceptibility map by training a random forest model using the collected extreme rainfall indices and landslide locations in combination with spatial data on topography. Fourth, we construct a gridded critical infrastructure spatial density map that quantifies the intensity of infrastructure (i.e., transportation, energy, telecommunication, waste, water, health, and education) at each grid location using OpenStreetMap. The landslide susceptibility map classified Nepal's topography into low (36 %), medium (33 %), and (32 %) high rainfall-triggered landslide susceptibility zones. The landslide susceptibility map had an average area under the receiver characteristic curve value of 0.94. Finally, we overlay the landslide susceptibility map with the critical infrastructure intensity to identify areas needing financial reinforcement. Our framework reasonably mapped critical infrastructure hotspots in Nepal prone to landslides on a 1 km grid. The hotspots are mainly concentrated along major national highways and in provinces 4, 3, and 1, highlighting the need for improved land management practices. These hotspots need spatial prioritization regarding climate-resilient critical infrastructure financing and slope conservation policies. The research data, output maps, and code are publicly released via an ArcGIS WebApp and GitHub repository. The framework is scalable and can be used for developing infrastructure financing strategies for landslide mountain regions and countries.

Collapse

Lerebourg L, Saboul D, Clémençon M, Coquart JB. Prediction of Marathon Performance using Artificial Intelligence. Int J Sports Med 2023;44:352-360. [PMID: 36473492 DOI: 10.1055/a-1993-2371] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]

Innovative Advances in Plant Genotyping. Methods Mol Biol 2023;2638:451-465. [PMID: 36781662 DOI: 10.1007/978-1-0716-3024-2_32] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/15/2023]

Vera Cruz G, Aboujaoude E, Khan R, Rochat L, Ben Brahim F, Courtois R, Khazaal Y. Smartphone apps for mental health and wellbeing: A usage survey and machine learning analysis of psychological and behavioral predictors. Digit Health 2023;9:20552076231152164. [PMID: 36714544 PMCID: PMC9880571 DOI: 10.1177/20552076231152164] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Accepted: 01/03/2023] [Indexed: 01/24/2023] Open

Abstract

Objective

Despite the availability of thousands of mental health applications, the extent to which they are used and the factors associated with their use remain largely unknown. The present study aims to (a) assess in a representative US-based population sample the use of smartphone apps for mental health and wellbeing (SAMHW), (b) determine the variables predicting the use of SAMHW, and (c) explore how a set of variables related to mental health, smartphone use, and smartphone "addiction" may be associated with the use of SAMHW.

Methods

Data was collected via online questionnaire from 1989 adults. The data gathered included information on smartphone use behavior, mental health, and the use of SAMHW. Latent class analysis was used to categorize participants. Machine learning and logistic regression analyses were used to determine the most important predictors of SAMHW use and associations between predictors and outcome variables.

Results

While two-thirds of participants had a statistically high probability for using SAMHW, nearly twice more had high probability for using them to improve wellbeing compared to using them to address mental health problems (43% vs. 18%). In both groups, these participants were more likely to be female and in the younger adult age bracket than male and in the adult or older adult age bracket. According to the machine learning model, the most important predictors for using the relevant smartphone apps were variables associated with smartphone problematic use, COVID-19 impact, and mental health problems.

Conclusion

Findings from the present study confirm that the use of SAMHW is growing, particularly among younger adult and female individuals who are negatively impacted by problematic smartphone use, COVID-19, and mental health problems. These individuals tend to bypass traditional care via psychotherapy or psychopharmacology, relying instead on smartphones to address mental health conditions or improve wellbeing. Advising users of these apps to also seek professional help and promoting efforts to prove the efficacy and safety of SAMHW would seem necessary.

Collapse

Kaya H, Guler E, Kırmacı V. Prediction of temperature separation of a nitrogen-driven vortex tube with linear, kNN, SVM, and RF regression models. Neural Comput Appl 2022. [DOI: 10.1007/s00521-022-08030-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]

Li X, Tang X, Cheng Q. Predicting the clinical citation count of biomedical papers using multilayer perceptron neural network. J Informetr 2022. [DOI: 10.1016/j.joi.2022.101333] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]

Couckuyt A, Seurinck R, Emmaneel A, Quintelier K, Novak D, Van Gassen S, Saeys Y. Challenges in translational machine learning. Hum Genet 2022;141:1451-1466. [PMID: 35246744 PMCID: PMC8896412 DOI: 10.1007/s00439-022-02439-8] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2021] [Accepted: 02/08/2022] [Indexed: 11/25/2022]

Chung PY, Liao CT. Selection of parental lines for plant breeding via genomic prediction. FRONTIERS IN PLANT SCIENCE 2022;13:934767. [PMID: 35968112 PMCID: PMC9363737 DOI: 10.3389/fpls.2022.934767] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/03/2022] [Accepted: 07/01/2022] [Indexed: 06/15/2023]

Chen X, Zheng H, Wang H, Yan T. Can machine learning algorithms perform better than multiple linear regression in predicting nitrogen excretion from lactating dairy cows. Sci Rep 2022;12:12478. [PMID: 35864287 PMCID: PMC9304409 DOI: 10.1038/s41598-022-16490-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2021] [Accepted: 07/11/2022] [Indexed: 11/09/2022] Open

Santhanam P, Nath T, Lindquist MA, Cooper DS. Relationship Between TSH Levels and Cognition in the Young Adult: An Analysis of the Human Connectome Project Data. J Clin Endocrinol Metab 2022;107:1897-1905. [PMID: 35389477 DOI: 10.1210/clinem/dgac189] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/15/2021] [Indexed: 11/19/2022]

A Geographically Weighted Random Forest Approach to Predict Corn Yield in the US Corn Belt. REMOTE SENSING 2022. [DOI: 10.3390/rs14122843] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/04/2022]

The advanced design of bioleaching process for metal recovery: A machine learning approach. Sep Purif Technol 2022. [DOI: 10.1016/j.seppur.2022.120919] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]

Patiyal S, Dhall A, Raghava GPS. Prediction of risk-associated genes and high-risk liver cancer patients from their mutation profile: Benchmarking of mutation calling techniques. Biol Methods Protoc 2022;7:bpac012. [PMID: 35734767 PMCID: PMC9204470 DOI: 10.1093/biomethods/bpac012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2021] [Revised: 05/20/2022] [Accepted: 05/20/2022] [Indexed: 11/12/2022] Open

Abstract Abstract Identification of somatic mutations with high precision is one of the major challenges in the prediction of high-risk liver-cancer patients. In the past, number of mutations calling techniques have been developed that include MuTect2, MuSE, Varscan2, and SomaticSniper. In this study, an attempt has been made to benchmark the potential of these techniques in predicting the prognostic biomarkers for liver cancer. Initially, we extracted somatic mutations in liver cancer patients using Variant Call Format (VCF) and Mutation Annotation Format (MAF) files from the cancer genome atlas. In terms of size, the MAF files are 42 times smaller than VCF files and containing only high-quality somatic mutations. Further, machine learning based models have been developed for predicting high-risk cancer patients using mutations obtained from different techniques. The performance of different techniques and data files have been compared based on their potential to discriminate high and low-risk liver-cancer patients. Based on correlation analysis, we selected 80 genes having significant negative-correlation with the overall survival of liver cancer patients. The univariate survival analysis revealed the prognostic role of highly mutated genes. Single-gene based analysis showed that MuTect2 technique based MAF file has achieved maximum hazard ratio (HRLAMC3) of 9.25 with p-value 1.78E-06. Further, we developed various prediction models using risk-associated top-10 genes for each technique. Our results indicate that MuTect2 technique based VCF files outperform all other methods with maximum Area Under the Receiver-Operating Characteristic (AUROC) curve of 0.765 and HR 4.50 (p-value 3.83E-15). Eventually, VCF file generated using MuTect2 technique performs better among other mutation calling techniques for the prediction of high-risk liver cancer patients. We hope that our findings will provide a useful and comprehensive comparison of various mutation calling techniques for the prognostic analysis of cancer patients. In order to serve the scientific community, we have provided a Python-based pipeline to develop the prediction models using mutation profiles (VCF/MAF) of cancer patients. It is available on GitHub at https://github.com/raghavagps/mutation_bench. Collapse

Quantifying the reproducibility of graph neural networks using multigraph data representation. Neural Netw 2022;148:254-265. [DOI: 10.1016/j.neunet.2022.01.018] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2021] [Revised: 01/10/2022] [Accepted: 01/26/2022] [Indexed: 11/20/2022]

A Meta-Model to Predict the Drag Coefficient of a Particle Translating in Viscoelastic Fluids: A Machine Learning Approach. Polymers (Basel) 2022;14:polym14030430. [PMID: 35160419 PMCID: PMC8838701 DOI: 10.3390/polym14030430] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2021] [Revised: 01/14/2022] [Accepted: 01/18/2022] [Indexed: 02/06/2023] Open

Merrigan JJ, Stone JD, Wagle JP, Hornsby WG, Ramadan J, Joseph M, Galster SM, Hagen JA. Using Random Forest Regression to Determine Influential Force-Time Metrics for Countermovement Jump Height: A Technical Report. J Strength Cond Res 2022;36:277-283. [PMID: 34941613 DOI: 10.1519/jsc.0000000000004154] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

Abstract

ABSTRACT

Merrigan, JJ, Stone, JD, Wagle, JP, Hornsby, WG, Ramadan, J, Joseph, M, and Hagen, JA. Using random forest regression to determine influential force-time metrics for countermovement jump height: a technical report. J Strength Cond Res 36(1): 277-283, 2022-The purpose of this study was to indicate the most influential force-time metrics on countermovement jump (CMJ) height using multiple statistical procedures. Eighty-two National Collegiate Athletic Association Division I American football players performed 2 maximal-effort, no arm-swing, CMJs on force plates. The average absolute and relative (i.e., power/body mass) metrics were included as predictor variables, whereas jump height was the dependent variable within regression models (p < 0.05). Best subsets regression (8 metrics, R2 = 0.95) included less metrics compared with stepwise regression (18 metrics, R2 = 0.96), while explaining similar overall variance in jump height (p = 0.083). Random forest regression (RFR) models included 8 metrics, explained ∼93% of jump height variance, and were not significantly different than best subsets regression models (p > 0.05). Players achieved higher CMJs by attaining a deeper, faster, and more forceful countermovement with lower eccentric-to-concentric force ratios. An additional RFR was conducted on metrics scaled to body mass and revealed relative mean and peak concentric power to be the most influential. For exploratory purposes, additional RFR were run for each positional group and suggested that the most influential variables may differ across positions. Thus, developing power output capabilities and providing coaching to improve technique during the countermovement may maximize jump height capabilities. Scientists and practitioners may use best subsets or RFR analyses to help identify which force-time metrics are of interest to reduce the selectable number of multicollinear force-time metrics to monitor. These results may inform their training programs to maximize individual performance capabilities.

Collapse

Hearing loss versus vestibular loss as contributors to cognitive dysfunction. J Neurol 2022;269:87-99. [PMID: 33387012 DOI: 10.1007/s00415-020-10343-2] [Citation(s) in RCA: 26] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2020] [Revised: 11/23/2020] [Accepted: 12/04/2020] [Indexed: 02/02/2023]

Papaioannou A, Kalantzi E, Papageorgiou CC, Korombili K, Bokou A, Pehlivanidis A, Papageorgiou CC, Papaioannou G. Differences in Performance of ASD and ADHD Subjects Facing Cognitive Loads in an Innovative Reasoning Experiment. Brain Sci 2021;11:1531. [PMID: 34827530 PMCID: PMC8615740 DOI: 10.3390/brainsci11111531] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Revised: 11/09/2021] [Accepted: 11/09/2021] [Indexed: 11/17/2022] Open

Abstract

We aim to investigate whether EEG dynamics differ in adults with ASD (Autism Spectrum Disorders) and ADHD (attention-deficit/hyperactivity disorder) compared with healthy subjects during the performance of an innovative cognitive task, Aristotle's valid and invalid syllogisms, and how these differences correlate with brain regions and behavioral data for each subject. We recorded EEGs from 14 scalp electrodes (channels) in 21 adults with ADHD, 21 with ASD, and 21 healthy, normal subjects. The subjects were exposed in a set of innovative cognitive tasks (inducing varying cognitive loads), Aristotle's two types of syllogism mentioned above. A set of 39 questions were given to participants related to valid-invalid syllogisms as well as a separate set of questionnaires, in order to collect a number of demographic and behavioral data, with the aim of detecting shared information with values of a feature extracted from EEG, the multiscale entropy (MSE), in the 14 channels ('brain regions'). MSE, a nonlinear information-theoretic measure of complexity, was computed to extract a feature that quantifies the complexity of the EEG. Behavior-Partial Least Squares Correlation, PLSC, is the method to detect the correlation between two sets of data, brain, and behavioral measures. -PLSC, a variant of PLSC, was applied to build a functional connectivity of the brain regions involved in the reasoning tasks. Graph-theoretic measures were used to quantify the complexity of the functional networks. Based on the results of the analysis described in this work, a mixed 14 × 2 × 3 ANOVA showed significant main effects of group factor and brain region* syllogism factor, as well as a significant brain region* group interaction. There are significant differences between the means of MSE (complexity) values at the 14 channels of the members of the 'pathological' groups of participants, i.e., between ASD and ADHD, while the difference in means of MSE between both ASD and ADHD and that of the control group is not significant. In conclusion, the valid-invalid type of syllogism generates significantly different complexity values, MSE, between ASD and ADHD. The complexity of activated brain regions of ASD participants increased significantly when switching from a valid to an invalid syllogism, indicating the need for more resources to 'face' the task escalating difficulty in ASD subjects. This increase is not so evident in both ADHD and control. Statistically significant differences were found also in the behavioral response of ASD and ADHD, compared with those of control subjects, based on the principal brain and behavior saliences extracted by PLSC. Specifically, two behavioral measures, the emotional state and the degree of confidence of participants in answering questions in Aristotle's valid-invalid syllogisms, and one demographic variable, age, statistically and significantly discriminate the three groups' ASD. The seed-PLC generated functional connectivity networks for ASD, ADHD, and control, were 'projected' on the regions of the Default Mode Network (DMN), the 'reference' connectivity, of which the structural changes were found significant in distinguishing the three groups. The contribution of this work lies in the examination of the relationship between brain activity and behavioral responses of healthy and 'pathological' participants in the case of cognitive reasoning of the type of Aristotle's valid and invalid syllogisms, using PLSC, a machine learning approach combined with MSE, a nonlinear method of extracting a feature based on EEGs that captures a broad spectrum of EEGs linear and nonlinear characteristics. The results seem promising in adopting this type of reasoning, in the future, after further enhancements and experimental tests, as a supplementary instrument towards examining the differences in brain activity and behavioral responses of ASD and ADHD patients. The application of the combination of these two methods, after further elaboration and testing as new and complementary to the existing ones, may be considered as a tool of analysis in helping detecting more effectively such types of disorders.

Collapse

Sandhu K, Patil SS, Pumphrey M, Carter A. Multitrait machine- and deep-learning models for genomic selection using spectral information in a wheat breeding program. THE PLANT GENOME 2021;14:e20119. [PMID: 34482627 DOI: 10.1002/tpg2.20119] [Citation(s) in RCA: 33] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/05/2021] [Accepted: 05/18/2021] [Indexed: 06/13/2023]

Aboveground Biomass Estimation in Short Rotation Forest Plantations in Northern Greece Using ESA’s Sentinel Medium-High Resolution Multispectral and Radar Imaging Missions. FORESTS 2021. [DOI: 10.3390/f12070902] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]

Mapping 30 m Fractional Forest Cover over China’s Three-North Region from Landsat-8 Data Using Ensemble Machine Learning Methods. REMOTE SENSING 2021. [DOI: 10.3390/rs13132592] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]

Abstract The accurate monitoring of forest cover and its changes are essential for environmental change research, but current satellite products for forest coverage carry many uncertainties. This study used 30-m Landsat-8 data, and aggregated 1-m GaoFen-2 (GF-2) satellite images to construct the training samples and used multiple machine learning algorithms (MLAs) to estimate the fractional forest cover (FFC) in China’s Three North Region (TNR). In this study, multiple MLAs were merged to construct stacked generalization (SG) models based on the idea of SG, and the performances of the MLAs in the FFC estimation were evaluated. The results of the 10-fold cross-validation showed that all non-linear algorithms had a good performance, with an R2 value of greater than 0.8 and a root-mean square error (RMSE) of less than 0.05. In the bagging ensemble, the random forest (RF) (R2 = 0.993, RMSE = 0.020) model performed the best and in the boosting ensemble, the light gradient boosted machine (LGBM) (R2 = 0.992, RMSE = 0.022) performed the best. Although the evaluation index of the RF is slightly better than that of the LGBM, the independent validation results show that the two models have similar performances. The model evaluation results of the independent datasets showed that, in the SG model, the performance of the SG(LGBM) (R2 = 0.991, RMSE = 0.034) was better than that of the single or non-ensemble model. Comparing the FFC estimates of our model with those of existing datasets showed that our model exhibited more forest spatial distribution details and higher accuracy in complex landscapes. Overall, in this study, the method of using high-resolution remote sensing (RS) images to extract samples for FFC estimation is feasible. Our results demonstrate the potential of the ensemble MLAs to map the FFC. The research results also show that among many MALs, the RF algorithm is the most suitable algorithm for estimating FFC, which provides a reference for future research. Collapse

Smith PF, Zheng Y. Applications of Multivariate Statistical and Data Mining Analyses to the Search for Biomarkers of Sensorineural Hearing Loss, Tinnitus, and Vestibular Dysfunction. Front Neurol 2021;12:627294. [PMID: 33746881 PMCID: PMC7966509 DOI: 10.3389/fneur.2021.627294] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2020] [Accepted: 02/01/2021] [Indexed: 11/24/2022] Open

Sep MSC, Joëls M, Geuze E. Individual differences in the encoding of contextual details following acute stress: An explorative study. Eur J Neurosci 2020;55:2714-2738. [PMID: 33249674 PMCID: PMC9291333 DOI: 10.1111/ejn.15067] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2020] [Revised: 11/05/2020] [Accepted: 11/21/2020] [Indexed: 12/19/2022]

Abstract

Information processing under stressful circumstances depends on many experimental conditions, like the information valence or the point in time at which brain function is probed. This also holds true for memorizing contextual details (or ‘memory contextualization’). Moreover, large interindividual differences appear to exist in (context‐dependent) memory formation after stress, but it is mostly unknown which individual characteristics are essential. Various characteristics were explored from a theory‐driven and data‐driven perspective, in 120 healthy men. In the theory‐driven model, we postulated that life adversity and trait anxiety shape the stress response, which impacts memory contextualization following acute stress. This was indeed largely supported by linear regression analyses, showing significant interactions depending on valence and time point after stress. Thus, during the acutephase of the stress response, reduced neutral memory contextualization was related to salivary cortisol level; moreover, certain individual characteristics correlated with memory contextualization of negatively valenced material: (a) life adversity, (b) α‐amylase reactivity in those with low life adversity and (c) cortisol reactivity in those with low trait anxiety. Better neutral memory contextualization during the recoveryphase of the stress response was associated with (a) cortisol in individuals with low life adversity and (b) α‐amylase in individuals with high life adversity. The data‐driven Random Forest‐based variable selection also pointed to (early) life adversity—during the acutephase—and (moderate) α‐amylase reactivity—during the recoveryphase—as individual characteristics related to better memory contextualization. Newly identified characteristics sparked novel hypotheses about non‐anxious personality traits, age, mood and states during retrieval of context‐related information.

Collapse

Tian Y, He YL, Zhu QX. Soft Sensor Development Using Improved Whale Optimization and Regularization-Based Functional Link Neural Network. Ind Eng Chem Res 2020. [DOI: 10.1021/acs.iecr.0c03839] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]

Hisham S, Rasheed SA, Dsouza B. Application of Predictive Modelling to Improve the Discharge Process in Hospitals. Healthc Inform Res 2020;26:166-174. [PMID: 32819034 PMCID: PMC7438692 DOI: 10.4258/hir.2020.26.3.166] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2020] [Accepted: 07/21/2001] [Indexed: 11/23/2022] Open

A Random Forest Modelling Procedure for a Multi-Sensor Assessment of Tree Species Diversity. REMOTE SENSING 2020. [DOI: 10.3390/rs12071210] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]

Zhang S, Tan Z, Liu J, Xu Z, Du Z. Determination of the food dye indigotine in cream by near-infrared spectroscopy technology combined with random forest model. SPECTROCHIMICA ACTA. PART A, MOLECULAR AND BIOMOLECULAR SPECTROSCOPY 2020;227:117551. [PMID: 31677907 DOI: 10.1016/j.saa.2019.117551] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/10/2019] [Revised: 09/09/2019] [Accepted: 09/17/2019] [Indexed: 06/10/2023]

Yuchi W, Gombojav E, Boldbaatar B, Galsuren J, Enkhmaa S, Beejin B, Naidan G, Ochir C, Legtseg B, Byambaa T, Barn P, Henderson SB, Janes CR, Lanphear BP, McCandless LC, Takaro TK, Venners SA, Webster GM, Allen RW. Evaluation of random forest regression and multiple linear regression for predicting indoor fine particulate matter concentrations in a highly polluted city. ENVIRONMENTAL POLLUTION (BARKING, ESSEX : 1987) 2019;245:746-753. [PMID: 30500754 DOI: 10.1016/j.envpol.2018.11.034] [Citation(s) in RCA: 44] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/22/2018] [Revised: 11/01/2018] [Accepted: 11/11/2018] [Indexed: 05/14/2023]

Affiliation(s)

Weiran Yuchi Faculty of Health Sciences, Simon Fraser University, 8888 University Drive, Burnaby, BC, V5A 1S6, Canada
Enkhjargal Gombojav School of Public Health, Mongolian National University of Medical Sciences, Zorig Street, Ulaanbaatar, 14210, Mongolia
Buyantushig Boldbaatar School of Public Health, Mongolian National University of Medical Sciences, Zorig Street, Ulaanbaatar, 14210, Mongolia
Jargalsaikhan Galsuren School of Public Health, Mongolian National University of Medical Sciences, Zorig Street, Ulaanbaatar, 14210, Mongolia
Sarangerel Enkhmaa Institute of Meteorology and Environmental Monitoring, Ministry of Environment of Mongolia, Mongolia
Bolor Beejin Mongolian National Center for Public Health, Olympic Street 2, Ulaanbaatar, Mongolia
Gerel Naidan School of Public Health, Mongolian National University of Medical Sciences, Zorig Street, Ulaanbaatar, 14210, Mongolia
Chimedsuren Ochir School of Public Health, Mongolian National University of Medical Sciences, Zorig Street, Ulaanbaatar, 14210, Mongolia
Bayarkhuu Legtseg Sukhbaatar District Health Center, 11 Horoo, Tsagdaagiin Gudamj, Sukhbaatar District, Ulaanbaatar, Mongolia
Tsogtbaatar Byambaa Ministry of Health of Mongolia, Olympic Street-2, Government Building VIII, Sukhbaatar District, Ulaanbaatar, Mongolia
Prabjit Barn Faculty of Health Sciences, Simon Fraser University, 8888 University Drive, Burnaby, BC, V5A 1S6, Canada
Sarah B Henderson Environmental Health Services, British Columbia Centre for Disease Control, 655 W. 12th Ave, Vancouver, BC, V5T 4R4, Canada
Craig R Janes School of Public Health and Health Systems, University of Waterloo, 200 University Avenue West, Waterloo, ON, N2L 3G1, Canada
Bruce P Lanphear Faculty of Health Sciences, Simon Fraser University, 8888 University Drive, Burnaby, BC, V5A 1S6, Canada
Lawrence C McCandless Faculty of Health Sciences, Simon Fraser University, 8888 University Drive, Burnaby, BC, V5A 1S6, Canada
Tim K Takaro Faculty of Health Sciences, Simon Fraser University, 8888 University Drive, Burnaby, BC, V5A 1S6, Canada
Scott A Venners Faculty of Health Sciences, Simon Fraser University, 8888 University Drive, Burnaby, BC, V5A 1S6, Canada
Glenys M Webster Faculty of Health Sciences, Simon Fraser University, 8888 University Drive, Burnaby, BC, V5A 1S6, Canada
Ryan W Allen Faculty of Health Sciences, Simon Fraser University, 8888 University Drive, Burnaby, BC, V5A 1S6, Canada.

Collapse

GIS-Based Random Forest Weight for Rainfall-Induced Landslide Susceptibility Assessment at a Humid Region in Southern China. WATER 2018. [DOI: 10.3390/w10081019] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]

Abstract Landslide susceptibility assessment is presently considered an effective tool for landslide warning and forecasting. Under the assessment procedure, a credible index weight can greatly increase the rationality of the assessment result. Using the Beijiang River Basin, China, as a case study, this paper proposes a new weight-determining method based on random forest (RF) and used the weighted linear combination (WLC) to evaluate the landslide susceptibility. The RF weight and eight indices were used to construct the assessment model. As a comparison, the entropy weight (EW) and weight determined by analytic hierarchy process (AHP) were also used, respectively, to demonstrate the rationality of the proposed weight-determining method. The results show that: (1) the average error rates of training and testing based on RF are 18.12% and 15.83%, respectively, suggesting that the RF model can be considered rational and credible; (2) RF ranks the indices elevation (EL), slope (SL), maximum one-day precipitation (M1DP) and distance to fault (DF) as the Top 4 most important of the eight indices, occupying 73.24% of the total, while the indices runoff coefficient (RC), normalized difference vegetation index (NDVI), shear resistance capacity (SRC) and available water capacity (AWC) are less consequential, with an index importance degree of only 26.76% of the total; and (3) the verification of landslide susceptibility indicates that the accuracy rate based on the RF weight reaches 75.41% but are only 59.02% and 72.13% for the other two weights (EW and AHP), respectively. This paper shows the potential to provide a new weight-determining method for landslide susceptibility assessment. Evaluation results are expected to provide a reference for landslide management, prevention and reduction in the studied basin. Collapse

Zhao D, Wu Q. An approach to predict the height of fractured water-conducting zone of coal roof strata using random forest regression. Sci Rep 2018;8:10986. [PMID: 30030501 PMCID: PMC6054685 DOI: 10.1038/s41598-018-29418-2] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2018] [Accepted: 06/08/2018] [Indexed: 12/03/2022] Open

Liu B, Stevenson RJ. Improving assessment accuracy for lake biological condition by classifying lakes with diatom typology, varying metrics and modeling multimetric indices. THE SCIENCE OF THE TOTAL ENVIRONMENT 2017;609:263-271. [PMID: 28750229 DOI: 10.1016/j.scitotenv.2017.07.152] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/18/2017] [Revised: 07/16/2017] [Accepted: 07/17/2017] [Indexed: 06/07/2023]

Abstract

Site grouping by regions or typologies, site-specific modeling and varying metrics among site groups are four approaches that account for natural variation, which can be a major source of error in ecological assessments. Using a data set from the 2007 National Lakes Assessment project of the USEPA, we compared performances of multimetric indices (MMI) of biological condition that were developed: (1) with different lake grouping methods, ecoregions or diatom typologies; (2) by varying or not varying metrics among site groups; and (3) with different statistical techniques for modeling diatom metric values expected for minimally disturbed condition for each lake. Hierarchical modeling of MMIs, i.e. grouping sites by ecoregions or typologies and then modeling natural variability in metrics among lakes within groups, substantially improved MMI performance compared to using either ecoregions or site-specific modeling alone. Compared with MMIs based on ecoregion site groups, MMI precision and sensitivity to human disturbance were better when sites were grouped by diatom typologies and assessing performance nationwide. However, when MMI performance was evaluated at site group levels, as some government agencies often do, there was little difference in MMI performance between the two site grouping methods. Low numbers of reference and highly impacted sites in some typology groups likely limited MMI performance at the group level of analysis. Varying metrics among site groups did not improve MMI performance. Random forest models for site-specific expected metric values performed better than classification and regression tree and multiple linear regression, except when numbers of reference sites were small in site groups. Then classification and regression tree models were most precise. Based on our results, we recommend hierarchical modeling in future large scale lake assessments where lakes are grouped by ecoregions or diatom typologies and site-specific metric models are used to establish expected metric values.

Collapse

Dimitriadis SI, Liparas D, Tsolaki MN. Random forest feature selection, fusion and ensemble strategy: Combining multiple morphological MRI measures to discriminate among healhy elderly, MCI, cMCI and alzheimer's disease patients: From the alzheimer's disease neuroimaging initiative (ADNI) database. J Neurosci Methods 2017;302:14-23. [PMID: 29269320 DOI: 10.1016/j.jneumeth.2017.12.010] [Citation(s) in RCA: 73] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2017] [Revised: 12/14/2017] [Accepted: 12/17/2017] [Indexed: 02/06/2023]

Abstract

BACKGROUND

In the era of computer-assisted diagnostic tools for various brain diseases, Alzheimer's disease (AD) covers a large percentage of neuroimaging research, with the main scope being its use in daily practice. However, there has been no study attempting to simultaneously discriminate among Healthy Controls (HC), early mild cognitive impairment (MCI), late MCI (cMCI) and stable AD, using features derived from a single modality, namely MRI.

NEW METHOD

Based on preprocessed MRI images from the organizers of a neuroimaging challenge,³ we attempted to quantify the prediction accuracy of multiple morphological MRI features to simultaneously discriminate among HC, MCI, cMCI and AD. We explored the efficacy of a novel scheme that includes multiple feature selections via Random Forest from subsets of the whole set of features (e.g. whole set, left/right hemisphere etc.), Random Forest classification using a fusion approach and ensemble classification via majority voting. From the ADNI database, 60 HC, 60 MCI, 60 cMCI and 60 CE were used as a training set with known labels. An extra dataset of 160 subjects (HC: 40, MCI: 40, cMCI: 40 and AD: 40) was used as an external blind validation dataset to evaluate the proposed machine learning scheme.

RESULTS

In the second blind dataset, we succeeded in a four-class classification of 61.9% by combining MRI-based features with a Random Forest-based Ensemble Strategy. We achieved the best classification accuracy of all teams that participated in this neuroimaging competition.

COMPARISON WITH EXISTING METHOD(S)

The results demonstrate the effectiveness of the proposed scheme to simultaneously discriminate among four groups using morphological MRI features for the very first time in the literature.

CONCLUSIONS

Hence, the proposed machine learning scheme can be used to define single and multi-modal biomarkers for AD.

Collapse

Evaluating Site-Specific and Generic Spatial Models of Aboveground Forest Biomass Based on Landsat Time-Series and LiDAR Strip Samples in the Eastern USA. REMOTE SENSING 2017. [DOI: 10.3390/rs9060598] [Citation(s) in RCA: 30] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]

Swanson A, Willette AA. Neuronal Pentraxin 2 predicts medial temporal atrophy and memory decline across the Alzheimer's disease spectrum. Brain Behav Immun 2016;58:201-208. [PMID: 27444967 PMCID: PMC5349324 DOI: 10.1016/j.bbi.2016.07.148] [Citation(s) in RCA: 47] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/14/2016] [Revised: 06/29/2016] [Accepted: 07/16/2016] [Indexed: 12/21/2022] Open

Abstract

Chronic neuroinflammation is thought to potentiate medial temporal lobe (MTL) atrophy and memory decline in Alzheimer's disease (AD). It has become increasingly important to find novel immunological biomarkers of neuroinflammation or other processes that can track AD development and progression. Our study explored which pro- or anti-inflammatory cerebrospinal fluid (CSF) biomarkers best predicted AD neuropathology over 24months. Using Alzheimer's Disease Neuroimaging Initiative data (N=285), CSF inflammatory biomarkers from mass spectrometry and multiplex panels were screened using stepwise regression, followed up with 50%/50% model retests for validation. Neuronal Pentraxin 2 (NPTX2) and Chitinase-3-like-protein-1 (C3LP1), biomarkers of glutamatergic synaptic plasticity and microglial activation respectively, were the only consistently significant biomarkers selected. Once these biomarkers were selected, linear mixed models were used to analyze their baseline and longitudinal associations with bilateral MTL volume, memory decline, global cognition, and established AD biomarkers including CSF amyloid and tau. Higher baseline NPTX2 levels corresponded to less MTL atrophy [R2=0.287, p<0.001] and substantially less memory decline [R2=0.560, p<0.001] by month 24. Conversely, higher C3LP1 modestly predicted more MTL atrophy [R2=0.083, p<0.001], yet did not significantly track memory decline over time. In conclusion, NPTX2 is a novel pro-inflammatory cytokine that predicts AD-related outcomes better than any immunological biomarker to date, substantially accounting for brain atrophy and especially memory decline. C3LP1 as the microglial biomarker, by contrast, performed modestly and did not predict longitudinal memory decline. This research may advance the current understanding of AD etiopathogenesis, while expanding early diagnostic techniques through the use of novel pro-inflammatory biomarkers, such as NPTX2. Future studies should also see if NPTX2 causally affects MTL morphometry and memory performance.

Collapse

Smith PF. Age-Related Neurochemical Changes in the Vestibular Nuclei. Front Neurol 2016;7:20. [PMID: 26973593 PMCID: PMC4776078 DOI: 10.3389/fneur.2016.00020] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2015] [Accepted: 02/09/2016] [Indexed: 12/18/2022] Open

Tetschke F, Schneider U, Schleussner E, Witte OW, Hoyer D. Assessment of fetal maturation age by heart rate variability measures using random forest methodology. Comput Biol Med 2016;70:157-162. [PMID: 26848727 DOI: 10.1016/j.compbiomed.2016.01.020] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2015] [Revised: 01/14/2016] [Accepted: 01/16/2016] [Indexed: 11/17/2022]