Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Sofaer HR, Hoeting JA, Jarnevich CS. The area under the precision‐recall curve as a performance metric for rare binary events. Methods Ecol Evol 2019. [DOI: 10.1111/2041-210x.13140] [Citation(s) in RCA: 78] [Impact Index Per Article: 15.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

For:	Sofaer HR, Hoeting JA, Jarnevich CS. The area under the precision‐recall curve as a performance metric for rare binary events. Methods Ecol Evol 2019. [DOI: 10.1111/2041-210x.13140] [Citation(s) in RCA: 78] [Impact Index Per Article: 15.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

Number

Cited by Other Article(s)

Mouskeftara T, Deda O, Liapikos T, Panteris E, Karagiannidis E, Papazoglou AS, Gika H. Lipidomic-Based Algorithms Can Enhance Prediction of Obstructive Coronary Artery Disease. J Proteome Res 2024;23:3598-3611. [PMID: 39008891 DOI: 10.1021/acs.jproteome.4c00249] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/17/2024]

Joyce T, Tasci E, Jagasia S, Shephard J, Chappidi S, Zhuge Y, Zhang L, Cooley Zgela T, Sproull M, Mackey M, Camphausen K, Krauze AV. Serum CD133-Associated Proteins Identified by Machine Learning Are Connected to Neural Development, Cancer Pathways, and 12-Month Survival in Glioblastoma. Cancers (Basel) 2024;16:2740. [PMID: 39123468 PMCID: PMC11311306 DOI: 10.3390/cancers16152740] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2024] [Revised: 07/24/2024] [Accepted: 07/26/2024] [Indexed: 08/12/2024] Open

Affiliation(s)

Thomas Joyce Radiation Oncology Branch, Center for Cancer Research, National Cancer Institute NIH, 9000 Rockville Pike, Bethesda, MD 20892, USA; (T.J.); (S.J.); (J.S.); (S.C.); (Y.Z.); (L.Z.); (T.C.Z.); (M.S.); (M.M.); (K.C.)
Erdal Tasci Radiation Oncology Branch, Center for Cancer Research, National Cancer Institute NIH, 9000 Rockville Pike, Bethesda, MD 20892, USA; (T.J.); (S.J.); (J.S.); (S.C.); (Y.Z.); (L.Z.); (T.C.Z.); (M.S.); (M.M.); (K.C.)
Sarisha Jagasia Radiation Oncology Branch, Center for Cancer Research, National Cancer Institute NIH, 9000 Rockville Pike, Bethesda, MD 20892, USA; (T.J.); (S.J.); (J.S.); (S.C.); (Y.Z.); (L.Z.); (T.C.Z.); (M.S.); (M.M.); (K.C.)
Jason Shephard Radiation Oncology Branch, Center for Cancer Research, National Cancer Institute NIH, 9000 Rockville Pike, Bethesda, MD 20892, USA; (T.J.); (S.J.); (J.S.); (S.C.); (Y.Z.); (L.Z.); (T.C.Z.); (M.S.); (M.M.); (K.C.)
Shreya Chappidi Radiation Oncology Branch, Center for Cancer Research, National Cancer Institute NIH, 9000 Rockville Pike, Bethesda, MD 20892, USA; (T.J.); (S.J.); (J.S.); (S.C.); (Y.Z.); (L.Z.); (T.C.Z.); (M.S.); (M.M.); (K.C.) Department of Computer Science and Technology, University of Cambridge, 15 JJ Thomson Ave, Cambridge CB3 0FD, UK
Ying Zhuge Radiation Oncology Branch, Center for Cancer Research, National Cancer Institute NIH, 9000 Rockville Pike, Bethesda, MD 20892, USA; (T.J.); (S.J.); (J.S.); (S.C.); (Y.Z.); (L.Z.); (T.C.Z.); (M.S.); (M.M.); (K.C.)
Longze Zhang Radiation Oncology Branch, Center for Cancer Research, National Cancer Institute NIH, 9000 Rockville Pike, Bethesda, MD 20892, USA; (T.J.); (S.J.); (J.S.); (S.C.); (Y.Z.); (L.Z.); (T.C.Z.); (M.S.); (M.M.); (K.C.)
Theresa Cooley Zgela Radiation Oncology Branch, Center for Cancer Research, National Cancer Institute NIH, 9000 Rockville Pike, Bethesda, MD 20892, USA; (T.J.); (S.J.); (J.S.); (S.C.); (Y.Z.); (L.Z.); (T.C.Z.); (M.S.); (M.M.); (K.C.)
Mary Sproull Radiation Oncology Branch, Center for Cancer Research, National Cancer Institute NIH, 9000 Rockville Pike, Bethesda, MD 20892, USA; (T.J.); (S.J.); (J.S.); (S.C.); (Y.Z.); (L.Z.); (T.C.Z.); (M.S.); (M.M.); (K.C.)
Megan Mackey Radiation Oncology Branch, Center for Cancer Research, National Cancer Institute NIH, 9000 Rockville Pike, Bethesda, MD 20892, USA; (T.J.); (S.J.); (J.S.); (S.C.); (Y.Z.); (L.Z.); (T.C.Z.); (M.S.); (M.M.); (K.C.)
Kevin Camphausen Radiation Oncology Branch, Center for Cancer Research, National Cancer Institute NIH, 9000 Rockville Pike, Bethesda, MD 20892, USA; (T.J.); (S.J.); (J.S.); (S.C.); (Y.Z.); (L.Z.); (T.C.Z.); (M.S.); (M.M.); (K.C.)
Andra V. Krauze Radiation Oncology Branch, Center for Cancer Research, National Cancer Institute NIH, 9000 Rockville Pike, Bethesda, MD 20892, USA; (T.J.); (S.J.); (J.S.); (S.C.); (Y.Z.); (L.Z.); (T.C.Z.); (M.S.); (M.M.); (K.C.)

Collapse

Villena OC, Arab A, Lippi CA, Ryan SJ, Johnson LR. Influence of environmental, geographic, socio-demographic, and epidemiological factors on presence of malaria at the community level in two continents. Sci Rep 2024;14:16734. [PMID: 39030306 PMCID: PMC11271557 DOI: 10.1038/s41598-024-67452-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2024] [Accepted: 07/11/2024] [Indexed: 07/21/2024] Open

Mahawan T, Luckett T, Mielgo Iza A, Pornputtapong N, Caamaño Gutiérrez E. Robust and consistent biomarker candidates identification by a machine learning approach applied to pancreatic ductal adenocarcinoma metastasis. BMC Med Inform Decis Mak 2024;24:175. [PMID: 38902676 PMCID: PMC11191155 DOI: 10.1186/s12911-024-02578-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2024] [Accepted: 06/14/2024] [Indexed: 06/22/2024] Open

Abstract

BACKGROUND

Machine Learning (ML) plays a crucial role in biomedical research. Nevertheless, it still has limitations in data integration and irreproducibility. To address these challenges, robust methods are needed. Pancreatic ductal adenocarcinoma (PDAC), a highly aggressive cancer with low early detection rates and survival rates, is used as a case study. PDAC lacks reliable diagnostic biomarkers, especially metastatic biomarkers, which remains an unmet need. In this study, we propose an ML-based approach for discovering disease biomarkers, apply it to the identification of a PDAC metastatic composite biomarker candidate, and demonstrate the advantages of harnessing data resources.

METHODS

We utilised primary tumour RNAseq data from five public repositories, pooling samples to maximise statistical power and integrating data by correcting for technical variance. Data were split into train and validation sets. The train dataset underwent variable selection via a 10-fold cross-validation process that combined three algorithms in 100 models per fold. Genes found in at least 80% of models and five folds were considered robust to build a consensus multivariate model. A random forest model was constructed using selected genes from the train dataset and tested in the validation set. We also assessed the goodness of prediction by recalibrating a model using only the validation data. The biological context and relevance of signals was explored through enrichment and pathway analyses using QIAGEN Ingenuity Pathway Analysis and GeneMANIA.

RESULTS

We developed a pipeline that can detect robust signatures to build composite biomarkers. We tested the pipeline in PDAC, exploiting transcriptomics data from different sources, proposing a composite biomarker candidate comprised of fifteen genes consistently selected that showed very promising predictive capability. Biological contextualisation revealed links with cancer progression and metastasis, underscoring their potential relevance. All code is available in GitHub.

CONCLUSION

This study establishes a robust framework for identifying composite biomarkers across various disease contexts. We demonstrate its potential by proposing a plausible composite biomarker candidate for PDAC metastasis. By reusing data from public repositories, we highlight the sustainability of our research and the wider applications of our pipeline. The preliminary findings shed light on a promising validation and application path.

Collapse

Richardson E, Trevizani R, Greenbaum JA, Carter H, Nielsen M, Peters B. The receiver operating characteristic curve accurately assesses imbalanced datasets. PATTERNS (NEW YORK, N.Y.) 2024;5:100994. [PMID: 39005487 PMCID: PMC11240176 DOI: 10.1016/j.patter.2024.100994] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Revised: 03/05/2024] [Accepted: 05/03/2024] [Indexed: 07/16/2024]

Pham MP, Vu DD, Nguyen TT, Nguyen VS. Predictive ecological niche model for Cinnamomumparthenoxylon (Jack) Meisn. (Lauraceae) from Last Glacial Maximum to future in Vietnam. Biodivers Data J 2024;12:e122325. [PMID: 38827585 PMCID: PMC11140409 DOI: 10.3897/bdj.12.e122325] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2024] [Accepted: 04/26/2024] [Indexed: 06/04/2024] Open

Kougioumoutzis K, Constantinou I, Panitsa M. Rising Temperatures, Falling Leaves: Predicting the Fate of Cyprus's Endemic Oak under Climate and Land Use Change. PLANTS (BASEL, SWITZERLAND) 2024;13:1109. [PMID: 38674518 PMCID: PMC11053427 DOI: 10.3390/plants13081109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/21/2024] [Revised: 04/11/2024] [Accepted: 04/14/2024] [Indexed: 04/28/2024]

Beyene KM, Chen DG, Kifle YG. A novel nonparametric time-dependent precision-recall curve estimator for right-censored survival data. Biom J 2024;66:e2300135. [PMID: 38637327 DOI: 10.1002/bimj.202300135] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2023] [Revised: 10/04/2023] [Accepted: 12/27/2023] [Indexed: 04/20/2024]

Wang K, Zeng X, Zhou J, Liu F, Luan X, Wang X. BERT-TFBS: a novel BERT-based model for predicting transcription factor binding sites by transfer learning. Brief Bioinform 2024;25:bbae195. [PMID: 38701417 PMCID: PMC11066948 DOI: 10.1093/bib/bbae195] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2024] [Revised: 03/26/2024] [Accepted: 04/10/2024] [Indexed: 05/05/2024] Open

Velásquez-López Y, Ruiz-Escudero A, Arrasate S, González-Díaz H. Implementation of IFPTML Computational Models in Drug Discovery Against Flaviviridae Family. J Chem Inf Model 2024;64:1841-1852. [PMID: 38466369 PMCID: PMC10966645 DOI: 10.1021/acs.jcim.3c01796] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Revised: 02/26/2024] [Accepted: 02/27/2024] [Indexed: 03/13/2024]

Usuzaki T, Takahashi K, Inamori R. Be Careful About Metrics When Imbalanced Data Is Used for a Deep Learning Model. Chest 2024;165:e87-e89. [PMID: 38461027 DOI: 10.1016/j.chest.2023.10.039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Accepted: 10/20/2023] [Indexed: 03/11/2024] Open

Atimbire SA, Appati JK, Owusu E. Empirical exploration of whale optimisation algorithm for heart disease prediction. Sci Rep 2024;14:4530. [PMID: 38402276 PMCID: PMC10894250 DOI: 10.1038/s41598-024-54990-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Accepted: 02/19/2024] [Indexed: 02/26/2024] Open

Cavaiola M, Cassola F, Sacchetti D, Ferrari F, Mazzino A. Hybrid AI-enhanced lightning flash prediction in the medium-range forecast horizon. Nat Commun 2024;15:1188. [PMID: 38331837 PMCID: PMC10853497 DOI: 10.1038/s41467-024-44697-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Accepted: 12/27/2023] [Indexed: 02/10/2024] Open

Usuzaki T, Takahashi K, Inamori R. Letter to the editor on "Automated classification of fat-infiltrated axillary lymph nodes on screening mammograms". Br J Radiol 2024;97:479-480. [PMID: 38308039 DOI: 10.1093/bjr/tqad061] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2023] [Accepted: 10/23/2023] [Indexed: 02/04/2024] Open

Ray DD, Flagel L, Schrider DR. IntroUNET: Identifying introgressed alleles via semantic segmentation. PLoS Genet 2024;20:e1010657. [PMID: 38377104 PMCID: PMC10906877 DOI: 10.1371/journal.pgen.1010657] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Revised: 03/01/2024] [Accepted: 01/29/2024] [Indexed: 02/22/2024] Open

Abstract

A growing body of evidence suggests that gene flow between closely related species is a widespread phenomenon. Alleles that introgress from one species into a close relative are typically neutral or deleterious, but sometimes confer a significant fitness advantage. Given the potential relevance to speciation and adaptation, numerous methods have therefore been devised to identify regions of the genome that have experienced introgression. Recently, supervised machine learning approaches have been shown to be highly effective for detecting introgression. One especially promising approach is to treat population genetic inference as an image classification problem, and feed an image representation of a population genetic alignment as input to a deep neural network that distinguishes among evolutionary models (i.e. introgression or no introgression). However, if we wish to investigate the full extent and fitness effects of introgression, merely identifying genomic regions in a population genetic alignment that harbor introgressed loci is insufficient-ideally we would be able to infer precisely which individuals have introgressed material and at which positions in the genome. Here we adapt a deep learning algorithm for semantic segmentation, the task of correctly identifying the type of object to which each individual pixel in an image belongs, to the task of identifying introgressed alleles. Our trained neural network is thus able to infer, for each individual in a two-population alignment, which of those individual's alleles were introgressed from the other population. We use simulated data to show that this approach is highly accurate, and that it can be readily extended to identify alleles that are introgressed from an unsampled "ghost" population, performing comparably to a supervised learning method tailored specifically to that task. Finally, we apply this method to data from Drosophila, showing that it is able to accurately recover introgressed haplotypes from real data. This analysis reveals that introgressed alleles are typically confined to lower frequencies within genic regions, suggestive of purifying selection, but are found at much higher frequencies in a region previously shown to be affected by adaptive introgression. Our method's success in recovering introgressed haplotypes in challenging real-world scenarios underscores the utility of deep learning approaches for making richer evolutionary inferences from genomic data.

Collapse

Acharya V, Choi D, Yener B, Beamer G. Prediction of Tuberculosis From Lung Tissue Images of Diversity Outbred Mice Using Jump Knowledge Based Cell Graph Neural Network. IEEE ACCESS : PRACTICAL INNOVATIONS, OPEN SOLUTIONS 2024;12:17164-17194. [PMID: 38515959 PMCID: PMC10956573 DOI: 10.1109/access.2024.3359989] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 03/23/2024]

Abstract

Tuberculosis (TB), primarily affecting the lungs, is caused by the bacterium Mycobacterium tuberculosis and poses a significant health risk. Detecting acid-fast bacilli (AFB) in stained samples is critical for TB diagnosis. Whole Slide (WS) Imaging allows for digitally examining these stained samples. However, current deep-learning approaches to analyzing large-sized whole slide images (WSIs) often employ patch-wise analysis, potentially missing the complex spatial patterns observed in the granuloma essential for accurate TB classification. To address this limitation, we propose an approach that models cell characteristics and interactions as a graph, capturing both cell-level information and the overall tissue micro-architecture. This method differs from the strategies in related cell graph-based works that rely on edge thresholds based on sparsity/density in cell graph construction, emphasizing a biologically informed threshold determination instead. We introduce a cell graph-based jumping knowledge neural network (CG-JKNN) that operates on the cell graphs where the edge thresholds are selected based on the length of the mycobacteria's cords and the activated macrophage nucleus's size to reflect the actual biological interactions observed in the tissue. The primary process involves training a Convolutional Neural Network (CNN) to segment AFBs and macrophage nuclei, followed by converting large (42831*41159 pixels) lung histology images into cell graphs where an activated macrophage nucleus/AFB represents each node within the graph and their interactions are denoted as edges. To enhance the interpretability of our model, we employ Integrated Gradients and Shapely Additive Explanations (SHAP). Our analysis incorporated a combination of 33 graph metrics and 20 cell morphology features. In terms of traditional machine learning models, Extreme Gradient Boosting (XGBoost) was the best performer, achieving an F1 score of 0.9813 and an Area under the Precision-Recall Curve (AUPRC) of 0.9848 on the test set. Among graph-based models, our CG-JKNN was the top performer, attaining an F1 score of 0.9549 and an AUPRC of 0.9846 on the held-out test set. The integration of graph-based and morphological features proved highly effective, with CG-JKNN and XGBoost showing promising results in classifying instances into AFB and activated macrophage nucleus. The features identified as significant by our models closely align with the criteria used by pathologists in practice, highlighting the clinical applicability of our approach. Future work will explore knowledge distillation techniques and graph-level classification into distinct TB progression categories.

Collapse

Li D. Attention-enhanced architecture for improved pneumonia detection in chest X-ray images. BMC Med Imaging 2024;24:6. [PMID: 38166579 PMCID: PMC10763425 DOI: 10.1186/s12880-023-01177-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Accepted: 12/07/2023] [Indexed: 01/04/2024] Open

Zhang L, Xu R, Zhao J. Learning technology for detection and grading of cancer tissue using tumour ultrasound images1. JOURNAL OF X-RAY SCIENCE AND TECHNOLOGY 2024;32:157-171. [PMID: 37424493 DOI: 10.3233/xst-230085] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/11/2023]

Zimmer SN, Holsinger KW, Dawson CA. A field-validated ensemble species distribution model of Eriogonum pelinophilum, an endangered subshrub in Colorado, USA. Ecol Evol 2023;13:e10816. [PMID: 38107426 PMCID: PMC10721943 DOI: 10.1002/ece3.10816] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2023] [Revised: 10/10/2023] [Accepted: 11/27/2023] [Indexed: 12/19/2023] Open

Abstract

Understanding the suitable habitat of endangered species is crucial for agencies such as the Bureau of Land Management to plan management and conservation. However, few species distribution models are directly validated, potentially limiting their application in management. In preparation for a Species Status Assessment of clay-loving wild buckwheat (Eriogonum pelinophilum), an endangered subshrub found in southwest Colorado, we ran a series of species distribution models to estimate the species' potential occupied habitat and validated these models in the field. A 1-meter resolution digital elevation model derived from LiDAR and a high-resolution geology mapping helped identify biologically relevant characteristics of the species' habitat. We employed a weighted ensemble model based on two Random Forest and one Boosted Regression Tree model, and discrimination performance of the ensemble model was high (AUC-PR = 0.793). We then conducted a systematic field survey of model habitat suitability predictions, during which we discovered 55 new subpopulations of the species and demonstrated that new species observations were strongly associated with model predictions (p < .0001, Cliff's delta = 0.575). We further refined our original models by incorporating the additional species occurrences collected in the field survey, a new explanatory variable, and a more diverse set of models. These iterative changes marginally improved performance of the ensemble model (AUC-PR = 0.825). Direct validation of species distribution models is extremely rare, and our field survey provides strong validation of our model results. This helps increase confidence to utilize predictions in planning. The final model predictions greatly improve the Bureau of Land Management's understanding of the species' habitat and increase our ability to consider potential habitat in planning land use activities such as road development and travel management.

Collapse

Mulhern RE, Kondash AJ, Norman E, Johnson J, Levine K, McWilliams A, Napier M, Weber F, Stella L, Wood E, Lee Pow Jackson C, Colley S, Cajka J, MacDonald Gibson J, Hoponick Redmon J. Improved Decision Making for Water Lead Testing in U.S. Child Care Facilities Using Machine-Learned Bayesian Networks. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2023;57:17959-17970. [PMID: 36932953 PMCID: PMC10666530 DOI: 10.1021/acs.est.2c07477] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/11/2022] [Revised: 03/06/2023] [Accepted: 03/07/2023] [Indexed: 06/18/2023]

Ghorbanali Z, Zare-Mirakabad F, Salehi N, Akbari M, Masoudi-Nejad A. DrugRep-HeSiaGraph: when heterogenous siamese neural network meets knowledge graphs for drug repurposing. BMC Bioinformatics 2023;24:374. [PMID: 37789314 PMCID: PMC10548718 DOI: 10.1186/s12859-023-05479-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Accepted: 09/12/2023] [Indexed: 10/05/2023] Open

Ng S, Masarone S, Watson D, Barnes MR. The benefits and pitfalls of machine learning for biomarker discovery. Cell Tissue Res 2023;394:17-31. [PMID: 37498390 PMCID: PMC10558383 DOI: 10.1007/s00441-023-03816-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Accepted: 07/12/2023] [Indexed: 07/28/2023]

Geleijnse J, Rutten M, de Villiers D, Bamwenda JT, Abraham E. Enhancing water access monitoring through mapping multi-source usage and disaggregated geographic inequalities with machine learning and surveys. Sci Rep 2023;13:13433. [PMID: 37596313 PMCID: PMC10439218 DOI: 10.1038/s41598-023-39917-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2023] [Accepted: 08/02/2023] [Indexed: 08/20/2023] Open

Abstract

Monitoring safe water access in developing countries relies primarily on household health survey and census data. These surveys are often incomplete: they tend to focus on the primary water source only, are spatially coarse, and usually happen every 5-10 years, during which significant changes can happen in urbanisation and infrastructure provision, especially in sub Saharan Africa. In this work, we present a data-driven approach that utilises and compliments survey based data of water access, to provide context-specific and disaggregated monitoring. The level of access to improved water and sanitation has been shown to vary with geographical inequalities related to the availability of water resources and terrain, population density and socio-economic determinants such as income and education. We use such data and successfully predict the level of water access in areas for which data is lacking, providing spatially explicit and community level monitoring possibilities for mapping geographical inequalities in access. This is showcased by applying three machine learning models that use such geographical data to predict the number of presences of water access points of eight different access types across Uganda, with a 1km by 1km grid resolution. Two Multi-Layer-Perceptron (MLP) models and a Maximum Entropy (MaxEnt) model are developed and compared, where the former are shown to consistently outperform the latter. The best performing Neural Network model achieved a True Positive Rate of 0.89 and a False Positive Rate of 0.24, compared to 0.85 and 0.46 respectively for the MaxEnt model. The models improve on previous work on water point modeling through the use of neural networks, in addition to introducing the True Positive - and False Positive Rate as better evaluation metrics to also assess the MaxEnt model. We also present a scaling method to move from predicting only the relative probability of water point presences, to predicting the absolute number of presences. To challenge both the model results and the more standard health surveys, a new household level survey is carried out in Bushenyi, a mid-sized town in the South-West of Uganda, asking specifically about the multitude of water sources. On average Bushenyi households reported to use 1.9 water sources. The survey further showed that the actual presence of a source, does not always imply that it is used. Therefore it is no option to rely solely on models for water access monitoring. For this, household surveys remain necessary but should be extended with questions on the multiple sources that are used by households.

Collapse

Oh SS, Kuang I, Jeong H, Song JY, Ren B, Moon JY, Park EC, Kawachi I. Predicting Fetal Alcohol Spectrum Disorders Using Machine Learning Techniques: Multisite Retrospective Cohort Study. J Med Internet Res 2023;25:e45041. [PMID: 37463016 PMCID: PMC10394506 DOI: 10.2196/45041] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2022] [Revised: 05/22/2023] [Accepted: 06/18/2023] [Indexed: 07/21/2023] Open

Abstract

BACKGROUND

Fetal alcohol syndrome (FAS) is a lifelong developmental disability that occurs among individuals with prenatal alcohol exposure (PAE). With improved prediction models, FAS can be diagnosed or treated early, if not completely prevented.

OBJECTIVE

In this study, we sought to compare different machine learning algorithms and their FAS predictive performance among women who consumed alcohol during pregnancy. We also aimed to identify which variables (eg, timing of exposure to alcohol during pregnancy and type of alcohol consumed) were most influential in generating an accurate model.

METHODS

Data from the collaborative initiative on fetal alcohol spectrum disorders from 2007 to 2017 were used to gather information about 595 women who consumed alcohol during pregnancy at 5 hospital sites around the United States. To obtain information about PAE, questionnaires or in-person interviews, as well as reviews of medical, legal, or social service records were used to gather information about alcohol consumption. Four different machine learning algorithms (logistic regression, XGBoost, light gradient-boosting machine, and CatBoost) were trained to predict the prevalence of FAS at birth, and model performance was measured by analyzing the area under the receiver operating characteristics curve (AUROC). Of the total cases, 80% were randomly selected for training, while 20% remained as test data sets for predicting FAS. Feature importance was also analyzed using Shapley values for the best-performing algorithm.

RESULTS

Overall, there were 20 cases of FAS within a total population of 595 individuals with PAE. Most of the drinking occurred in the first trimester only (n=491) or throughout all 3 trimesters (n=95); however, there were also reports of drinking in the first and second trimesters only (n=8), and 1 case of drinking in the third trimester only (n=1). The CatBoost method delivered the best performance in terms of AUROC (0.92) and area under the precision-recall curve (AUPRC 0.51), followed by the logistic regression method (AUROC 0.90; AUPRC 0.59), the light gradient-boosting machine (AUROC 0.89; AUPRC 0.52), and XGBoost (AUROC 0.86; AURPC 0.45). Shapley values in the CatBoost model revealed that 12 variables were considered important in FAS prediction, with drinking throughout all 3 trimesters of pregnancy, maternal age, race, and type of alcoholic beverage consumed (eg, beer, wine, or liquor) scoring highly in overall feature importance. For most predictive measures, the best performance was obtained by the CatBoost algorithm, with an AUROC of 0.92, precision of 0.50, specificity of 0.29, F1 score of 0.29, and accuracy of 0.96.

CONCLUSIONS

Machine learning algorithms were able to identify FAS risk with a prediction performance higher than that of previous models among pregnant drinkers. For small training sets, which are common with FAS, boosting mechanisms like CatBoost may help alleviate certain problems associated with data imbalances and difficulties in optimization or generalization.

Collapse

Stillman AN, Wilkerson RL, Kaschube DR, Siegel RB, Sawyer SC, Tingley MW. Incorporating pyrodiversity into wildlife habitat assessments for rapid post-fire management: A woodpecker case study. ECOLOGICAL APPLICATIONS : A PUBLICATION OF THE ECOLOGICAL SOCIETY OF AMERICA 2023;33:e2853. [PMID: 36995347 DOI: 10.1002/eap.2853] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Revised: 03/15/2023] [Accepted: 03/16/2023] [Indexed: 06/02/2023]

Lin HD, Lee TH, Lin CH, Wu HC. Optical Imaging Deformation Inspection and Quality Level Determination of Multifocal Glasses. SENSORS (BASEL, SWITZERLAND) 2023;23:s23094497. [PMID: 37177700 PMCID: PMC10181736 DOI: 10.3390/s23094497] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Revised: 05/01/2023] [Accepted: 05/03/2023] [Indexed: 05/15/2023]

Abstract

Multifocal glasses are a new type of lens that can fit both nearsighted and farsighted vision on the same lens. This property allows the glass to have various curvatures in distinct regions within the glass during the grinding process. However, when the curvature varies irregularly, the glass is prone to optical deformation during imaging. Most of the previous studies on imaging deformation focus on the deformation correction of optical lenses. Consequently, this research uses an automatic deformation defect detection system for multifocal glasses to replace professional assessors. To quantify the grade of deformation of curved multifocal glasses, we first digitally imaged a pattern of concentric circles through a test glass to generate an imaged image of the glass. Second, we preprocess the image to enhance the clarity of the concentric circles' appearance. A centroid-radius model is used to represent the form variation properties of every circle in the processed image. Third, the deviation of the centroid radius for detecting deformation defects is found by a slight deviation control scheme, and we gain a difference image indicating the detected deformed regions after comparing it with the norm pattern. Fourth, based on the deformation measure and occurrence location of multifocal glasses, we build fuzzy membership functions and inference regulations to quantify the deformation's severity. Finally, a mixed model incorporating a network-based fuzzy inference and a genetic algorithm is applied to determine a quality grade for the deformation severity of detected defects. Testing outcomes show that the proposed methods attain a 94% accuracy rate of the quality levels for deformation severity, an 81% recall rate of deformation defects, and an 11% false positive rate for multifocal glass detection. This research contributes solutions to the problems of imaging deformation inspection and provides computer-aided systems for determining quality levels that meet the demands of inspection and quality control.

Collapse

Ghorbanali Z, Zare-Mirakabad F, Akbari M, Salehi N, Masoudi-Nejad A. DrugRep-KG: Toward Learning a Unified Latent Space for Drug Repurposing Using Knowledge Graphs. J Chem Inf Model 2023;63:2532-2545. [PMID: 37023229 PMCID: PMC10109243 DOI: 10.1021/acs.jcim.2c01291] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2022] [Indexed: 04/08/2023]

Abstract

Drug repurposing or repositioning (DR) refers to finding new therapeutic applications for existing drugs. Current computational DR methods face data representation and negative data sampling challenges. Although retrospective studies attempt to operate various representations, it is a crucial step for an accurate prediction to aggregate these features and bring the associations between drugs and diseases into a unified latent space. In addition, the number of unknown associations between drugs and diseases, which is considered negative data, is much higher than the number of known associations, or positive data, leading to an imbalanced dataset. In this regard, we propose the DrugRep-KG method, which applies a knowledge graph embedding approach for representing drugs and diseases, to address these challenges. Despite the typical DR methods that consider all unknown drug-disease associations as negative data, we select a subset of unknown associations, provided the disease occurs because of an adverse reaction to a drug. DrugRep-KG has been evaluated based on different settings and achieves an AUC-ROC (area under the receiver operating characteristic curve) of 90.83% and an AUC-PR (area under the precision-recall curve) of 90.10%, which are higher than in previous works. Besides, we checked the performance of our framework in finding potential drugs for coronavirus infection and skin-related diseases: contact dermatitis and atopic eczema. DrugRep-KG predicted beclomethasone for contact dermatitis, and fluorometholone, clocortolone, fluocinonide, and beclomethasone for atopic eczema, all of which have previously been proven to be effective in other studies. Fluorometholone for contact dermatitis is a novel suggestion by DrugRep-KG that should be validated experimentally. DrugRep-KG also predicted the associations between COVID-19 and potential treatments suggested by DrugBank, in addition to new drug candidates provided with experimental evidence. The data and code underlying this article are available at https://github.com/CBRC-lab/DrugRep-KG.

Collapse

Silagyi DV, Liu D. Prediction of severity of aviation landing accidents using support vector machine models. ACCIDENT; ANALYSIS AND PREVENTION 2023;187:107043. [PMID: 37086512 DOI: 10.1016/j.aap.2023.107043] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/27/2022] [Revised: 12/29/2022] [Accepted: 03/23/2023] [Indexed: 05/03/2023]

Chang S, Wilkho RS, Gharaibeh N, Sansom G, Meyer M, Olivera F, Zou L. Environmental, climatic, and situational factors influencing the probability of fatality or injury occurrence in flash flooding: a rare event logistic regression predictive model. NATURAL HAZARDS (DORDRECHT, NETHERLANDS) 2023;116:3957-3978. [PMID: 37974652 PMCID: PMC10653003 DOI: 10.1007/s11069-023-05845-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/05/2022] [Accepted: 01/30/2023] [Indexed: 11/19/2023]

Murch WS, Kairouz S, Dauphinais S, Picard E, Costes JM, French M. Using machine learning to retrospectively predict self-reported gambling problems in Quebec. Addiction 2023. [PMID: 36880253 DOI: 10.1111/add.16179] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/27/2022] [Accepted: 02/15/2023] [Indexed: 03/08/2023]

Abstract

BACKGROUND AND AIMS

Participating in online gambling is associated with an increased risk for experiencing gambling-related harms, driving calls for more effective, personalized harm prevention initiatives. Such initiatives depend on the development of models capable of detecting at-risk online gamblers. We aimed to determine whether machine learning algorithms can use site data to detect retrospectively at-risk online gamblers indicated by the Problem Gambling Severity Index (PGSI).

DESIGN

Exploratory comparison of six prominent supervised machine learning methods (decision trees, random forests, K-nearest neighbours, logistic regressions, artificial neural networks and support vector machines) to predict problem gambling risk levels reported on the PGSI.

SETTING

Lotoquebec.com (formerly espacejeux.com), an online gambling platform operated by Loto-Québec (a provincial Crown Corporation) in Quebec, Canada.

PARTICIPANTS

N = 9145 adults (18+) who completed the survey measure and placed at least one bet using real money on the site.

MEASUREMENTS

Participants completed the PGSI, a self-report questionnaire with validated cut-offs denoting a moderate-to-high-risk (PGSI 5+) or high-risk (PGSI 8+) for experiencing past-year gambling-related problems. Participants agreed to release additional data about the preceding 12 months from their user accounts. Predictor variables (144) were derived from users' transactions, apparent betting behaviours, listed demographics and use of responsible gambling tools on the platform.

FINDINGS

Our best classification models (random forests) for the PGSI 5+ and 8+ outcome variables accounted for 84.33% (95% CI = 82.24-86.41) and 82.52% (95% CI = 79.96-85.08) of the total area under their receiver operating characteristic curves, respectively. The most important factors in these models included the frequency and variability of participants' betting behaviour and repeat engagement on the site.

CONCLUSIONS

Machine learning algorithms appear to be able to classify at-risk online gamblers using data generated from their use of online gambling platforms. They may enable personalized harm prevention initiatives, but are constrained by trade-offs between their sensitivity and precision.

Collapse

Eysenbach G, Chao HJ, Chiang YC, Chen HY. Explainable Machine Learning Techniques To Predict Amiodarone-Induced Thyroid Dysfunction Risk: Multicenter, Retrospective Study With External Validation. J Med Internet Res 2023;25:e43734. [PMID: 36749620 PMCID: PMC9944157 DOI: 10.2196/43734] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2022] [Revised: 12/25/2022] [Accepted: 01/16/2023] [Indexed: 02/08/2023] Open

Abstract

BACKGROUND

Machine learning offers new solutions for predicting life-threatening, unpredictable amiodarone-induced thyroid dysfunction. Traditional regression approaches for adverse-effect prediction without time-series consideration of features have yielded suboptimal predictions. Machine learning algorithms with multiple data sets at different time points may generate better performance in predicting adverse effects.

OBJECTIVE

We aimed to develop and validate machine learning models for forecasting individualized amiodarone-induced thyroid dysfunction risk and to optimize a machine learning-based risk stratification scheme with a resampling method and readjustment of the clinically derived decision thresholds.

METHODS

This study developed machine learning models using multicenter, delinked electronic health records. It included patients receiving amiodarone from January 2013 to December 2017. The training set was composed of data from Taipei Medical University Hospital and Wan Fang Hospital, while data from Taipei Medical University Shuang Ho Hospital were used as the external test set. The study collected stationary features at baseline and dynamic features at the first, second, third, sixth, ninth, 12th, 15th, 18th, and 21st months after amiodarone initiation. We used 16 machine learning models, including extreme gradient boosting, adaptive boosting, k-nearest neighbor, and logistic regression models, along with an original resampling method and 3 other resampling methods, including oversampling with the borderline-synthesized minority oversampling technique, undersampling-edited nearest neighbor, and over- and undersampling hybrid methods. The model performance was compared based on accuracy; Precision, recall, F₁-score, geometric mean, area under the curve of the receiver operating characteristic curve (AUROC), and the area under the precision-recall curve (AUPRC). Feature importance was determined by the best model. The decision threshold was readjusted to identify the best cutoff value and a Kaplan-Meier survival analysis was performed.

RESULTS

The training set contained 4075 patients from Taipei Medical University Hospital and Wan Fang Hospital, of whom 583 (14.3%) developed amiodarone-induced thyroid dysfunction, while the external test set included 2422 patients from Taipei Medical University Shuang Ho Hospital, of whom 275 (11.4%) developed amiodarone-induced thyroid dysfunction. The extreme gradient boosting oversampling machine learning model demonstrated the best predictive outcomes among all 16 models. The accuracy; Precision, recall, F₁-score, G-mean, AUPRC, and AUROC were 0.923, 0.632, 0.756, 0.688, 0.845, 0.751, and 0.934, respectively. After readjusting the cutoff, the best value was 0.627, and the F₁-score reached 0.699. The best threshold was able to classify 286 of 2422 patients (11.8%) as high-risk subjects, among which 275 were true-positive patients in the testing set. A shorter treatment duration; higher levels of thyroid-stimulating hormone and high-density lipoprotein cholesterol; and lower levels of free thyroxin, alkaline phosphatase, and low-density lipoprotein were the most important features.

CONCLUSIONS

Machine learning models combined with resampling methods can predict amiodarone-induced thyroid dysfunction and serve as a support tool for individualized risk prediction and clinical decision support.

Collapse

National wetland mapping using remote-sensing-derived environmental variables, archive field data, and artificial intelligence. Heliyon 2023;9:e13482. [PMID: 36816231 PMCID: PMC9929292 DOI: 10.1016/j.heliyon.2023.e13482] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Revised: 01/31/2023] [Accepted: 02/01/2023] [Indexed: 02/08/2023] Open

Predicting wind-driven spatial deposition through simulated color images using deep autoencoders. Sci Rep 2023;13:1394. [PMID: 36697487 PMCID: PMC9876895 DOI: 10.1038/s41598-023-28590-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2022] [Accepted: 01/20/2023] [Indexed: 01/27/2023] Open

Fiorentino MC, Villani FP, Di Cosmo M, Frontoni E, Moccia S. A review on deep-learning algorithms for fetal ultrasound-image analysis. Med Image Anal 2023;83:102629. [PMID: 36308861 DOI: 10.1016/j.media.2022.102629] [Citation(s) in RCA: 27] [Impact Index Per Article: 27.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2022] [Revised: 07/12/2022] [Accepted: 09/10/2022] [Indexed: 11/07/2022]

Kougioumoutzis K, Trigas P, Tsakiri M, Kokkoris IP, Koumoutsou E, Dimopoulos P, Tzanoudakis D, Iatrou G, Panitsa M. Climate and Land-Cover Change Impacts and Extinction Risk Assessment of Rare and Threatened Endemic Taxa of Chelmos-Vouraikos National Park (Peloponnese, Greece). PLANTS (BASEL, SWITZERLAND) 2022;11:3548. [PMID: 36559660 PMCID: PMC9784511 DOI: 10.3390/plants11243548] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/25/2022] [Revised: 12/04/2022] [Accepted: 12/10/2022] [Indexed: 06/17/2023]

Zhi X, Du H, Zhang M, Long Z, Zhong L, Sun X. Mapping the habitat for the moose population in Northeast China by combining remote sensing products and random forests. Glob Ecol Conserv 2022. [DOI: 10.1016/j.gecco.2022.e02347] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022] Open

Langeland E, Johnsen IF, Sømme KK, Morken AM, Erevik EK, Kolberg E, Jonsson J, Mentzoni RA, Pallesen S. One size does not fit all. Should gambling loss limits be based on income? Front Psychiatry 2022;13:1005172. [PMID: 36465287 PMCID: PMC9709812 DOI: 10.3389/fpsyt.2022.1005172] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/18/2022] [Accepted: 10/05/2022] [Indexed: 11/17/2022] Open

Levy TJ, Coppa K, Cang J, Barnaby DP, Paradis MD, Cohen SL, Makhnevich A, van Klaveren D, Kent DM, Davidson KW, Hirsch JS, Zanos TP. Development and validation of self-monitoring auto-updating prognostic models of survival for hospitalized COVID-19 patients. Nat Commun 2022;13:6812. [PMID: 36357420 PMCID: PMC9648888 DOI: 10.1038/s41467-022-34646-2] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2022] [Accepted: 11/02/2022] [Indexed: 11/12/2022] Open

Affiliation(s)

Todd J Levy Institute of Health System Science, Feinstein Institutes for Medical Research, Northwell Health, Manhasset, NY, 11030, USA Institute of Bioelectronic Medicine, Feinstein Institutes for Medical Research, Northwell Health, Manhasset, NY, 11030, USA
Kevin Coppa Clinical Digital Solutions, Northwell Health, New Hyde Park, NY, 11042, USA
Jinxuan Cang Institute of Health System Science, Feinstein Institutes for Medical Research, Northwell Health, Manhasset, NY, 11030, USA Institute of Bioelectronic Medicine, Feinstein Institutes for Medical Research, Northwell Health, Manhasset, NY, 11030, USA
Douglas P Barnaby Institute of Health System Science, Feinstein Institutes for Medical Research, Northwell Health, Manhasset, NY, 11030, USA Donald and Barbara Zucker School of Medicine at Hofstra/Northwell, Northwell Health, Hempstead, NY, 11549, USA
Marc D Paradis Northwell Holdings, Northwell Health, Manhasset, NY, 11030, USA
Stuart L Cohen Institute of Health System Science, Feinstein Institutes for Medical Research, Northwell Health, Manhasset, NY, 11030, USA Donald and Barbara Zucker School of Medicine at Hofstra/Northwell, Northwell Health, Hempstead, NY, 11549, USA
Alex Makhnevich Institute of Health System Science, Feinstein Institutes for Medical Research, Northwell Health, Manhasset, NY, 11030, USA Donald and Barbara Zucker School of Medicine at Hofstra/Northwell, Northwell Health, Hempstead, NY, 11549, USA
David van Klaveren Department of Public Health, Erasmus MC University Medical Center, Rotterdam, Netherlands Predictive Analytics and Comparative Effectiveness Center, Tufts Medical Center, Boston, MA, USA
David M Kent Predictive Analytics and Comparative Effectiveness Center, Tufts Medical Center, Boston, MA, USA
Karina W Davidson Institute of Health System Science, Feinstein Institutes for Medical Research, Northwell Health, Manhasset, NY, 11030, USA Donald and Barbara Zucker School of Medicine at Hofstra/Northwell, Northwell Health, Hempstead, NY, 11549, USA
Jamie S Hirsch Institute of Health System Science, Feinstein Institutes for Medical Research, Northwell Health, Manhasset, NY, 11030, USA Clinical Digital Solutions, Northwell Health, New Hyde Park, NY, 11042, USA Donald and Barbara Zucker School of Medicine at Hofstra/Northwell, Northwell Health, Hempstead, NY, 11549, USA
Theodoros P Zanos Institute of Health System Science, Feinstein Institutes for Medical Research, Northwell Health, Manhasset, NY, 11030, USA. Institute of Bioelectronic Medicine, Feinstein Institutes for Medical Research, Northwell Health, Manhasset, NY, 11030, USA. Donald and Barbara Zucker School of Medicine at Hofstra/Northwell, Northwell Health, Hempstead, NY, 11549, USA.

Collapse

Lotterhos KE, Fitzpatrick MC, Blackmon H. Simulation Tests of Methods in Evolution, Ecology, and Systematics: Pitfalls, Progress, and Principles. ANNUAL REVIEW OF ECOLOGY, EVOLUTION, AND SYSTEMATICS 2022;53:113-136. [PMID: 38107485 PMCID: PMC10723108 DOI: 10.1146/annurev-ecolsys-102320-093722] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2023]

Hysen L, Nayeri D, Cushman S, Wan HY. Background sampling for multi-scale ensemble habitat selection modeling: Does the number of points matter? ECOL INFORM 2022. [DOI: 10.1016/j.ecoinf.2022.101914] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]

Cicuttin A, Morales IR, Crespo ML, Carrato S, García LG, Molina RS, Valinoti B, Folla Kamdem J. A Simplified Correlation Index for Fast Real-Time Pulse Shape Recognition. SENSORS (BASEL, SWITZERLAND) 2022;22:7697. [PMID: 36298048 PMCID: PMC9607046 DOI: 10.3390/s22207697] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/31/2022] [Revised: 10/01/2022] [Accepted: 10/04/2022] [Indexed: 06/16/2023]

Jarnevich CS, Sofaer HR, Belamaric P, Engelstad P. Regional models do not outperform continental models for invasive species. NEOBIOTA 2022. [DOI: 10.3897/neobiota.77.86364] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]

Abstract Aim: Species distribution models can guide invasive species prevention and management by characterizing invasion risk across space. However, extrapolation and transferability issues pose challenges for developing useful models for invasive species. Previous work has emphasized the importance of including all available occurrences in model estimation, but managers attuned to local processes may be skeptical of models based on a broad spatial extent if they suspect the captured responses reflect those of other regions where data are more numerous. We asked whether species distribution models for invasive plants performed better when developed at national versus regional extents. Location: Continental United States. Methods: We developed ensembles of species distribution models trained nationally, on sagebrush habitat, or on sagebrush habitat within three ecoregions (Great Basin, eastern sagebrush, and Great Plains) for nine invasive plants of interest for early detection and rapid response at local or regional scales. We compared the performance of national versus regional models using spatially independent withheld test data from each of the three ecoregions. Results: We found that models trained using a national spatial extent tended to perform better than regionally trained models. Regional models did not outperform national ones even when considerable occurrence data were available for model estimation within the focal region. Information was often unavailable to fit informative regional models precisely in those areas of greatest interest for early detection and rapid response. Main conclusions: Habitat suitability models for invasive plant species trained at a continental extent can reduce extrapolation while maximizing information on species’ responses to environmental variation. Standard modeling methods can capture spatially varying limiting factors, while regional or hierarchical models may only be advantageous when populations differ in their responses to environmental conditions, a condition expected to be relatively rare at the expanding boundaries of invasive species’ distributions. Collapse

Perrot B, Hardouin JB, Thiabaud E, Saillard A, Grall-Bronnec M, Challet-Bouju G. Development and validation of a prediction model for online gambling problems based on players' account data. J Behav Addict 2022;11:874-889. [PMID: 36125924 PMCID: PMC9872531 DOI: 10.1556/2006.2022.00063] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/05/2021] [Revised: 03/03/2022] [Accepted: 08/13/2022] [Indexed: 02/03/2023] Open

Saak S, Huelsmeier D, Kollmeier B, Buhl M. A flexible data-driven audiological patient stratification method for deriving auditory profiles. Front Neurol 2022;13:959582. [PMID: 36188360 PMCID: PMC9520582 DOI: 10.3389/fneur.2022.959582] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2022] [Accepted: 08/11/2022] [Indexed: 11/13/2022] Open

Abstract For characterizing the complexity of hearing deficits, it is important to consider different aspects of auditory functioning in addition to the audiogram. For this purpose, extensive test batteries have been developed aiming to cover all relevant aspects as defined by experts or model assumptions. However, as the assessment time of physicians is limited, such test batteries are often not used in clinical practice. Instead, fewer measures are used, which vary across clinics. This study aimed at proposing a flexible data-driven approach for characterizing distinct patient groups (patient stratification into auditory profiles) based on one prototypical database (N = 595) containing audiogram data, loudness scaling, speech tests, and anamnesis questions. To further maintain the applicability of the auditory profiles in clinical routine, we built random forest classification models based on a reduced set of audiological measures which are often available in clinics. Different parameterizations regarding binarization strategy, cross-validation procedure, and evaluation metric were compared to determine the optimum classification model. Our data-driven approach, involving model-based clustering, resulted in a set of 13 patient groups, which serve as auditory profiles. The 13 auditory profiles separate patients within certain ranges across audiological measures and are audiologically plausible. Both a normal hearing profile and profiles with varying extents of hearing impairments are defined. Further, a random forest classification model with a combination of a one-vs.-all and one-vs.-one binarization strategy, 10-fold cross-validation, and the kappa evaluation metric was determined as the optimal model. With the selected model, patients can be classified into 12 of the 13 auditory profiles with adequate precision (mean across profiles = 0.9) and sensitivity (mean across profiles = 0.84). The proposed approach, consequently, allows generating of audiologically plausible and interpretable, data-driven clinical auditory profiles, providing an efficient way of characterizing hearing deficits, while maintaining clinical applicability. The method should by design be applicable to all audiological data sets from clinics or research, and in addition be flexible to summarize information across databases by means of profiles, as well as to expand the approach toward aided measurements, fitting parameters, and further information from databases. Collapse

Naing KM, Boonsang S, Chuwongin S, Kittichai V, Tongloy T, Prommongkol S, Dekumyoy P, Watthanakulpanich D. Automatic recognition of parasitic products in stool examination using object detection approach. PeerJ Comput Sci 2022;8:e1065. [PMID: 36092001 PMCID: PMC9455271 DOI: 10.7717/peerj-cs.1065] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2022] [Accepted: 07/19/2022] [Indexed: 06/15/2023]

Abstract

BACKGROUND

Object detection is a new artificial intelligence approach to morphological recognition and labeling parasitic pathogens. Due to the lack of equipment and trained personnel, artificial intelligence innovation for searching various parasitic products in stool examination will enable patients in remote areas of undeveloped countries to access diagnostic services. Because object detection is a developing approach that has been tested for its effectiveness in detecting intestinal parasitic objects such as protozoan cysts and helminthic eggs, it is suitable for use in rural areas where many factors supporting laboratory testing are still lacking. Based on the literatures, the YOLOv4-Tiny produces faster results and uses less memory with the support of low-end GPU devices. In comparison to the YOLOv3 and YOLOv3-Tiny models, this study aimed to propose an automated object detection approach, specifically the YOLOv4-Tiny model, for automatic recognition of intestinal parasitic products in stools.

METHODS

To identify protozoan cysts and helminthic eggs in human feces, the three YOLO approaches; YOLOv4-Tiny, YOLOv3, and YOLOv3-Tiny, were trained to recognize 34 intestinal parasitic classes using training of image dataset. Feces were processed using a modified direct smear method adapted from the simple direct smear and the modified Kato-Katz methods. The image dataset was collected from intestinal parasitic objects discovered during stool examination and the three YOLO models were trained to recognize the image datasets.

RESULTS

The non-maximum suppression technique and the threshold level were used to analyze the test dataset, yielding results of 96.25% precision and 95.08% sensitivity for YOLOv4-Tiny. Additionally, the YOLOv4-Tiny model had the best AUPRC performance of the three YOLO models, with a score of 0.963.

CONCLUSION

This study, to our knowledge, was the first to detect protozoan cysts and helminthic eggs in the 34 classes of intestinal parasitic objects in human stools.

Collapse

Shao M, Fan J, Ma J, Wang L. Identifying the natural reserve area of Cistanche salsa under the effects of multiple host plants and climate change conditions using a maximum entropy model in Xinjiang, China. FRONTIERS IN PLANT SCIENCE 2022;13:934959. [PMID: 36061800 PMCID: PMC9432852 DOI: 10.3389/fpls.2022.934959] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/03/2022] [Accepted: 07/21/2022] [Indexed: 06/15/2023]

Abstract

Cistanche salsa (C. A. Mey.) G. Beck, a holoparasitic desert medicine plant with multiple hosts, is regarded as a potential future desert economic plant. However, as a result of excessive exploitation and poaching, its wild resources have become scarce. Thus, before developing its desert economic value, this plant has to be protected, and the identification of its natural reserve is currently the top priority. However, in previous nature reserve prediction studies, the influence of host plants has been overlooked, particularly in holoparasitic plants with multiple hosts. In this study, we sought to identify the conservation areas of wild C. salsa by considering multiple host-plant interactions and climate change conditions using the MaxEnt model. Additionally, a Principal Component Analysis (PCA) was used to reduce the autocorrelation between environmental variables. The effects of the natural distribution of the host plants in terms of natural distribution from the perspective of niche similarities and extrapolation detection were considered by filtering the most influential hosts: Krascheninnikovia ceratoides (Linnaeus), Gueldenstaedt, and Nitraria sibirica Pall. Additionally, the change trends in these hosts based on climate change conditions combined with the change trends in C. salsa were used to identify a core protection area of 126483.5 km². In this article, we corrected and tried to avoid some of the common mistakes found in species distribution models based on the findings of previous research and fully considered the effects of host plants for multiple-host holoparasitic plants to provide a new perspective on the prediction of holoparasitic plants and to provide scientific zoning for biodiversity conservation in desert ecosystems. This research will hopefully serve as a significant reference for decision-makers.

Collapse

Wade MW, Fisher M, Matich P. Comparison of two machine learning frameworks for predicting aggregatory behavior of sharks. J Appl Ecol 2022. [DOI: 10.1111/1365-2664.14273] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]

Qiu L, Chen J, Fan L, Sun L, Zheng C. High-resolution mapping of wildfire drivers in California based on machine learning. THE SCIENCE OF THE TOTAL ENVIRONMENT 2022;833:155155. [PMID: 35413339 DOI: 10.1016/j.scitotenv.2022.155155] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/29/2021] [Revised: 03/31/2022] [Accepted: 04/06/2022] [Indexed: 06/14/2023]

Quantifying congestion with player tracking data in Australian football. PLoS One 2022;17:e0272657. [PMID: 35939497 PMCID: PMC9359552 DOI: 10.1371/journal.pone.0272657] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2022] [Accepted: 07/24/2022] [Indexed: 12/04/2022] Open

Tomal JH, Welch WJ, Zamar RH. Robust ranking by ensembling of diverse models and assessment metrics. J STAT COMPUT SIM 2022. [DOI: 10.1080/00949655.2022.2093873] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]