Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Basu S, Kumbier K, Brown JB, Yu B. Iterative random forests to discover predictive and stable high-order interactions. Proc Natl Acad Sci U S A 2018;115:1943-8. [PMID: 29351989 DOI: 10.1073/pnas.1711236115] [Citation(s) in RCA: 112] [Impact Index Per Article: 18.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open

For:	Basu S, Kumbier K, Brown JB, Yu B. Iterative random forests to discover predictive and stable high-order interactions. Proc Natl Acad Sci U S A 2018;115:1943-8. [PMID: 29351989 DOI: 10.1073/pnas.1711236115] [Citation(s) in RCA: 112] [Impact Index Per Article: 18.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open

Number

Cited by Other Article(s)

Zhang F, Gou J. Machine learning assessment of risk factors for depression in later adulthood. THE LANCET REGIONAL HEALTH. EUROPE 2022;18:100399. [PMID: 35586270 PMCID: PMC9109181 DOI: 10.1016/j.lanepe.2022.100399] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]

Wang Z, Niu Y, Vashisth T, Li J, Madden R, Livingston TS, Wang Y. Nontargeted metabolomics-based multiple machine learning modeling boosts early accurate detection for citrus Huanglongbing. HORTICULTURE RESEARCH 2022;9:uhac145. [PMID: 36061619 PMCID: PMC9433982 DOI: 10.1093/hr/uhac145] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/06/2022] [Accepted: 06/20/2022] [Indexed: 06/15/2023]

Walker AM, Cliff A, Romero J, Shah MB, Jones P, Felipe Machado Gazolla JG, Jacobson DA, Kainer D. Evaluating the Performance of Random Forest and Iterative Random Forest Based Methods when Applied to Gene Expression Data. Comput Struct Biotechnol J 2022;20:3372-3386. [PMID: 35832622 PMCID: PMC9260260 DOI: 10.1016/j.csbj.2022.06.037] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2022] [Revised: 06/14/2022] [Accepted: 06/14/2022] [Indexed: 11/30/2022] Open

Provable Boolean interaction recovery from tree ensemble obtained via random forests. Proc Natl Acad Sci U S A 2022;119:e2118636119. [PMID: 35609192 PMCID: PMC9295780 DOI: 10.1073/pnas.2118636119] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Sadique Z, Grieve R, Diaz-Ordaz K, Mouncey P, Lamontagne F, O’Neill S. A Machine-Learning Approach for Estimating Subgroup- and Individual-Level Treatment Effects: An Illustration Using the 65 Trial. Med Decis Making 2022;42:923-936. [PMID: 35607982 PMCID: PMC9459357 DOI: 10.1177/0272989x221100717] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]

You X, Dadwal UC, Lenburg ME, Kacena MA, Charles JF. Murine Gut Microbiome Meta-analysis Reveals Alterations in Carbohydrate Metabolism in Response to Aging. mSystems 2022;7:e0124821. [PMID: 35400171 PMCID: PMC9040766 DOI: 10.1128/msystems.01248-21] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2021] [Accepted: 03/28/2022] [Indexed: 11/23/2022] Open

Abstract

Compositional and functional alterations to the gut microbiota during aging are hypothesized to potentially impact our health. Thus, determining aging-specific gut microbiome alterations is critical for developing microbiome-based strategies to improve health and promote longevity in the elderly. In this study, we performed a meta-analysis of publicly available 16S rRNA gene sequencing data from studies investigating the effect of aging on the gut microbiome in mice. Aging reproducibly increased gut microbial alpha diversity and shifted the microbial community structure in mice. We applied the bioinformatic tool PICRUSt2 to predict microbial metagenome function and established a random forest classifier to differentiate between microbial communities from young and old hosts and to identify aging-specific metabolic features. In independent validation data sets, this classifier achieved an area under the receiver operating characteristic curve (AUC) of 0.75 to 0.97 in differentiating microbiomes from young and old hosts. We found that 50% of the most important predicted aging-specific metabolic features were involved in carbohydrate metabolism. Furthermore, fecal short-chain fatty acid (SCFA) concentrations were significantly decreased in old mice, and the expression of the SCFA receptor Gpr41 in the colon was significantly correlated with the relative abundances of gut microbes and microbial carbohydrate metabolic pathways. In conclusion, this study identified aging-specific alterations in the composition and function of the gut microbiome and revealed a potential relationship between aging, microbial carbohydrate metabolism, fecal SCFA, and colonic Gpr41 expression. IMPORTANCE Aging-associated microbial alteration is hypothesized to play an important role in host health and longevity. However, investigations regarding specific gut microbes or microbial functional alterations associated with aging have had inconsistent results. We performed a meta-analysis across 5 independent studies to investigate the effect of aging on the gut microbiome in mice. Our analysis revealed that aging increased gut microbial alpha diversity and shifted the microbial community structure. To determine if we could reliably differentiate the gut microbiomes from young and old hosts, we established a random forest classifier based on predicted metagenome function and validated its performance against independent data sets. Alterations in microbial carbohydrate metabolism and decreased fecal short-chain fatty acid (SCFA) concentrations were key features of aging and correlated with host colonic expression of the SCFA receptor Gpr41. This study advances our understanding of the impact of aging on the gut microbiome and proposes a hypothesis that alterations in gut microbiota-derived SCFA-host GPR41 signaling are a feature of aging.

Collapse

Machine Learning algorithm unveils glutamatergic alterations in the post-mortem schizophrenia brain. NPJ SCHIZOPHRENIA 2022;8:8. [PMID: 35217646 PMCID: PMC8881508 DOI: 10.1038/s41537-022-00231-1] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/05/2021] [Accepted: 12/06/2021] [Indexed: 01/24/2023]

Abstract

Schizophrenia is a disorder of synaptic plasticity and aberrant connectivity in which a major dysfunction in glutamate synapse has been suggested. However, a multi-level approach tackling diverse clusters of interacting molecules of the glutamate signaling in schizophrenia is still lacking. We investigated in the post-mortem dorsolateral prefrontal cortex (DLPFC) and hippocampus of schizophrenia patients and non-psychiatric controls, the levels of neuroactive d- and l-amino acids (l-glutamate, d-serine, glycine, l-aspartate, d-aspartate) by HPLC. Moreover, by quantitative RT-PCR and western blotting we analyzed, respectively, the mRNA and protein levels of pre- and post-synaptic key molecules involved in the glutamatergic synapse functioning, including glutamate receptors (NMDA, AMPA, metabotropic), their interacting scaffolding proteins (PSD-95, Homer1b/c), plasma membrane and vesicular glutamate transporters (EAAT1, EAAT2, VGluT1, VGluT2), enzymes involved either in glutamate-dependent GABA neurotransmitter synthesis (GAD65 and 67), or in post-synaptic NMDA receptor-mediated signaling (CAMKIIα) and the pre-synaptic marker Synapsin-1. Univariable analyses revealed that none of the investigated molecules was differently represented in the post-mortem DLPFC and hippocampus of schizophrenia patients, compared with controls. Nonetheless, multivariable hypothesis-driven analyses revealed that the presence of schizophrenia was significantly affected by variations in neuroactive amino acid levels and glutamate-related synaptic elements. Furthermore, a Machine Learning hypothesis-free unveiled other discriminative clusters of molecules, one in the DLPFC and another in the hippocampus. Overall, while confirming a key role of glutamatergic synapse in the molecular pathophysiology of schizophrenia, we reported molecular signatures encompassing elements of the glutamate synapse able to discriminate patients with schizophrenia and normal individuals.

Collapse

Minamikawa MF, Nonaka K, Hamada H, Shimizu T, Iwata H. Dissecting Breeders' Sense via Explainable Machine Learning Approach: Application to Fruit Peelability and Hardness in Citrus. FRONTIERS IN PLANT SCIENCE 2022;13:832749. [PMID: 35222489 PMCID: PMC8867066 DOI: 10.3389/fpls.2022.832749] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/10/2021] [Accepted: 01/17/2022] [Indexed: 06/14/2023]

Abstract

"Genomics-assisted breeding", which utilizes genomics-based methods, e.g., genome-wide association study (GWAS) and genomic selection (GS), has been attracting attention, especially in the field of fruit breeding. Low-cost genotyping technologies that support genome-assisted breeding have already been established. However, efficient collection of large amounts of high-quality phenotypic data is essential for the success of such breeding. Most of the fruit quality traits have been sensorily and visually evaluated by professional breeders. However, the fruit morphological features that serve as the basis for such sensory and visual judgments are unclear. This makes it difficult to collect efficient phenotypic data on fruit quality traits using image analysis. In this study, we developed a method to automatically measure the morphological features of citrus fruits by the image analysis of cross-sectional images of citrus fruits. We applied explainable machine learning methods and Bayesian networks to determine the relationship between fruit morphological features and two sensorily evaluated fruit quality traits: easiness of peeling (Peeling) and fruit hardness (FruH). In each of all the methods applied in this study, the degradation area of the central core of the fruit was significantly and directly associated with both Peeling and FruH, while the seed area was significantly and directly related to FruH alone. The degradation area of albedo and the area of flavedo were also significantly and directly related to Peeling and FruH, respectively, except in one or two methods. These results suggest that an approach that combines explainable machine learning methods, Bayesian networks, and image analysis can be effective in dissecting the experienced sense of a breeder. In breeding programs, collecting fruit images and efficiently measuring and documenting fruit morphological features that are related to fruit quality traits may increase the size of data for the analysis and improvement of the accuracy of GWAS and GS on the quality traits of the citrus fruits.

Collapse

Ji X, Lin L, Fan J, Li Y, Wei Y, Shen S, Su L, Shafer A, Bjaanæs MM, Karlsson A, Planck M, Staaf J, Helland Å, Esteller M, Zhang R, Chen F, Christiani DC. Epigenome-wide three-way interaction study identifies a complex pattern between TRIM27, KIAA0226, and smoking associated with overall survival of early-stage NSCLC. Mol Oncol 2022;16:717-731. [PMID: 34932879 PMCID: PMC8807353 DOI: 10.1002/1878-0261.13167] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2021] [Revised: 11/23/2021] [Accepted: 12/20/2021] [Indexed: 01/12/2023] Open

Affiliation(s)

Xinyu Ji Department of BiostatisticsCenter for Global HealthSchool of Public HealthNanjing Medical UniversityNanjingChina
Lijuan Lin Department of BiostatisticsCenter for Global HealthSchool of Public HealthNanjing Medical UniversityNanjingChina
Juanjuan Fan Department of BiostatisticsCenter for Global HealthSchool of Public HealthNanjing Medical UniversityNanjingChina
Yi Li Department of BiostatisticsUniversity of MichiganAnn ArborMIUSA
Yongyue Wei Department of BiostatisticsCenter for Global HealthSchool of Public HealthNanjing Medical UniversityNanjingChina,Department of Environmental HealthHarvard T.H. Chan School of Public HealthBostonMAUSA,China International Cooperation Center for Environment and Human HealthNanjing Medical UniversityNanjingChina
Sipeng Shen Department of BiostatisticsCenter for Global HealthSchool of Public HealthNanjing Medical UniversityNanjingChina
Li Su Department of Environmental HealthHarvard T.H. Chan School of Public HealthBostonMAUSA
Andrea Shafer Pulmonary and Critical Care DivisionDepartment of MedicineMassachusetts General Hospital and Harvard Medical SchoolBostonMAUSA
Maria Moksnes Bjaanæs Department of Cancer GeneticsInstitute for Cancer ResearchOslo University HospitalOsloNorway
Anna Karlsson Division of OncologyDepartment of Clinical Sciences Lund and CREATE Health Strategic Center for Translational Cancer ResearchLund UniversityLundSweden
Maria Planck Division of OncologyDepartment of Clinical Sciences Lund and CREATE Health Strategic Center for Translational Cancer ResearchLund UniversityLundSweden
Johan Staaf Division of OncologyDepartment of Clinical Sciences Lund and CREATE Health Strategic Center for Translational Cancer ResearchLund UniversityLundSweden
Åslaug Helland Department of Cancer GeneticsInstitute for Cancer ResearchOslo University HospitalOsloNorway,Institute of Clinical MedicineUniversity of OsloOsloNorway
Manel Esteller Josep Carreras Leukaemia Research InstituteBarcelonaSpain,Centro de Investigacion Biomedica en Red CancerMadridSpain,Institucio Catalana de Recerca i Estudis AvançatsBarcelonaSpain,Physiological Sciences DepartmentSchool of Medicine and Health SciencesUniversity of BarcelonaBarcelonaSpain
Ruyang Zhang Department of BiostatisticsCenter for Global HealthSchool of Public HealthNanjing Medical UniversityNanjingChina,Department of Environmental HealthHarvard T.H. Chan School of Public HealthBostonMAUSA,China International Cooperation Center for Environment and Human HealthNanjing Medical UniversityNanjingChina
Feng Chen Department of BiostatisticsCenter for Global HealthSchool of Public HealthNanjing Medical UniversityNanjingChina,China International Cooperation Center for Environment and Human HealthNanjing Medical UniversityNanjingChina,State Key Laboratory of Reproductive MedicineNanjing Medical UniversityNanjingChina,Jiangsu Key Lab of Cancer Biomarkers, Prevention and TreatmentCancer CenterCollaborative Innovation Center for Cancer Personalized MedicineNanjing Medical UniversityNanjingChina
David C. Christiani Department of Environmental HealthHarvard T.H. Chan School of Public HealthBostonMAUSA,Pulmonary and Critical Care DivisionDepartment of MedicineMassachusetts General Hospital and Harvard Medical SchoolBostonMAUSA

Collapse

Prates ET, Garvin MR, Jones P, Miller JI, Sullivan KA, Cliff A, Gazolla JGFM, Shah MB, Walker AM, Lane M, Rentsch CT, Justice A, Pavicic M, Romero J, Jacobson D. Antiviral Strategies Against SARS-CoV-2: A Systems Biology Approach. Methods Mol Biol 2022;2452:317-351. [PMID: 35554915 DOI: 10.1007/978-1-0716-2111-0_19] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]

Affiliation(s)

Erica T Prates Oak Ridge National Laboratory, Computational Systems Biology, Oak Ridge, TN, USA National Virtual Biotechnology Laboratory, US Department of Energy, Washington, DC, USA
Michael R Garvin Oak Ridge National Laboratory, Computational Systems Biology, Oak Ridge, TN, USA National Virtual Biotechnology Laboratory, US Department of Energy, Washington, DC, USA
Piet Jones The Bredesen Center for Interdisciplinary Research and Graduate Education, University of Tennessee Knoxville, Knoxville, TN, USA
J Izaak Miller Oak Ridge National Laboratory, Computational Systems Biology, Oak Ridge, TN, USA National Virtual Biotechnology Laboratory, US Department of Energy, Washington, DC, USA
Kyle A Sullivan Oak Ridge National Laboratory, Computational Systems Biology, Oak Ridge, TN, USA National Virtual Biotechnology Laboratory, US Department of Energy, Washington, DC, USA
Ashley Cliff The Bredesen Center for Interdisciplinary Research and Graduate Education, University of Tennessee Knoxville, Knoxville, TN, USA
Joao Gabriel Felipe Machado Gazolla Oak Ridge National Laboratory, Computational Systems Biology, Oak Ridge, TN, USA National Virtual Biotechnology Laboratory, US Department of Energy, Washington, DC, USA
Manesh B Shah Genome Science and Technology, University of Tennessee Knoxville, Knoxville, TN, USA
Angelica M Walker The Bredesen Center for Interdisciplinary Research and Graduate Education, University of Tennessee Knoxville, Knoxville, TN, USA
Matthew Lane The Bredesen Center for Interdisciplinary Research and Graduate Education, University of Tennessee Knoxville, Knoxville, TN, USA
Christopher T Rentsch Faculty of Epidemiology and Population Health, London School of Hygiene and Tropical Medicine, London, UK VA Connecticut Healthcare/General Internal Medicine, West Haven, CT, USA
Amy Justice VA Connecticut Healthcare/General Internal Medicine, West Haven, CT, USA Yale University School of Medicine, New Haven, CT, USA
Mirko Pavicic Oak Ridge National Laboratory, Computational Systems Biology, Oak Ridge, TN, USA National Virtual Biotechnology Laboratory, US Department of Energy, Washington, DC, USA
Jonathon Romero The Bredesen Center for Interdisciplinary Research and Graduate Education, University of Tennessee Knoxville, Knoxville, TN, USA
Daniel Jacobson Oak Ridge National Laboratory, Computational Systems Biology, Oak Ridge, TN, USA. National Virtual Biotechnology Laboratory, US Department of Energy, Washington, DC, USA. The Bredesen Center for Interdisciplinary Research and Graduate Education, University of Tennessee Knoxville, Knoxville, TN, USA. Genome Science and Technology, University of Tennessee Knoxville, Knoxville, TN, USA. Department of Psychology, NeuroNet Research Center, University of Tennessee Knoxville, Knoxville, TN, USA.

Collapse

Beyond Importance Scores: Interpreting Tabular ML by Visualizing Feature Semantics. INFORMATION 2021. [DOI: 10.3390/info13010015] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open

Sanchez CD, Brown JB, Gal-Oz O, Singer E. EcoPLOT: dynamic analysis of biogeochemical data. Bioinformatics 2021;38:1480-1482. [PMID: 34927685 PMCID: PMC8825466 DOI: 10.1093/bioinformatics/btab842] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2021] [Revised: 12/02/2021] [Accepted: 12/14/2021] [Indexed: 01/05/2023] Open

Branch CL, Semenov GA, Wagner DN, Sonnenberg BR, Pitera AM, Bridge ES, Taylor SA, Pravosudov VV. The genetic basis of spatial cognitive variation in a food-caching bird. Curr Biol 2021;32:210-219.e4. [PMID: 34735793 DOI: 10.1016/j.cub.2021.10.036] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2021] [Revised: 09/15/2021] [Accepted: 10/14/2021] [Indexed: 01/02/2023]

A novel dimension reduction algorithm based on weighted kernel principal analysis for gene expression data. PLoS One 2021;16:e0258326. [PMID: 34644329 PMCID: PMC8513872 DOI: 10.1371/journal.pone.0258326] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2020] [Accepted: 09/26/2021] [Indexed: 11/19/2022] Open

A novel random forest approach to revealing interactions and controls on chlorophyll concentration and bacterial communities during coastal phytoplankton blooms. Sci Rep 2021;11:19944. [PMID: 34620921 PMCID: PMC8497483 DOI: 10.1038/s41598-021-98110-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2020] [Accepted: 08/24/2021] [Indexed: 11/12/2022] Open

Chen D, Sun Y, Shao G, Yu W, Zhang HT, Lin W. Coordinating directional switches in pigeon flocks: the role of nonlinear interactions. ROYAL SOCIETY OPEN SCIENCE 2021;8:210649. [PMID: 34631121 PMCID: PMC8479334 DOI: 10.1098/rsos.210649] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/06/2021] [Accepted: 09/03/2021] [Indexed: 06/13/2023]

Zabeti H, Dexter N, Safari AH, Sedaghat N, Libbrecht M, Chindelevitch L. INGOT-DR: an interpretable classifier for predicting drug resistance in M. tuberculosis. Algorithms Mol Biol 2021;16:17. [PMID: 34376217 PMCID: PMC8353837 DOI: 10.1186/s13015-021-00198-1] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2021] [Accepted: 07/23/2021] [Indexed: 12/13/2022] Open

Stell E, Warner D, Jian J, Bond-Lamberty B, Vargas R. Spatial biases of information influence global estimates of soil respiration: How can we improve global predictions? GLOBAL CHANGE BIOLOGY 2021;27:3923-3938. [PMID: 33934461 DOI: 10.1111/gcb.15666] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/08/2021] [Accepted: 03/31/2021] [Indexed: 06/12/2023]

Abstract

Soil respiration (Rs), the efflux of CO₂ from soils to the atmosphere, is a major component of the terrestrial carbon cycle, but is poorly constrained from regional to global scales. The global soil respiration database (SRDB) is a compilation of in situ Rs observations from around the globe that has been consistently updated with new measurements over the past decade. It is unclear whether the addition of data to new versions has produced better-constrained global Rs estimates. We compared two versions of the SRDB (v3.0 n = 5173 and v5.0 n = 10,366) to determine how additional data influenced global Rs annual sum, spatial patterns and associated uncertainty (1 km spatial resolution) using a machine learning approach. A quantile regression forest model parameterized using SRDBv3 yielded a global Rs sum of 88.6 Pg C year^-1 , and associated uncertainty of 29.9 (mean absolute error) and 57.9 (standard deviation) Pg C year^-1 , whereas parameterization using SRDBv5 yielded 96.5 Pg C year^-1 and associated uncertainty of 30.2 (mean average error) and 73.4 (standard deviation) Pg C year^-1 . Empirically estimated global heterotrophic respiration (Rh) from v3 and v5 were 49.9-50.2 (mean 50.1) and 53.3-53.5 (mean 53.4) Pg C year^-1 , respectively. SRDBv5's inclusion of new data from underrepresented regions (e.g., Asia, Africa, South America) resulted in overall higher model uncertainty. The largest differences between models parameterized with different SRDVB versions were in arid/semi-arid regions. The SRDBv5 is still biased toward northern latitudes and temperate zones, so we tested an optimized global distribution of Rs measurements, which resulted in a global sum of 96.4 ± 21.4 Pg C year^-1 with an overall lower model uncertainty. These results support current global estimates of Rs but highlight spatial biases that influence model parameterization and interpretation and provide insights for design of environmental networks to improve global-scale Rs estimates.

Collapse

Tansey W, Veitch V, Zhang H, Rabadan R, Blei DM. The Holdout Randomization Test for Feature Selection in Black Box Models. J Comput Graph Stat 2021. [DOI: 10.1080/10618600.2021.1923520] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]

Mooney C, O'Boyle D, Finder M, Hallberg B, Walsh BH, Henshall DC, Boylan GB, Murray DM. Predictive modelling of hypoxic ischaemic encephalopathy risk following perinatal asphyxia. Heliyon 2021;7:e07411. [PMID: 34278022 PMCID: PMC8261660 DOI: 10.1016/j.heliyon.2021.e07411] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2021] [Revised: 05/29/2021] [Accepted: 06/23/2021] [Indexed: 01/03/2023] Open

Abstract

Hypoxic Ischemic Encephalopathy (HIE) remains a major cause of neurological disability. Early intervention with therapeutic hypothermia improves outcome, but prediction of HIE is difficult and no single clinical marker is reliable. Machine learning algorithms may allow identification of patterns in clinical data to improve prognostic power. Here we examine the use of a Random Forest machine learning algorithm and five-fold cross-validation to predict the occurrence of HIE in a prospective cohort of infants with perinatal asphyxia. Infants with perinatal asphyxia were recruited at birth and neonatal course was followed for the development of HIE. Clinical variables were recorded for each infant including maternal demographics, delivery details and infant's condition at birth. We found that the strongest predictors of HIE were the infant's condition at birth (as expressed by Apgar score), need for resuscitation, and the first postnatal measures of pH, lactate, and base deficit. Random Forest models combining features including Apgar score, most intensive resuscitation, maternal age and infant birth weight both with and without biochemical markers of pH, lactate, and base deficit resulted in a sensitivity of 56-100% and a specificity of 78-99%. This study presents a dynamic method of rapid classification that has the potential to be easily adapted and implemented in a clinical setting, with and without the availability of blood gas analysis. Our results demonstrate that applying machine learning algorithms to readily available clinical data may support clinicians in the early and accurate identification of infants who will develop HIE. We anticipate our models to be a starting point for the development of a more sophisticated clinical decision support system to help identify which infants will benefit from early therapeutic hypothermia.

Collapse

DiMucci D, Kon M, Segrè D. BowSaw: Inferring Higher-Order Trait Interactions Associated With Complex Biological Phenotypes. Front Mol Biosci 2021;8:663532. [PMID: 34222331 PMCID: PMC8245782 DOI: 10.3389/fmolb.2021.663532] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2021] [Accepted: 05/24/2021] [Indexed: 11/15/2022] Open

Gao H, Yang C, Fan J, Lan L, Pang D. Hereditary and breastfeeding factors are positively associated with the aetiology of mammary gland hyperplasia: a case-control study. Int Health 2021;13:240-247. [PMID: 32556322 PMCID: PMC8079319 DOI: 10.1093/inthealth/ihaa028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2020] [Revised: 04/10/2020] [Accepted: 05/18/2020] [Indexed: 11/30/2022] Open

Armstrong AJS, Quinn K, Fouquier J, Li SX, Schneider JM, Nusbacher NM, Doenges KA, Fiorillo S, Marden TJ, Higgins J, Reisdorph N, Campbell TB, Palmer BE, Lozupone CA. Systems Analysis of Gut Microbiome Influence on Metabolic Disease in HIV-Positive and High-Risk Populations. mSystems 2021;6:e01178-20. [PMID: 34006628 PMCID: PMC8269254 DOI: 10.1128/msystems.01178-20] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2020] [Accepted: 04/15/2021] [Indexed: 12/20/2022] Open

Abstract

Poor metabolic health, characterized by insulin resistance and dyslipidemia, is higher in people living with HIV and has been linked with inflammation, antiretroviral therapy (ART) drugs, and ART-associated lipodystrophy (LD). Metabolic disease is associated with gut microbiome composition outside the context of HIV but has not been deeply explored in HIV infection or in high-risk men who have sex with men (HR-MSM), who have a highly altered gut microbiome composition. Furthermore, the contribution of increased bacterial translocation and associated systemic inflammation that has been described in HIV-positive and HR-MSM individuals has not been explored. We used a multiomic approach to explore relationships between impaired metabolic health, defined using fasting blood markers, gut microbes, immune phenotypes, and diet. Our cohort included ART-treated HIV-positive MSM with or without LD, untreated HIV-positive MSM, and HR-MSM. For HIV-positive MSM on ART, we further explored associations with the plasma metabolome. We found that elevated plasma lipopolysaccharide binding protein (LBP) was the most important predictor of impaired metabolic health and network analysis showed that LBP formed a hub joining correlated microbial and immune predictors of metabolic disease. Taken together, our results suggest the role of inflammatory processes linked with bacterial translocation and interaction with the gut microbiome in metabolic disease among HIV-positive and -negative MSM.IMPORTANCE The gut microbiome in people living with HIV (PLWH) is of interest since chronic infection often results in long-term comorbidities. Metabolic disease is prevalent in PLWH even in well-controlled infection and has been linked with the gut microbiome in previous studies, but little attention has been given to PLWH. Furthermore, integrated analyses that consider gut microbiome, together with diet, systemic immune activation, metabolites, and demographics, have been lacking. In a systems-level analysis of predictors of metabolic disease in PLWH and men who are at high risk of acquiring HIV, we found that increased lipopolysaccharide-binding protein, an inflammatory marker indicative of compromised intestinal barrier function, was associated with worse metabolic health. We also found impaired metabolic health associated with specific dietary components, gut microbes, and host and microbial metabolites. This study lays the framework for mechanistic studies aimed at targeting the microbiome to prevent or treat metabolic endotoxemia in HIV-infected individuals.

Collapse

Yu F, Wei C, Deng P, Peng T, Hu X. Deep exploration of random forest model boosts the interpretability of machine learning studies of complicated immune responses and lung burden of nanoparticles. SCIENCE ADVANCES 2021;7:7/22/eabf4130. [PMID: 34039604 PMCID: PMC8153727 DOI: 10.1126/sciadv.abf4130] [Citation(s) in RCA: 49] [Impact Index Per Article: 16.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/25/2020] [Accepted: 04/05/2021] [Indexed: 05/22/2023]

Liu D, Zhang X, Zheng T, Shi Q, Cui Y, Wang Y, Liu L. Optimisation and evaluation of the random forest model in the efficacy prediction of chemoradiotherapy for advanced cervical cancer based on radiomics signature from high-resolution T2 weighted images. Arch Gynecol Obstet 2021;303:811-820. [PMID: 33394142 PMCID: PMC7960581 DOI: 10.1007/s00404-020-05908-5] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2020] [Accepted: 11/17/2020] [Indexed: 12/28/2022]

Sarkar P, Malik S, Laha S, Das S, Bunk S, Ray JG, Chatterjee R, Saha A. Dysbiosis of Oral Microbiota During Oral Squamous Cell Carcinoma Development. Front Oncol 2021;11:614448. [PMID: 33708627 PMCID: PMC7940518 DOI: 10.3389/fonc.2021.614448] [Citation(s) in RCA: 47] [Impact Index Per Article: 15.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2020] [Accepted: 01/05/2021] [Indexed: 12/24/2022] Open

Jain R, Xu W. HDSI: High dimensional selection with interactions algorithm on feature selection and testing. PLoS One 2021;16:e0246159. [PMID: 33592034 PMCID: PMC7886179 DOI: 10.1371/journal.pone.0246159] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2020] [Accepted: 01/15/2021] [Indexed: 11/19/2022] Open

A Data-Driven and Data-Based Framework for Online Voltage Stability Assessment Using Partial Mutual Information and Iterated Random Forest. ENERGIES 2021. [DOI: 10.3390/en14030715] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]

Yu B, Barter R. The Data Science Process: One Culture. Int Stat Rev 2020. [DOI: 10.1111/insr.12416] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

Dwivedi R, Tan YS, Park B, Wei M, Horgan K, Madigan D, Yu B. Stable Discovery of Interpretable Subgroups via Calibration in Causal Studies. Int Stat Rev 2020. [DOI: 10.1111/insr.12427] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]

Khalili E, Kouchaki S, Ramazi S, Ghanati F. Machine Learning Techniques for Soybean Charcoal Rot Disease Prediction. FRONTIERS IN PLANT SCIENCE 2020;11:590529. [PMID: 33381132 PMCID: PMC7767839 DOI: 10.3389/fpls.2020.590529] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/01/2020] [Accepted: 11/23/2020] [Indexed: 06/01/2023]

Lawson CE, Martí JM, Radivojevic T, Jonnalagadda SVR, Gentz R, Hillson NJ, Peisert S, Kim J, Simmons BA, Petzold CJ, Singer SW, Mukhopadhyay A, Tanjore D, Dunn JG, Garcia Martin H. Machine learning for metabolic engineering: A review. Metab Eng 2020;63:34-60. [PMID: 33221420 DOI: 10.1016/j.ymben.2020.10.005] [Citation(s) in RCA: 86] [Impact Index Per Article: 21.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2020] [Revised: 10/22/2020] [Accepted: 10/31/2020] [Indexed: 12/14/2022]

Affiliation(s)

Christopher E Lawson Biological Systems and Engineering, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA; Joint BioEnergy Institute, Emeryville, CA, 94608, USA
Jose Manuel Martí Biological Systems and Engineering, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA; Joint BioEnergy Institute, Emeryville, CA, 94608, USA; DOE Agile BioFoundry, Emeryville, CA, 94608, USA
Tijana Radivojevic Biological Systems and Engineering, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA; Joint BioEnergy Institute, Emeryville, CA, 94608, USA; DOE Agile BioFoundry, Emeryville, CA, 94608, USA
Sai Vamshi R Jonnalagadda Biological Systems and Engineering, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA; Joint BioEnergy Institute, Emeryville, CA, 94608, USA; DOE Agile BioFoundry, Emeryville, CA, 94608, USA
Reinhard Gentz Biological Systems and Engineering, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA; Joint BioEnergy Institute, Emeryville, CA, 94608, USA; Computational Research Division, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
Nathan J Hillson Biological Systems and Engineering, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA; Joint BioEnergy Institute, Emeryville, CA, 94608, USA; DOE Agile BioFoundry, Emeryville, CA, 94608, USA
Sean Peisert Biological Systems and Engineering, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA; Computational Research Division, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA; University of California Davis, Davis, CA, 95616, USA
Joonhoon Kim Joint BioEnergy Institute, Emeryville, CA, 94608, USA; Pacific Northwest National Laboratory, Richland, 99354, WA, USA
Blake A Simmons Biological Systems and Engineering, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA; Joint BioEnergy Institute, Emeryville, CA, 94608, USA; DOE Agile BioFoundry, Emeryville, CA, 94608, USA
Christopher J Petzold Biological Systems and Engineering, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA; Joint BioEnergy Institute, Emeryville, CA, 94608, USA; DOE Agile BioFoundry, Emeryville, CA, 94608, USA
Steven W Singer Biological Systems and Engineering, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA; Joint BioEnergy Institute, Emeryville, CA, 94608, USA
Aindrila Mukhopadhyay Biological Systems and Engineering, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA; Joint BioEnergy Institute, Emeryville, CA, 94608, USA; Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, USA
Deepti Tanjore Biological Systems and Engineering, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA; Advanced Biofuels and Bioproducts Process Development Unit, Emeryville, CA, 94608, USA
Joshua G Dunn Ginkgo Bioworks, Boston, MA, 02210, USA
Hector Garcia Martin Biological Systems and Engineering, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA; Joint BioEnergy Institute, Emeryville, CA, 94608, USA; DOE Agile BioFoundry, Emeryville, CA, 94608, USA; Basque Center for Applied Mathematics, 48009, Bilbao, Spain; Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, USA.

Collapse

Friedberg R, Tibshirani J, Athey S, Wager S. Local Linear Forests. J Comput Graph Stat 2020. [DOI: 10.1080/10618600.2020.1831930] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]

Lucas TCD. A translucent box: interpretable machine learning in ecology. ECOL MONOGR 2020. [DOI: 10.1002/ecm.1422] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]

Parchure P, Joshi H, Dharmarajan K, Freeman R, Reich DL, Mazumdar M, Timsina P, Kia A. Development and validation of a machine learning-based prediction model for near-term in-hospital mortality among patients with COVID-19. BMJ Support Palliat Care 2020;12:bmjspcare-2020-002602. [PMID: 32963059 PMCID: PMC8049537 DOI: 10.1136/bmjspcare-2020-002602] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2020] [Revised: 08/09/2020] [Accepted: 08/18/2020] [Indexed: 02/06/2023]

Hu Q, Greene CS, Heller EA. Specific histone modifications associate with alternative exon selection during mammalian development. Nucleic Acids Res 2020;48:4709-4724. [PMID: 32319526 PMCID: PMC7229819 DOI: 10.1093/nar/gkaa248] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2019] [Revised: 03/23/2020] [Accepted: 04/02/2020] [Indexed: 12/29/2022] Open

Hao B, Zhang A, Cheng G. Sparse and Low-rank Tensor Estimation via Cubic Sketchings. IEEE TRANSACTIONS ON INFORMATION THEORY 2020;66:5927-5964. [PMID: 33746244 PMCID: PMC7978041 DOI: 10.1109/tit.2020.2982499] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]

Huang S, Blatti C, Sinha S, Parameswaran A. Uncovering Effective Explanations for Interactive Genomic Data Analysis. PATTERNS 2020;1:100093. [PMID: 33205133 PMCID: PMC7660438 DOI: 10.1016/j.patter.2020.100093] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/10/2020] [Revised: 07/13/2020] [Accepted: 08/05/2020] [Indexed: 10/25/2022]

Schperberg AV, Boichard A, Tsigelny IF, Richard SB, Kurzrock R. Machine learning model to predict oncologic outcomes for drugs in randomized clinical trials. Int J Cancer 2020;147:2537-2549. [PMID: 32745254 DOI: 10.1002/ijc.33240] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2020] [Revised: 07/15/2020] [Accepted: 07/17/2020] [Indexed: 11/12/2022]

Abstract

Predicting oncologic outcome is challenging due to the diversity of cancer histologies and the complex network of underlying biological factors. In this study, we determine whether machine learning (ML) can extract meaningful associations between oncologic outcome and clinical trial, drug-related biomarker and molecular profile information. We analyzed therapeutic clinical trials corresponding to 1102 oncologic outcomes from 104 758 cancer patients with advanced colorectal adenocarcinoma, pancreatic adenocarcinoma, melanoma and nonsmall-cell lung cancer. For each intervention arm, a dataset with the following attributes was curated: line of treatment, the number of cytotoxic chemotherapies, small-molecule inhibitors, or monoclonal antibody agents, drug class, molecular alteration status of the clinical arm's population, cancer type, probability of drug sensitivity (PDS) (integrating the status of genomic, transcriptomic and proteomic biomarkers in the population of interest) and outcome. A total of 467 progression-free survival (PFS) and 369 overall survival (OS) data points were used as training sets to build our ML (random forest) model. Cross-validation sets were used for PFS and OS, obtaining correlation coefficients (r) of 0.82 and 0.70, respectively (outcome vs model's parameters). A total of 156 PFS and 110 OS data points were used as test sets. The Spearman correlation (r_s ) between predicted and actual outcomes was statistically significant (PFS: r_s = 0.879, OS: r_s = 0.878, P < .0001). The better outcome arm was predicted in 81% (PFS: N = 59/73, z = 5.24, P < .0001) and 71% (OS: N = 37/52, z = 2.91, P = .004) of randomized trials. The success of our algorithm to predict clinical outcome may be exploitable as a model to optimize clinical trial design with pharmaceutical agents.

Collapse

Ghazanfar S, Lin Y, Su X, Lin DM, Patrick E, Han ZG, Marioni JC, Yang JYH. Investigating higher-order interactions in single-cell data with scHOT. Nat Methods 2020;17:799-806. [PMID: 32661426 PMCID: PMC7610653 DOI: 10.1038/s41592-020-0885-x] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2019] [Accepted: 06/03/2020] [Indexed: 12/12/2022]

A mechanism-aware and multiomic machine-learning pipeline characterizes yeast cell growth. Proc Natl Acad Sci U S A 2020;117:18869-18879. [PMID: 32675233 DOI: 10.1073/pnas.2002959117] [Citation(s) in RCA: 54] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Abstract

Metabolic modeling and machine learning are key components in the emerging next generation of systems and synthetic biology tools, targeting the genotype-phenotype-environment relationship. Rather than being used in isolation, it is becoming clear that their value is maximized when they are combined. However, the potential of integrating these two frameworks for omic data augmentation and integration is largely unexplored. We propose, rigorously assess, and compare machine-learning-based data integration techniques, combining gene expression profiles with computationally generated metabolic flux data to predict yeast cell growth. To this end, we create strain-specific metabolic models for 1,143 Saccharomyces cerevisiae mutants and we test 27 machine-learning methods, incorporating state-of-the-art feature selection and multiview learning approaches. We propose a multiview neural network using fluxomic and transcriptomic data, showing that the former increases the predictive accuracy of the latter and reveals functional patterns that are not directly deducible from gene expression alone. We test the proposed neural network on a further 86 strains generated in a different experiment, therefore verifying its robustness to an additional independent dataset. Finally, we show that introducing mechanistic flux features improves the predictions also for knockout strains whose genes were not modeled in the metabolic reconstruction. Our results thus demonstrate that fusing experimental cues with in silico models, based on known biochemistry, can contribute with disjoint information toward biologically informed and interpretable machine learning. Overall, this study provides tools for understanding and manipulating complex phenotypes, increasing both the prediction accuracy and the extent of discernible mechanistic biological insights.

Collapse

Yu B, Barter R. The Data Science Process: One Culture. J Am Stat Assoc 2020. [DOI: 10.1080/01621459.2020.1762615] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]

Cheng FY, Joshi H, Tandon P, Freeman R, Reich DL, Mazumdar M, Kohli-Seth R, Levin MA, Timsina P, Kia A. Using Machine Learning to Predict ICU Transfer in Hospitalized COVID-19 Patients. J Clin Med 2020;9:jcm9061668. [PMID: 32492874 PMCID: PMC7356638 DOI: 10.3390/jcm9061668] [Citation(s) in RCA: 96] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2020] [Revised: 05/27/2020] [Accepted: 05/28/2020] [Indexed: 12/13/2022] Open

Abstract

OBJECTIVES

Approximately 20-30% of patients with COVID-19 require hospitalization, and 5-12% may require critical care in an intensive care unit (ICU). A rapid surge in cases of severe COVID-19 will lead to a corresponding surge in demand for ICU care. Because of constraints on resources, frontline healthcare workers may be unable to provide the frequent monitoring and assessment required for all patients at high risk of clinical deterioration. We developed a machine learning-based risk prioritization tool that predicts ICU transfer within 24 h, seeking to facilitate efficient use of care providers' efforts and help hospitals plan their flow of operations.

METHODS

A retrospective cohort was comprised of non-ICU COVID-19 admissions at a large acute care health system between 26 February and 18 April 2020. Time series data, including vital signs, nursing assessments, laboratory data, and electrocardiograms, were used as input variables for training a random forest (RF) model. The cohort was randomly split (70:30) into training and test sets. The RF model was trained using 10-fold cross-validation on the training set, and its predictive performance on the test set was then evaluated.

RESULTS

The cohort consisted of 1987 unique patients diagnosed with COVID-19 and admitted to non-ICU units of the hospital. The median time to ICU transfer was 2.45 days from the time of admission. Compared to actual admissions, the tool had 72.8% (95% CI: 63.2-81.1%) sensitivity, 76.3% (95% CI: 74.7-77.9%) specificity, 76.2% (95% CI: 74.6-77.7%) accuracy, and 79.9% (95% CI: 75.2-84.6%) area under the receiver operating characteristics curve.

CONCLUSIONS

A ML-based prediction model can be used as a screening tool to identify patients at risk of imminent ICU transfer within 24 h. This tool could improve the management of hospital resources and patient-throughput planning, thus delivering more effective care to patients hospitalized with COVID-19.

Collapse

Affiliation(s)

Fu-Yuan Cheng Institute for Healthcare Delivery Science; Icahn School of Medicine at Mount Sinai, 1425 Madison Avenue, New York, NY 10029, USA; (F.-Y.C.); (H.J.); (R.F.); (P.T.); (A.K.)
Himanshu Joshi Institute for Healthcare Delivery Science; Icahn School of Medicine at Mount Sinai, 1425 Madison Avenue, New York, NY 10029, USA; (F.-Y.C.); (H.J.); (R.F.); (P.T.); (A.K.) Department of Population Health Science and Policy, Icahn School of Medicine at Mount Sinai, 1425 Madison Avenue, New York, NY 10029, USA
Pranai Tandon Respiratory Institute, Icahn School of Medicine at Mount Sinai, 10 E 102nd St, New York, NY 10029, USA;
Robert Freeman Institute for Healthcare Delivery Science; Icahn School of Medicine at Mount Sinai, 1425 Madison Avenue, New York, NY 10029, USA; (F.-Y.C.); (H.J.); (R.F.); (P.T.); (A.K.) Hospital Administration; The Mount Sinai Hospital, 1 Gustave L. Levy Place, New York, NY 10029, USA;
David L Reich Hospital Administration; The Mount Sinai Hospital, 1 Gustave L. Levy Place, New York, NY 10029, USA; Department of Anesthesiology, Perioperative and Pain Medicine, 1 Gustave L. Levy Place, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA;
Madhu Mazumdar Institute for Healthcare Delivery Science; Icahn School of Medicine at Mount Sinai, 1425 Madison Avenue, New York, NY 10029, USA; (F.-Y.C.); (H.J.); (R.F.); (P.T.); (A.K.) Department of Population Health Science and Policy, Icahn School of Medicine at Mount Sinai, 1425 Madison Avenue, New York, NY 10029, USA Correspondence: ; Tel.: +1-212-659-1470; Fax: +1-212-423-2998
Roopa Kohli-Seth Institute for Critical Care Medicine, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA;
Matthew A. Levin Department of Anesthesiology, Perioperative and Pain Medicine, 1 Gustave L. Levy Place, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Department of Genetics and Genomic Sciences, 1 Gustave L. Levy Place, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
Prem Timsina Institute for Healthcare Delivery Science; Icahn School of Medicine at Mount Sinai, 1425 Madison Avenue, New York, NY 10029, USA; (F.-Y.C.); (H.J.); (R.F.); (P.T.); (A.K.)
Arash Kia Institute for Healthcare Delivery Science; Icahn School of Medicine at Mount Sinai, 1425 Madison Avenue, New York, NY 10029, USA; (F.-Y.C.); (H.J.); (R.F.); (P.T.); (A.K.)

Collapse

Wang H, Sham P, Tong T, Pang H. Pathway-Based Single-Cell RNA-Seq Classification, Clustering, and Construction of Gene-Gene Interactions Networks Using Random Forests. IEEE J Biomed Health Inform 2020;24:1814-1822. [DOI: 10.1109/jbhi.2019.2944865] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Zhang X, Baer AG, Price JM, Jones PC, Garcia BJ, Romero J, Cliff AM, Mi W, Brown JB, Jacobson DA, Lydic R, Baghdoyan HA. Neurotransmitter networks in mouse prefrontal cortex are reconfigured by isoflurane anesthesia. J Neurophysiol 2020;123:2285-2296. [PMID: 32347157 PMCID: PMC7311717 DOI: 10.1152/jn.00092.2020] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open

Abstract

This study quantified eight small-molecule neurotransmitters collected simultaneously from prefrontal cortex of C57BL/6J mice (n = 23) during wakefulness and during isoflurane anesthesia (1.3%). Using isoflurane anesthesia as an independent variable enabled evaluation of the hypothesis that isoflurane anesthesia differentially alters concentrations of multiple neurotransmitters and their interactions. Machine learning was applied to reveal higher order interactions among neurotransmitters. Using a between-subjects design, microdialysis was performed during wakefulness and during anesthesia. Concentrations (nM) of acetylcholine, adenosine, dopamine, GABA, glutamate, histamine, norepinephrine, and serotonin in the dialysis samples are reported (means ± SD). Relative to wakefulness, acetylcholine concentration was lower during isoflurane anesthesia (1.254 ± 1.118 vs. 0.401 ± 0.134, P = 0.009), and concentrations of adenosine (29.456 ± 29.756 vs. 101.321 ± 38.603, P < 0.001), dopamine (0.0578 ± 0.0384 vs. 0.113 ± 0.084, P = 0.036), and norepinephrine (0.126 ± 0.080 vs. 0.219 ± 0.066, P = 0.010) were higher during anesthesia. Isoflurane reconfigured neurotransmitter interactions in prefrontal cortex, and the state of isoflurane anesthesia was reliably predicted by prefrontal cortex concentrations of adenosine, norepinephrine, and acetylcholine. A novel finding to emerge from machine learning analyses is that neurotransmitter concentration profiles in mouse prefrontal cortex undergo functional reconfiguration during isoflurane anesthesia. Adenosine, norepinephrine, and acetylcholine showed high feature importance, supporting the interpretation that interactions among these three transmitters may play a key role in modulating levels of cortical and behavioral arousal.

NEW & NOTEWORTHY This study discovered that interactions between neurotransmitters in mouse prefrontal cortex were altered during isoflurane anesthesia relative to wakefulness. Machine learning further demonstrated that, relative to wakefulness, higher order interactions among neurotransmitters were disrupted during isoflurane administration. These findings extend to the neurochemical domain the concept that anesthetic-induced loss of wakefulness results from a disruption of neural network connectivity.

Collapse

Azodi CB, Tang J, Shiu SH. Opening the Black Box: Interpretable Machine Learning for Geneticists. Trends Genet 2020;36:442-455. [PMID: 32396837 DOI: 10.1016/j.tig.2020.03.005] [Citation(s) in RCA: 104] [Impact Index Per Article: 26.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2020] [Revised: 03/12/2020] [Accepted: 03/16/2020] [Indexed: 01/16/2023]

Rillig MC, Ryo M, Lehmann A, Aguilar-Trigueros CA, Buchert S, Wulf A, Iwasaki A, Roy J, Yang G. The role of multiple global change factors in driving soil functions and microbial biodiversity. Science 2020;366:886-890. [PMID: 31727838 DOI: 10.1126/science.aay2832] [Citation(s) in RCA: 251] [Impact Index Per Article: 62.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2019] [Revised: 08/28/2019] [Accepted: 10/15/2019] [Indexed: 01/06/2023]

Veridical data science. Proc Natl Acad Sci U S A 2020;117:3920-3929. [PMID: 32054788 PMCID: PMC7049126 DOI: 10.1073/pnas.1901326117] [Citation(s) in RCA: 44] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023] Open

King DM, Hong CKY, Shepherdson JL, Granas DM, Maricque BB, Cohen BA. Synthetic and genomic regulatory elements reveal aspects of cis-regulatory grammar in mouse embryonic stem cells. eLife 2020;9:41279. [PMID: 32043966 PMCID: PMC7077988 DOI: 10.7554/elife.41279] [Citation(s) in RCA: 43] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2018] [Accepted: 02/07/2020] [Indexed: 01/08/2023] Open

Abstract

In embryonic stem cells (ESCs), a core transcription factor (TF) network establishes the gene expression program necessary for pluripotency. To address how interactions between four key TFs contribute to cis-regulation in mouse ESCs, we assayed two massively parallel reporter assay (MPRA) libraries composed of binding sites for SOX2, POU5F1 (OCT4), KLF4, and ESRRB. Comparisons between synthetic cis-regulatory elements and genomic sequences with comparable binding site configurations revealed some aspects of a regulatory grammar. The expression of synthetic elements is influenced by both the number and arrangement of binding sites. This grammar plays only a small role for genomic sequences, as the relative activities of genomic sequences are best explained by the predicted occupancy of binding sites, regardless of binding site identity and positioning. Our results suggest that the effects of transcription factor binding sites (TFBS) are influenced by the order and orientation of sites, but that in the genome the overall occupancy of TFs is the primary determinant of activity.

Transcription factors are proteins that flip genetic switches; their role is to control when and where genes are active. They do this by binding to short stretches of DNA called cis-regulatory sequences. Each sequence can have several binding sites for different transcription factors, but it is largely unclear whether the transcription factors binding to the same regulatory sequence actually work together.

It is possible that each transcription factor may work independently and there only needs to be critical mass of transcription factors bound to throw the genetic switch. If this is the case, the most important features of a cis-regulatory sequence should be the number of binding sites it contains, and how tightly the transcription factors bind to those sites. The more transcription factors and the more strongly they bind, the more active the gene should be. An alternative option is that certain transcription factors may work better together, enhancing each other's effects such that the total effect is more than the sum of its parts. If this is true, the order, orientation and spacing of the binding sites within a sequence should matter more than the number.

One way to investigate to distinguish between these possibilities is to study mouse embryonic stem cells, which have a core set of four transcription factors. Looking directly at a real genome, however, can be confusing and it is difficult to measure the effects of different cis-regulatory sequences because genes differ in so many other ways. To tackle this problem, King et al. created a synthetic set of cis-regulatory sequences based on the four core transcription factors found in mouse stem cells.

The synthetic set had every combination of two, three or four of the binding sites, with each site either facing forwards or backwards along the DNA strand. King et al. attached each of the synthetic cis-regulatory sequences to a reporter gene to find out how well each sequence performed. This revealed that the cis-regulatory sequences with the most binding sites and the tightest binding affinities work best, suggesting that transcription factors mainly work independently.

There was evidence of some interaction between some transcription factors, because, of the synthetic sequences with four binding sites, some worked better than others, and there were patterns in the most effective binding site combinations. However, these effects were small and when King et al. went on to test sequences from the real mouse genome, the most important factor by far was the number of binding sites.

Synthetic libraries of DNA sequences allow researchers to examine gene regulation more clearly than is possible in real genomes. Yet this approach does have its limitations and it is impossible to capture every type of cis-regulatory sequence in one library. The next step to extend this work is to combine the two approaches, taking sequences from the real genome and manipulating them one by one. This could help to unravel the rules that govern how cis-regulatory sequences work in real cells.

Collapse

100

Streich J, Romero J, Gazolla JGFM, Kainer D, Cliff A, Prates ET, Brown JB, Khoury S, Tuskan GA, Garvin M, Jacobson D, Harfouche AL. Can exascale computing and explainable artificial intelligence applied to plant biology deliver on the United Nations sustainable development goals? Curr Opin Biotechnol 2020;61:217-225. [DOI: 10.1016/j.copbio.2020.01.010] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2019] [Revised: 01/27/2020] [Accepted: 01/28/2020] [Indexed: 01/26/2023]