1
|
Kanjilal S. The modern alchemy of clinical pathology: turning the output of microbiology laboratory operations into gold. J Clin Microbiol 2024; 62:e0170922. [PMID: 38506516 PMCID: PMC11077955 DOI: 10.1128/jcm.01709-22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/21/2024] Open
Abstract
The clinical microbiology laboratory generates a huge amount of high-quality data that play a vital role in clinical care. With proper extraction, cleaning, analysis, and validation pipelines, these data can serve multiple other purposes that include supporting laboratory operations, understanding local epidemiology, informing hospital-specific policies, and public health surveillance. In this review, I use one of the core activities of the microbiology laboratory, antimicrobial susceptibility testing (AST), to illustrate several potential applications of next-generation data analytics. The first involves continuous monitoring of commercial AST systems using comparisons of minimum inhibitory concentration (MIC) distributions over time to trigger re-verification when statistically significant differences are detected. An extension of this is temporal analysis of joint MIC distributions to understand performance for multidrug-resistant organisms. More sophisticated analyses involve linking microbiologic data to clinical metadata to gain insight into the clinical validity of AST data and to inform treatment policies. The elements of a robust, validated analysis engine using routine data streams already exist, but numerous challenges must be overcome to make it a reality. Most importantly, it will require the sustained collaboration and advocacy of hospital leadership, microbiologists, clinicians, antimicrobial stewardship, data scientists, and regulatory agencies. Though no small feat, achieving this vision would provide an important resource for microbiology laboratories facing a rapidly evolving practice landscape and further cement its role as an integral part of a learning health system.
Collapse
Affiliation(s)
- Sanjat Kanjilal
- Department of Population Medicine, Harvard Pilgrim Healthcare Institute and Harvard Medical School, Boston, Massachusetts, USA
- Division of Infectious Diseases, Brigham & Women's Hospital, Boston, Massachusetts, USA
| |
Collapse
|
2
|
Zawadzki RS, Grill JD, Gillen DL. Frameworks for estimating causal effects in observational settings: comparing confounder adjustment and instrumental variables. BMC Med Res Methodol 2023; 23:122. [PMID: 37217854 PMCID: PMC10201752 DOI: 10.1186/s12874-023-01936-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2022] [Accepted: 04/25/2023] [Indexed: 05/24/2023] Open
Abstract
To estimate causal effects, analysts performing observational studies in health settings utilize several strategies to mitigate bias due to confounding by indication. There are two broad classes of approaches for these purposes: use of confounders and instrumental variables (IVs). Because such approaches are largely characterized by untestable assumptions, analysts must operate under an indefinite paradigm that these methods will work imperfectly. In this tutorial, we formalize a set of general principles and heuristics for estimating causal effects in the two approaches when the assumptions are potentially violated. This crucially requires reframing the process of observational studies as hypothesizing potential scenarios where the estimates from one approach are less inconsistent than the other. While most of our discussion of methodology centers around the linear setting, we touch upon complexities in non-linear settings and flexible procedures such as target minimum loss-based estimation and double machine learning. To demonstrate the application of our principles, we investigate the use of donepezil off-label for mild cognitive impairment. We compare and contrast results from confounder and IV methods, traditional and flexible, within our analysis and to a similar observational study and clinical trial.
Collapse
Affiliation(s)
- Roy S Zawadzki
- Department of Statistics, University of California, Irvine, Irvine, USA.
| | - Joshua D Grill
- Department of Psychiatry and Human Behavior, University of California, Irvine, Irvine, USA
- Department of Neurobiology and Behavior, University of California, Irvine, Irvine, USA
| | - Daniel L Gillen
- Department of Statistics, University of California, Irvine, Irvine, USA
| |
Collapse
|
3
|
Moosavi N, Häggström J, de Luna X. The Costs and Benefits of Uniformly Valid Causal Inference with High-Dimensional Nuisance Parameters. Stat Sci 2023. [DOI: 10.1214/21-sts843] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Affiliation(s)
- Niloofar Moosavi
- Niloofar Moosavi is Ph.D. Student, Department of Statistics, USBE, Umeå University, 901 87, Umeå, Sweden
| | - Jenny Häggström
- Jenny Häggström is Associate Professor, Department of Statistics, USBE, Umeå University, 901 87, Umeå, Sweden
| | - Xavier de Luna
- Xavier de Luna is Professor, Department of Statistics, USBE, Umeå University, 901 87, Umeå, Sweden
| |
Collapse
|
4
|
Li WX, Tong X, Yang PP, Zheng Y, Liang JH, Li GH, Liu D, Guan DG, Dai SX. Screening of antibacterial compounds with novel structure from the FDA approved drugs using machine learning methods. Aging (Albany NY) 2022; 14:1448-1472. [PMID: 35150482 PMCID: PMC8876917 DOI: 10.18632/aging.203887] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2021] [Accepted: 01/28/2022] [Indexed: 11/25/2022]
Abstract
Bacterial infection is one of the most important factors affecting the human life span. Elderly people are more harmed by bacterial infections due to their deficits in immunity. Because of the lack of new antibiotics in recent years, bacterial resistance has increasingly become a serious problem globally. In this study, an antibacterial compound predictor was constructed using the support vector machines and random forest methods and the data of the active and inactive antibacterial compounds from the ChEMBL database. The results showed that both models have excellent prediction performance (mean accuracy >0.9 and mean AUC >0.9 for the two models). We used the predictor to screen potential antibacterial compounds from FDA-approved drugs in the DrugBank database. The screening results showed that 1087 small-molecule drugs have potential antibacterial activity and 154 of them are FDA-approved antibacterial drugs, which accounts for 76.2% of the approved antibacterial drugs collected in this study. Through molecular fingerprint similarity analysis and common substructure analysis, we screened 8 predicted antibacterial small-molecule compounds with novel structures compared with known antibacterial drugs, and 5 of them are widely used in the treatment of various tumors. This study provides a new insight for predicting antibacterial compounds by using approved drugs, the predicted compounds might be used to treat bacterial infections and extend lifespan.
Collapse
Affiliation(s)
- Wen-Xing Li
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, Guangdong, China.,Guangdong Provincial Key Laboratory of Single Cell Technology and Application, Southern Medical University, Guangzhou 510515, Guangdong, China
| | - Xin Tong
- State Key Laboratory of Primate Biomedical Research, Institute of Primate Translational Medicine, Kunming University of Science and Technology, Kunming 650500, Yunnan, China
| | - Peng-Peng Yang
- State Key Laboratory of Primate Biomedical Research, Institute of Primate Translational Medicine, Kunming University of Science and Technology, Kunming 650500, Yunnan, China
| | - Yang Zheng
- State Key Laboratory of Primate Biomedical Research, Institute of Primate Translational Medicine, Kunming University of Science and Technology, Kunming 650500, Yunnan, China
| | - Ji-Hao Liang
- State Key Laboratory of Primate Biomedical Research, Institute of Primate Translational Medicine, Kunming University of Science and Technology, Kunming 650500, Yunnan, China
| | - Gong-Hua Li
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650223, Yunnan, China
| | - Dahai Liu
- School of Medicine, Foshan University, Foshan 528000, Guangdong, China
| | - Dao-Gang Guan
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, Guangdong, China.,Guangdong Provincial Key Laboratory of Single Cell Technology and Application, Southern Medical University, Guangzhou 510515, Guangdong, China
| | - Shao-Xing Dai
- State Key Laboratory of Primate Biomedical Research, Institute of Primate Translational Medicine, Kunming University of Science and Technology, Kunming 650500, Yunnan, China
| |
Collapse
|
5
|
Schuler A. Designing efficient randomized trials: power and sample size calculation when using semiparametric efficient estimators. Int J Biostat 2021; 18:151-171. [PMID: 34364314 DOI: 10.1515/ijb-2021-0039] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2021] [Accepted: 07/12/2021] [Indexed: 11/15/2022]
Abstract
Trials enroll a large number of subjects in order to attain power, making them expensive and time-consuming. Sample size calculations are often performed with the assumption of an unadjusted analysis, even if the trial analysis plan specifies a more efficient estimator (e.g. ANCOVA). This leads to conservative estimates of required sample sizes and an opportunity for savings. Here we show that a relatively simple formula can be used to estimate the power of any two-arm, single-timepoint trial analyzed with a semiparametric efficient estimator, regardless of the domain of the outcome or kind of treatment effect (e.g. odds ratio, mean difference). Since an efficient estimator attains the minimum possible asymptotic variance, this allows for the design of trials that are as small as possible while still attaining design power and control of type I error. The required sample size calculation is parsimonious and requires the analyst to provide only a small number of population parameters. We verify in simulation that the large-sample properties of trials designed this way attain their nominal values. Lastly, we demonstrate how to use this formula in the "design" (and subsequent reanalysis) of a real randomized trial and show that fewer subjects are required to attain the same design power when a semiparametric efficient estimator is accounted for at the design stage.
Collapse
|
6
|
Applications of Machine Learning to the Problem of Antimicrobial Resistance: an Emerging Model for Translational Research. J Clin Microbiol 2021; 59:e0126020. [PMID: 33536291 DOI: 10.1128/jcm.01260-20] [Citation(s) in RCA: 55] [Impact Index Per Article: 18.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
Antimicrobial resistance (AMR) remains one of the most challenging phenomena of modern medicine. Machine learning (ML) is a subfield of artificial intelligence that focuses on the development of algorithms that learn how to accurately predict outcome variables using large sets of predictor variables that are typically not hand selected and are minimally curated. Models are parameterized using a training data set and then applied to a test data set on which predictive performance is evaluated. The application of ML algorithms to the problem of AMR has garnered increasing interest in the past 5 years due to the exponential growth of experimental and clinical data, heavy investment in computational capacity, improvements in algorithm performance, and increasing urgency for innovative approaches to reducing the burden of disease. Here, we review the current state of research at the intersection of ML and AMR with an emphasis on three domains of work. The first is the prediction of AMR using genomic data. The second is the use of ML to gain insight into the cellular functions disrupted by antibiotics, which forms the basis for understanding mechanisms of action and developing novel anti-infectives. The third focuses on the application of ML for antimicrobial stewardship using data extracted from the electronic health record. Although the use of ML for understanding, diagnosing, treating, and preventing AMR is still in its infancy, the continued growth of data and interest ensures it will become an important tool for future translational research programs.
Collapse
|
7
|
Burchill E, Lymberopoulos E, Menozzi E, Budhdeo S, McIlroy JR, Macnaughtan J, Sharma N. The Unique Impact of COVID-19 on Human Gut Microbiome Research. Front Med (Lausanne) 2021; 8:652464. [PMID: 33796545 PMCID: PMC8007773 DOI: 10.3389/fmed.2021.652464] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2021] [Accepted: 02/19/2021] [Indexed: 12/14/2022] Open
Abstract
The coronavirus (COVID-19) pandemic has disrupted clinical trials globally, with unique implications for research into the human gut microbiome. In this mini-review, we explore the direct and indirect influences of the pandemic on the gut microbiome and how these can affect research and clinical trials. We explore the direct bidirectional relationships between the COVID-19 virus and the gut and lung microbiomes. We then consider the significant indirect effects of the pandemic, such as repeated lockdowns, increased hand hygiene, and changes to mood and diet, that could all lead to longstanding changes to the gut microbiome at an individual and a population level. Together, these changes may affect long term microbiome research, both in observational as well as in population studies, requiring urgent attention. Finally, we explore the unique implications for clinical trials using faecal microbiota transplants (FMT), which are increasingly investigated as potential treatments for a range of diseases. The pandemic introduces new barriers to participation in trials, while the direct and indirect effects laid out above can present a confounding factor. This affects recruitment and sample size, as well as study design and statistical analyses. Therefore, the potential impact of the pandemic on gut microbiome research is significant and needs to be specifically addressed by the research community and funders.
Collapse
Affiliation(s)
- Ella Burchill
- Faculty of Life Sciences and Medicine, King's College London, London, United Kingdom
| | - Eva Lymberopoulos
- Department of Clinical and Movement Neurosciences, Institute of Neurology, University College London, London, United Kingdom
- Centre for Doctoral Training (CDT) AI-Enabled Healthcare Systems, Institute of Health Informatics, University College London, London, United Kingdom
| | - Elisa Menozzi
- Department of Clinical and Movement Neurosciences, Institute of Neurology, University College London, London, United Kingdom
| | - Sanjay Budhdeo
- Department of Clinical and Movement Neurosciences, Institute of Neurology, University College London, London, United Kingdom
- National Hospital for Neurology and Neurosurgery, University College London Hospitals National Health Service (NHS) Foundation Trust, London, United Kingdom
| | | | - Jane Macnaughtan
- Institute for Liver and Digestive Health, University College London, London, United Kingdom
| | - Nikhil Sharma
- Department of Clinical and Movement Neurosciences, Institute of Neurology, University College London, London, United Kingdom
- National Hospital for Neurology and Neurosurgery, University College London Hospitals National Health Service (NHS) Foundation Trust, London, United Kingdom
| |
Collapse
|
8
|
Le Borgne F, Chatton A, Léger M, Lenain R, Foucher Y. G-computation and machine learning for estimating the causal effects of binary exposure statuses on binary outcomes. Sci Rep 2021; 11:1435. [PMID: 33446866 PMCID: PMC7809122 DOI: 10.1038/s41598-021-81110-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2020] [Accepted: 12/24/2020] [Indexed: 11/09/2022] Open
Abstract
In clinical research, there is a growing interest in the use of propensity score-based methods to estimate causal effects. G-computation is an alternative because of its high statistical power. Machine learning is also increasingly used because of its possible robustness to model misspecification. In this paper, we aimed to propose an approach that combines machine learning and G-computation when both the outcome and the exposure status are binary and is able to deal with small samples. We evaluated the performances of several methods, including penalized logistic regressions, a neural network, a support vector machine, boosted classification and regression trees, and a super learner through simulations. We proposed six different scenarios characterised by various sample sizes, numbers of covariates and relationships between covariates, exposure statuses, and outcomes. We have also illustrated the application of these methods, in which they were used to estimate the efficacy of barbiturates prescribed during the first 24 h of an episode of intracranial hypertension. In the context of GC, for estimating the individual outcome probabilities in two counterfactual worlds, we reported that the super learner tended to outperform the other approaches in terms of both bias and variance, especially for small sample sizes. The support vector machine performed well, but its mean bias was slightly higher than that of the super learner. In the investigated scenarios, G-computation associated with the super learner was a performant method for drawing causal inferences, even from small sample sizes.
Collapse
Affiliation(s)
- Florent Le Borgne
- INSERM UMR 1246 - SPHERE, Nantes University, Tours University, 22 Boulevard Bénoni Goullin, 44200, Nantes, France.,IDBC-A2COM, Pacé, France
| | - Arthur Chatton
- INSERM UMR 1246 - SPHERE, Nantes University, Tours University, 22 Boulevard Bénoni Goullin, 44200, Nantes, France.,IDBC-A2COM, Pacé, France
| | - Maxime Léger
- INSERM UMR 1246 - SPHERE, Nantes University, Tours University, 22 Boulevard Bénoni Goullin, 44200, Nantes, France.,Département D'Anesthésie Réanimation, Centre Hospitalier Universitaire D'Angers, Angers, France
| | - Rémi Lenain
- INSERM UMR 1246 - SPHERE, Nantes University, Tours University, 22 Boulevard Bénoni Goullin, 44200, Nantes, France.,Lille University Hospital, Lille, France
| | - Yohann Foucher
- INSERM UMR 1246 - SPHERE, Nantes University, Tours University, 22 Boulevard Bénoni Goullin, 44200, Nantes, France. .,Nantes University Hospital, Nantes, France.
| |
Collapse
|
9
|
Stern AD, Price WN. Regulatory oversight, causal inference, and safe and effective health care machine learning. Biostatistics 2020; 21:363-367. [PMID: 31742358 DOI: 10.1093/biostatistics/kxz044] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2019] [Revised: 09/25/2019] [Accepted: 09/25/2019] [Indexed: 11/13/2022] Open
Abstract
In recent years, the applications of Machine Learning (ML) in the health care delivery setting have grown to become both abundant and compelling. Regulators have taken notice of these developments and the U.S. Food and Drug Administration (FDA) has been engaging actively in thinking about how best to facilitate safe and effective use. Although the scope of its oversight for software-driven products is limited, if FDA takes the lead in promoting and facilitating appropriate applications of causal inference as a part of ML development, that leadership is likely to have implications well beyond regulated products.
Collapse
Affiliation(s)
- Ariel Dora Stern
- Harvard Business School and the Harvard-MIT Center for Regulatory Science, Morgan Hall 433, 15 Harvard Way, Boston, MA 02163, USA
| | - W Nicholson Price
- University of Michigan Law School, 625 State Street, Ann Arbor, MI, USA.,University of Copenhagen Center for Advanced Studies in Biomedical Innovation Law (CeBIL), Copenhagen, Denmark
| |
Collapse
|
10
|
Rose S, Rizopoulos D. Machine learning for causal inference in Biostatistics. Biostatistics 2020; 21:336-338. [PMID: 31742360 DOI: 10.1093/biostatistics/kxz045] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2019] [Revised: 09/25/2019] [Accepted: 09/25/2019] [Indexed: 11/14/2022] Open
Affiliation(s)
- Sherri Rose
- Department of Health Care Policy, Harvard Medical School, 180 Longwood Ave, Boston, MA, 02115, USA
| | - Dimitris Rizopoulos
- Department of Biostatistics, Erasmus University Medical Center, PO Box 2040, 3000 CA Rotterdam, the Netherlands
| |
Collapse
|
11
|
Goldstein ND, LeVasseur MT, McClure LA. On the Convergence of Epidemiology, Biostatistics, and Data Science. HARVARD DATA SCIENCE REVIEW 2020; 2. [PMID: 35005710 DOI: 10.1162/99608f92.9f0215e6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022] Open
Abstract
Epidemiology, biostatistics, and data science are broad disciplines that incorporate a variety of substantive areas. Common among them is a focus on quantitative approaches for solving intricate problems. When the substantive area is health and health care, the overlap is further cemented. Researchers in these disciplines are fluent in statistics, data management and analysis, and health and medicine, to name but a few competencies. Yet there are important and perhaps mutually exclusive attributes of these fields that warrant a tighter integration. For example, epidemiologists receive substantial training in the science of study design, measurement, and the art of causal inference. Biostatisticians are well versed in the theory and application of methodological techniques, as well as the design and conduct of public health research. Data scientists receive equivalently rigorous training in computational and visualization approaches for high-dimensional data. Compared to data scientists, epidemiologists and biostatisticians may have less expertise in computer science and informatics, while data scientists may benefit from a working knowledge of study design and causal inference. Collaboration and cross-training offer the opportunity to share and learn of the constructs, frameworks, theories, and methods of these fields with the goal of offering fresh and innovate perspectives for tackling challenging problems in health and health care. In this article, we first describe the evolution of these fields focusing on their convergence in the era of electronic health data, notably electronic medical records (EMRs). Next we present how a collaborative team may design, analyze, and implement an EMR-based study. Finally, we review the curricula at leading epidemiology, biostatistics, and data science training programs, identifying gaps and offering suggestions for the fields moving forward.
Collapse
Affiliation(s)
- Neal D Goldstein
- Neal D. Goldstein is an assistant research professor, Michael T. LeVasseur is a visiting assistant teaching professor, and Leslie A. McClure is a professor and chair of the Department of Epidemiology and Biostatistics at the Drexel University Dornsife School of Public Health, Philadelphia, PA, USA
| | - Michael T LeVasseur
- Neal D. Goldstein is an assistant research professor, Michael T. LeVasseur is a visiting assistant teaching professor, and Leslie A. McClure is a professor and chair of the Department of Epidemiology and Biostatistics at the Drexel University Dornsife School of Public Health, Philadelphia, PA, USA
| | - Leslie A McClure
- Neal D. Goldstein is an assistant research professor, Michael T. LeVasseur is a visiting assistant teaching professor, and Leslie A. McClure is a professor and chair of the Department of Epidemiology and Biostatistics at the Drexel University Dornsife School of Public Health, Philadelphia, PA, USA
| |
Collapse
|