1
|
Reinisch W, Pradhan V, Ahmad S, Zhang Z, Gale JD. Alternative Endoscopy Reading Paradigms Determine Score Reliability and Effect Size in Ulcerative Colitis. J Crohns Colitis 2024; 18:82-90. [PMID: 37616127 PMCID: PMC10821708 DOI: 10.1093/ecco-jcc/jjad134] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Indexed: 08/25/2023]
Abstract
OBJECTIVE Central reading of endoscopy is advocated by regulatory agencies for clinical trials in ulcerative colitis [UC]. It is uncertain whether the local/site reader should be included in the reading paradigm. We explore whether using locally- and centrally-determined endoscopic Mayo subscores [eMS] provide a reliable final assessment and whether the paradigm used has an impact on effect size. METHODS eMS data from the TURANDOT [NCT01620255] study were used to retrospectively examine seven different reading paradigms (using the scores of local readers [LR], first central readers [CR1], second central readers [CR2], and various consensus reads [ConCR]) by assessing inter-rater reliabilities and their impact on the key study endpoint, endoscopic improvement. RESULTS More than 40% of eMS scores between two trained central readers were discordant. Central readers had wide variability in scorings at baseline (intraclass correlation coefficient [ICC] of 0.475 [0.339, 0.610] for CR1 vs CR2). Centrally-read scores had variable concordance with LR (LR vs CR1 ICC 0.682 [0.575, 0.788], and LR vs CR2 ICC 0.526 [0.399, 0.653]). Reading paradigms with LR and CR which included a consensus, enhanced ICC estimates to >0.8. At Week 12, without the consensus reads, the CR1 vs CR2 ICC estimates were 0.775 [0.710, 0.841], and with consensus reads the ICC estimates were >0.9. Consensus-based approaches were most favourable to detect a treatment difference. CONCLUSION The ICC between the eMS of two trained and experienced central readers is unexpectedly low, which reinforces that currently used central reading processes are still associated with several weaknesses.
Collapse
Affiliation(s)
- Walter Reinisch
- Department of Internal Medicine III, Division Gastroenterology & Hepatology, Medical University of Vienna, Vienna, Austria
| | - Vivek Pradhan
- Statistics, Global Biometry and Data Management, Pfizer Inc., 1 Portland St, Cambridge, MA 02139, USA
| | - Saira Ahmad
- Statistics and Programming, Quanticate, Hitchin, UK
| | - Zhen Zhang
- Statistics, Global Biometry and Data Management, Pfizer Inc., 1 Portland St, Cambridge, MA 02139, USA
| | - Jeremy D Gale
- Inflammation and Immunology Research Unit, Pfizer Inc., 1 Portland St, Cambridge, MA 02139, USA
| |
Collapse
|
2
|
Khanna R, Ma C, Jairath V, Vande Casteele N, Zou G, Feagan BG. Endoscopic Assessment of Inflammatory Bowel Disease Activity in Clinical Trials. Clin Gastroenterol Hepatol 2022; 20:727-736.e2. [PMID: 33338657 DOI: 10.1016/j.cgh.2020.12.017] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/11/2020] [Revised: 12/04/2020] [Accepted: 12/11/2020] [Indexed: 02/07/2023]
Abstract
In patients with Crohn's disease and ulcerative colitis, poor correlation between symptoms and active luminal inflammation has been well established. As a result, the field has moved towards the use of endoscopic assessment to evaluate inflammatory activity. Numerous endoscopic indices have been used for this purpose although none are completely validated. The Simple Endoscopic Score for Crohn's Disease and the Crohn's Disease Endoscopic Index of Severity have been used most frequently; however in addition to incomplete validation, they have important limitations for clinical use, including complexity of scoring and poor reliability of items such as stenosis. The Rutgeerts' score for postoperative Crohn's disease was developed primarily as a prognostic rather than evaluative tool and also requires additional validation. In ulcerative colitis, the Mayo endoscopic subscore has been used as the regulatory standard, although the Ulcerative Colitis Endoscopic Index of Severity may provide a more granular assessment of individual components of disease activity. The use of combined outcomes with patient reported outcomes (PROs) and endoscopic indices has received favor by regulatory bodies but require further validation. This review describes the indications for endoscopic assessment in trials, the indices most frequently utilized for these purposes, and potential future approaches to assessment of disease activity.
Collapse
Affiliation(s)
- Reena Khanna
- Department of Medicine, University of Western Ontario, London, Ontario, Canada.
| | - Christopher Ma
- Department of Medicine, University of Calgary, Calgary, Alberta, Canada; Department of Community Health Sciences, University of Calgary, Calgary, Alberta, Canada; Robarts Clinical Trials, London, Ontario, Canada
| | - Vipul Jairath
- Department of Medicine, University of Western Ontario, London, Ontario, Canada; Robarts Clinical Trials, London, Ontario, Canada; Department of Epidemiology and Biostatistics, University of Western Ontario, London, Ontario, Canada
| | - Niels Vande Casteele
- Robarts Clinical Trials, London, Ontario, Canada; Department of Medicine, University of California San Diego, La Jolla, California
| | - Guangyong Zou
- Robarts Clinical Trials, London, Ontario, Canada; Department of Epidemiology and Biostatistics, University of Western Ontario, London, Ontario, Canada
| | - Brian G Feagan
- Department of Medicine, University of Western Ontario, London, Ontario, Canada; Robarts Clinical Trials, London, Ontario, Canada; Department of Epidemiology and Biostatistics, University of Western Ontario, London, Ontario, Canada
| |
Collapse
|
3
|
Gottlieb K, Requa J, McGILL J. Reply. Gastroenterology 2021; 161:1074. [PMID: 33901494 DOI: 10.1053/j.gastro.2021.04.043] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/14/2021] [Accepted: 04/22/2021] [Indexed: 12/02/2022]
Affiliation(s)
| | - James Requa
- Eli Lilly and Company, Indianapolis, Indiana
| | - Jim McGILL
- Eli Lilly and Company, Indianapolis, Indiana
| |
Collapse
|
4
|
Bossuyt P, Bisschops R, Vermeire S, De Hertogh G. Variability in the Distribution of Histological Disease Activity in the Colon of Patients with Ulcerative Colitis. J Crohns Colitis 2021; 15:603-608. [PMID: 33053161 DOI: 10.1093/ecco-jcc/jjaa206] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Abstract
BACKGROUND AND AIMS Histological activity scores have been developed and validated. However, data on the distribution of histological inflammation within one segment in patients with ulcerative colitis [UC] are lacking. This impacts on the reliability of histological activity scores. The aim of this study was to assess the variability in histological activity within one endoscopic segment in patients with UC. METHODS Biopsies were taken in sequential patients with UC in three adjacent contiguous regions within a macroscopically homogeneous colonic segment. Biopsies were scored for Geboes score [GS], Robarts histological index [RHI] and Nancy histological index [NHI]. Variability was assessed by Kappa statistics for categorical outcomes and intraclass correlation coefficient [ICC] for continuous outcomes. RESULTS A total of 161 biopsy sets from 55 endoscopic segments of 21 patients were analysed. Endoscopically active disease was present in 45% of segments. The continuous histological scores showed excellent agreement between the different regions. The ICC for RHI in all segments was 0.974 (95% confidence interval [CI] 0.958-0.984; p < 0.0001) and 0.98 [95% CI: 0.968-0.988; p < 0.0001] for the numerically converted GS. The categorical NHI showed higher variability: κ = 0.574 [95% CI: 0.571-0.577; p < 0.0001]. In all segments the highest variability was seen in samples with NHI = 2. When dichotomizing based on histological remission, substantial agreement was seen for all scores, with κ > 0.734 for all cut-offs. The homogeneity in the distribution of histological disease activity was comparable between colonic segments. CONCLUSION The distribution of histological disease activity in UC follows a homogeneous pattern in different locations of one segment.
Collapse
Affiliation(s)
- Peter Bossuyt
- Department of Gastroenterology and Hepatology, University Hospitals Leuven, Leuven, Belgium.,Department of Gastroenterology, Imelda General Hospital, Bonheiden, Belgium
| | - Raf Bisschops
- Department of Gastroenterology and Hepatology, University Hospitals Leuven, Leuven, Belgium
| | - Séverine Vermeire
- Department of Gastroenterology and Hepatology, University Hospitals Leuven, Leuven, Belgium
| | - Gert De Hertogh
- Department of Pathology, University Hospitals Leuven, Leuven, Belgium
| |
Collapse
|
5
|
Barua I, Vinsard DG, Jodal HC, Løberg M, Kalager M, Holme Ø, Misawa M, Bretthauer M, Mori Y. Artificial intelligence for polyp detection during colonoscopy: a systematic review and meta-analysis. Endoscopy 2021; 53:277-284. [PMID: 32557490 DOI: 10.1055/a-1201-7165] [Citation(s) in RCA: 120] [Impact Index Per Article: 40.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
BACKGROUND Artificial intelligence (AI)-based polyp detection systems are used during colonoscopy with the aim of increasing lesion detection and improving colonoscopy quality. PATIENTS AND METHODS We performed a systematic review and meta-analysis of prospective trials to determine the value of AI-based polyp detection systems for detection of polyps and colorectal cancer. We performed systematic searches in MEDLINE, EMBASE, and Cochrane CENTRAL. Independent reviewers screened studies and assessed eligibility, certainty of evidence, and risk of bias. We compared colonoscopy with and without AI by calculating relative and absolute risks and mean differences for detection of polyps, adenomas, and colorectal cancer. RESULTS Five randomized trials were eligible for analysis. Colonoscopy with AI increased adenoma detection rates (ADRs) and polyp detection rates (PDRs) compared to colonoscopy without AI (values given with 95 %CI). ADR with AI was 29.6 % (22.2 % - 37.0 %) versus 19.3 % (12.7 % - 25.9 %) without AI; relative risk (RR] 1.52 (1.31 - 1.77), with high certainty. PDR was 45.4 % (41.1 % - 49.8 %) with AI versus 30.6 % (26.5 % - 34.6 %) without AI; RR 1.48 (1.37 - 1.60), with high certainty. There was no difference in detection of advanced adenomas (mean advanced adenomas per colonoscopy 0.03 for each group, high certainty). Mean adenomas detected per colonoscopy was higher for small adenomas (≤ 5 mm) for AI versus non-AI (mean difference 0.15 [0.12 - 0.18]), but not for larger adenomas (> 5 - ≤ 10 mm, mean difference 0.03 [0.01 - 0.05]; > 10 mm, mean difference 0.01 [0.00 - 0.02]; high certainty). Data on cancer are unavailable. CONCLUSIONS AI-based polyp detection systems during colonoscopy increase detection of small nonadvanced adenomas and polyps, but not of advanced adenomas.
Collapse
Affiliation(s)
- Ishita Barua
- Clinical Effectiveness Research Group, Institute of Health and Society, University of Oslo, and Department of Transplantation Medicine Oslo University Hospital, Oslo, Norway
| | - Daniela Guerrero Vinsard
- Department of Internal Medicine, University of Connecticut Health Centre, Connecticut, USA.,Department of Gastroenterology and Hepatology, Mayo Clinic, Rochester, Minnesota, USA
| | - Henriette C Jodal
- Clinical Effectiveness Research Group, Institute of Health and Society, University of Oslo, and Department of Transplantation Medicine Oslo University Hospital, Oslo, Norway
| | - Magnus Løberg
- Clinical Effectiveness Research Group, Institute of Health and Society, University of Oslo, and Department of Transplantation Medicine Oslo University Hospital, Oslo, Norway
| | - Mette Kalager
- Clinical Effectiveness Research Group, Institute of Health and Society, University of Oslo, and Department of Transplantation Medicine Oslo University Hospital, Oslo, Norway
| | - Øyvind Holme
- Clinical Effectiveness Research Group, Institute of Health and Society, University of Oslo, and Department of Transplantation Medicine Oslo University Hospital, Oslo, Norway
| | - Masashi Misawa
- Digestive Disease Center, Showa University Northern Yokohama Hospital, Yokohama, Japan
| | - Michael Bretthauer
- Clinical Effectiveness Research Group, Institute of Health and Society, University of Oslo, and Department of Transplantation Medicine Oslo University Hospital, Oslo, Norway
| | - Yuichi Mori
- Clinical Effectiveness Research Group, Institute of Health and Society, University of Oslo, and Department of Transplantation Medicine Oslo University Hospital, Oslo, Norway.,Digestive Disease Center, Showa University Northern Yokohama Hospital, Yokohama, Japan
| |
Collapse
|
6
|
Gottlieb K, Daperno M, Usiskin K, Sands BE, Ahmad H, Howden CW, Karnes W, Oh YS, Modesto I, Marano C, Stidham RW, Reinisch W. Endoscopy and central reading in inflammatory bowel disease clinical trials: achievements, challenges and future developments. Gut 2021; 70:418-426. [PMID: 32699100 PMCID: PMC7815632 DOI: 10.1136/gutjnl-2020-320690] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/15/2020] [Revised: 06/04/2020] [Accepted: 06/13/2020] [Indexed: 12/19/2022]
Abstract
Central reading, that is, independent, off-site, blinded review or reading of imaging endpoints, has been identified as a crucial component in the conduct and analysis of inflammatory bowel disease clinical trials. Central reading is the final step in a workflow that has many parts, all of which can be improved. Furthermore, the best reading algorithm and the most intensive central reader training cannot make up for deficiencies in the acquisition stage (clinical trial endoscopy) or improve on the limitations of the underlying score (outcome instrument). In this review, academic and industry experts review scoring systems, and propose a theoretical framework for central reading that predicts when improvements in statistical power, affecting trial size and chances of success, can be expected: Multireader models can be conceptualised as statistical or non-statistical (social). Important organisational and operational factors, such as training and retraining of readers, optimal bowel preparation for colonoscopy, video quality, optimal or at least acceptable read duration times and other quality control matters, are addressed as well. The theory and practice of central reading and the conduct of endoscopy in clinical trials are interdisciplinary topics that should be of interest to many, regulators, clinical trial experts, gastroenterology societies and those in the academic community who endeavour to develop new scoring systems using traditional and machine learning approaches.
Collapse
Affiliation(s)
- Klaus Gottlieb
- Immunology, Eli Lilly and Company, Indianapolis, Indiana, USA
| | | | | | - Bruce E Sands
- Dr Henry D Janowitz Division of Gastroenterology, Mount Sinai School of Medicine, New York, New York, USA
| | - Harris Ahmad
- Immunoscience, Bristol-Myers Squibb Co, New York, New York, USA
| | - Colin W Howden
- Gastroenterology, Univ Tennessee, Memphis, Tennessee, USA
| | | | - Young S Oh
- Immunology, Genentech Inc, South San Francisco, California, USA
| | - Irene Modesto
- Inflammation & Immunology, Pfizer Inc, New York, New York, USA
| | - Colleen Marano
- Janssen Research & Development, Spring House, Pennsylvania, USA
| | | | - Walter Reinisch
- Department of Medicine IV, Medical University Vienna, Vienna, Austria
| |
Collapse
|
7
|
Gottlieb K, Requa J, Karnes W, Chandra Gudivada R, Shen J, Rael E, Arora V, Dao T, Ninh A, McGill J. Central Reading of Ulcerative Colitis Clinical Trial Videos Using Neural Networks. Gastroenterology 2021; 160:710-719.e2. [PMID: 33098883 DOI: 10.1053/j.gastro.2020.10.024] [Citation(s) in RCA: 65] [Impact Index Per Article: 21.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/24/2020] [Revised: 09/15/2020] [Accepted: 10/08/2020] [Indexed: 02/07/2023]
Abstract
BACKGROUND AND AIMS Endoscopic disease activity scoring in ulcerative colitis (UC) is useful in clinical practice but done infrequently. It is required in clinical trials, where it is expensive and slow because human central readers are needed. A machine learning algorithm automating the process could elevate clinical care and facilitate clinical research. Prior work using single-institution databases and endoscopic still images has been promising. METHODS Seven hundred and ninety-five full-length endoscopy videos were prospectively collected from a phase 2 trial of mirikizumab with 249 patients from 14 countries, totaling 19.5 million image frames. Expert central readers assigned each full-length endoscopy videos 1 endoscopic Mayo score (eMS) and 1 Ulcerative Colitis Endoscopic Index of Severity (UCEIS) score. Initially, video data were cleaned and abnormality features extracted using convolutional neural networks. Subsequently, a recurrent neural network was trained on the features to predict eMS and UCEIS from individual full-length endoscopy videos. RESULTS The primary metric to assess the performance of the recurrent neural network model was quadratic weighted kappa (QWK) comparing the agreement of the machine-read endoscopy score with the human central reader score. QWK progressively penalizes disagreements that exceed 1 level. The model's agreement metric was excellent, with a QWK of 0.844 (95% confidence interval, 0.787-0.901) for eMS and 0.855 (95% confidence interval, 0.80-0.91) for UCEIS. CONCLUSIONS We found that a deep learning algorithm can be trained to predict levels of UC severity from full-length endoscopy videos. Our data set was prospectively collected in a multinational clinical trial, videos rather than still images were used, UCEIS and eMS were reported, and machine learning algorithm performance metrics met or exceeded those previously published for UC severity scores.
Collapse
Affiliation(s)
| | | | | | | | - Jie Shen
- Eli Lilly and Company, Indianapolis, Indiana
| | | | - Vipin Arora
- Eli Lilly and Company, Indianapolis, Indiana
| | | | | | | |
Collapse
|
8
|
Reinisch W, Mishkin DS, Oh YS, Schreiber S, Hussain F, Jacob R, Hassanali A, Daperno M. Impact of various central endoscopy reading models on treatment outcome in Crohn's disease using data from the randomized, controlled, exploratory cohort arm of the BERGAMOT trial. Gastrointest Endosc 2021; 93:174-182.e2. [PMID: 32464142 DOI: 10.1016/j.gie.2020.05.020] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/24/2020] [Accepted: 05/01/2020] [Indexed: 02/08/2023]
Abstract
BACKGROUND AND AIMS Endoscopic assessment of mucosal appearance by independent central reading has become the standard method to assess Crohn's disease activity in clinical trials. The performance characteristics of various endoscopy reading models have yet to be systematically evaluated. METHODS This substudy included patients with Crohn's disease in the exploratory induction cohort of the BERGAMOT trial (NCT02394028) randomly assigned to etrolizumab or placebo. Endoscopies conducted at baseline and week 14 were independently scored using the Simple Endoscopic Score for Crohn's Disease (SES-CD) by a local reader (LR) and 2 central readers (CRs). Five endoscopy reading models were compared: single LR, single CR, average of 2 CRs, and 2 models incorporating the LR and 1 or 2 CRs depending on alignment between the LR and the CR, defined according to a sliding scale applied to a range of scores. RESULTS Five hundred thirty-five videos were scored. Models involving 2 readers demonstrated lower placebo rates (3.4%) than the single LR (11.9%) and the single CR (6.8%) models. Treatment effect size based on endoscopic improvement (≥50% reduction in SES-CD from baseline) was highest with the 2 models incorporating the LR and 1 or 2 CRs (Δ = 16.2%). Further, in the etrolizumab arm, models with 2 readers demonstrated the lowest variability for the SES-CD. CONCLUSIONS Central endoscopy reading models in Crohn's disease have an impact on placebo response rates and effect size. Incorporating the LR appears to be important because models using both CRs and LRs resulted in the greatest treatment effect size for endoscopic improvement with etrolizumab, lower placebo rates, and reduced variability.
Collapse
Affiliation(s)
| | | | - Young S Oh
- Genentech, Inc, South San Francisco, California, USA
| | | | - Fez Hussain
- Gastroenterology Center of Excellence, IQVIA, Reading
| | - Rhian Jacob
- Roche Products Limited, Welwyn Garden City, United Kingdom
| | | | | |
Collapse
|
9
|
Abreu MT, Sandborn WJ. Defining Endpoints and Biomarkers in Inflammatory Bowel Disease: Moving the Needle Through Clinical Trial Design. Gastroenterology 2020; 159:2013-2018.e7. [PMID: 32961246 DOI: 10.1053/j.gastro.2020.07.064] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/10/2019] [Revised: 07/18/2020] [Accepted: 07/21/2020] [Indexed: 12/27/2022]
Affiliation(s)
- Maria T Abreu
- Division of Gastroenterology, University of Miami Miller School of Medicine, Miami, Florida.
| | - William J Sandborn
- Division of Gastroenterology, University of California San Diego, La Jolla, California
| | | |
Collapse
|
10
|
Reeve R, Gottlieb K. Sequentially Determined Measures of Interobserver Agreement (Kappa) in Clinical Trials May Vary Independent of Changes in Observer Performance. Ther Innov Regul Sci 2019:2168479019874059. [PMID: 31569962 DOI: 10.1177/2168479019874059] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/28/2024]
Abstract
BACKGROUND Cohen's kappa is a statistic that estimates interobserver agreement. It was originally introduced to help develop diagnostic tests. Interpretative readings of 2 observers, for example, of a mammogram or other imaging, were compared at a single point in time. It is known that kappa depends on the prevalence of disease and that, therefore, kappas across different settings are hard to compare. METHODS Using simulation, we examine an analogous situation, not previously described, that occurs in clinical trials where sequential measurements are obtained to evaluate disease progression or clinical improvement over time. RESULTS We show that weighted kappa, used for multilevel outcomes, changes during the trial even if we keep the performance of the observer constant. CONCLUSIONS Kappa and closely related measures can therefore only be used with great difficulty, if at all, in quality assurance in clinical trials.
Collapse
|
11
|
Tucker JD, Day S, Tang W, Bayus B. Crowdsourcing in medical research: concepts and applications. PeerJ 2019; 7:e6762. [PMID: 30997295 PMCID: PMC6463854 DOI: 10.7717/peerj.6762] [Citation(s) in RCA: 73] [Impact Index Per Article: 14.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2018] [Accepted: 03/11/2019] [Indexed: 12/23/2022] Open
Abstract
Crowdsourcing shifts medical research from a closed environment to an open collaboration between the public and researchers. We define crowdsourcing as an approach to problem solving which involves an organization having a large group attempt to solve a problem or part of a problem, then sharing solutions. Crowdsourcing allows large groups of individuals to participate in medical research through innovation challenges, hackathons, and related activities. The purpose of this literature review is to examine the definition, concepts, and applications of crowdsourcing in medicine. This multi-disciplinary review defines crowdsourcing for medicine, identifies conceptual antecedents (collective intelligence and open source models), and explores implications of the approach. Several critiques of crowdsourcing are also examined. Although several crowdsourcing definitions exist, there are two essential elements: (1) having a large group of individuals, including those with skills and those without skills, propose potential solutions; (2) sharing solutions through implementation or open access materials. The public can be a central force in contributing to formative, pre-clinical, and clinical research. A growing evidence base suggests that crowdsourcing in medicine can result in high-quality outcomes, broad community engagement, and more open science.
Collapse
Affiliation(s)
- Joseph D. Tucker
- Institute for Global Health and Infectious Diseases, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- Faculty of Infectious and Tropical Diseases, London School of Hygiene & Tropical Medicine, University of London, London, UK
- Social Entrepreneurship to Spur Health (SESH) Global, Guangzhou, China
| | - Suzanne Day
- Institute for Global Health and Infectious Diseases, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- Department of Social Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Weiming Tang
- Institute for Global Health and Infectious Diseases, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- Department of STD Control, Dermatology Hospital of Southern Medical University, Guangzhou, China
| | - Barry Bayus
- Kenan-Flagler School of Business, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| |
Collapse
|
12
|
de Lange T, Halvorsen P, Riegler M. Methodology to develop machine learning algorithms to improve performance in gastrointestinal endoscopy. World J Gastroenterol 2018; 24:5057-5062. [PMID: 30568383 PMCID: PMC6288655 DOI: 10.3748/wjg.v24.i45.5057] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/07/2018] [Revised: 10/25/2018] [Accepted: 11/02/2018] [Indexed: 02/06/2023] Open
Abstract
Assisted diagnosis using artificial intelligence has been a holy grail in medical research for many years, and recent developments in computer hardware have enabled the narrower area of machine learning to equip clinicians with potentially useful tools for computer assisted diagnosis (CAD) systems. However, training and assessing a computer's ability to diagnose like a human are complex tasks, and successful outcomes depend on various factors. We have focused our work on gastrointestinal (GI) endoscopy because it is a cornerstone for diagnosis and treatment of diseases of the GI tract. About 2.8 million luminal GI (esophageal, stomach, colorectal) cancers are detected globally every year, and although substantial technical improvements in endoscopes have been made over the last 10-15 years, a major limitation of endoscopic examinations remains operator variation. This translates into a substantial inter-observer variation in the detection and assessment of mucosal lesions, causing among other things an average polyp miss-rate of 20% in the colon and thus the subsequent development of a number of post-colonoscopy colorectal cancers. CAD systems might eliminate this variation and lead to more accurate diagnoses. In this editorial, we point out some of the current challenges in the development of efficient computer-based digital assistants. We give examples of proposed tools using various techniques, identify current challenges, and give suggestions for the development and assessment of future CAD systems.
Collapse
Affiliation(s)
- Thomas de Lange
- Department of Transplantation, Oslo University Hospital, Oslo 0424, Norway
- Institute of Clinical Medicine, University of Oslo, Oslo 0316, Norway
| | - Pål Halvorsen
- Center for Digital Engineering Simula Metropolitan, Fornebu 1364, Norway
- Department for Informatics, University of Oslo, Oslo 0316, Norway
| | - Michael Riegler
- Center for Digital Engineering Simula Metropolitan, Fornebu 1364, Norway
- Department for Informatics, University of Oslo, Oslo 0316, Norway
| |
Collapse
|
13
|
Daperno M, Comberlato M, Bossa F, Armuzzi A, Biancone L, Bonanomi AG, Cassinotti A, Cosintino R, Lombardi G, Mangiarotti R, Papa A, Pica R, Grassano L, Pagana G, D'Incà R, Orlando A, Rizzello F. Training Programs on Endoscopic Scoring Systems for Inflammatory Bowel Disease Lead to a Significant Increase in Interobserver Agreement Among Community Gastroenterologists. J Crohns Colitis 2017; 11:556-561. [PMID: 28453758 DOI: 10.1093/ecco-jcc/jjw181] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/16/2016] [Accepted: 10/02/2016] [Indexed: 12/21/2022]
Abstract
BACKGROUND AND AIMS Endoscopic outcomes are increasingly used in clinical trials and in routine practice for inflammatory bowel disease [IBD] in order to reach more objective patient evaluations than possible using only clinical features. However, reproducibility of endoscopic scoring systems used to categorize endoscopic activity has been reported to be suboptimal. The aim of this study was to analyse the inter-rated agreement of non-dedicated gastroenterologists on IBD endoscopic scoring systems, and to explore the effects of a dedicated training programme on agreement. METHODS A total of 237 physicians attended training courses on IBD endoscopic scoring systems, and they independently scored a set of IBD endoscopic videos for ulcerative colitis [with Mayo endoscopic subscore], post-operative Crohn's disease [with Rutgeerts score] and luminal Crohn's disease (with the Simple Endoscopic Score for Crohn's Disease [SESCD] and Crohn's Endoscopic Index of Severity [CDEIS]). A second round of scoring was collected after discussion about determinants of discrepancy. Interobserver agreement was measured by means of the Fleiss' kappa [kappa] or intraclass correlation coefficient [ICC] as appropriate. RESULTS The inter-rater agreement increased from kappa 0.51 (95% confidence interval [95% CI] 0.48-0.55) to 0.76 [95% CI 0.72-0.79] for the Mayo endoscopic subscore, and from 0.45 [95% CI 0.40-0.50] to 0.79 [0.74-0.83] for the Rutgeerts score before and after the training programme, respectively, and both differences were significant [P < 0.0001]. The ICC was 0.77 [95% CI 0.56-0.96] for SESCD and 0.76 [0.54- 0.96] for CDEIS, respectively, with only one measurement. DISCUSSION The basal inter-rater agreement of inexperienced gastroenterologists focused on IBD management is moderate; however, a dedicated training programme can significantly impact on inter-rater agreement, increasing it to levels expected among expert central reviewers.
Collapse
Affiliation(s)
- Marco Daperno
- Gastroenterology Unit, AO Ordine Mauriziano, Torino, TO, Italy
| | | | - Fabrizio Bossa
- Gastroenterology Unit, IRCCS 'Casa Sollievo della Sofferenza', San Giovanni Rotondo, Italy
| | | | - Livia Biancone
- Gastroenterology Unit, Tor Vergata University, Roma, Italy
| | | | | | - Rocco Cosintino
- Gastroenterology Unit, S. Camillo-Forlanini Hospital, Roma, Italy
| | | | | | - Alfredo Papa
- Gastroenterology Unit, Complesso integrato Columbus, Roma, Italy
| | - Roberta Pica
- Gastroenterology Unit, ASL Roma B, Ospedale Pertini, Roma, Italy
| | | | - Guido Pagana
- Politecnico di Torino, Torino, Italy.,Istituto Mario Boella, Torino, Italy
| | - Renata D'Incà
- Department of Surgery, Oncology and Gastroenterology, Azienda Ospedaliera di Padova, Padova, Italy
| | - Ambrogio Orlando
- Internal Medicine Unit, AO Ospedali Riuniti Villa Sofia - Cervello, Palermo, Italy
| | - Fernando Rizzello
- Internal Medicine Unit, Policlinic S. Orsola Malpighi and Bologna University, Bologna, Italy
| | | |
Collapse
|
14
|
Toward reducing bias in clinical trials: central readers for endoscopic endpoints. Gastrointest Endosc 2016; 83:198-200. [PMID: 26706305 DOI: 10.1016/j.gie.2015.06.053] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/24/2015] [Accepted: 06/25/2015] [Indexed: 12/11/2022]
|
15
|
Abstract
BACKGROUND Central reading of endoscopy (CROE) is crucial in determining who qualifies for a trial but also has a role, independent of the selected scoring system, in decreasing measurement noise that can obscure separation between placebo and active drug. Benefits of CROE may not be independent of the method chosen, and controversy exists about the ideal approach. METHODS Literature review and concept development. RESULTS Components to be considered in the reading algorithm are blinding, number of central readers, independent voting versus consensus panel, video recordings versus still images, and involvement of the site reader. Key concepts considered are endpoints, bias, power, and sample size derived from the Food and Drug Administration and European Medicines Agency guidelines, as well as the technological requirements and recruitment, qualification, and revalidation of central readers as applied to CROE. CONCLUSIONS Recording and CROE should be standardized, and an imaging charter developed with research on the different components and its overall impact.
Collapse
|
16
|
Ahmad HA, Gottlieb K, Hussain F. The 2 + 1 paradigm: an efficient algorithm for central reading of Mayo endoscopic subscores in global multicenter phase 3 ulcerative colitis clinical trials. Gastroenterol Rep (Oxf) 2015; 4:35-8. [PMID: 26361984 PMCID: PMC4760065 DOI: 10.1093/gastro/gov024] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/28/2015] [Accepted: 05/19/2015] [Indexed: 01/09/2023] Open
Abstract
Despite its importance and potential impact in clinical trials, central reading continues to be an under-represented topic in the literature about inflammatory bowel disease (IBD) clinical trials. Although several IBD studies have incorporated central reading to date, none have fully detailed the specific methodology with which the reads were conducted. Here we outline key principles for designing an efficient central reading paradigm for an ulcerative colitis (UC) study that addresses regulatory, operational and clinical expectations. As a step towards standardization of read methodology for the growing number of multicenter phase 3 clinical trials in IBD, we have applied these principles to the design of an optimal read methodology that we call the ‘2 + 1 paradigm.’ The 2 + 1 paradigm involves the use of both site and central readers, validated scoring criteria and multiple measures for blinding readers, all of which contribute to reducing bias and generating a reliable endoscopic subscore that reflects endoscopic disease severity. The paradigm can be utilized while maintaining a practical workflow compatible with an operationally feasible clinical trial. The 2 + 1 paradigm represents a logical approach to endoscopic assessment in IBD clinical trials, one that should be considered attractive to prospective sponsors, contract research organizations, key opinion leaders and regulatory authorities and be ready for implementation and further evaluation.
Collapse
Affiliation(s)
- Harris A Ahmad
- Rheumatology and Gastroenterology, BioClinica Inc., Princeton, NJ, USA and
| | - Klaus Gottlieb
- Immunology and Internal Medicine, Quintiles Inc., Research Triangle Park, NC, USA
| | - Fez Hussain
- Immunology and Internal Medicine, Quintiles Inc., Research Triangle Park, NC, USA
| |
Collapse
|