1
|
El Homsi M, Fuqua L, Kim TH, Fernandes MC, Shia J, Widmar M, White C, Capanu M, Rodriguez L, Petkovska I. Accuracy of Post-Neoadjuvant Therapy MRI for the Assessment of Anal Sphincter Involvement in Patients with Rectal Cancer. Radiol Imaging Cancer 2025; 7:e240208. [PMID: 40340564 DOI: 10.1148/rycan.240208] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/10/2025]
Abstract
Purpose To assess the accuracy of post-neoadjuvant therapy (NAT) MRI, as compared with that of pathologic evaluation, to determine anal sphincter involvement in patients with rectal cancer. Materials and Methods This retrospective study included patients diagnosed with rectal cancer between January 2015 and December 2017 whose baseline MRI showed anal sphincter involvement and who then underwent NAT, post-NAT MRI, and abdominoperineal resection. Four radiologists (with 20 years, 5 years, 2 years, and 1 year of experience) independently reviewed MRI findings. Resected specimens were reviewed by a gastrointestinal pathologist. Interreader agreement between the radiologists and pathologist was assessed using the Cohen κ statistic. Conditional sensitivity, specificity, and positive predictive value (PPV) of the radiologists were calculated among patients for whom the radiologists and the pathologist agreed that the anal canal was involved. Results Thirty-two patients were included (mean age ± SD, 60 years ± 15; 19 male, 13 female). For the post-NAT assessment of anal sphincter involvement, agreement between readers 1, 2, and 4 and the pathologist was moderate (κ = 0.55 [95% CI: 0.18, 0.91], 0.45 [95% CI: -0.06, 0.82], and 0.53 [95% CI: 0, 0.89], respectively). There was fair agreement between reader 3 and the pathologist (κ = 0.30 [95% CI: -0.09, 0.67]). Radiologists had high sensitivity for the detection of anal sphincter involvement (88%-100%), high PPV (88%-96%), and moderate to high specificity (50%-80%); the senior radiologist had the highest sensitivity, PPV, and specificity. Conclusion Radiologists had fair to moderate interreader agreement with the pathologist for post-NAT assessment of anal sphincter involvement in patients with rectal cancer and showed high conditional sensitivity regardless of their level of experience. Keywords: Abdomen/GI, Rectum, Oncology, Post-Neoadjuvant Therapy MRI Supplemental material is available for this article. © RSNA, 2025.
Collapse
Affiliation(s)
- Maria El Homsi
- Department of Radiology, Memorial Sloan Kettering Cancer Center, 1275 York Ave, New York, NY 10065
| | - Louis Fuqua
- Department of Radiology, Memorial Sloan Kettering Cancer Center, 1275 York Ave, New York, NY 10065
| | - Tae-Hyung Kim
- Department of Radiology, Memorial Sloan Kettering Cancer Center, 1275 York Ave, New York, NY 10065
| | - Maria Clara Fernandes
- Department of Radiology, Memorial Sloan Kettering Cancer Center, 1275 York Ave, New York, NY 10065
| | - Jinru Shia
- Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, NY
| | - Maria Widmar
- Department of Surgery, Memorial Sloan Kettering Cancer Center, New York, NY
| | - Charlie White
- Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY
| | - Marinela Capanu
- Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY
| | - Lee Rodriguez
- Department of Radiology, Memorial Sloan Kettering Cancer Center, 1275 York Ave, New York, NY 10065
| | - Iva Petkovska
- Department of Radiology, Memorial Sloan Kettering Cancer Center, 1275 York Ave, New York, NY 10065
| |
Collapse
|
2
|
Warren EM, Handley JC, Sheets HD. Cross entropy and log likelihood ratio cost as performance measures for multi-conclusion categorical outcomes scales. J Forensic Sci 2025; 70:589-606. [PMID: 39655364 DOI: 10.1111/1556-4029.15686] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2024] [Revised: 11/06/2024] [Accepted: 11/21/2024] [Indexed: 03/04/2025]
Abstract
The inconclusive category in forensics reporting is the appropriate response in many cases, but it poses challenges in estimating an "error rate". We discuss the use of a class of information-theoretic measures related to cross entropy as an alternative set of metrics that allows for performance evaluation of results presented using multi-category reporting scales. This paper shows how this class of performance metrics, and in particular the log likelihood ratio cost, which is already in use with likelihood ratio forensic reporting methods and in machine learning communities, can be readily adapted for use with the widely used multiple category conclusions scales. Bayesian credible intervals on these metrics can be estimated using numerical methods. The application of these metrics to published test results is shown. It is demonstrated, using these test results, that reducing the number of categories used in a proficiency test from five or six to three increases the cross entropy, indicating that the higher number of categories was justified, as it they increased the level of agreement with ground truth.
Collapse
Affiliation(s)
| | - John C Handley
- Simon Business School, University of Rochester, Rochester, New York, USA
| | - H David Sheets
- Computer and Data Sciences, Merrimack College, North Andover, Massachusetts, USA
| |
Collapse
|
3
|
Petkovska I, Alus O, Rodriguez L, El Homsi M, Golia Pernicka JS, Fernandes MC, Zheng J, Capanu M, Otazo R. Clinical evaluation of accelerated diffusion-weighted imaging of rectal cancer using a denoising neural network. Eur J Radiol 2024; 181:111802. [PMID: 39467396 PMCID: PMC11614684 DOI: 10.1016/j.ejrad.2024.111802] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2024] [Revised: 10/10/2024] [Accepted: 10/22/2024] [Indexed: 10/30/2024]
Abstract
BACKGROUND To evaluate the effectiveness of a deep learning denoising approach to accelerate diffusion-weighted imaging (DWI) and thus improve diagnostic accuracy and image quality in restaging rectal MRI following total neoadjuvant therapy (TNT). METHODS This retrospective single-center study included patients with locally advanced rectal cancer who underwent restaging rectal MRI between December 30, 2021, and June 1, 2022, following TNT. A convolutional neural network trained with DWI data was employed to denoise accelerated DWI acquisitions (i.e., acquisitions performed with a reduced number of repetitions compared to standard acquisitions). Image characteristics and residual disease were independently assessed by two radiologists across original and denoised images. Statistical analyses included the Wilcoxon signed-rank test to compare image quality scores across denoised and original images, weighted kappa statistics for inter-reader agreement assessment, and the calculation of measures of diagnostic accuracy. RESULTS In 46 patients (median age, 60 years [IQR: 47-72]; 37 men and 9 women), 8- and 16-fold accelerated images maintained or exhibited enhanced lesion visibility and image quality compared with original images that were performed 16 repetitions. Denoised images maintained diagnostic accuracy, with conditional specificities of up to 96 %. Moderate-to-high inter-reader agreement indicated reliable image and diagnostic assessment. The overall test yield for denoised DWI reconstructions ranged from 76-98 %, demonstrating a reduction in equivocal interpretations. CONCLUSION Applying a denoising network to accelerate rectal DWI acquisitions can reduce scan times and enhance image quality while maintaining diagnostic accuracy, presenting a potential pathway for more efficient rectal cancer management.
Collapse
Affiliation(s)
- Iva Petkovska
- Department of Radiology, Memorial Sloan Kettering Cancer Center, New York, NY, USA.
| | - Or Alus
- Department of Medical Physics, Memorial Sloan Kettering Cancer Cencer, New York, NY, USA
| | - Lee Rodriguez
- Department of Radiology, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Maria El Homsi
- Department of Radiology, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | | | | | - Junting Zheng
- Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Marinela Capanu
- Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Ricardo Otazo
- Department of Medical Physics, Memorial Sloan Kettering Cancer Cencer, New York, NY, USA
| |
Collapse
|
4
|
Goldberg MS, Cockerell CJ, Rogers JH, Siegel JJ, Russell BH, Hosler GA, Marks E. Appropriate Statistical Methods to Assess Cross-study Diagnostic 23-Gene Expression Profile Test Performance for Cutaneous Melanocytic Neoplasms. Am J Dermatopathol 2024; 46:833-838. [PMID: 39141759 PMCID: PMC11573081 DOI: 10.1097/dad.0000000000002808] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/16/2024]
Abstract
ABSTRACT Comparing studies of molecular ancillary diagnostic tests for difficult-to-diagnose cutaneous melanocytic neoplasms presents a methodological challenge, given the disparate ways accuracy metrics are calculated. A recent report by Boothby-Shoemaker et al investigating the real-world accuracy of the 23-gene expression profile (23-GEP) test highlights this methodological difficulty, reporting lower accuracy than previously observed. However, their calculation method-with indeterminate test results defined as either false positive or false negative-was different than those used in previous studies. We corrected for these differences and recalculated their reported accuracy metrics in the same manner as the previous studies to enable appropriate comparison with previously published reports. This corrected analysis showed a sensitivity of 92.1% (95% confidence interval [CI], 82.1%-100%) and specificity of 94.4% (91.6%-96.9%). We then compared these results directly to previous studies with >25 benign and >25 malignant cases with outcomes and/or concordant histopathological diagnosis by ≥3 dermatopathologists. All studies assessed had enrollment imbalances of benign versus malignant patients (0.8-7.0 ratio), so balanced cohorts were resampled according to the lowest common denominator to calculate point estimates and CIs for accuracy metrics. Overall, we found no statistically significant differences in the ranges of 23-GEP sensitivity, 90.4%-96.3% (95% CI, 80.8%-100%), specificity, 87.3%-96.2% (78.2%-100%), positive predictive value, 88.5%-96.1% (81.5%-100%), or negative predictive value, 91.1%-96.3% (83.6%-100%) between previous studies and the cohort from Boothby-Shoemaker et al with this unified methodological approach. Rigorous standardization of calculation methods is necessary when the goal is direct cross-study comparability.
Collapse
Affiliation(s)
- Matthew S Goldberg
- Castle Biosciences, Inc., Friendswood, TX
- Icahn School of Medicine at Mount Sinai, New York, NY
| | | | | | | | | | - Gregory A Hosler
- ProPath/Sonic Healthcare USA, Dallas, TX
- University of Texas Southwestern, Dallas, TX
| | - Etan Marks
- Department of Dermatopathology, Kansas City University-Graduate Medical Education Consortium, Oviedo, FL; and
- Advanced Dermatology and Cosmetic Surgery, Oviedo, FL
| |
Collapse
|
5
|
Huang Q, Trinquart L. Relative likelihood ratios for neutral comparisons of statistical tests in simulation studies. Biom J 2024; 66:e2200102. [PMID: 36642800 DOI: 10.1002/bimj.202200102] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Revised: 11/11/2022] [Accepted: 11/15/2022] [Indexed: 01/17/2023]
Abstract
When comparing the performance of two or more competing tests, simulation studies commonly focus on statistical power. However, if the size of the tests being compared are either different from one another or from the nominal size, comparing tests based on power alone may be misleading. By analogy with diagnostic accuracy studies, we introduce relative positive and negative likelihood ratios to factor in both power and size in the comparison of multiple tests. We derive sample size formulas for a comparative simulation study. As an example, we compared the performance of six statistical tests for small-study effects in meta-analyses of randomized controlled trials: Begg's rank correlation, Egger's regression, Schwarzer's method for sparse data, the trim-and-fill method, the arcsine-Thompson test, and Lin and Chu's combined test. We illustrate that comparing power alone, or power adjusted or penalized for size, can be misleading, and how the proposed likelihood ratio approach enables accurate comparison of the trade-off between power and size between competing tests.
Collapse
Affiliation(s)
- Qiuxi Huang
- Department of Biostatistics, Boston University School of Public Health, Boston, Massachusetts, USA
| | - Ludovic Trinquart
- Department of Biostatistics, Boston University School of Public Health, Boston, Massachusetts, USA
- Institute for Clinical Research and Health Policy Studies, Tufts Medical Center, Boston, Massachusetts, USA
- Tufts Clinical and Translational Science Institute, Tufts University, Boston, Massachusetts, USA
| |
Collapse
|
6
|
LeBlanc M, Kang J, Costa AF. Can we rely on contrast-enhanced CT to identify pancreatic ductal adenocarcinoma? A population-based study in sensitivity and factors associated with false negatives. Eur Radiol 2023; 33:7656-7664. [PMID: 37266655 DOI: 10.1007/s00330-023-09758-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2022] [Revised: 03/20/2023] [Accepted: 03/27/2023] [Indexed: 06/03/2023]
Abstract
OBJECTIVES To determine the sensitivity of contrast-enhanced computed tomography (CECT) in detecting pancreatic ductal adenocarcinoma (PDAC) and identify factors associated with false negatives (FNs). METHODS Patients diagnosed with PDAC in 2014-2015 were retrospectively identified by a cancer registry. CECTs performed during the diagnostic interval were retrospectively classified as true positive (TP), indeterminate, or FN. Sensitivity TP/(TP+FN) was calculated for all CECTs and the following subgroups: protocol (uniphasic vs. biphasic); tumor size (≤ 2 cm vs. > 2 cm); and resectability (potentially resectable vs. unresectable). Multivariate logistic regression was performed to assess which of the following factors were associated with FN: clinical suspicion of PDAC; size >2 cm; presence of metastases; protocol; isoattenuating tumor; and potentially resectable disease on imaging. RESULTS In total, 176 CECTs (127 uniphasic; 49 biphasic) in 154 patients (90 men, mean age 72 ± 11 years) were included. Sensitivity was 125/149 (83.9%) overall and 87/106 (82.1%) and 38/43 (88.4%) for uniphasic and biphasic protocols, respectively. Sensitivity was decreased for tumors ≤ 2 cm (45.4% vs. 90.6%), no liver metastases (78.0% vs. 95.9%), and potentially resectable disease (65.3% vs. 93.0%). Factors significantly associated with FN were clinical suspicion (OR, 0.24, 95% CI: 0.07-0.75), size>2 cm (OR, 0.10, 95% CI: 0.02-0.44), absence of liver metastases (OR, 4.94, 95% CI: 1.29-22.99), and potentially resectable disease (OR, 4.13, 95% CI: 1.07-16.65). CONCLUSIONS In our population, the overall sensitivity of CECT to detect PDAC is 83.9%; however, this is substantially lower in several scenarios, including patients with potentially resectable disease. This finding has important implications for patient outcomes and efforts to maximize CECT sensitivity should be sought. CLINICAL RELEVANCE STATEMENT The sensitivity of CECT to detect PDAC is significantly decreased in the setting of sub-2 cm tumors and potentially resectable disease. A dedicated biphasic pancreatic CECT protocol has higher sensitivity and should be applied in patients with suspected pancreatic disease. KEY POINTS • The sensitivities of contrast-enhanced CT for the detection of PDAC were 87/106 (82.1%) and 38/43 (88.4%) for uniphasic and biphasic protocols, respectively. • The sensitivity of contrast-enhanced CT was decreased for small tumors ≤ 2 cm (45.4% vs. 90.6%), if there were no liver metastases (78.0% vs. 95.9%), and with potentially resectable disease (65.3% vs. 93.0%). • Absence of liver metastases (OR, 4.94, 95% CI: 1.29-22.99) and potentially resectable disease (OR, 4.13, 95% CI: 1.07-16.65) were associated with a false--negative (FN) CT result; suspicion of malignancy on the imaging requisition (OR, 0.24, 95% CI: 0.07-0.75) and size > 2 cm (OR, 0.10, 95% CI: 0.02-0.44) were negatively associated with FN.
Collapse
Affiliation(s)
- Max LeBlanc
- Department of Diagnostic Radiology, Queen Elizabeth II Health Sciences Centre and Dalhousie University, Victoria General Building, 3rd floor, 1276 South Park Street, Halifax, Nova Scotia, B3H 2Y9, Canada
| | - Jessie Kang
- Department of Diagnostic Radiology, Queen Elizabeth II Health Sciences Centre and Dalhousie University, Victoria General Building, 3rd floor, 1276 South Park Street, Halifax, Nova Scotia, B3H 2Y9, Canada
| | - Andreu F Costa
- Department of Diagnostic Radiology, Queen Elizabeth II Health Sciences Centre and Dalhousie University, Victoria General Building, 3rd floor, 1276 South Park Street, Halifax, Nova Scotia, B3H 2Y9, Canada.
| |
Collapse
|
7
|
Poynard T, Deckmyn O, Peta V, Paradis V, Gautier JF, Brzustowski A, Bedossa P, Castera L, Pol S, Valla D. Prospective direct comparison of non-invasive liver tests in outpatients with type 2 diabetes using intention-to-diagnose analysis. Aliment Pharmacol Ther 2023; 58:888-902. [PMID: 37642160 DOI: 10.1111/apt.17688] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/05/2023] [Revised: 07/23/2023] [Accepted: 08/11/2023] [Indexed: 08/31/2023]
Abstract
BACKGROUND No prospective diagnostic studies have directly compared widespread non-invasive liver tests in patients with type 2 diabetes (T2D) using the intention-to-diagnose method for each of the three main histological features of metabolic dysfunction associated steatotic liver disease - namely fibrosis, metabolic dysfunction-associated steatohepatitis (MASH), and steatosis. AIMS To compare the performance of nine tests using the intention-to-diagnose rather than the standard method, which would exclude non-evaluable participants METHODS: Biopsy was used as the reference with predetermined cut-offs, advanced fibrosis being the main endpoint. The Nash-FibroTest panel including FibroTest-T2D, SteatoTest-T2D and MashTest-T2D was optimised for type 2 diabetes. FibroTest-T2D was compared to vibration-controlled transient elastography stiffness (VCTE), two-dimensional shear-wave elastography stiffness (TD-SWE), and Fibrosis-4 blood test. NashTest-T2D was compared to aspartate aminotransferase. SteatoTest-T2D was compared to the controlled attenuation parameter and the hepatorenal gradient. RESULTS Among 402 cases, non-evaluable tests were 6.7% for VCTE, 4.0% for hepatorenal gradient, 3.2% for controlled attenuation parameter, 1.5% for TD-SWE, 1.2% for NashTest-T2D, and 0.02% for Fibrosis-4, aspartate aminotransferase and SteatoTest-T2D. The VCTE AUROC for advanced fibrosis was over-estimated by 6% (0.83 [95% CI: 0.78-0.87]) by standard analysis compared to intention-to-diagnose (0.77 [0.72-0.81] p = 0.008). The AUROCs for advanced fibrosis did not differ significantly in intention-to-diagnose between FibroTest-T2D (0.77; 95% CI: 0.73-0.82), VCTE (0.77; 95% CI: 0.72-0.81) and TD-SWE(0.78; 0.74-0.83) but were all higher than the Fibrosis-4 score (0.70; 95% CI all differences ≥7%; p ≤ 0.03). For MASH, MashTest-T2D had a higher AUROC (0.76; 95% CI: 0.70-0.80) than aspartate aminotransferase (0.72; 95% CI: 0.66-0.77; p = 0.035). For steatosis, AUROCs did not differ significantly between SteatoTest-T2D, controlled attenuation parameter and hepatorenal gradient. CONCLUSIONS In intention-to-diagnose analysis, FibroTest-T2D, TD-SWE and VCTE performed similarly for staging fibrosis, and out-performed Fibrosis-4 in outpatients with type 2 diabetes. The standard analysis over-estimated VCTE performance. CLINICALTRIAL gov: NCT03634098.
Collapse
Affiliation(s)
- Thierry Poynard
- Centre de Recherche Saint-Antoine (CRSA), INSERM, Institute of Cardiometabolism and Nutrition (ICAN), Sorbonne Université, Paris, France
- BioPredictive, Paris, France
| | | | | | - Valérie Paradis
- Department of Pathology, AP-HP, Beaujon Hospital, Clichy, France
| | - Jean-Francois Gautier
- Department of Diabetes and Endocrinology, APHP, INSERM U1138, Hôpital Lariboisière, Paris, France
| | | | - Pierre Bedossa
- Department of Pathology, AP-HP, Beaujon Hospital, Clichy, France
| | - Laurent Castera
- Department of Hepatology, AP-HP, Beaujon Hospital, Clichy, France
| | - Stanislas Pol
- Department of Hepatology, Cochin Hospital, Université Paris Descartes, Paris, France
| | - Dominique Valla
- Department of Hepatology, AP-HP, Beaujon Hospital, Clichy, France
| |
Collapse
|
8
|
Koop C, Kruus P, Hallik R, Lehemets H, Vettus E, Niin M, Ross P, Kingo K. A country-wide teledermatoscopy service in Estonia shows results comparable to those in experimental settings in management plan development and diagnostic accuracy: A retrospective database study. JAAD Int 2023; 12:81-89. [PMID: 37288150 PMCID: PMC10241971 DOI: 10.1016/j.jdin.2023.02.019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/12/2023] [Indexed: 06/09/2023] Open
Abstract
Background Teledermatoscopy accuracy has been examined in experimental settings and is recommended for primary care despite lacking real-world implementation evidence. A teledermatoscopy service has been provided in Estonia since 2013, where lesions are evaluated based on the patient's or general practitioner's suggestion. Objective The management plan and diagnostic accuracy of a real-world store-and-forward teledermatoscopy service for melanoma diagnosis were evaluated. Methods A retrospective study analyzed 4748 cases from 3403 patients using the service between October 16, 2017 and August 30, 2019 by matching country-wide databases. Management plan accuracy was calculated as the percentage of melanoma found that was managed correctly. Diagnostic accuracy parameters were sensitivity, specificity, and positive and negative predictive values. Results Management plan accuracy for melanoma detection was 95.5% (95% CI, 77.2-99.9). Diagnostic accuracy showed a sensitivity of 90.48% (95% CI, 69.62-98.83) and a specificity of 92.57% (95% CI, 91.79-93.31). Limitations Matching the lesions was limited to SNOMED CT location standard precision. Diagnostic accuracy was calculated based on a combination of diagnosis and management plan data. Conclusion Teledermatoscopy for detecting and managing melanoma in real-world clinical practice displays results comparable with those in experimental setting studies.
Collapse
Affiliation(s)
| | - Priit Kruus
- Dermtest OÜ, Tallinn, Estonia
- Department of Health Technologies, Tallinn University of Technology, School of Information Technology, Tallinn, Estonia
| | - Riina Hallik
- Department of Health Technologies, Tallinn University of Technology, School of Information Technology, Tallinn, Estonia
| | | | - Elen Vettus
- East Tallinn Central Hospital, Clinic of Internal Medicine, Centre of Oncology, Tallinn, Estonia
| | | | - Peeter Ross
- Department of Health Technologies, Tallinn University of Technology, School of Information Technology, Tallinn, Estonia
- East Tallinn Central Hospital, Tallinn, Estonia
| | - Külli Kingo
- Department of Dermatology and Venerology, Faculty of Medicine, Institute of Clinical Medicine, University of Tartu, Tartu, Estonia
- Tartu University Hospital, Dermatology Clinic, Tartu, Estonia
| |
Collapse
|
9
|
Stahlmann K, Reitsma JB, Zapf A. Missing values and inconclusive results in diagnostic studies - A scoping review of methods. Stat Methods Med Res 2023; 32:1842-1855. [PMID: 37559474 PMCID: PMC10540494 DOI: 10.1177/09622802231192954] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/11/2023]
Abstract
Most diagnostic studies exclude missing values and inconclusive results from the analysis or apply simple methods resulting in biased accuracy estimates. This may be due to the lack of availability or awareness of appropriate methods. This scoping review aimed to provide an overview of strategies to handle missing values and inconclusive results in the reference standard or index test in diagnostic accuracy studies. Conducting a systematic literature search in MEDLINE, Cochrane Library, and Web of Science, we could identify many articles proposing methods for addressing missing values in the reference standard. There are also several articles describing methods regarding missing values or inconclusive results in the index test. The latter encompass imputation, frequentist and Bayesian likelihood, model-based, and latent class methods. While methods for missing values in the reference standard are regularly applied in practice, this is not true for methods addressing missing values and inconclusive results in the index test. Our comprehensive overview and description of available methods may raise further awareness of these methods and will enhance their application. Future research is needed to compare the performance of these methods under different conditions to give valid and robust recommendations for their usage in various diagnostic accuracy research scenarios.
Collapse
Affiliation(s)
- Katharina Stahlmann
- Institute of Medical Biometry and Epidemiology, University Medical Center Hamburg-Eppendorf, Germany
| | - Johannes B Reitsma
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, the Netherlands
| | - Antonia Zapf
- Institute of Medical Biometry and Epidemiology, University Medical Center Hamburg-Eppendorf, Germany
| |
Collapse
|
10
|
Heijboer WMP, Weir A, Vuckovic Z, Fullam K, Tol JL, Delahunt E, Serner A. Inter-examiner reliability of the Doha agreement meeting classification system of groin pain in male athletes. Scand J Med Sci Sports 2023; 33:189-196. [PMID: 36259124 PMCID: PMC10092143 DOI: 10.1111/sms.14248] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2021] [Revised: 08/18/2022] [Accepted: 10/02/2022] [Indexed: 01/11/2023]
Abstract
The Doha agreement classification is used to classify groin pain in athletes. We evaluated the inter-examiner reliability of this classification system. We prospectively recruited 48 male athletes (66 symptomatic sides) with groin pain between 10-2017 and 03-2020 at a sports medicine hospital in Qatar. Two examiners (23 and 10 years of clinical experience) performed history taking, and a standardized clinical examination blinded to each other's findings. Examiners classified groin pain using the Doha agreement terminology (adductor-, inguinal-, iliopsoas-, pubic-, hip-related groin pain, or other causes of groin pain). Multiple entities were ranked in order of perceived clinical importance. Each side was classified separately for bilateral groin pain. Inter-examiner reliability was calculated using Cohen's Kappa statistic (κ). Inter-examiner reliability was slight to moderate for adductor- (κ = 0.40), inguinal- (κ = 0.44), iliopsoas- (κ = 0.57), and pubic-related groin pain (κ = 0.12), substantial for hip-related groin pain (κ = 0.62), and slight for "other causes of groin pain" (κ = 0.13). Ranking entities in order of perceived clinical importance improved inter-examiner reliability for adductor-, inguinal-, and iliopsoas-related groin pain (κ = 0.52-0.65), but not for pubic (κ = 0.12), hip (κ = 0.51), and "other causes of groin pain" (κ = 0.03). For participants with unilateral groin pain classified with a single entity (n = 7), there was 100% agreement between the two examiners. Inter-examiner reliability of the Doha agreement meeting classification system varied from slight to substantial, depending on the clinical entity. Agreement between examiners was perfect when athletes were classified with a single clinical entity of groin pain, but lower when athletes were classified with multiple clinical entities.
Collapse
Affiliation(s)
- Willem M P Heijboer
- Aspetar Orthopaedic and Sports Medicine Hospital, Doha, Qatar.,Department of Orthopedic Surgery and Sports Medicine, Amsterdam UMC location University of Amsterdam, Amsterdam, The Netherlands.,Amsterdam Movement Sciences, Musculoskeletal Health and Sports, Amsterdam, The Netherlands.,REHABfysio, Rotterdam, The Netherlands
| | - Adam Weir
- Aspetar Orthopaedic and Sports Medicine Hospital, Doha, Qatar.,Department of Orthopaedics and Sports Medicine, Erasmus MC University Medical Centre, Rotterdam, The Netherlands.,Sport medicine and exercise clinic Haarlem (Sport en Beweeg Kliniek), Haarlem, The Netherlands
| | - Zarko Vuckovic
- Aspetar Orthopaedic and Sports Medicine Hospital, Doha, Qatar
| | - Karl Fullam
- DBC Chartered Physiotherapy Clinic, Institute for Sport & Health, University College Dublin, Dublin, Ireland
| | - Johannes L Tol
- Aspetar Orthopaedic and Sports Medicine Hospital, Doha, Qatar.,Department of Orthopedic Surgery and Sports Medicine, Amsterdam UMC location University of Amsterdam, Amsterdam, The Netherlands.,Amsterdam Movement Sciences, Musculoskeletal Health and Sports, Amsterdam, The Netherlands
| | - Eamonn Delahunt
- School of Public Health, Physiotherapy and Sports Science, University College Dublin, Dublin, Ireland.,Institute for Sport and Health, University College Dublin, Dublin, Ireland
| | - Andreas Serner
- Aspetar Orthopaedic and Sports Medicine Hospital, Doha, Qatar.,FIFA Medical, Fédération Internationale de Football Association, Zurich, Switzerland
| |
Collapse
|
11
|
Kang J, Abdolell M, Costa AF. Transabdominal ultrasound of pancreatic ductal adenocarcinoma: A multi-centered population-based study in sensitivity, associated diagnostic intervals, and survival. Curr Probl Diagn Radiol 2022; 51:842-847. [PMID: 35618553 DOI: 10.1067/j.cpradiol.2022.04.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2022] [Revised: 04/03/2022] [Accepted: 04/18/2022] [Indexed: 11/22/2022]
Abstract
OBJECTIVES To determine the sensitivity of ultrasound (US) in detecting pancreatic ductal adenocarcinoma in our region, to identify factors associated with US test result, and assess the impact on the diagnostic interval and survival. METHODS Patients diagnosed between January 1, 2014 and December 31, 2015 in Nova Scotia, Canada were identified by a cancer registry. US performed prior to diagnosis were retrospectively graded as true positive (TP), indeterminate or false negative (FN). Amongst US results, differences in age, weight and tumor size were assessed [one-way analysis of variance (ANOVA)]. Associations between result and sex, tumor location (proximal/distal), clinical suspicion of malignancy, and visualization of the pancreas, tumor, secondary signs and liver metastases were assessed (Chi-square). Mean follow-up imaging, diagnostic, and survival intervals were assessed (one-way ANOVA). RESULTS One hundred thirteen US of 107 patients (54 women; mean 70 ± 13 years) were graded as follows: 48/113 (42.5%) TPs; 42/113 (37.2%) indeterminates; and 23/113 (20.4%) FNs. Sensitivity was 48/71(67.6%). There was no difference in age, weight or tumor size amongst US result (P > 0.5). FNs had proportionally more men (P = 0.011) and lacked clinical suspicion of malignancy (P = 0.0006); TPs had proportionally more proximal tumors (P = 0.017). US result was associated with visualization of the pancreas, tumor, secondary signs and liver metastases (P < 0.005). FNs had longer mean follow-up imaging (P < 0.0001) and diagnostic (P = 0.0007) intervals, and worse mean survival (P = 0.034). CONCLUSIONS In our region, the sensitivity of US in detecting pancreatic ductal adenocarcinoma is 67.6%. A false negative US is associated with delayed diagnostic work-up and worse mean survival.
Collapse
Affiliation(s)
- Jessie Kang
- Department of Diagnostic Radiology, Queen Elizabeth II Health Sciences Centre and Dalhousie University, Victoria General Building, 3rd floor, 1276 South Park Street, Halifax, NS B3H 2Y9, Canada
| | - Mohamed Abdolell
- Department of Diagnostic Radiology, Queen Elizabeth II Health Sciences Centre and Dalhousie University, Victoria General Building, 3rd floor, 1276 South Park Street, Halifax, NS B3H 2Y9, Canada
| | - Andreu F Costa
- Department of Diagnostic Radiology, Queen Elizabeth II Health Sciences Centre and Dalhousie University, Victoria General Building, 3rd floor, 1276 South Park Street, Halifax, NS B3H 2Y9, Canada.
| |
Collapse
|
12
|
Biedermann A. The strange persistence of (source) "identification" claims in forensic literature through descriptivism, diagnosticism and machinism. Forensic Sci Int Synerg 2022; 4:100222. [PMID: 35257092 PMCID: PMC8897692 DOI: 10.1016/j.fsisyn.2022.100222] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2021] [Revised: 01/25/2022] [Accepted: 02/10/2022] [Indexed: 12/27/2022]
Abstract
Many forensic scientists consider that identification (individualisation) - in the sense of statements of the kind "the questioned item and the known item come from the same source" - is a concept that is central to their discipline. This is so despite decade-long, fundamental critiques levelled by both practitioners and academics against the conceptual and practical feasibility of forensic identification. Oddly, there is a constant stream of publications in (peer-reviewed) forensic science journals that treat forensic identification axiomatically as a valid object of study, sidestepping the fundamental critiques. This paper reviews and discusses three exemplary strands of publications that exemplify this persistent trend. These strands are called descriptivism, diagnosticism and machinism. The latter term refers to methods borrowed from the now increasingly popular approaches used in the field of machine learning. In turn, descriptivism and diagnosticism refer to general design aspects of mainstream research methods, illustrated here through a critical review of two recent papers on, respectively, forensic odontology and a framework for interpreting fingerprint evidence. The critique of the use of 'identification' in these strands of publication includes, but goes beyond, semantic details and the reiteration of long-known shortcomings of obsolete technical language such as 'match' and 'matching'. Specifically, this paper exposes deeper problems such as the subtle and argumentatively unfounded carrying-over of source conclusions to ultimate issues and the use probability concepts for questions that require more than the mere quantification of uncertainty. This paper submits that in order to foster trust in an era of continually expanding publishing activities, it should be a vital interest to forensic science journals to better examine what identification-related research can and cannot legitimately purport to achieve.
Collapse
Affiliation(s)
- Alex Biedermann
- University of Lausanne, School of Criminal Justice, 1015 Lausanne-Dorigny, Switzerland
| |
Collapse
|
13
|
Statistical methods for evaluating the fine needle aspiration cytology procedure in breast cancer diagnosis. BMC Med Res Methodol 2022; 22:40. [PMID: 35125097 PMCID: PMC8818244 DOI: 10.1186/s12874-022-01506-y] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2021] [Accepted: 01/10/2022] [Indexed: 01/24/2023] Open
Abstract
Background Statistical issues present while evaluating a diagnostic procedure for breast cancer are non rare but often ignored, leading to biased results. We aimed to evaluate the diagnostic accuracy of the fine needle aspiration cytology(FNAC), a minimally invasive and rapid technique potentially used as a rule-in or rule-out test, handling its statistical issues: suspect test results and verification bias. Methods We applied different statistical methods to handle suspect results by defining conditional estimates. When considering a partial verification bias, Begg and Greenes method and multivariate imputation by chained equations were applied, however, and a Bayesian approach with respect to each gold standard was used when considering a differential verification bias. At last, we extended the Begg and Greenes method to be applied conditionally on the suspect results. Results The specificity of the FNAC test above 94%, was always higher than its sensitivity regardless of the proposed method. All positive likelihood ratios were higher than 10, with variations among methods. The positive and negative yields were high, defining precise discriminating properties of the test. Conclusion The FNAC test is more likely to be used as a rule-in test for diagnosing breast cancer. Our results contributed in advancing our knowledge regarding the performance of FNAC test and the methods to be applied for its evaluation. Supplementary Information The online version contains supplementary material available at (10.1186/s12874-022-01506-y).
Collapse
|
14
|
Mizrahi-Man O, Woehrmann MH, Webster TA, Gollub J, Bivol A, Keeble SM, Aull KH, Mittal A, Roter AH, Wong BA, Schmidt JP. Novel genotyping algorithms for rare variants significantly improve the accuracy of Applied Biosystems™ Axiom™ array genotyping calls: Retrospective evaluation of UK Biobank array data. PLoS One 2022; 17:e0277680. [PMID: 36395175 PMCID: PMC9671364 DOI: 10.1371/journal.pone.0277680] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2022] [Accepted: 11/01/2022] [Indexed: 11/18/2022] Open
Abstract
The UK Biobank genotyped about 500k participants using Applied Biosystems Axiom microarrays. Participants were subsequently sequenced by the UK Biobank Exome Sequencing Consortium. Axiom genotyping was highly accurate in comparison to sequencing results, for almost 100,000 variants both directly genotyped on the UK Biobank Axiom array and via whole exome sequencing. However, in a study using the exome sequencing results of the first 50k individuals as reference (truth), it was observed that the positive predictive value (PPV) decreased along with the number of heterozygous array calls per variant. We developed a novel addition to the genotyping algorithm, Rare Heterozygous Adjusted (RHA), to significantly improve PPV in variants with minor allele frequency below 0.01%. The improvement in PPV was roughly equal when comparing to the exome sequencing of 50k individuals, or to the more recent ~200k individuals. Sensitivity was higher in the 200k data. The improved calling algorithm, along with enhanced quality control of array probesets, significantly improved the positive predictive value and the sensitivity of array data, making it suitable for the detection of ultra-rare variants.
Collapse
Affiliation(s)
- Orna Mizrahi-Man
- Thermo Fisher Scientific, Santa Clara, CA, United States of America
| | | | | | - Jeremy Gollub
- Thermo Fisher Scientific, Santa Clara, CA, United States of America
| | - Adrian Bivol
- Thermo Fisher Scientific, Santa Clara, CA, United States of America
| | - Sara M. Keeble
- Thermo Fisher Scientific, Santa Clara, CA, United States of America
| | | | - Anuradha Mittal
- Thermo Fisher Scientific, Santa Clara, CA, United States of America
| | - Alan H. Roter
- Thermo Fisher Scientific, Santa Clara, CA, United States of America
| | - Brant A. Wong
- Thermo Fisher Scientific, Santa Clara, CA, United States of America
| | - Jeanette P. Schmidt
- Thermo Fisher Scientific, Santa Clara, CA, United States of America
- * E-mail:
| |
Collapse
|
15
|
Vale L, Kunonga P, Coughlan D, Kontogiannis V, Astin M, Beyer F, Richmond C, Wilson D, Bajwa D, Javanbakht M, Bryant A, Akor W, Craig D, Lovat P, Labus M, Nasr B, Cunliffe T, Hinde H, Shawgi M, Saleh D, Royle P, Steward P, Lucas R, Ellis R. Optimal surveillance strategies for patients with stage 1 cutaneous melanoma post primary tumour excision: three systematic reviews and an economic model. Health Technol Assess 2021; 25:1-178. [PMID: 34792018 DOI: 10.3310/hta25640] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
BACKGROUND Malignant melanoma is the fifth most common cancer in the UK, with rates continuing to rise, resulting in considerable burden to patients and the NHS. OBJECTIVES The objectives were to evaluate the effectiveness and cost-effectiveness of current and alternative follow-up strategies for stage IA and IB melanoma. REVIEW METHODS Three systematic reviews were conducted. (1) The effectiveness of surveillance strategies. Outcomes were detection of new primaries, recurrences, metastases and survival. Risk of bias was assessed using the Cochrane Collaboration's Risk-of-Bias 2.0 tool. (2) Prediction models to stratify by risk of recurrence, metastases and survival. Model performance was assessed by study-reported measures of discrimination (e.g. D-statistic, Harrel's c-statistic), calibration (e.g. the Hosmer-Lemeshow 'goodness-of-fit' test) or overall performance (e.g. Brier score, R 2). Risk of bias was assessed using the Prediction model Risk Of Bias ASsessment Tool (PROBAST). (3) Diagnostic test accuracy of fine-needle biopsy and ultrasonography. Outcomes were detection of new primaries, recurrences, metastases and overall survival. Risk of bias was assessed using the Quality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2) tool. Review data and data from elsewhere were used to model the cost-effectiveness of alternative surveillance strategies and the value of further research. RESULTS (1) The surveillance review included one randomised controlled trial. There was no evidence of a difference in new primary or recurrence detected (risk ratio 0.75, 95% confidence interval 0.43 to 1.31). Risk of bias was considered to be of some concern. Certainty of the evidence was low. (2) Eleven risk prediction models were identified. Discrimination measures were reported for six models, with the area under the operating curve ranging from 0.59 to 0.88. Three models reported calibration measures, with coefficients of ≥ 0.88. Overall performance was reported by two models. In one, the Brier score was slightly better than the American Joint Committee on Cancer scheme score. The other reported an R 2 of 0.47 (95% confidence interval 0.45 to 0.49). All studies were judged to have a high risk of bias. (3) The diagnostic test accuracy review identified two studies. One study considered fine-needle biopsy and the other considered ultrasonography. The sensitivity and specificity for fine-needle biopsy were 0.94 (95% confidence interval 0.90 to 0.97) and 0.95 (95% confidence interval 0.90 to 0.97), respectively. For ultrasonography, sensitivity and specificity were 1.00 (95% confidence interval 0.03 to 1.00) and 0.99 (95% confidence interval 0.96 to 0.99), respectively. For the reference standards and flow and timing domains, the risk of bias was rated as being high for both studies. The cost-effectiveness results suggest that, over a lifetime, less intensive surveillance than recommended by the National Institute for Health and Care Excellence might be worthwhile. There was considerable uncertainty. Improving the diagnostic performance of cancer nurse specialists and introducing a risk prediction tool could be promising. Further research on transition probabilities between different stages of melanoma and on improving diagnostic accuracy would be of most value. LIMITATIONS Overall, few data of limited quality were available, and these related to earlier versions of the American Joint Committee on Cancer staging. Consequently, there was considerable uncertainty in the economic evaluation. CONCLUSIONS Despite adoption of rigorous methods, too few data are available to justify changes to the National Institute for Health and Care Excellence recommendations on surveillance. However, alternative strategies warrant further research, specifically on improving estimates of incidence, progression of recurrent disease; diagnostic accuracy and health-related quality of life; developing and evaluating risk stratification tools; and understanding patient preferences. STUDY REGISTRATION This study is registered as PROSPERO CRD42018086784. FUNDING This project was funded by the National Institute for Health Research Health Technology Assessment programme and will be published in full in Health Technology Assessment; Vol 25, No. 64. See the NIHR Journals Library website for further project information.
Collapse
Affiliation(s)
- Luke Vale
- Institute of Health & Society, Newcastle University, Newcastle upon Tyne, UK
| | - Patience Kunonga
- Institute of Health & Society, Newcastle University, Newcastle upon Tyne, UK
| | - Diarmuid Coughlan
- Institute of Health & Society, Newcastle University, Newcastle upon Tyne, UK
| | | | - Margaret Astin
- Institute of Health & Society, Newcastle University, Newcastle upon Tyne, UK
| | - Fiona Beyer
- Institute of Health & Society, Newcastle University, Newcastle upon Tyne, UK
| | - Catherine Richmond
- Institute of Health & Society, Newcastle University, Newcastle upon Tyne, UK
| | - Dor Wilson
- Institute of Health & Society, Newcastle University, Newcastle upon Tyne, UK
| | - Dalvir Bajwa
- Institute of Cellular Medicine, Newcastle University, Newcastle upon Tyne, UK
| | - Mehdi Javanbakht
- Institute of Health & Society, Newcastle University, Newcastle upon Tyne, UK
| | - Andrew Bryant
- Institute of Health & Society, Newcastle University, Newcastle upon Tyne, UK
| | - Wanwuri Akor
- Northumbria Healthcare NHS Foundation Trust, North Shields, UK
| | - Dawn Craig
- Institute of Health & Society, Newcastle University, Newcastle upon Tyne, UK
| | - Penny Lovat
- Institute of Translation and Clinical Studies, Newcastle University, Newcastle upon Tyne, UK
| | - Marie Labus
- Business Development and Enterprise, Newcastle University, Newcastle upon Tyne, UK
| | - Batoul Nasr
- Dermatological Sciences, Institute of Cellular Medicine, Newcastle University, Newcastle upon Tyne, UK
| | - Timothy Cunliffe
- Dermatology Department, James Cook University Hospital, Middlesbrough, UK
| | - Helena Hinde
- Dermatology Department, James Cook University Hospital, Middlesbrough, UK
| | - Mohamed Shawgi
- Radiology Department, James Cook University Hospital, Middlesbrough, UK
| | - Daniel Saleh
- Newcastle upon Tyne Hospitals NHS Foundation Trust, Newcastle upon Tyne, UK.,Princess Alexandra Hospital Southside Clinical Unit, Faculty of Medicine, University of Queensland, Brisbane, QLD, Australia
| | - Pam Royle
- Patient representative, ITV Tyne Tees, Gateshead, UK
| | - Paul Steward
- Patient representative, Dermatology Department, James Cook University Hospital, Middlesbrough, UK
| | - Rachel Lucas
- Patient representative, Dermatology Department, James Cook University Hospital, Middlesbrough, UK
| | - Robert Ellis
- Institute of Translation and Clinical Studies, Newcastle University, Newcastle upon Tyne, UK.,South Tees Hospitals NHS Foundation Trust, Middlesbrough, UK
| |
Collapse
|
16
|
Diagnostic performance of US for suspected appendicitis: Does multi-categorical reporting provide better estimates of disease in adults, and what factors are associated with false or indeterminate results? Eur J Radiol 2021; 144:109992. [PMID: 34634535 DOI: 10.1016/j.ejrad.2021.109992] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2021] [Revised: 09/03/2021] [Accepted: 09/29/2021] [Indexed: 11/23/2022]
Abstract
PURPOSE To identify factors associated with false or indeterminate US result for suspected appendicitis, and assess whether multi-categorical reporting of US yields more precise estimates regarding the probability of appendicitis. METHODS 562 US examinations for suspected appendicitis between May 2013-April 2015 were categorized as true (77/562 true positives or true negatives) or false/indeterminate (485/562 false negatives, false positives or indeterminates) based on results from a prior study. Of 541 examinations with images available retrospectively, a category of A-E was assigned as follows: non-visualized appendix with secondary findings (A) absent or (B) present; appendix visualized and considered (C) negative, (D) equivocal, or (E) positive for appendicitis. The following factors were recorded: age; sex; scan time (daytime vs. off-hours); resident/fellow involvement; abdominal subspecialty radiologist; radiologist experience (>5 years or not); and tenderness on interrogation. Associations between factors and US result were assessed (t-tests, Fisher's exact test and multivariate logistic regression). RESULTS The true group had proportionally more males (18/77 (23.4%) vs. 66/485 (13.6%), p = 0.04) and patients with sonographic tenderness (43/77 (55.8%) vs. 132/353 (27.3%), p < 0.0001). There was no significant difference or association with other factors. On multivariate logistic regression, false/indeterminate results were 1.9 times (95% CIs 1.0-3.5) more likely among females and 3.8 times more likely in the absence of tenderness (95% CIs 2.3-6.4). The proportion of patients with appendicitis in categories A-E was 34/410 (8.3%), 24/44 (54.5%), 0/18 (0%), 0/3 (0%) and 61/66 (92.4%), respectively. CONCLUSIONS Females and absence of tenderness were associated with a false/indeterminate US. Categorical reporting provides more granular estimates of the post-test probability of appendicitis.
Collapse
|
17
|
Diagnostic Accuracy of Magnetic Resonance Imaging for International Federation of Gynecology and Obstetrics 2018 IB to IIB Cervical Cancer Staging: Comparison Among Magnetic Resonance Sequences and Pathologies. J Comput Assist Tomogr 2021; 45:829-836. [PMID: 34407060 DOI: 10.1097/rct.0000000000001210] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
OBJECTIVE This study aimed to investigate the most accurate magnetic resonance (MR) sequence for tumor detection, maximal tumor diameter, and parametrial invasion compared with histopathologic diagnoses. METHODS Fifty-one patients with International Federation of Gynecology and Obstetrics 2018 IB1 to IIB cervical cancer underwent preoperative MR imaging and surgical resection. Two radiologists independently evaluated the tumor detection, parametrial invasion, and tumor size in each of T2-weighted image, diffusion-weighted image, and contrast-enhanced T1-weighted image. Results obtained for squamous cell carcinoma (SCC) and adenocarcinoma were also compared. RESULTS Neither the tumor detection rate nor parametrial invasion was found to be significantly different among sequences. Tumor size assessment using MR imaging with pathology showed good correlation: r = 0.63-0.72. The adenocarcinoma size tended to be more underestimated than SCC in comparison with the pathologic specimen. CONCLUSIONS Cervical cancer staging by MR images showed no significant difference among T2-weighted image, diffusion-weighted image, and contrast-enhanced T1-weighted image. Adenocarcinoma was prone to be measured as smaller than the pathologic specimen compared with SCC.
Collapse
|
18
|
Netterström-Wedin F, Matthews M, Bleakley C. Diagnostic Accuracy of Clinical Tests Assessing Ligamentous Injury of the Talocrural and Subtalar Joints: A Systematic Review With Meta-Analysis. Sports Health 2021; 14:336-347. [PMID: 34286639 PMCID: PMC9109591 DOI: 10.1177/19417381211029953] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
Context: Ankle sprains are the most common acute musculoskeletal injury. Clinical tests represent the first opportunity to assess the sprain’s severity, but no systematic review has compared these tests to contemporary reference standards. Objective: To determine the diagnostic accuracy of clinical tests assessing the talocrural and subtalar joint ligaments after ankle sprain. Data Sources: CINAHL, EMBASE, MEDLINE, hand-searching, and PubMed-related article searches (inception to November 18, 2020). Study Selection: Eligible diagnostic studies compared clinical examination (palpation, joint laxity) against imaging or surgery. Studies at a high risk of bias or with high concerns regarding applicability on Quality Assessment of Diagnostic Accuracy Studies-2 were excluded from the meta-analysis. Study Design: Systematic review and meta-analysis. Level of Evidence: Level 3a. Data Extraction: True-positive, false-negative, false-positive, and true-negative findings were extracted to calculate sensitivity, specificity, and likelihood ratios. If ordinal data were reported, these were extracted to calculate Cohen’s kappa. Results: A total of 14 studies met the inclusion criteria (6302 observations; 9 clinical tests). No test had both sensitivity and specificity exceeding 90%. Palpation of the anterior talofibular ligament is highly sensitive (sensitivity 95%-100%; specificity 0%-32%; min-max; n = 6) but less so for the calcaneofibular ligament (sensitivity 49%-100%; specificity 26%-79%; min-max; n = 6). Pooled data from 6 studies (885 observations) found a low sensitivity (54%; 95% CI 35%-71%) but high specificity (87%; 95% CI 63%-96%) for the anterior drawer test. Conclusion: The anterior talofibular ligament is best assessed using a cluster of palpation (rule out), and anterior drawer testing (rule in). The talar tilt test can rule in injury to the calcaneofibular ligament, but a sensitive clinical test for the ligament is lacking. It is unclear if ligamentous injury grading can be done beyond the binary (injured vs uninjured), and clinical tests of the subtalar joint ligaments are not well researched. The generalizability of our findings is limited by insufficient reporting on blinding and poor study quality. Registration: Prospero ID: CRD42020187848. Data Availability: Data are available in a public, open access repository on publication, including our RevMan file and the CSV file used for meta-analysis: http://doi.org/10.5281/zenodo.4917138
Collapse
Affiliation(s)
| | - Mark Matthews
- Sport and Exercise Science Research Institute, Ulster University, Belfast, UK
| | - Chris Bleakley
- School of Health Sciences, Faculty of Life and Health Sciences, Ulster University, Jordanstown Campus, Antrim, UK
- Chris Bleakley, PhD, Ulster University, Jordanstown Campus, Room 01F118, Shore Road, Newtownabbey Co, Antrim BT37 0QB, UK ()
| |
Collapse
|
19
|
Forensic science and the principle of excluded middle: "Inconclusive" decisions and the structure of error rate studies. Forensic Sci Int Synerg 2021; 3:100147. [PMID: 33981984 PMCID: PMC8082088 DOI: 10.1016/j.fsisyn.2021.100147] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2021] [Revised: 03/20/2021] [Accepted: 03/29/2021] [Indexed: 11/21/2022]
Abstract
In a paper published recently in this journal, Dror and Scurich (2020) [20] critically discuss the notions of "inconclusive evidence" (i.e., test items for which it is difficult to render a categorical response) and "inconclusive decisions" (i.e., experts' conclusions or responses) in the context of forensic science error rate studies. They expose several ways in which the understanding and use of "inconclusives" in current forensic science research and practice can adversely affect the outcomes of error rate studies. A main cause of distortion, according to Dror and Scurich, is what they call "erroneous inconclusive" decisions, in particular the lack of acknowledgment of this type of erroneous conclusion in the computation of error rates. To overcome this complication, Dror and Scurich call for a more explicit monitoring of "inconclusives" using a modified error rate study design. Whilst we agree with several well-argued points raised by the authors, we disagree with their framing of "inconclusive decisions" as potential errors. In this paper, we argue that referring to an "inconclusive decision" as an error is a contradiction in terms, runs counter to an analysis based on decision logic and, hence, is questionable as a concept. We also reiterate that the very term "inconclusive decision" disregards the procedural architecture of the criminal justice system across modern jurisdictions, especially the fact that forensic experts have no decisional rights in the criminal process. These positions do not ignore the possibility that "inconclusives" - if used excessively - do raise problems in forensic expert reporting, in particular limited assertiveness (or, overcautiousness). However, these drawbacks derive from inherent limitations of experts rather than from the seemingly erroneous nature of "inconclusives" that needs to be fixed. More fundamentally, we argue that attempts to score "inconclusives" as errors amount to philosophical claims disguised as forensic methodology. Specifically, these attempts interfere with the metaphysical substrate underpinning empirical research. We point this out on the basis of the law of the excluded middle, i.e. the principle of "no third possibility being given" (tertium non datur).
Collapse
|
20
|
Diagnostic performance and radiation dose of reduced vs. standard scan range abdominopelvic CT for evaluation of appendicitis. Eur Radiol 2021; 31:7817-7826. [PMID: 33856521 DOI: 10.1007/s00330-021-07945-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2020] [Revised: 03/17/2021] [Accepted: 03/25/2021] [Indexed: 10/21/2022]
Abstract
OBJECTIVE To compare the diagnostic performance and radiation dose of reduced vs. standard scan range CT in diagnosing appendicitis. METHODS We retrospectively evaluated 531 consecutive adults who underwent emergency contrast-enhanced CT for abdominal pain or suspected appendicitis between July 2018 and March 2019. One hundred eighty-one young adults (mean age, 26 ± 6 years) were imaged from L2 to the symphysis pubis (reduced protocol). A total of 350 older patients (mean age, 55 ± 17 years) and those with a wider differential diagnosis were imaged from the diaphragm to the ischium (standard protocol). The reference standard was histopathology (surgical cases) or 3 months of medical record follow-up (nonsurgical cases). Sensitivity, specificity, and accuracy were calculated. Mean dose-length products (DLP) were compared (t-test). Using an anthropomorphic phantom, organ doses were measured on CT scanners with (scanner 1) and without (scanner 2) automatic voltage selection; effective radiation doses were calculated. RESULTS The frequency of appendicitis was 57/181 (31.5%) and 80/350 (22.9%) in the reduced and standard groups, respectively. Results of the reduced and standard protocols respectively were as follows (95% CI in parentheses): sensitivity, 98.2% (90.4-99.9%) and 100.0 (95.3-100.0%); specificity, 99.2% (95.6-100.0%) and 99.6% (97.9-100.0%); accuracy, 97.8% and 97.4%; mean DLPs, 363 ± 191mGy∙cm and 633 ± 591mGy∙cm (p < 0.0001). Phantom-based measurements of effective dose were 47% lower on scanner 1 (4.64 vs. 2.48 mSv) and 26% lower on scanner 2 (4.68 vs. 3.45 mSv) with the reduced protocol. CONCLUSION For young adults with clinically suspected appendicitis, a reduced scan range CT protocol is as sensitive, specific, and accurate as a standard scan range CT and imparts significantly less radiation dose. KEY POINTS • A reduced scan range CT protocol in young adults with high suspicion of appendicitis demonstrates similar diagnostic performance as a full-range abdominopelvic CT in undifferentiated adult patients. • The reduced scan range CT protocol imparts significantly less radiation dose: 57% based on dose-length product data and 26-47% based on anthropomorphic phantom data.
Collapse
|
21
|
Martucciello A, Vitale N, Mazzone P, Dondo A, Archetti I, Chiavacci L, Cerrone A, Gamberale F, Schiavo L, Pacciarini ML, Boniotti MB, De Carlo E. Field Evaluation of the Interferon Gamma Assay for Diagnosis of Tuberculosis in Water Buffalo ( Bubalus bubalis) Comparing Four Interpretative Criteria. Front Vet Sci 2020; 7:563792. [PMID: 33335916 PMCID: PMC7736034 DOI: 10.3389/fvets.2020.563792] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2020] [Accepted: 11/05/2020] [Indexed: 12/30/2022] Open
Abstract
Bovine tuberculosis (bTB) is a worldwide zoonosis that affects many species of domestic and wild animals. Mycobaterium bovis is the main cause of infection in water buffalo (Bubalus bubalis) and bovines and is of great concern for human health and for buffalo producers in Italy. The bTB eradication programme is based on slaughterhouse surveillance and intradermal skin tests. Other in vivo diagnostic methods such as the interferon-gamma (IFN-γ) assay have been developed and are widely used in cattle to accelerate the elimination of bTB positive animals. The present study is the first to assess the use and performance of IFN-γ assays, which is used as an ancillary test for bTB diagnosis in water buffalo, and presents the results of a field-evaluation of the assay from 2012 to 2019 during the buffalo bTB eradication programme in Italy. The study involved 489 buffaloes with a positive result to the single intradermal tuberculin test (SITT). The IFN-γ assays and single intradermal comparative tuberculin test were used as confirmation tests. Then, a total of 458 buffaloes, reared on officially tuberculosis-free (OTF) herds, that were confirmed bTB-free for at least the last 6 years were subjected to IFN-γ testing. Furthermore, to evaluate the IFN-γ test in an OTF herd with Paratuberculosis (PTB) infection, 103 buffaloes were subjected to SITT and IFN-γ test simultaneously. Four interpretative criteria were used, and the IFN-γ test showed high levels of accuracy, with sensitivity levels between 75.3% (CI 95% 71.2–79.0%) and 98.4% (CI 95% 96.7–99.4%) and specificity levels between 94.3% (CI 95% 91.2–96.50%) and 98.5% (CI 95% 96.9–99.4%), depending on the criterion used. Finally, in the OTF herd with PTB infection, in buffalo, the IFN-γ test displayed high specificity values according to all 4 interpretative criteria, with specificity levels between 96.7% (CI 95% 88.4–99.5%) and 100% (CI 95% 96.2–100%), while SITT specificity proved unsatisfactory, with a level of 45.3% (CI 95% 35.0–55.7%). Our results showed that the IFN-γ test in the buffalo species could reach high Sensitivity and Specificity values, and that the level of Sensitivity and Specificity could be chosen based on the interpretative criterion and the antigens used depending on the health status of the herd and the epidemiological context of the territory. The IFN-γ test and the use of different interpretative criteria proved to be useful to implement bTB diagnostic strategies in buffalo herds, with the possibility of a flexible use of the assay.
Collapse
Affiliation(s)
- Alessandra Martucciello
- National Reference Centre for Hygiene and Technologies of Water Buffalo Farming and Productions, Istituto Zooprofilattico Sperimentale del Mezzogiorno, Salerno, Italy
| | - Nicoletta Vitale
- Istituto Zooprofilattico Sperimentale del Piemonte, Liguria e Valle d'Aosta, Turin, Italy
| | - Piera Mazzone
- Istituto Zooprofilattico Sperimentale dell'Umbria e delle Marche "Togo Rosati", Perugia, Italy
| | - Alessandro Dondo
- Istituto Zooprofilattico Sperimentale del Piemonte, Liguria e Valle d'Aosta, Turin, Italy
| | - Ivonne Archetti
- National Reference Centre for Bovine Tuberculosis, Istituto Zooprofilattico Sperimentale della Lombardia e dell'Emilia Romagna, Brescia, Italy
| | - Laura Chiavacci
- Istituto Zooprofilattico Sperimentale del Piemonte, Liguria e Valle d'Aosta, Turin, Italy
| | - Anna Cerrone
- National Reference Centre for Hygiene and Technologies of Water Buffalo Farming and Productions, Istituto Zooprofilattico Sperimentale del Mezzogiorno, Salerno, Italy
| | | | - Lorena Schiavo
- National Reference Centre for Hygiene and Technologies of Water Buffalo Farming and Productions, Istituto Zooprofilattico Sperimentale del Mezzogiorno, Salerno, Italy
| | - Maria Lodovica Pacciarini
- National Reference Centre for Bovine Tuberculosis, Istituto Zooprofilattico Sperimentale della Lombardia e dell'Emilia Romagna, Brescia, Italy
| | - Maria Beatrice Boniotti
- National Reference Centre for Bovine Tuberculosis, Istituto Zooprofilattico Sperimentale della Lombardia e dell'Emilia Romagna, Brescia, Italy
| | - Esterina De Carlo
- National Reference Centre for Hygiene and Technologies of Water Buffalo Farming and Productions, Istituto Zooprofilattico Sperimentale del Mezzogiorno, Salerno, Italy
| |
Collapse
|
22
|
Petersen LJ, Johansen MN, Strandberg J, Stenholt L, Zacho HD. Reporting and handling of equivocal imaging findings in diagnostic studies of bone metastasis in prostate cancer. Acta Radiol 2020; 61:1096-1104. [PMID: 31821767 DOI: 10.1177/0284185119890087] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
BACKGROUND Equivocal scanning results occur. It remains unclear how these results are presented and their management influence diagnostic characteristics. PURPOSE To investigate the reporting and handling of equivocal imaging findings in diagnostic studies of bone metastases, and to assess the impact on diagnostic performance of the methods used to analyze equivocal findings. The conceptual issue was reified based on two actual observations. MATERIAL AND METHODS A recent meta-analysis of bone metastases in prostate cancer was conducted and data were obtained from a large clinical trial with a true reference of bone metastasis, where diagnostic characteristics were calculated with equivocal scans handled by: removal; considered malignant; considered benign; and intention-to-diagnose. RESULTS The meta-analysis included 18 trials where the median proportion of reported equivocal results was 27%. Eleven (61%) studies reported an equivocal option for the index test, 42% reported equivocal results and described how these were analyzed. The clinical trial included 583 prostate cancer patients with 20% equivocal results. The different methods of managing equivocal findings resulted in highly variable outcomes: sensitivity = 85%-100%; specificity = 78%-99%; and positive and negative predictive values = 44%-94% and 97%-100%, respectively. The diagnostic performances obtained using the four methods were differentially susceptible to the proportion of equivocal imaging findings and the prevalence of bone metastases. CONCLUSION Reporting of equivocal results was inadequate in bone imaging trials. The handling of equivocal findings strongly influenced diagnostic accuracy.
Collapse
Affiliation(s)
- Lars J Petersen
- Department of Nuclear Medicine, Clinical Cancer Research Centre, Aalborg University Hospital, Aalborg, Denmark
- Department of Clinical Medicine, Aalborg University, Aalborg, Denmark
| | | | - Jesper Strandberg
- Department of Nuclear Medicine, Clinical Cancer Research Centre, Aalborg University Hospital, Aalborg, Denmark
| | - Louise Stenholt
- The Medical Library, Aalborg University Hospital, Aalborg, Denmark
| | - Helle D Zacho
- Department of Nuclear Medicine, Clinical Cancer Research Centre, Aalborg University Hospital, Aalborg, Denmark
- Department of Clinical Medicine, Aalborg University, Aalborg, Denmark
| |
Collapse
|
23
|
Landsheer JA. Impact of the Prevalence of Cognitive Impairment on the Accuracy of the Montreal Cognitive Assessment: The Advantage of Using two MoCA Thresholds to Identify Error-prone Test Scores. Alzheimer Dis Assoc Disord 2020; 34:248-253. [PMID: 31934880 PMCID: PMC7497609 DOI: 10.1097/wad.0000000000000365] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2019] [Accepted: 11/10/2019] [Indexed: 11/26/2022]
Abstract
OBJECTIVES The focus of this study is the classification accuracy of the Montreal Cognitive Assessment (MoCA) for the detection of cognitive impairment (CI). Classification accuracy can be low when the prevalence of CI is either high or low in a clinical sample. A more robust result can be expected when avoiding the range of test scores within which most classification errors are expected, with adequate predictive values for more clinical settings. METHODS The classification methods have been applied to the MoCA data of 5019 patients in the Uniform Data Set of the University of Washington's National Alzheimer's Coordinating Center, to which 30 Alzheimer Disease Centers (ADCs) contributed. RESULTS The ADCs show sample prevalence of CI varying from 0.22 to 0.87. Applying an optimal cutoff score of 23, the MoCA showed for only 3 of 30 ADCs both a positive predictive value (PPV) and a negative predictive value (NPV) ≥0.8, and in 18 cases, a PPV ≥0.8 and for 13 an NPV ≥0.8. Overall, the test scores between 22 and 25 have low odds of true against false decisions of 1.14 and contains 55.3% of all errors when applying the optimal dichotomous cut-point. Excluding the range 22 to 25 offers higher classification accuracies for the samples of the individual ADCs. Sixteen of 30 ADCs showed both NPV and PPV ≥0.8, 25 show a PPV ≥0.8, and 21 show an NPV ≥0.8. CONCLUSION In comparison to a dichotomous threshold, considering the most error-prone test scores as uncertain enables a classification that offers adequate classification accuracies in a larger number of clinical settings.
Collapse
Affiliation(s)
- Johannes A Landsheer
- Department of Methods and Statistics, Faculty of Social Sciences, Utrecht University, Utrecht, The Netherlands
| |
Collapse
|
24
|
Ultrasound and CT in the Diagnosis of Appendicitis: Accuracy With Consideration of Indeterminate Examinations According to STARD Guidelines. AJR Am J Roentgenol 2020; 215:639-644. [PMID: 32406773 DOI: 10.2214/ajr.19.22370] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
OBJECTIVE. The objective of our study was to determine the accuracy of ultrasound (US) and CT in diagnosing appendicitis at our institution while taking into account the number of indeterminate examinations in accordance with the Standards for Reporting Diagnostic Accuracy (STARD) guidelines. MATERIALS AND METHODS. We retrospectively evaluated 790 patients who underwent US, CT, or both for evaluation of suspected appendicitis between May 1, 2013, and April 30, 2015. Patient characteristics and US and CT examination results were recorded. The reference standard was histopathology or 3 months of medical record follow-up if surgery was not performed; 3 × 2 tables were generated, and sensitivity, specificity, overall test yield, and accuracy were calculated according to STARD guidelines. For surgical cases, time to surgery (one-way ANOVA) was compared among patients who underwent US alone, CT alone, or both US and CT. RESULTS. A total of 473 of 562 US examinations had indeterminate findings (overall test yield, 15.8%); sensitivity and specificity in the 89 diagnostic examinations were 98.5% and 54.2%, respectively. Thirteen of 522 CT examinations were indeterminate (overall test yield, 97.5%); sensitivity and specificity in the remaining 509 CT examinations were 98.9% and 97.2%, respectively. Taking indeterminate studies into account, the accuracy was 13.7% for US and 95.6% for CT. The negative appendectomy rates were 17.7% (11/62) for US and 3.3% (9/276) for CT (p = 0.0002). Time to surgery was longer for patients who underwent US and CT (mean ± SD, 17.7 ± 8.9 hours) than US alone (12.9 ± 6.4 hours; p = 0.002) but was not longer for patients who underwent CT alone (16.3 ± 8.4 hours; p = 0.45). CONCLUSION. At our institution, a large proportion of US examinations are indeterminate for appendicitis. CT is the preferred first-line imaging test for evaluating appendicitis in nonobstetric adult patients.
Collapse
|
25
|
Nikoloulopoulos AK. A multinomial quadrivariate D-vine copula mixed model for meta-analysis of diagnostic studies in the presence of non-evaluable subjects. Stat Methods Med Res 2020; 29:2988-3005. [PMID: 32323626 PMCID: PMC7682507 DOI: 10.1177/0962280220913898] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Diagnostic test accuracy studies observe the result of a gold standard procedure that defines the presence or absence of a disease and the result of a diagnostic test. They typically report the number of true positives, false positives, true negatives and false negatives. However, diagnostic test outcomes can also be either non-evaluable positives or non-evaluable negatives. We propose a novel model for the meta-analysis of diagnostic studies in the presence of non-evaluable outcomes, which assumes independent multinomial distributions for the true and non-evaluable positives, and, the true and non-evaluable negatives, conditional on the latent sensitivity, specificity, probability of non-evaluable positives and probability of non-evaluable negatives in each study. For the random effects distribution of the latent proportions, we employ a drawable vine copula that can successively model the dependence in the joint tails. Our methodology is demonstrated with an extensive simulation study and applied to data from diagnostic accuracy studies of coronary computed tomography angiography for the detection of coronary artery disease. The comparison of our method with the existing approaches yields findings in the real data application that change the current conclusions.
Collapse
|
26
|
Mishra H, Reeve BWP, Palmer Z, Caldwell J, Dolby T, Naidoo CC, Jackson JG, Schumacher SG, Denkinger CM, Diacon AH, van Helden PD, Marx FM, Warren RM, Theron G. Xpert MTB/RIF Ultra and Xpert MTB/RIF for diagnosis of tuberculosis in an HIV-endemic setting with a high burden of previous tuberculosis: a two-cohort diagnostic accuracy study. THE LANCET. RESPIRATORY MEDICINE 2020; 8:368-382. [PMID: 32066534 DOI: 10.1016/s2213-2600(19)30370-4] [Citation(s) in RCA: 64] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/18/2019] [Revised: 09/19/2019] [Accepted: 09/20/2019] [Indexed: 01/26/2023]
Abstract
BACKGROUND Xpert MTB/RIF Ultra (Ultra) is a new test for tuberculosis undergoing global roll-out. We assessed the performance of Ultra compared with Xpert MTB/RIF (Xpert) in an HIV-endemic setting where previous tuberculosis is frequent and current test performance is suboptimal. METHODS In this two-cohort diagnostic accuracy study, we used sputum samples from patients in South Africa to evaluate the accuracy of Ultra and Xpert against a single culture reference standard. For the first cohort (cohort A), we recruited adults (aged ≥18 years) with symptoms of presumptive tuberculosis at Scottsdene clinic in Cape Town, South Africa. We collected three sputum samples from each patient in cohort A, two at the first visit of which one was tested using Xpert and the other was tested using culture, and one sample the next morning which was tested using Ultra. In a separate cohort of patients with presumptive tuberculosis and recent previous tuberculosis (≤2 years) who had submitted sputum samples to the National Health Laboratory Services (cohort B), decontaminated sediments were, after processing, randomly allocated (1:1) for testing with Ultra or Xpert. For both cohorts we calculated the sensitivity and specificity of Ultra and Xpert and evaluated the effects of different methods of interpreting Ultra trace results. FINDINGS Between Feb 6, 2016, and Feb 2, 2018, we recruited 302 people into cohort A, all of whom provided sputum samples and 239 were included in the head-to-head analyses of Ultra and Xpert. For cohort B, we collected sputum samples from eligible patients who had submitted samples between Dec 6, 2016, and Dec 21, 2017, to give a cohort of 831 samples, of which 352 were eligible for inclusion in analyses and randomly assigned to Ultra (n=173) or Xpert (n=179). In cohort A, Ultra gave more non-actionable results (not positive or negative) than did Xpert (28 [10%] 275 vs 14 [5%] 301; p=0·011). In the head-to-head analysis, in smear-negative patients, sensitivity of Ultra was 80% (95% CI 64-90) and of Xpert was 73% (57-85; p=0·45). Overall, specificity of Ultra was lower than that of Xpert (90% [84-94] vs 99% [95-100]; p=0·001). In cohort B, overall sensitivity was 92% (81-98) for Xpert versus 86% (73-95; p=0·36) for Ultra and overall specificity was 69% (60-77) for Ultra versus 84% (78-91; p=0·005) for Xpert. Ultra specificity estimates improved after reclassification of results with the lowest Ultra-positive semiquantitation category (trace) to negative (15% [8-22]). In cohort A, the positive predictive value (PPV) for Ultra was 78% (67-87) and for Xpert was 96% (87-99; p=0·004); in cohort B, the PPV for Ultra was 50% (43-57) and for Xpert was 70% (61-78; p=0·014). Ultra PPV estimates in previously treated patients were low: at 15% tuberculosis prevalence, half of Ultra-positive patients with presumptive tuberculosis would be culture negative, increasing to approximately 70% in patients with recent previous tuberculosis. In cohort B, 21 (28%) of 76 samples that were Ultra positive were rifampicin indeterminate (all trace) and, like cohort A, most were culture negative (19 [90%] of 21). INTERPRETATION In a setting with a high burden of previous tuberculosis, Ultra generated more non-actionable results and had diminished specificity compared with Xpert. In patients with recent previous tuberculosis, a quarter of Ultra-positive samples were indeterminate for rifampicin resistance and culture negative, suggesting that additional drug-resistance testing will probably be unsuccessful. Our data have implications for the handling of Ultra-positive results in patients with previous tuberculosis in high burden settings. FUNDING South African Medical Research Council, the EDCTP2 program, and the Faculty of Medicine and Health Sciences, Stellenbosch University.
Collapse
Affiliation(s)
- Hridesh Mishra
- NRF-DST Centre of Excellence for Biomedical Tuberculosis Research, South African Medical Research Council Centre for Tuberculosis Research, Division of Molecular Biology and Human Genetics, Stellenbosch University, Cape Town, South Africa
| | - Byron W P Reeve
- NRF-DST Centre of Excellence for Biomedical Tuberculosis Research, South African Medical Research Council Centre for Tuberculosis Research, Division of Molecular Biology and Human Genetics, Stellenbosch University, Cape Town, South Africa
| | - Zaida Palmer
- NRF-DST Centre of Excellence for Biomedical Tuberculosis Research, South African Medical Research Council Centre for Tuberculosis Research, Division of Molecular Biology and Human Genetics, Stellenbosch University, Cape Town, South Africa
| | | | - Tania Dolby
- National Health Laboratory Services, Cape Town, South Africa
| | - Charissa C Naidoo
- NRF-DST Centre of Excellence for Biomedical Tuberculosis Research, South African Medical Research Council Centre for Tuberculosis Research, Division of Molecular Biology and Human Genetics, Stellenbosch University, Cape Town, South Africa
| | - Jennifer G Jackson
- NRF-DST Centre of Excellence for Biomedical Tuberculosis Research, South African Medical Research Council Centre for Tuberculosis Research, Division of Molecular Biology and Human Genetics, Stellenbosch University, Cape Town, South Africa
| | | | - Claudia M Denkinger
- FIND, Geneva, Switzerland; University of Heidelberg, Division of Tropical Medicine, Center of Infectious Diseases, Heidelberg, Germany
| | - Andreas H Diacon
- Faculty of Medicine and Health Sciences, Division of Medical Physiology, Stellenbosch University, Cape Town, South Africa
| | - Paul D van Helden
- NRF-DST Centre of Excellence for Biomedical Tuberculosis Research, South African Medical Research Council Centre for Tuberculosis Research, Division of Molecular Biology and Human Genetics, Stellenbosch University, Cape Town, South Africa
| | - Florian M Marx
- Desmond Tutu TB Centre, Department of Paediatrics and Child Health, Stellenbosch University, Cape Town, South Africa; DST-NRF South African Centre of Excellence in Epidemiological Modelling and Analysis (SACEMA), Stellenbosch University, Cape Town, South Africa
| | - Robin M Warren
- NRF-DST Centre of Excellence for Biomedical Tuberculosis Research, South African Medical Research Council Centre for Tuberculosis Research, Division of Molecular Biology and Human Genetics, Stellenbosch University, Cape Town, South Africa
| | - Grant Theron
- NRF-DST Centre of Excellence for Biomedical Tuberculosis Research, South African Medical Research Council Centre for Tuberculosis Research, Division of Molecular Biology and Human Genetics, Stellenbosch University, Cape Town, South Africa.
| |
Collapse
|
27
|
The Clinical Utility of Chest Radiography for Identifying Pneumonia: Accounting for Diagnostic Uncertainty in Radiology Reports. AJR Am J Roentgenol 2019; 213:1207-1212. [PMID: 31509449 DOI: 10.2214/ajr.19.21521] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
OBJECTIVE. Currently, chest radiography is the first-line imaging test for identifying pneumonia; chest CT is considered the reference standard. The purpose of this study was to calculate the statistical measures of performance of chest radiography for identifying pneumonia when taking into account uncertain results of both chest radiography and CT examinations. MATERIALS AND METHODS. Statistical measures of performance of chest radiography, using CT as the reference standard, were calculated with 95% CIs by varying uncertain radiology report impressions of both chest radiography and CT to all negative or all positive. The resulting scenarios were as follows: scenario 1, uncertain chest radiography and CT impressions are considered positive for pneumonia; scenario 2, uncertain chest radiography impressions are positive but uncertain CT impressions are negative; scenario 3, uncertain chest radiography impressions are negative and uncertain CT impressions are positive; scenario 4, uncertain chest radiography and CT impressions are negative; and scenario 5, uncertain chest radiography and CT impressions are excluded. RESULTS. A retrospective analysis of 2411 patient visits revealed the prevalence of uncertain radiology report impressions to be 31.8% for chest radiography and 21.7% for CT. Scenario 1 yielded the following performance values: sensitivity, 51.9%; specificity, 71.3%; PPV, 59.4%; and NPV, 64.5%. Scenario 2 produced the following performance values: sensitivity, 59.6%; specificity, 67.1%; PPV, 59.6%; and NPV, 67.1%. Scenario 3 showed the following performance values: sensitivity, 13.4%; specificity, 97.7%; PPV, 82.6%; and NPV, 58.1%. Scenario 4 yielded the following performance values: sensitivity, 19.6%; specificity, 96.4%; PPV, 81.6%; and NPV, 59.5%. Scenario 5 produced the following performance values: sensitivity, 32.7%; specificity, 96.8%; PPV, 89.2%; and NPV, 63.8%. CONCLUSION. Uncertain chest radiography results for the evaluation of pneumonia are prevalent. A chest radiography impression using the strongest language in support of a pneumonia diagnosis is useful to rule in pneumonia radiographically, but a negative result performs poorly at ruling out disease.
Collapse
|
28
|
Riddle DL. Letter to the Editor on "Unexplained Painful Hip Arthroplasty: What Should We Find? Diagnostic Approach and Results". J Arthroplasty 2019; 34:2195-2196. [PMID: 31253448 DOI: 10.1016/j.arth.2019.06.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/13/2019] [Accepted: 06/03/2019] [Indexed: 02/01/2023] Open
Affiliation(s)
- Daniel L Riddle
- Departments of Physical Therapy, Orthopaedic Surgery and Rheumatology, Virginia Commonwealth University, Richmond, VA
| |
Collapse
|
29
|
Fitzpatrick M, Rac VE, Mitsakakis N, Abrahamyan L, Pechlivanoglou P, Chung S, Carcone SM, Pham B, Kendzerska T, Zwarenstein M, Gottschalk R, George C, Kashgari A, Krahn M. SIESTA - Home sleep study with BresoDx for obstructive sleep apnea: a randomized controlled trial. Sleep Med 2019; 65:45-53. [PMID: 31707288 DOI: 10.1016/j.sleep.2019.07.013] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/13/2018] [Revised: 07/03/2019] [Accepted: 07/08/2019] [Indexed: 11/25/2022]
Abstract
STUDY OBJECTIVES The objectives of this study were to evaluate (1) the accuracy of the clinical diagnosis of obstructive sleep apnea (OSA) informed by the home sleep study with a Type 4 portable monitor BresoDx® versus Type 1 polysomnography (PSG); and (2) agreement of the apnea-hypopnea index (AHI) compared between BresoDx and PSG. MATERIAL AND METHODS This was a randomized, parallel, multicentre, single-blind, pragmatic controlled trial enrolling adults referred to three Ontario sleep clinics for suspected OSA. Participants were randomized to BresoDx followed by PSG (one-night apart) or PSG followed by BresoDx sleep testing sequence arms. The primary outcomes included the accuracy of clinical diagnosis and OSA severity measured by AHI between tests. RESULTS In sum, 233 participants completed both sleep studies and 206 completed physician consultation visits. The agreement between clinical diagnosis informed by PSG versus BresoDx was fair (Cohen's kappa coefficient = 0.28). The sensitivity of BresoDx-informed clinical diagnosis against PSG was between 0.86 and 0.89, and the specificity between 0.38 and 0.44. For AHI cut-off of ≥5 events/hour the sensitivity, specificity and positive and negative predictive values were 0.85, 0.48, 0.81 and 0.54. CONCLUSIONS Home sleep apnea testing with BresoDx can be used in a referral population with a high pretest probability of OSA similar to other Type IV devices. This study complements the existing body of evidence suggesting that home testing with portable devices plays a valuable role for diagnosing of OSA in a variety of settings. SIESTA TRIAL REGISTRATION: www.clinicaltrials.gov (Identifier: NCT02003729).
Collapse
Affiliation(s)
| | - Valeria E Rac
- Ted Rogers Centre for Heart Research, Peter Munk Cardiac Centre, Toronto General Hospital, University Health Network, Toronto, Ontario, Canada; Toronto Health Economics and Technology Assessment (THETA) Collaborative, Toronto General Hospital Research Institute, University Health Network, Toronto, Ontario, Canada; Institute of Health Policy, Management and Evaluation (IHPME), University of Toronto, Toronto, Ontario, Canada
| | - Nicholas Mitsakakis
- Toronto Health Economics and Technology Assessment (THETA) Collaborative, Toronto General Hospital Research Institute, University Health Network, Toronto, Ontario, Canada; Institute of Health Policy, Management and Evaluation (IHPME), University of Toronto, Toronto, Ontario, Canada
| | - Lusine Abrahamyan
- Toronto Health Economics and Technology Assessment (THETA) Collaborative, Toronto General Hospital Research Institute, University Health Network, Toronto, Ontario, Canada; Institute of Health Policy, Management and Evaluation (IHPME), University of Toronto, Toronto, Ontario, Canada
| | - Petros Pechlivanoglou
- Institute of Health Policy, Management and Evaluation (IHPME), University of Toronto, Toronto, Ontario, Canada; Child Health Evaluative Sciences, The Hospital for Sick Children Research Institute, Toronto, Ontario, Canada
| | - Suzanne Chung
- Toronto Health Economics and Technology Assessment (THETA) Collaborative, Toronto General Hospital Research Institute, University Health Network, Toronto, Ontario, Canada
| | - Steven M Carcone
- Toronto Health Economics and Technology Assessment (THETA) Collaborative, Toronto General Hospital Research Institute, University Health Network, Toronto, Ontario, Canada
| | - Ba' Pham
- Toronto Health Economics and Technology Assessment (THETA) Collaborative, Toronto General Hospital Research Institute, University Health Network, Toronto, Ontario, Canada
| | | | | | | | - Charles George
- Department of Medicine, Western University, London, Ontario, Canada
| | - Alia Kashgari
- Department of Medicine, Western University, London, Ontario, Canada
| | - Murray Krahn
- Toronto Health Economics and Technology Assessment (THETA) Collaborative, Toronto General Hospital Research Institute, University Health Network, Toronto, Ontario, Canada; Institute of Health Policy, Management and Evaluation (IHPME), University of Toronto, Toronto, Ontario, Canada; Department of Medicine, University of Toronto, Toronto, Ontario, Canada
| |
Collapse
|
30
|
Jackson TJ, Williams RD, Brok J, Chowdhury T, Ronghe M, Powis M, Pritchard-Jones K, Vujanić GM. The diagnostic accuracy and clinical utility of pediatric renal tumor biopsy: Report of the UK experience in the SIOP UK WT 2001 trial. Pediatr Blood Cancer 2019; 66:e27627. [PMID: 30761727 PMCID: PMC6522371 DOI: 10.1002/pbc.27627] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/08/2018] [Revised: 11/19/2018] [Accepted: 12/06/2018] [Indexed: 01/13/2023]
Abstract
INTRODUCTION The International Society of Paediatric Oncology (SIOP) protocols recommend preoperative chemotherapy appropriate for Wilms tumors (WTs) in children with renal tumors aged ≥6 months, reserving biopsy for "atypical" cases. The Children's Cancer and Leukaemia Group (CCLG) joined the SIOP-WT-2001 study but continued the national practice of biopsy at presentation. METHOD Retrospective study of concordance between locally reported renal tumor biopsies and central pathology review nephrectomy diagnoses of children enrolled by CCLG centers in the SIOP-WT-2001 study. RESULTS Biopsy reports were available for 552/787 children with unilateral tumors. 36 of 552 (6.5%) were nondiagnostic: 2 normal tissue, 12 necrotic, 9 insufficient sample, and 13 indeterminate results (disproportionately non-WTs). The sensitivity and specificity of biopsy to identify tumors that did not require SIOP empirical preoperative chemotherapy were 86.0% and 99.6%, respectively. 13 of 548 (2.4%) biopsy results were discordant with nephrectomy; non-WTs other than renal cell carcinoma and clear cell sarcoma of the kidney (CCSK) were poorly recognized. In children aged 6-119 months, 480 of 518 (91.6%) had WT or nephroblastomatosis. 5 of 518 (1%) had benign tumors, and only one diagnosed on biopsy. Biopsy results correctly changed clinical management in 25 of 518 (4.8%), including identifying 19 of 20 CCSKs, but would have led to overtreatment in 5 of 518 (1%) or undertreatment in 4 of 518 (0.8%). In children aged ≥10 years, biopsy correctly changed management in 5 of 19 (26%) cases with no discordance. CONCLUSION Biopsy is less effective at identifying non-WTs than WTs and rarely changes management in younger children. Biopsy should be reserved in SIOP protocols for children ≥10 years and in younger children with clinical or radiological features inconsistent with WT.
Collapse
Affiliation(s)
- Thomas J Jackson
- University College London Great Ormond Street Institute of Child Health, London
| | - Richard D Williams
- University College London Great Ormond Street Institute of Child Health, London
| | - Jesper Brok
- University College London Great Ormond Street Institute of Child Health, London
- Department of Paediatric Oncology and Haematology, Rigshospitalet, Copenhagen, Denmark
| | - Tanzina Chowdhury
- University College London Great Ormond Street Institute of Child Health, London
- Department of Oncology, Great Ormond Street Hospital NHS Foundation Trust, London WC1N 3JH
| | - Milind Ronghe
- Department of Paediatric Oncology, Royal Hospital for Children, Glasgow
| | - Mark Powis
- Department of Paediatric Surgery, Leeds Teaching Hospital NHS Trust, Leeds
| | | | - Gordan M. Vujanić
- Department of Cellular Pathology, University Hospital of Wales, Cardiff, UK
- Department of Pathology, Sidra Medicine, Doha, Qatar
| |
Collapse
|
31
|
Edler L, Ittrich C. Biostatistical Methods for the Validation of Alternative Methods for In Vitro Toxicity Testing. Altern Lab Anim 2019; 31 Suppl 1:5-41. [PMID: 15595899 DOI: 10.1177/026119290303101s02] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Abstract
Statistical methods for the validation of toxicological in vitro test assays are developed and applied. Validation is performed either in comparison with in vivo assays or in comparison with other in vitro assays of established validity. Biostatistical methods are presented which are of potential use and benefit for the validation of alternative methods for the risk assessment of chemicals, providing at least an equivalent level of protection through in vitro toxicity testing to that obtained through the use of current in vivo methods. Characteristic indices are developed and determined. Qualitative outcomes are characterised by the rates of false-positive and false-negative predictions, sensitivity and specificity, and predictive values. Quantitative outcomes are characterised by regression coefficients derived from predictive models. The receiver operating characteristics (ROC) technique, applicable when a continuum of cut-off values is considered, is discussed in detail, in relation to its use for statistical modelling and statistical inference. The methods presented are examined for their use for the proof of safety and for toxicity detection and testing. We emphasise that the final validation of toxicity testing is human toxicity, and that the in vivo test itself is only a predictor with an inherent uncertainty. Therefore, the validation of the in vitro test has to account for the vagueness and uncertainty of the "gold standard" in vivo test. We address model selection and model validation, and a four-step scheme is proposed for the conduct of validation studies. Gaps and research needs are formulated to improve the validation of alternative methods for in vitro toxicity testing.
Collapse
Affiliation(s)
- Lutz Edler
- Biostatistics Unit, C060, German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany
| | | |
Collapse
|
32
|
Pellerin O, Pereira H, Van Ngoc Ty C, Moussa N, Del Giudice C, Pernot S, Déan C, Chatellier G, Sapoval M. Is dual-phase C-arm CBCT sufficiently accurate for the diagnosis of colorectal cancer liver metastasis during liver intra-arterial treatment? Eur Radiol 2019; 29:5253-5263. [PMID: 30937583 DOI: 10.1007/s00330-019-06173-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2018] [Revised: 03/05/2019] [Accepted: 03/15/2019] [Indexed: 01/01/2023]
Abstract
PURPOSE This study aimed to estimate the accuracy of dual-phase C-arm cone beam computed tomography (CBCT) for the detection of colorectal cancer liver metastases, as compared with multidetector computed tomography (MDCT). MATERIALS AND METHODS Between March 2014 and December 2016, 49 consecutive patients referred for intra-arterial treatment for colorectal cancer liver metastases were enrolled in a single-center observational study. All patients were examined with MDCT and with dual-phase C-arm cone beam computed tomography performed after iodine injection in the proper hepatic artery before intra-arterial treatment. Two blinded observers independently reviewed all examinations. Diagnostic accuracy was determined using both a six-cell matrix method and a "worst-case scenario." RESULTS Readers identified at MDCT 264 colorectal liver metastases and 43 other liver lesions. The early and late arterial phase showed 240 and 277 liver lesions respectively. A certainty of the diagnosis was obtained in 63% and 85% at the early (EAP) and late arterial phase (LAP), respectively. Streak artifacts or liver segment truncation, or inadequate enhancement was responsible for the inability to see or to correctly adjudicate a lesion to a diagnosis in 27% and 15% of the cases at the EAP and LAP. The "worst-case scenario" yielded a Se and Sp of 58% and 51%, respectively, at EAP and 84% and 70%, respectively, at LAP. CONCLUSION On CBCT, EAP showed limited accuracy. LAP provided the best tumor detectability. KEY POINTS • The early arterial phase (EAP) yielded poor accuracy: Se = 58% and Sp = 51% (p < 0.0001). • The late arterial phase (LAP) phase yielded good accuracy: Se = 84% and Se = 70% (p = 0.02). • The probability of a correct diagnosis at the EAP was 60%.
Collapse
Affiliation(s)
- Olivier Pellerin
- INSERM U970, Paris, France. .,Université Paris Descartes, Sorbonne Paris Cité, Paris, France. .,Department of Interventional Radiology, Hôpital Européen Georges Pompidou, Assistance Publique - Hôpitaux de Paris, 20 rue Leblanc, 75015, Paris, France.
| | - Helena Pereira
- Clinical Research Unit, Hôpital Européen Georges Pompidou, Assistance Publique - Hôpitaux de Paris, Paris, France.,INSERM U1418, Paris, France
| | | | - Nadia Moussa
- Université Paris Descartes, Sorbonne Paris Cité, Paris, France.,Department of Interventional Radiology, Hôpital Européen Georges Pompidou, Assistance Publique - Hôpitaux de Paris, 20 rue Leblanc, 75015, Paris, France
| | - Costantino Del Giudice
- INSERM U970, Paris, France.,Université Paris Descartes, Sorbonne Paris Cité, Paris, France.,Department of Interventional Radiology, Hôpital Européen Georges Pompidou, Assistance Publique - Hôpitaux de Paris, 20 rue Leblanc, 75015, Paris, France
| | - Simon Pernot
- Université Paris Descartes, Sorbonne Paris Cité, Paris, France.,Department of Digestive Oncology, Hôpital Européen Georges Pompidou, Assistance Publique - Hôpitaux de Paris, Paris, France
| | - Carole Déan
- Department of Interventional Radiology, Hôpital Européen Georges Pompidou, Assistance Publique - Hôpitaux de Paris, 20 rue Leblanc, 75015, Paris, France
| | - Gilles Chatellier
- Université Paris Descartes, Sorbonne Paris Cité, Paris, France.,Clinical Research Unit, Hôpital Européen Georges Pompidou, Assistance Publique - Hôpitaux de Paris, Paris, France.,INSERM U1418, Paris, France
| | - Marc Sapoval
- INSERM U970, Paris, France.,Université Paris Descartes, Sorbonne Paris Cité, Paris, France.,Department of Interventional Radiology, Hôpital Européen Georges Pompidou, Assistance Publique - Hôpitaux de Paris, 20 rue Leblanc, 75015, Paris, France
| |
Collapse
|
33
|
Blunt Thoracolumbar-Spine Trauma Evaluation in the Emergency Department: A Meta-Analysis of Diagnostic Accuracy for History, Physical Examination, and Imaging. J Emerg Med 2018; 56:153-165. [PMID: 30598296 DOI: 10.1016/j.jemermed.2018.10.032] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2018] [Revised: 10/16/2018] [Accepted: 10/25/2018] [Indexed: 01/04/2023]
Abstract
BACKGROUND Delayed diagnoses of unstable thoracolumbar spine (TL-spine) fractures can result in neurologic deficits and avoidable pain, so it is important for clinicians to reach prompt diagnostic decisions. There are no validated decision aids for determining which trauma patients warrant TL-spine imaging. OBJECTIVE Our aim was to quantify the diagnostic accuracy of the injury mechanism, physical examination, associated injuries, clinical decision aids, and imaging for evaluating blunt TL-spine trauma patients. METHODS A search strategy for studies including adult blunt TL-spine trauma using PubMed, Embase, Scopus, CENTRAL, Cochrane Database of Systematic Reviews, and ClinicalTrials.gov was performed. Excluded studies lacked data to construct 2 × 2 tables, were duplicates, were not primary research, did not focus on blunt trauma, examined associated injuries without any utility in identifying TL-spine injuries, only studied cervical-spine fractures, were non-English, had a pediatric setting, or were cadaver/autopsy reports. Risk of bias was assessed using the Quality Assessment Tool for Diagnostic Accuracy Studies. Diagnostic predictors were analyzed with a meta-analysis of sensitivity, specificity, and likelihood ratios. RESULTS In blunt trauma patients in the emergency department, the weighted pretest probability of a TL-spine fracture was 15%. The estimates for detection of TL-spine fractures with plain film were: positive likelihood ratio (+LR) = 25.0 (95% confidence interval [CI] 4.1-152.2; I2 = 94%; p < 0.001) and negative likelihood ratio (-LR) = 0.43 (95% CI 0.32-0.59; I2 = 84%; p < 0.001), and for computed tomography (CT) were: +LR = 81.1 (95% CI 14.1-467.9; I2 = 87%; p < 0.001) and -LR = 0.04 (95% CI 0.02-0.08; I2 = 23%; p = 0.26). CONCLUSIONS CT is more accurate than plain films for detecting TL-spine fractures. Injury mechanism, physical examination, and associated injuries alone are not accurate to rule-in or rule-out TL-spine fractures.
Collapse
|
34
|
Paterson C, Ghaemi J, Alashkham A, Biyani CS, Coles B, Baker L, Szewczyk-Bieda M, Nabi G. Diagnostic accuracy of image-guided biopsies in small (<4 cm) renal masses with implications for active surveillance: a systematic review of the evidence. Br J Radiol 2018; 91:20170761. [PMID: 29888978 DOI: 10.1259/bjr.20170761] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
OBJECTIVE: To determine the safety and diagnostic accuracy of renal tumour biopsies in a defined population of small renal masses (SRMs) only <4 cm using 3 × 2 table, intention to diagnose approach. 3 × 2 table approach examines indeterminate results as a separate category rather than pushing these through traditional 2 × 2 table (four-cell matrix) approach. METHODS: A highly sensitive search was performed in the Cochrane Library, Database of Abstracts of Reviews of Effects; MEDLINE and MEDLINE in Process, EMBASE and conference proceedings (1966-2016) for the acquisition of data on the diagnostic accuracy and complications of RTB in patients with SRM <4 cm. Methodological quality and risk of bias was assessed using QUADAS-2. Test characteristics were calculated using conventional 2 × 2 contingency table analysis excluding non-diagnostic biopsies, and an intention-to-diagnose approach with a 3 × 2 table for pooled estimates of the sensitivity and specificity. RESULTS: A total of 20 studies were included with a total sample size of 974. The pooled estimates for sensitivity and specificity of RTB based upon univariate analysis using 2 × 2 table observed sensitivity 0.952 [confidence interval (CI) 0.908-0.979] and specificity 0.824 (CI 0.566-0.962). Using the 3 × 2 table and intention-to-diagnose principle, sensitivity 0.947 (CI 0.925-0.965) and specificity 0.609 (CI 0.385-0.803) decreased. CONCLUSION: RTB in SRMs (<4 cm) is associated with a high diagnostic sensitivity but poor specificity when non-diagnostic results are included by a 3 × 2 table for analysis (intention to diagnose approach). Risk of non-diagnostic results and poor quality of research need addressing through future studies, preferably by a well-designed prospective study appropriately powered for diagnostic accuracy using valid reference standards. ADVANCES IN KNOWLEDGE: A comprehensive synthesis of literature on image-guided biopsies in SRMs using a different methodology and study design.
Collapse
Affiliation(s)
- Catherine Paterson
- 1 School of Nursing and Midwifery, Robert Gordon University , Garthdee, Aberdeen , UK
| | - Joseph Ghaemi
- 2 Academic Section of Urology, Division of Cancer, School of Medicine, Ninewells Hospital , Dundee , UK
| | - Abduelmenem Alashkham
- 3 Centre for Human Anatomy, School of Biomedical Sciences, University of Edinburgh , Edinburgh , UK
| | - Chandra Shekhar Biyani
- 4 Department of Urology, St James's University Hospital, Leeds Teaching Hospitals NHS Trust , Leeds, West Yorkshire , UK
| | - Bernadette Coles
- 5 Site Librarian, University Library Service, Cardiff University, Cancer Research Wales Library, Velindre Cancer Centre , Cardiff , Wales
| | - Lee Baker
- 6 Chi-Squared Innovations , Dundee , UK
| | - Magdalena Szewczyk-Bieda
- 2 Academic Section of Urology, Division of Cancer, School of Medicine, Ninewells Hospital , Dundee , UK
| | - Ghulam Nabi
- 2 Academic Section of Urology, Division of Cancer, School of Medicine, Ninewells Hospital , Dundee , UK
| |
Collapse
|
35
|
Leung KSS, Siu GKH, Tam KKG, Ho PL, Wong SSY, Leung EKC, Yu SH, Ma OCK, Yam WC. Diagnostic evaluation of an in-house developed single-tube, duplex, nested IS6110 real-time PCR assay for rapid pulmonary tuberculosis diagnosis. Tuberculosis (Edinb) 2018; 112:120-125. [PMID: 30205964 DOI: 10.1016/j.tube.2018.08.008] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2018] [Revised: 08/17/2018] [Accepted: 08/20/2018] [Indexed: 12/30/2022]
Abstract
OBJECTIVE To perform a prospective evaluation on the diagnostic performance of an in-house developed, duplex nested IS6110 real-time Polymerase-Chain-Reaction (PCR) assay (IS6110-qPCR assay) for rapid pulmonary TB diagnosis. METHODS A total of 503 sputum specimens were prospectively collected from July 2016 to November 2016. Diagnostic accuracy and optimal cut-off Cycle-threshold (Ct) value for IS6110-qPCR assay was determined by Receiver Operating Characteristic (ROC) curve. Using the optimal cut-off Ct, diagnostic performance of IS6110-qPCR assay was assessed with reference to both bacteriological and clinical information. Meanwhile, limit of detection (LOD) was calculated using Mycobacterium tuberculosis H37Rv as reference strain. RESULT ROC curve analysis of IS6110-qPCR assay showed a high Area Under the Curve (AUC) value (0.948) with optimal Ct value at 24.140. Prospective analysis of IS6110-qPCR assay with cut-off Ct = 24.140 showed a high overall sensitivity and specificity of 97.2% and 99.7%, respectively. No cross reactivity was observed among all non-tuberculous mycobacteria specimens in this study. LOD analysis on MTB-spiked sputum showed an average detection limit of 5.0 CFU/mL at Ct = 23.18 (±SD, 0.57). CONCLUSION IS6110-qPCR assay is a highly accurate and cost-effective assay developed for primary screening of suspected TB cases, which is particularly suitable for regions with limited resources but high TB burden.
Collapse
Affiliation(s)
- Kenneth Siu-Sing Leung
- Department of Microbiology, Queen Mary Hospital, The University of Hong Kong, Hong Kong Special Administrative Region
| | - Gilman Kit-Hang Siu
- Department of Health Technology and Informatics, The Hong Kong Polytechnic University, Hong Kong Special Administrative Region
| | - Kingsley King-Gee Tam
- Department of Microbiology, Queen Mary Hospital, The University of Hong Kong, Hong Kong Special Administrative Region
| | - Pak-Leung Ho
- Department of Microbiology, Queen Mary Hospital, The University of Hong Kong, Hong Kong Special Administrative Region
| | - Samson Sai-Yin Wong
- Department of Microbiology, Queen Mary Hospital, The University of Hong Kong, Hong Kong Special Administrative Region
| | - Eunice Ka-Chun Leung
- Department of Microbiology, Queen Mary Hospital, The University of Hong Kong, Hong Kong Special Administrative Region
| | - Shi Hui Yu
- KingMed Diagnostics, Science Park, Hong Kong Special Administrative Region
| | - Oliver Chiu-Kit Ma
- KingMed Diagnostics, Science Park, Hong Kong Special Administrative Region
| | - Wing-Cheong Yam
- Department of Microbiology, Queen Mary Hospital, The University of Hong Kong, Hong Kong Special Administrative Region.
| |
Collapse
|
36
|
Pillay S, Cheddie S, Moodley Y. Fibroadenoma of the breast in a South African population -a pilot study of the diagnostic accuracy of fine needle aspirate cytology and breast ultrasonography. Afr Health Sci 2018; 18:273-280. [PMID: 30602953 PMCID: PMC6306964 DOI: 10.4314/ahs.v18i2.11] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Background The triple assessment of clinical breast exam (CBE), fine needle aspirate cytology (FNAC) and breast ultrasonography (US) is used in many settings for the diagnosis of fibroadenoma (FA). The diagnostic accuracy of FNAC and US for FA in South African (SA) women with palpable breast masses (PBM) is unknown. Objective To report the diagnostic accuracy of FNAC/US for FA in SA women with PBM. Methods We conducted a retrospective pilot diagnostic study of 91 women who presented with PBM to a SA regional academic hospital. Data for CBE, US, unguided FNAC, and open biopsies was collected from study participant medical records and analyzed using diagnostic accuracy tables. Results A total of 57/91 (62.6%) study participants had uninterpretable FNAC results. No study participants had uninterpretable US results. The overall diagnostic accuracy of FNAC for FA was 36.3% (95% Confidence Interval - CI: 27.1–46.5%). The overall diagnostic accuracy of US for FA was 83.5% (95% CI: 74.6–89.8%). Conclusion The yield of interpretable test results for FNAC was poor in our study. The diagnostic accuracy of US for FA appears to be superior to that of FNAC. Omission of FNAC from the triple assessment in our setting should be considered.
Collapse
Affiliation(s)
- Sumana Pillay
- University of KwaZulu-Natal College of Health Sciences, Department of Surgery
| | | | - Yoshan Moodley
- University of Kwazulu-Natal, Discipline of Anaesthetics and Critical Care
| |
Collapse
|
37
|
Landsheer JA. The Clinical Relevance of Methods for Handling Inconclusive Medical Test Results: Quantification of Uncertainty in Medical Decision-Making and Screening. Diagnostics (Basel) 2018; 8:diagnostics8020032. [PMID: 29747402 PMCID: PMC6023344 DOI: 10.3390/diagnostics8020032] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2018] [Revised: 05/04/2018] [Accepted: 05/07/2018] [Indexed: 11/16/2022] Open
Abstract
BACKGROUND although the existence of inconclusive medical test results or bio-markers is widely recognized, there are indications that this inherent diagnostic uncertainty is sometimes ignored. This paper discusses three methods for defining and determining inconclusive medical test results, which use different definitions and differ in clinical relevance. METHODS the TG-ROC (two graphs receiver operating characteristics) method is the easiest to use, while the grey zone method and the uncertain interval method require more extensive calculations. RESULTS this paper discusses the technical details of the methods, as well as advantages and disadvantages for their clinical use. TG-ROC and the grey zone method can help in the acquisition of high rates of diagnostic certainty, but can exclude large groups. The uncertain interval method can prevent decisions that are the most uncertain, invalid and unreliable, while excluding smaller groups. CONCLUSIONS the identification of uncertain test scores is relevant, because these scores indicate the need to obtain better information or to await further developments. The methods presented help to determine inconclusive test scores and can help to reduce erroneous decisions. However, further research and development is desirable.
Collapse
Affiliation(s)
- Johannes A Landsheer
- Department of Methodology and Statistics, Faculty of Social and Behavioural Sciences, Utrecht University, 3508 TA Utrecht, The Netherlands.
| |
Collapse
|
38
|
Reporting and Handling of Indeterminate Bone Scan Results in the Staging of Prostate Cancer: A Systematic Review. Diagnostics (Basel) 2018; 8:diagnostics8010009. [PMID: 29337860 PMCID: PMC5871992 DOI: 10.3390/diagnostics8010009] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2017] [Revised: 01/10/2018] [Accepted: 01/12/2018] [Indexed: 01/01/2023] Open
Abstract
Bone scintigraphy is key in imaging skeletal metastases in newly diagnosed prostate cancer. Unfortunately, a notable proportion of scans are not readily classified as positive or negative but deemed indeterminate. The extent of reporting of indeterminate bone scans and how such scans are handled in clinical trials are not known. A systematic review was conducted using electronic databases up to October 2016. The main outcome of interest was the reporting of indeterminate bone scans, analyses of how such scans were managed, and exploratory analyses of the association of study characteristics and the reporting of indeterminate bone scan results. Seventy-four eligible clinical trials were identified. The trials were mostly retrospective (85%), observational (95%), large trials (median 195 patients) from five continents published over four decades. The majority of studies had university affiliation (72%), and an author with imaging background (685). Forty-five studies (61%) reported an indeterminate option for the bone scan and 23 studies reported the proportion of indeterminate scans (median 11.4%). Most trials (44/45, 98%) reported how to handle indeterminate scans. Most trials (n = 39) used add-on supplementary imaging, follow-up bone scans, or both. Exploratory analyses showed a significant association of reporting of indeterminate results and number of patients in the study (p = 0.024) but failed to reach statistical significance with other variables tested. Indeterminate bone scan for staging of prostate cancer was insufficiently reported in clinical trials. In the case of indeterminate scans, most studies provided adequate measures to obtain the final status of the patients.
Collapse
|
39
|
Calès P, Sacher-Huvelin S, Valla D, Bureau C, Olivier A, Oberti F, Boursier J, Galmiche JP. Large oesophageal varice screening by a sequential algorithm using a cirrhosis blood test and optionally capsule endoscopy. Liver Int 2018. [PMID: 28622450 DOI: 10.1111/liv.13497] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/13/2023]
Abstract
BACKGROUND & AIMS Large oesophageal varice (LEV) screening is recommended in cirrhosis. We performed a prospective study to improve non-invasive LEV screening. DESIGN 287 patients with cirrhosis had upper gastrointestinal endoscopy (LEV reference), oesophageal capsule endoscopy (ECE), liver elastography and blood marker analyses. CirrhoMeter (cirrhosis blood test), the most accurate non-invasive LEV test, was segmented for cirrhosis (reference comparator) or LEV. VariScreen, a sequential and partially minimally invasive diagnostic algorithm, was developed by multivariate analysis. It uses CirrhoMeter first, then ECE if CirrhoMeter cannot rule LEV out or in, and finally endoscopy if CirrhoMeter+ECE combination remains uninformative. RESULTS Diagnostic effectiveness rates for LEV were: cirrhosis-segmented CirrhoMeter: 14.6%, LEV-segmented CirrhoMeter: 34.6%, ECE: 60.6% and VariScreen: 66.4% (P ≤ .001 for overall or pair comparison). The respective missed LEV rates were: 2.8%, 5.6%, 8.3% and 5.6% (P = .789). Spared endoscopy rates were, respectively: 15.6%, 36.0%, 70.6% and 69%, (P < .001 for overall or paired comparison except ECE vs VariScreen: P = .743). VariScreen spared 38% of ECE and reduced missed LEV by 87% compared to classical ECE performed in all patients. Excepting cirrhosis-segmented CirrhoMeter, these spared endoscopy rates were significantly higher than that of the Baveno VI recommendation (using platelets and Fibroscan): 18.4% (P < .001). Ascites and Child-Pugh class independently predicted endoscopy sparing by VariScreen: from 86.0% in compensated Child Pugh class A to 24.1% in Child-Pugh class C with ascites. CONCLUSION VariScreen algorithm significantly reduced the missed LEV rate with ECE by 87%, ECE use by 38% and endoscopy requirement by 69%, and even 86% in compensated cirrhosis.
Collapse
Affiliation(s)
- Paul Calès
- Department of Liver-Gastroenterology, University Hospital, HIFIH Laboratory, UNIV Angers, Bretagne Loire University, Angers, France
| | - Sylvie Sacher-Huvelin
- CIC 1413, INSERM, CHU, Nantes, France.,Department of Gastroenterology, IMAD, CHU and UNIV Nantes, Bretagne Loire University, Nantes, France
| | - Dominique Valla
- Liver Unit, DHU UNITY, Beaujon Hospital, HUPNVS, APHP, INSERM UMR U1149, University Paris Diderot, Clichy, France
| | | | - Anne Olivier
- Department of Liver-Gastroenterology, University Hospital, HIFIH Laboratory, UNIV Angers, Bretagne Loire University, Angers, France
| | - Frédéric Oberti
- Department of Liver-Gastroenterology, University Hospital, HIFIH Laboratory, UNIV Angers, Bretagne Loire University, Angers, France
| | - Jérôme Boursier
- Department of Liver-Gastroenterology, University Hospital, HIFIH Laboratory, UNIV Angers, Bretagne Loire University, Angers, France
| | - Jean Paul Galmiche
- Department of Gastroenterology, IMAD, CHU and UNIV Nantes, Bretagne Loire University, Nantes, France
| | | |
Collapse
|
40
|
Thompson KA, Rayburn MC, Chigerwe M. Evaluation of the immunocrit method to detect failure of passively acquired immunity in dairy calves. J Am Vet Med Assoc 2017; 251:702-705. [DOI: 10.2460/javma.251.6.702] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
41
|
Diagnostic feasibility and safety of CT-guided core biopsy for lung nodules less than or equal to 8 mm: A single-institution experience. Eur Radiol 2017; 28:796-806. [DOI: 10.1007/s00330-017-5027-1] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2017] [Revised: 07/01/2017] [Accepted: 08/09/2017] [Indexed: 12/21/2022]
|
42
|
Pawloski L, Plikaytis B, Martin M, Martin S, Prince H, Lape-Nixon M, Tondella ML. Evaluation of Commercial Assays for Single-Point Diagnosis of Pertussis in the US. J Pediatric Infect Dis Soc 2017; 6:e15-e21. [PMID: 27451419 PMCID: PMC8574169 DOI: 10.1093/jpids/piw035] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/15/2015] [Accepted: 06/02/2016] [Indexed: 11/14/2022]
Abstract
BACKGROUND Pertussis serodiagnosis is increasingly being used in the United States despite the lack of a US Food and Drug Administration-approved, commercially available assay. To better understand the utility of these assays in diagnosing pertussis, serology assays were evaluated for analytical parameters and clinical accuracy. METHODS Forty-three antigen-antibody combinations were evaluated for single-point diagnosis of pertussis. Serum panels included sera from laboratory-confirmed cases, an international reference standard, and healthy donors. Phase I panel (n = 20) of sera was used to assess precision, linearity, and accuracy; Phase II panel (n = 226) followed with positive percent agreement (PPA) and negative percent agreement (NPA) estimates. Analytical analyses included coefficients of variation (CV) and concordance correlation coefficients (rc). RESULTS Intra-analyst variability was found to be relatively low among samples per assay, with only 6% (78 of 1240) having CV >20%, primarily with the highly concentrated immunoglobulin (Ig)G anti-pertussis toxin (PT) specimens and IgM assays. The rc measurements to assess linearity ranged between 0.282 and 0.994, 0.332 and 0.999, and -0.056 and 0.482 for IgA, IgG, and IgM, respectively. Analytical accuracy for calibrated IgG anti-PT assays was 86%-115%. The PPA and NPA varied greatly for all assays; PPA/NPA ranges for IgA, IgG, and IgM assays, with culture and/or polymerase chain reaction positivity as control, were 29-90/13-100, 26-96/27-100, and 0-73/42-100, respectively. In IgG assays, mixing filamentous hemagglutinin antigen with PT increased PPA but decreased NPA. CONCLUSIONS Seroassays varied substantially under both analytical and clinical parameters; however, those that were calibrated to a reference standard were highly accurate. Our findings support incorporation of calibrated pertussis seroassays to the pertussis case definition for improved diagnosis and surveillance.
Collapse
Affiliation(s)
- Lucia Pawloski
- Centers for Disease Control and Prevention, Atlanta, GA 30329-4027
| | - Brian Plikaytis
- Centers for Disease Control and Prevention, Atlanta, GA 30329-4027
| | - Monte Martin
- Centers for Disease Control and Prevention, Atlanta, GA 30329-4027
| | - Stacey Martin
- Centers for Disease Control and Prevention, Atlanta, GA 30329-4027
| | - Harry Prince
- Focus Diagnostics, San Juan Capistrano, CA 92675
| | | | | |
Collapse
|
43
|
The Diagnostic Accuracy of Special Tests for Rotator Cuff Tear: The ROW Cohort Study. Am J Phys Med Rehabil 2017; 96:176-183. [PMID: 27386812 DOI: 10.1097/phm.0000000000000566] [Citation(s) in RCA: 62] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
OBJECTIVE The aim was to assess diagnostic accuracy of 15 shoulder special tests for rotator cuff tears. DESIGN From February 2011 to December 2012, 208 participants with shoulder pain were recruited in a cohort study. RESULTS Among tests for supraspinatus tears, Jobe test had a sensitivity of 88% (95% confidence interval [CI], 80%-96%), specificity of 62% (95% CI, 53%-71%), and likelihood ratio of 2.30 (95% CI, 1.79-2.95). The full can test had a sensitivity of 70% (95% CI, 59%-82%) and a specificity of 81% (95% CI, 74%-88%). Among tests for infraspinatus tears, external rotation lag signs at 0 degrees had a specificity of 98% (95% CI, 96%-100%) and a likelihood ratio of 6.06 (95% CI, 1.30-28.33), and the Hornblower sign had a specificity of 96% (95% CI, 93%-100%) and likelihood ratio of 4.81 (95% CI, 1.60-14.49). CONCLUSIONS Jobe test and full can test had high sensitivity and specificity for supraspinatus tears, and Hornblower sign performed well for infraspinatus tears. In general, special tests described for subscapularis tears have high specificity but low sensitivity. These data can be used in clinical practice to diagnose rotator cuff tears and may reduce the reliance on expensive imaging.
Collapse
|
44
|
Carpenter CR, Meisel ZF. Overcoming the Tower of Babel in Medical Science by Finding the "EQUATOR": Research Reporting Guidelines. Acad Emerg Med 2017; 24:1030-1033. [PMID: 28493596 DOI: 10.1111/acem.13225] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
Affiliation(s)
| | - Zachary F. Meisel
- Center for Emergency Care Policy Research; Department of Emergency Medicine; Perelman School of Medicine; University of Pennsylvania; Philadelphia PA
| |
Collapse
|
45
|
Diagnostic utility of intravenous contrast for MR imaging in pediatric appendicitis. Pediatr Radiol 2017; 47:398-403. [PMID: 28108797 DOI: 10.1007/s00247-016-3775-8] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/17/2016] [Revised: 10/23/2016] [Accepted: 12/23/2016] [Indexed: 12/14/2022]
Abstract
BACKGROUND Magnetic resonance imaging (MRI) is increasingly employed as a diagnostic modality for suspected appendicitis in children. However, there is uncertainty as to which MRI sequences are sufficient for safe, timely and accurate diagnosis. Several recent studies have described different MRI protocols, including exams both with and without the use of intravenous contrast. OBJECTIVE We hypothesized that intravenous contrast may be useful in some patients but could be safely omitted in others. MATERIALS AND METHODS All MRI examinations (n=112) performed at our institution for evaluating appendicitis in children were retrospectively reevaluated. Exams were reread by pediatric radiologists under three conditions: With postcontrast images, Without postcontrast images, and Without/With - selective use of postcontrast sequences only when needed for diagnostic certainty. Samples were scored as positive, negative or equivocal for appendicitis. Findings were compared to pathological or clinical follow-up in the medical record. RESULTS Without the use of intravenous contrast yielded more equivocal results (12.4%) compared to With contrast (3.4%). By selectively using postcontrast sequences, the Without/With group yielded fewer equivocal results (1.1%) compared to Without while also reducing contrast use 79.8% compared to the With contrast group. No significant differences in conditional sensitivity or conditional specificity were detected among the three groups. CONCLUSION MRI diagnosis of acute appendicitis can be performed without contrast for most patients; injection of contrast can be reserved for only those patients with equivocal non-contrast imaging.
Collapse
|
46
|
Cohen JF, Korevaar DA, Altman DG, Bruns DE, Gatsonis CA, Hooft L, Irwig L, Levine D, Reitsma JB, de Vet HCW, Bossuyt PMM. STARD 2015 guidelines for reporting diagnostic accuracy studies: explanation and elaboration. BMJ Open 2016; 6:e012799. [PMID: 28137831 PMCID: PMC5128957 DOI: 10.1136/bmjopen-2016-012799] [Citation(s) in RCA: 1516] [Impact Index Per Article: 168.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/26/2016] [Revised: 08/03/2016] [Accepted: 08/25/2016] [Indexed: 12/11/2022] Open
Abstract
Diagnostic accuracy studies are, like other clinical studies, at risk of bias due to shortcomings in design and conduct, and the results of a diagnostic accuracy study may not apply to other patient groups and settings. Readers of study reports need to be informed about study design and conduct, in sufficient detail to judge the trustworthiness and applicability of the study findings. The STARD statement (Standards for Reporting of Diagnostic Accuracy Studies) was developed to improve the completeness and transparency of reports of diagnostic accuracy studies. STARD contains a list of essential items that can be used as a checklist, by authors, reviewers and other readers, to ensure that a report of a diagnostic accuracy study contains the necessary information. STARD was recently updated. All updated STARD materials, including the checklist, are available at http://www.equator-network.org/reporting-guidelines/stard Here, we present the STARD 2015 explanation and elaboration document. Through commented examples of appropriate reporting, we clarify the rationale for each of the 30 items on the STARD 2015 checklist, and describe what is expected from authors in developing sufficiently informative study reports.
Collapse
Affiliation(s)
- Jérémie F Cohen
- Department of Clinical Epidemiology, Biostatistics and Bioinformatics, Academic Medical Centre, University of Amsterdam, Amsterdam, The Netherlands
- Department of Pediatrics, INSERM UMR 1153, Necker Hospital, AP-HP, Paris Descartes University, Paris, France
| | - Daniël A Korevaar
- Department of Clinical Epidemiology, Biostatistics and Bioinformatics, Academic Medical Centre, University of Amsterdam, Amsterdam, The Netherlands
| | - Douglas G Altman
- Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, Centre for Statistics in Medicine, University of Oxford, Oxford, UK
| | - David E Bruns
- Department of Pathology, University of Virginia School of Medicine, Charlottesville, Virginia, USA
| | - Constantine A Gatsonis
- Department of Biostatistics, Brown University School of Public Health, Providence, Rhode Island, USA
| | - Lotty Hooft
- Cochrane Netherlands, Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, University of Utrecht, Utrecht, The Netherlands
| | - Les Irwig
- Screening and Diagnostic Test Evaluation Program, School of Public Health, University of Sydney, Sydney, New South Wales, Australia
| | - Deborah Levine
- Department of Radiology, Beth Israel Deaconess Medical Center, Boston, Massachusetts, USA
- Radiology Editorial Office, Boston, Massachusetts, USA
| | - Johannes B Reitsma
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, University of Utrecht, Utrecht, The Netherlands
| | - Henrica C W de Vet
- Department of Epidemiology and Biostatistics, EMGO Institute for Health and Care Research, VU University Medical Center, Amsterdam, The Netherlands
| | - Patrick M M Bossuyt
- Department of Clinical Epidemiology, Biostatistics and Bioinformatics, Academic Medical Centre, University of Amsterdam, Amsterdam, The Netherlands
| |
Collapse
|
47
|
Minca EC, Al-Rohil RN, Wang M, Harms PW, Ko JS, Collie AM, Kovalyshyn I, Prieto VG, Tetzlaff MT, Billings SD, Andea AA. Comparison between melanoma gene expression score and fluorescence in situ hybridization for the classification of melanocytic lesions. Mod Pathol 2016; 29:832-43. [PMID: 27174586 DOI: 10.1038/modpathol.2016.84] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2015] [Revised: 03/22/2016] [Accepted: 03/24/2016] [Indexed: 11/09/2022]
Abstract
Melanoma accounts for most skin cancer-related deaths and has an increasing incidence. Accurate diagnosis and distinction from atypical nevi can be at times difficult using light microscopy alone. Fluorescence in situ hybridization (FISH) and melanoma gene expression score (myPath, Myriad Genetics) have emerged as ancillary tools to further aid in this differential diagnosis. Our aim in this study was to correlate FISH results, gene expression score, consensus histopathologic impression and clinical outcome on a series of 117 challenging melanocytic lesions collected from three separate institutions. The lesions were separated into two groups: 39 histopathologically unequivocal lesions (15 malignant, 24 benign) and 78 challenging lesions interpreted by expert consensus (27 favor malignant, 30 favor benign, and 21 ambiguous). Melanoma-FISH was performed using probes for 6p25, 11q13, 8q24, and 9p21/CEP9 and scored according to established criteria. Analysis by myPath gene expression score was performed and interpreted by the manufacturer as 'benign', 'indeterminate,' or 'malignant'. In the unequivocal group, melanoma-FISH and myPath score showed 97 and 83% agreement with the histopathologic diagnosis, respectively, with 93 and 62% sensitivity, 100 and 95% specificity, and 80% inter-test agreement. In the challenging group, FISH and the myPath score showed 70 and 64% agreement with the histopathologic interpretation, respectively, with 70% inter-test agreement and similar sensitivities and specificities. The inter-test agreement was 73% overall, excluding indeterminate results. Discordant test results occurred in 27/117 cases from both unequivocal and challenging groups. Melanoma-FISH and gene expression score are valuable ancillary tools, though both have limitations and return discordant results in a subset of cases. Follow-up studies with more extensive clinical outcome data are warranted to establish the accuracy of these tests for the classification of melanocytic lesions.
Collapse
Affiliation(s)
- Eugen C Minca
- Department of Pathology, Cleveland Clinic, Cleveland, OH, USA.,Department of Dermatology, Cleveland Clinic, Cleveland, OH, USA
| | - Rami N Al-Rohil
- Department of Pathology, MD Anderson Cancer Center, The University of Texas, Houston, TX, USA
| | - Min Wang
- Department of Pathology, University of Michigan Medical Center, Ann Arbor, MI, USA
| | - Paul W Harms
- Department of Pathology, University of Michigan Medical Center, Ann Arbor, MI, USA
| | - Jennifer S Ko
- Department of Pathology, Cleveland Clinic, Cleveland, OH, USA.,Department of Dermatology, Cleveland Clinic, Cleveland, OH, USA
| | - Angela M Collie
- Department of Pathology, Cleveland Clinic, Cleveland, OH, USA.,Department of Dermatology, Cleveland Clinic, Cleveland, OH, USA
| | - Ivanka Kovalyshyn
- Department of Pathology, Cleveland Clinic, Cleveland, OH, USA.,Department of Dermatology, Cleveland Clinic, Cleveland, OH, USA
| | - Victor G Prieto
- Department of Pathology, MD Anderson Cancer Center, The University of Texas, Houston, TX, USA.,Department of Dermatology, MD Anderson Cancer Center, The University of Texas, Houston, TX, USA
| | - Michael T Tetzlaff
- Department of Pathology, MD Anderson Cancer Center, The University of Texas, Houston, TX, USA.,Department of Tanslational and Molecular Pathology, MD Anderson Cancer Center, The University of Texas, Houston, TX, USA
| | - Steven D Billings
- Department of Pathology, Cleveland Clinic, Cleveland, OH, USA.,Department of Dermatology, Cleveland Clinic, Cleveland, OH, USA
| | - Aleodor A Andea
- Department of Pathology, University of Michigan Medical Center, Ann Arbor, MI, USA
| |
Collapse
|
48
|
Determination of neonatal serum immunoglobulin G concentrations associated with mortality during the first 4 months of life in dairy heifer calves. J DAIRY RES 2015; 82:400-6. [PMID: 26383079 DOI: 10.1017/s0022029915000503] [Citation(s) in RCA: 45] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Colostral administration practices on dairy farms have significantly improved over the last 15-20 years resulting in prevalence of calves ingesting insufficient colostrum decreasing from 35-40% to 19%. Despite these improvements, the serum immunoglobulin G (IgG) concentration of ≥ 1000 g/dl and serum total protein (TP) concentrations of ≥ 5. 2 g/dl are considered indicative of adequate transfer of immunity. We hypothesised that the current serum IgG concentrations of ≥ 1000 mg/dl is too low to indicate adequate transfer of colostral immunity on modern dairies. The objective of this study was to determine the serum IgG and TP concentrations indicating adequate transfer of passive immunity in dairy heifer calves. A cohort study of 1290 heifers from a calf raising facility for 48 dairy farms was performed. Heifers were assigned into strata based on serum IgG and TP concentrations. Mortality events were recorded for the heifers for 4 months. Interval likelihood ratios for mortality were calculated for heifers in each stratum of serum IgG or TP concentrations. Logistic regression to predict probability of mortality events was performed. Estimates of probability of survival were evaluated using survival analysis. Serum strata of ≤ 1500, 1501-2000 or >2500 were not significant predictors of mortality during the 120 d of rearing. Serum IgG concentration was not a significant predictor of hazard for mortality. In contrast to previous studies, serum IgG and TP concentrations of 2001-2500 mg/dl and 5.8-6.3 g/dl respectively, were considered optimum for indicating adequate passive transfer of colostral immunity in dairy calves based on the likelihood ratios. On dairies with optimum colostral feeding practices, serum IgG and TP concentrations of 2001-2500 mg/dl and 5.8-6.3 g/dl are recommended as endpoints to indicate adequate passive immunity in dairy calves.
Collapse
|
49
|
Chiesa C, Pacifico L, Osborn JF, Bonci E, Hofer N, Resch B. Early-Onset Neonatal Sepsis: Still Room for Improvement in Procalcitonin Diagnostic Accuracy Studies. Medicine (Baltimore) 2015; 94:e1230. [PMID: 26222858 PMCID: PMC4554116 DOI: 10.1097/md.0000000000001230] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
To perform a systematic review assessing accuracy and completeness of diagnostic studies of procalcitonin (PCT) for early-onset neonatal sepsis (EONS) using the Standards for Reporting of Diagnostic Accuracy (STARD) initiative.EONS, diagnosed during the first 3 days of life, remains a common and serious problem. Increased PCT is a potentially useful diagnostic marker of EONS, but reports in the literature are contradictory. There are several possible explanations for the divergent results including the quality of studies reporting the clinical usefulness of PCT in ruling in or ruling out EONS.We systematically reviewed PubMed, Scopus, and the Cochrane Library databases up to October 1, 2014. Studies were eligible for inclusion in our review if they provided measures of PCT accuracy for diagnosing EONS. A data extraction form based on the STARD checklist and adapted for neonates with EONS was used to appraise the quality of the reporting of included studies.We found 18 articles (1998-2014) fulfilling our eligibility criteria which were included in the final analysis. Overall, the results of our analysis showed that the quality of studies reporting diagnostic accuracy of PCT for EONS was suboptimal leaving ample room for improvement. Information on key elements of design, analysis, and interpretation of test accuracy were frequently missing.Authors should be aware of the STARD criteria before starting a study in this field. We welcome stricter adherence to this guideline. Well-reported studies with appropriate designs will provide more reliable information to guide decisions on the use and interpretations of PCT test results in the management of neonates with EONS.
Collapse
Affiliation(s)
- Claudio Chiesa
- From the Institute of Translational Pharmacology, National Research Council (CC), Department of Pediatrics and Child Neuropsychiatry (LP), Department of Public Health and Infectious Diseases (JFO), Department of Experimental Medicine, Sapienza University of Rome, Rome, Italy (EB); and Research Unit for Neonatal Infectious Diseases and Epidemiology, Division of Neonatology, Department of Pediatrics and Adolescent Medicine, Medical University of Graz, Graz, Austria (NH, BR)
| | | | | | | | | | | |
Collapse
|
50
|
Menke J, Kowalski J. Diagnostic accuracy and utility of coronary CT angiography with consideration of unevaluable results: A systematic review and multivariate Bayesian random-effects meta-analysis with intention to diagnose. Eur Radiol 2015; 26:451-8. [DOI: 10.1007/s00330-015-3831-z] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2014] [Revised: 04/25/2015] [Accepted: 04/28/2015] [Indexed: 12/21/2022]
|