1
|
Tseng AS, Shelly-Cohen M, Attia IZ, Noseworthy PA, Friedman PA, Oh JK, Lopez-Jimenez F. Spectrum bias in algorithms derived by artificial intelligence: a case study in detecting aortic stenosis using electrocardiograms. EUROPEAN HEART JOURNAL. DIGITAL HEALTH 2021; 2:561-567. [PMID: 36713099 PMCID: PMC9707965 DOI: 10.1093/ehjdh/ztab061] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/25/2021] [Revised: 05/03/2021] [Indexed: 02/01/2023]
Abstract
Aims Spectrum bias can arise when a diagnostic test is derived from study populations with different disease spectra than the target population, resulting in poor generalizability. We used a real-world artificial intelligence (AI)-derived algorithm to detect severe aortic stenosis (AS) to experimentally assess the effect of spectrum bias on test performance. Methods and results All adult patients at the Mayo Clinic between 1 January 1989 and 30 September 2019 with transthoracic echocardiograms within 180 days after electrocardiogram (ECG) were identified. Two models were developed from two distinct patient cohorts: a whole-spectrum cohort comparing severe AS to any non-severe AS and an extreme-spectrum cohort comparing severe AS to no AS at all. Model performance was assessed. Overall, 258 607 patients had valid ECG and echocardiograms pairs. The area under the receiver operator curve was 0.87 and 0.91 for the whole-spectrum and extreme-spectrum models, respectively. Sensitivity and specificity for the whole-spectrum model was 80% and 81%, respectively, while for the extreme-spectrum model it was 84% and 84%, respectively. When applying the AI-ECG derived from the extreme-spectrum cohort to patients in the whole-spectrum cohort, the sensitivity, specificity, and area under the curve dropped to 83%, 73%, and 0.86, respectively. Conclusion While the algorithm performed robustly in identifying severe AS, this study shows that limiting datasets to clearly positive or negative labels leads to overestimation of test performance when testing an AI algorithm in the setting of classifying severe AS using ECG data. While the effect of the bias may be modest in this example, clinicians should be aware of the existence of such a bias in AI-derived algorithms.
Collapse
Affiliation(s)
- Andrew S Tseng
- Department of Cardiovascular Medicine, Mayo Clinic, 200 First Street Southwest, Rochester, MN 55905, USA
| | - Michal Shelly-Cohen
- Department of Cardiovascular Medicine, Mayo Clinic, 200 First Street Southwest, Rochester, MN 55905, USA
| | - Itzhak Z Attia
- Department of Cardiovascular Medicine, Mayo Clinic, 200 First Street Southwest, Rochester, MN 55905, USA
| | - Peter A Noseworthy
- Department of Cardiovascular Medicine, Mayo Clinic, 200 First Street Southwest, Rochester, MN 55905, USA
| | - Paul A Friedman
- Department of Cardiovascular Medicine, Mayo Clinic, 200 First Street Southwest, Rochester, MN 55905, USA
| | - Jae K Oh
- Department of Cardiovascular Medicine, Mayo Clinic, 200 First Street Southwest, Rochester, MN 55905, USA
| | - Francisco Lopez-Jimenez
- Department of Cardiovascular Medicine, Mayo Clinic, 200 First Street Southwest, Rochester, MN 55905, USA,Corresponding author. Tel: +1 507 284 8087, Fax: +1 507 266 7929,
| |
Collapse
|
2
|
Viegas-Costa LC, Friesen R, Flores-Mir C, McGaw T. Diagnostic performance of serology against histologic assessment to diagnose Sjogren's syndrome: a systematic review. Clin Rheumatol 2021; 40:4817-4828. [PMID: 34142295 DOI: 10.1007/s10067-021-05813-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2021] [Revised: 05/20/2021] [Accepted: 06/06/2021] [Indexed: 10/21/2022]
Abstract
The objective of this review was to assess and evaluate whether the published diagnostic accuracy studies provide evidence to sustain the current diagnostic guidelines put forth by ACR/EULAR used for patients with suspected Sjögren's syndrome (SS). Literature databases, including Medline, Embase, and EBM Reviews, were searched for relevant studies on the correlation between ACR/EULAR criteria, particularly those with a direct comparison between their accuracy in diagnosing Sjögren's syndrome. We followed Cochrane, QUADAS-2, and STARD guidelines and the four-phase flow diagram by the PRISMA Statement. Reports in several languages, but only human studies were considered. Three studies assessed the accuracy of the current diagnostic tests, and these did not present adequate designs that would allow a well-supported conclusion with a high level of certainty. Due to significant clinical and methodological heterogeneity, a meta-analysis was not performed. A qualitative review of the papers was undertaken. Neither the comparative nor the non-comparative study designs permit conclusive recommendations regarding an alternative diagnostic pathway for SS. Well-designed studies of the diagnostic accuracy of SS tests are needed to validate current guidelines or to suggest changes to the current guidelines.
Collapse
Affiliation(s)
- Luiz Claudio Viegas-Costa
- Department of Dentistry - Division of Oral Medicine, Oral Pathology and Radiology & Division of Orthodontics, Faculty of Medicine and Dentistry, University of Alberta, Edmonton, Canada
| | - Reid Friesen
- Department of Dentistry - Division of Oral Medicine, Oral Pathology and Radiology & Division of Orthodontics, Faculty of Medicine and Dentistry, University of Alberta, Edmonton, Canada
| | - Carlos Flores-Mir
- Department of Dentistry - Division of Oral Medicine, Oral Pathology and Radiology & Division of Orthodontics, Faculty of Medicine and Dentistry, University of Alberta, Edmonton, Canada
| | - Timothy McGaw
- Department of Dentistry - Division of Oral Medicine, Oral Pathology and Radiology & Division of Orthodontics, Faculty of Medicine and Dentistry, University of Alberta, Edmonton, Canada. .,Edmonton Clinic Health Academy, Room 5-357, 11405 87 Avenue NW, Edmonton, AB, Canada.
| |
Collapse
|
3
|
Meyer Sauteur PM, Krautter S, Ambroggio L, Seiler M, Paioni P, Relly C, Capaul R, Kellenberger C, Haas T, Gysin C, Bachmann LM, van Rossum AMC, Berger C. Improved Diagnostics Help to Identify Clinical Features and Biomarkers That Predict Mycoplasma pneumoniae Community-acquired Pneumonia in Children. Clin Infect Dis 2020; 71:1645-1654. [PMID: 31665253 PMCID: PMC7108170 DOI: 10.1093/cid/ciz1059] [Citation(s) in RCA: 34] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2019] [Accepted: 10/23/2019] [Indexed: 12/21/2022] Open
Abstract
BACKGROUND There are no reliable signs or symptoms that differentiate Mycoplasma pneumoniae (Mp) infection in community-acquired pneumonia (CAP) from other etiologies. Additionally, current diagnostic tests do not reliably distinguish between Mp infection and carriage. We previously determined that the measurement of Mp-specific immunoglobulin M antibody-secreting cells (ASCs) by enzyme-linked immunospot assay allowed for differentiation between infection and carriage. Using this new diagnostic test, we aimed to identify clinical and laboratory features associated with Mp infection. METHODS This is a prospective cohort study of children, 3-18 years of age, with CAP from 2016 to 2017. Clinical features and biomarkers were compared between Mp-positive and -negative groups by Mann-Whitney U test or Fisher exact test, as appropriate. Area under the receiver operating characteristic curve (AUC) differences and optimal thresholds were determined by using the DeLong test and Youden J statistic, respectively. RESULTS Of 63 CAP patients, 29 were Mp-positive (46%). Mp positivity was statistically associated with older age (median, 8.6 vs 4.7 years), no underlying disease, family with respiratory symptoms, prior antibiotic treatment, prolonged prodromal respiratory symptoms and fever, and extrapulmonary (skin) manifestations. Lower levels of C-reactive protein, white blood cell count, absolute neutrophil count, and procalcitonin (PCT), specifically PCT <0.25 μg/L, were statistically associated with Mp infection. A combination of age >5 years (AUC = 0.77), prodromal fever and respiratory symptoms >6 days (AUC = 0.79), and PCT <0.25 μg/L (AUC = 0.81) improved diagnostic performance (AUC = 0.90) (P = .05). CONCLUSIONS A combination of clinical features and biomarkers may aid physicians in identifying patients at high risk for Mp CAP.
Collapse
Affiliation(s)
- Patrick M Meyer Sauteur
- Division of Infectious Diseases and Hospital Epidemiology, University Children’s Hospital Zurich, Zurich, Switzerland
| | - Selina Krautter
- Division of Infectious Diseases and Hospital Epidemiology, University Children’s Hospital Zurich, Zurich, Switzerland
| | - Lilliam Ambroggio
- Emergency Medicine and Hospital Medicine, Children’s Hospital Colorado, Denver, Colorado, USA
| | - Michelle Seiler
- Emergency Department, University Children’s Hospital Zurich, Zurich, Switzerland
| | - Paolo Paioni
- Division of Infectious Diseases and Hospital Epidemiology, University Children’s Hospital Zurich, Zurich, Switzerland
| | - Christa Relly
- Division of Infectious Diseases and Hospital Epidemiology, University Children’s Hospital Zurich, Zurich, Switzerland
| | - Riccarda Capaul
- Institute of Medical Virology, University of Zurich, Zurich, Switzerland
| | - Christian Kellenberger
- Division of Diagnostic Imaging, University Children’s Hospital Zurich, Zurich, Switzerland
| | - Thorsten Haas
- Division of Anesthesiology, University Children’s Hospital Zurich, Zurich, Switzerland
| | - Claudine Gysin
- Division of Otolaryngology, University Children’s Hospital Zurich, Zurich, Switzerland
| | | | - Annemarie M C van Rossum
- Department of Pediatrics, Division of Pediatric Infectious Diseases and Immunology, Erasmus MC University Medical Center–Sophia Children’s Hospital, Rotterdam, The Netherlands
| | - Christoph Berger
- Division of Infectious Diseases and Hospital Epidemiology, University Children’s Hospital Zurich, Zurich, Switzerland
| |
Collapse
|
4
|
Faes L, Liu X, Wagner SK, Fu DJ, Balaskas K, Sim DA, Bachmann LM, Keane PA, Denniston AK. A Clinician's Guide to Artificial Intelligence: How to Critically Appraise Machine Learning Studies. Transl Vis Sci Technol 2020; 9:7. [PMID: 32704413 PMCID: PMC7346877 DOI: 10.1167/tvst.9.2.7] [Citation(s) in RCA: 88] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2019] [Accepted: 10/04/2019] [Indexed: 12/16/2022] Open
Abstract
In recent years, there has been considerable interest in the prospect of machine learning models demonstrating expert-level diagnosis in multiple disease contexts. However, there is concern that the excitement around this field may be associated with inadequate scrutiny of methodology and insufficient adoption of scientific good practice in the studies involving artificial intelligence in health care. This article aims to empower clinicians and researchers to critically appraise studies of clinical applications of machine learning, through: (1) introducing basic machine learning concepts and nomenclature; (2) outlining key applicable principles of evidence-based medicine; and (3) highlighting some of the potential pitfalls in the design and reporting of these studies.
Collapse
Affiliation(s)
- Livia Faes
- Medical Retina Department, Moorfields Eye Hospital NHS Foundation Trust, London, UK
- Eye Clinic, Cantonal Hospital of Lucerne, Lucerne, Switzerland
| | - Xiaoxuan Liu
- Medical Retina Department, Moorfields Eye Hospital NHS Foundation Trust, London, UK
- Department of Ophthalmology, University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK
- Academic Unit of Ophthalmology, Institute of Inflammation & Ageing, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK
- Health Data Research UK, London, UK
| | - Siegfried K. Wagner
- NIHR Biomedical Research Centre for Ophthalmology, Moorfields Eye Hospital NHS Foundation Trust and UCL Institute of Ophthalmology, London, UK
| | - Dun Jack Fu
- Medical Retina Department, Moorfields Eye Hospital NHS Foundation Trust, London, UK
| | - Konstantinos Balaskas
- Medical Retina Department, Moorfields Eye Hospital NHS Foundation Trust, London, UK
- NIHR Biomedical Research Centre for Ophthalmology, Moorfields Eye Hospital NHS Foundation Trust and UCL Institute of Ophthalmology, London, UK
| | - Dawn A. Sim
- Medical Retina Department, Moorfields Eye Hospital NHS Foundation Trust, London, UK
- NIHR Biomedical Research Centre for Ophthalmology, Moorfields Eye Hospital NHS Foundation Trust and UCL Institute of Ophthalmology, London, UK
| | | | - Pearse A. Keane
- NIHR Biomedical Research Centre for Ophthalmology, Moorfields Eye Hospital NHS Foundation Trust and UCL Institute of Ophthalmology, London, UK
| | - Alastair K. Denniston
- Department of Ophthalmology, University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK
- Academic Unit of Ophthalmology, Institute of Inflammation & Ageing, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK
- Health Data Research UK, London, UK
- NIHR Biomedical Research Centre for Ophthalmology, Moorfields Eye Hospital NHS Foundation Trust and UCL Institute of Ophthalmology, London, UK
- Centre for Patient Reported Outcome Research, Institute of Applied Health Research, University of Birmingham, Birmingham, UK
| |
Collapse
|
5
|
Liu X, Faes L, Kale AU, Wagner SK, Fu DJ, Bruynseels A, Mahendiran T, Moraes G, Shamdas M, Kern C, Ledsam JR, Schmid MK, Balaskas K, Topol EJ, Bachmann LM, Keane PA, Denniston AK. A comparison of deep learning performance against health-care professionals in detecting diseases from medical imaging: a systematic review and meta-analysis. Lancet Digit Health 2019; 1:e271-e297. [PMID: 33323251 DOI: 10.1016/s2589-7500(19)30123-2] [Citation(s) in RCA: 724] [Impact Index Per Article: 144.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2019] [Revised: 08/06/2019] [Accepted: 08/14/2019] [Indexed: 02/06/2023]
Abstract
BACKGROUND Deep learning offers considerable promise for medical diagnostics. We aimed to evaluate the diagnostic accuracy of deep learning algorithms versus health-care professionals in classifying diseases using medical imaging. METHODS In this systematic review and meta-analysis, we searched Ovid-MEDLINE, Embase, Science Citation Index, and Conference Proceedings Citation Index for studies published from Jan 1, 2012, to June 6, 2019. Studies comparing the diagnostic performance of deep learning models and health-care professionals based on medical imaging, for any disease, were included. We excluded studies that used medical waveform data graphics material or investigated the accuracy of image segmentation rather than disease classification. We extracted binary diagnostic accuracy data and constructed contingency tables to derive the outcomes of interest: sensitivity and specificity. Studies undertaking an out-of-sample external validation were included in a meta-analysis, using a unified hierarchical model. This study is registered with PROSPERO, CRD42018091176. FINDINGS Our search identified 31 587 studies, of which 82 (describing 147 patient cohorts) were included. 69 studies provided enough data to construct contingency tables, enabling calculation of test accuracy, with sensitivity ranging from 9·7% to 100·0% (mean 79·1%, SD 0·2) and specificity ranging from 38·9% to 100·0% (mean 88·3%, SD 0·1). An out-of-sample external validation was done in 25 studies, of which 14 made the comparison between deep learning models and health-care professionals in the same sample. Comparison of the performance between health-care professionals in these 14 studies, when restricting the analysis to the contingency table for each study reporting the highest accuracy, found a pooled sensitivity of 87·0% (95% CI 83·0-90·2) for deep learning models and 86·4% (79·9-91·0) for health-care professionals, and a pooled specificity of 92·5% (95% CI 85·1-96·4) for deep learning models and 90·5% (80·6-95·7) for health-care professionals. INTERPRETATION Our review found the diagnostic performance of deep learning models to be equivalent to that of health-care professionals. However, a major finding of the review is that few studies presented externally validated results or compared the performance of deep learning models and health-care professionals using the same sample. Additionally, poor reporting is prevalent in deep learning studies, which limits reliable interpretation of the reported diagnostic accuracy. New reporting standards that address specific challenges of deep learning could improve future studies, enabling greater confidence in the results of future evaluations of this promising technology. FUNDING None.
Collapse
Affiliation(s)
- Xiaoxuan Liu
- Department of Ophthalmology, University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK; Academic Unit of Ophthalmology, Institute of Inflammation & Ageing, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK; Medical Retina Department, Moorfields Eye Hospital NHS Foundation Trust, London, UK; Health Data Research UK, London, UK
| | - Livia Faes
- Medical Retina Department, Moorfields Eye Hospital NHS Foundation Trust, London, UK; Eye Clinic, Cantonal Hospital of Lucerne, Lucerne, Switzerland
| | - Aditya U Kale
- Department of Ophthalmology, University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK
| | - Siegfried K Wagner
- NIHR Biomedical Research Centre for Ophthalmology, Moorfields Eye Hospital NHS Foundation Trust and UCL Institute of Ophthalmology, London, UK
| | - Dun Jack Fu
- Medical Retina Department, Moorfields Eye Hospital NHS Foundation Trust, London, UK
| | - Alice Bruynseels
- Department of Ophthalmology, University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK
| | - Thushika Mahendiran
- Department of Ophthalmology, University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK
| | - Gabriella Moraes
- Medical Retina Department, Moorfields Eye Hospital NHS Foundation Trust, London, UK
| | - Mohith Shamdas
- Academic Unit of Ophthalmology, Institute of Inflammation & Ageing, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK
| | - Christoph Kern
- Medical Retina Department, Moorfields Eye Hospital NHS Foundation Trust, London, UK; University Eye Hospital, Ludwig Maximilian University of Munich, Munich, Germany
| | | | - Martin K Schmid
- Eye Clinic, Cantonal Hospital of Lucerne, Lucerne, Switzerland
| | - Konstantinos Balaskas
- Medical Retina Department, Moorfields Eye Hospital NHS Foundation Trust, London, UK; NIHR Biomedical Research Centre for Ophthalmology, Moorfields Eye Hospital NHS Foundation Trust and UCL Institute of Ophthalmology, London, UK
| | - Eric J Topol
- Scripps Research Translational Institute, La Jolla, California
| | | | - Pearse A Keane
- NIHR Biomedical Research Centre for Ophthalmology, Moorfields Eye Hospital NHS Foundation Trust and UCL Institute of Ophthalmology, London, UK; Health Data Research UK, London, UK
| | - Alastair K Denniston
- Department of Ophthalmology, University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK; Academic Unit of Ophthalmology, Institute of Inflammation & Ageing, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK; Centre for Patient Reported Outcome Research, Institute of Applied Health Research, University of Birmingham, Birmingham, UK; NIHR Biomedical Research Centre for Ophthalmology, Moorfields Eye Hospital NHS Foundation Trust and UCL Institute of Ophthalmology, London, UK; Health Data Research UK, London, UK.
| |
Collapse
|
6
|
Bosmans JE, Coupé VMH, Knottnerus BJ, Geerlings SE, Moll van Charante EP, ter Riet G. Cost-effectiveness of different strategies for diagnosis of uncomplicated urinary tract infections in women presenting in primary care. PLoS One 2017; 12:e0188818. [PMID: 29186185 PMCID: PMC5706710 DOI: 10.1371/journal.pone.0188818] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2016] [Accepted: 11/14/2017] [Indexed: 11/19/2022] Open
Abstract
Background Uncomplicated Urinary Tract Infections (UTIs) are common in primary care resulting in substantial costs. Since antimicrobial resistance against antibiotics for UTIs is rising, accurate diagnosis is needed in settings with low rates of multidrug-resistant bacteria. Objective To compare the cost-effectiveness of different strategies to diagnose UTIs in women who contacted their general practitioner (GP) with painful and/or frequent micturition between 2006 and 2008 in and around Amsterdam, The Netherlands. Methods This is a model-based cost-effectiveness analysis using data from 196 women who underwent four tests: history, urine stick, sediment, dipslide, and the gold standard, a urine culture. Decision trees were constructed reflecting 15 diagnostic strategies comprising different parallel and sequential combinations of the four tests. Using the decision trees, for each strategy the costs and the proportion of women with a correct positive or negative diagnosis were estimated. Probabilistic sensitivity analysis was used to estimate uncertainty surrounding costs and effects. Uncertainty was presented using cost-effectiveness planes and acceptability curves. Results Most sequential testing strategies resulted in higher proportions of correctly classified women and lower costs than parallel testing strategies. For different willingness to pay thresholds, the most cost-effective strategies were: 1) performing a dipstick after a positive history for thresholds below €10 per additional correctly classified patient, 2) performing both a history and dipstick for thresholds between €10 and €17 per additional correctly classified patient, 3) performing a dipstick if history was negative, followed by a sediment if the dipstick was negative for thresholds between €17 and €118 per additional correctly classified patient, 4) performing a dipstick if history was negative, followed by a dipslide if the dipstick was negative for thresholds above €118 per additional correctly classified patient. Conclusion Depending on decision makers’ willingness to pay for one additional correctly classified woman, the strategy consisting of performing a history and dipstick simultaneously (ceiling ratios between €10 and €17) or performing a sediment if history and subsequent dipstick are negative (ceiling ratios between €17 and €118) are the most cost-effective strategies to diagnose a UTI.
Collapse
Affiliation(s)
- Judith E. Bosmans
- Department of Health Sciences, Faculty of Science, Vrije Universiteit Amsterdam, Amsterdam Public Health research institute, Amsterdam, The Netherlands
- * E-mail:
| | - Veerle M. H. Coupé
- Department of Epidemiology and Biostatistics, VU University Medical Centre, Amsterdam, the Netherlands
| | - Bart J. Knottnerus
- Department of General Practice, Academic Medical Center, Amsterdam, the Netherlands
| | - Suzanne E. Geerlings
- Department of Internal Medicine / Infectious Diseases, Academic Medical Center, Amsterdam, The Netherlands
| | | | - Gerben ter Riet
- Department of General Practice, Academic Medical Center, Amsterdam, the Netherlands
| |
Collapse
|
7
|
Bouwmans AEP, Weber WEJ, Leentjens AFG, Mess WH. Transcranial sonography findings related to depression in parkinsonian disorders: cross-sectional study in 126 patients. PeerJ 2016; 4:e2037. [PMID: 27231659 PMCID: PMC4878362 DOI: 10.7717/peerj.2037] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2015] [Accepted: 04/23/2016] [Indexed: 12/02/2022] Open
Abstract
Background. Transcranial sonography (TCS) has emerged as a potential diagnostic tool for Parkinson’s disease. Recent research has suggested that abnormal echogenicity of substantia nigra, raphe nuclei and third ventricle is associated with increased risk of depression among these patients. We sought to reproduce these findings in an ongoing larger study of patients with parkinsonian syndromes. Methods. A total of 126 patients with parkinsonian symptoms underwent the Hamilton Depression Scale, and TCS of the substantia nigra (SN) (n = 126), the raphe nuclei (RN) (n = 80) and the third ventricle (n = 57). We then calculated the correlation between depression and hyper-echogenic SN, hypo-echogenic RN and a wider third ventricle. Results. In patients with PD we found no significant difference of the SN between non-depressed and depressed patients (46% vs. 22%; p = 0.18). Non-depressed patients with other parkinsonisms more often had hyperechogenicity of the SN than depressed patients (51% vs. 0%; p = 0.01). We found no relation between depression and the echogenicity of the RN or the width of the third ventricle. Conclusions. In patients with parkinsonian syndromes, we found no association between depression and hyper-echogenic SN, hypo-echogenic RN or a wider third ventricle, as determined by transcranial sonography.
Collapse
Affiliation(s)
| | - Wim E J Weber
- Department of Neurology, Maastricht University Medical Centre , Maastricht , Netherlands
| | - Albert F G Leentjens
- Department of Psychiatry, Maastricht University Medical Centre , Maastricht , Netherlands
| | - Werner H Mess
- Department of Clinical Neurophysiology, Maastricht University Medical Centre , Maastricht , Netherlands
| |
Collapse
|
8
|
|
9
|
Hernaez R. Reliability and agreement studies: a guide for clinical investigators. Gut 2015; 64:1018-27. [PMID: 25873640 DOI: 10.1136/gutjnl-2014-308619] [Citation(s) in RCA: 58] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/15/2015] [Accepted: 03/23/2015] [Indexed: 01/20/2023]
|
10
|
Wu LM, Xu JR, Gu HY, Hua J, Chen J, Zhu J, Zhang W, Hu J. Is liver-specific gadoxetic acid-enhanced magnetic resonance imaging a reliable tool for detection of hepatocellular carcinoma in patients with chronic liver disease? Dig Dis Sci 2013; 58:3313-25. [PMID: 23884757 DOI: 10.1007/s10620-013-2790-y] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/21/2012] [Accepted: 07/02/2013] [Indexed: 12/18/2022]
Abstract
BACKGROUND Gadoxetic acid is a recently developed hepatobiliary-specific contrast material used for magnetic resonance imaging (MRI) which enables highly sensitive detection of hepatocellular carcinoma (HCC). AIM We performed a meta-analysis of all available studies of the diagnostic performance of gadoxetic acid-enhanced MRI (Gd-EOB-MRI) for detection of HCC in patients with chronic liver disease. METHODS Databases including MEDLINE and EMBASE were searched for relevant original articles published from January 2000 to April 2012. Pooled estimation and subgroup analysis data were obtained by statistical analysis. RESULTS Across 10 studies of 570 patients, Gd-EOB-MRI sensitivity was 0.91 (95 % CI 0.77, 0.97) and specificity was 0.93 (95 % CI 0.85, 0.97). Overall, LR+ was 13.6 (95 % CI 5.6, 33.2), LR- was 0.10 (95 % CI 0.04, 0.27), and DOR was 140.36 (95 % CI 28, 696). Among patients with high pre-test probabilities, MRI enabled confirmation of HCC; among patients with low pre-test probabilities, MRI enabled exclusion of HCC. Worst-case-scenario (pre-test probability, 50 %) post-test probabilities were 93 and 9 % for positive and negative MRI results, respectively. In studies in which both Gd-EOB-MRI and contrast enhanced computed tomography (CE-CT) were performed, Gd-EOB-MRI was more sensitive than CE-CT (0.93 vs. 0.78; p < 0.05). Subgroup analysis suggested average lesion size (<2 vs. >2 cm) did not affect the diagnostic accuracy of the test (p > 0.05). CONCLUSIONS A limited number of small studies suggest Gd-EOB-MRI has good diagnostic performance in the detection of HCC among patients with chronic liver disease. It is also confirmed to be a reliable tool for evaluation of small early-stage HCC.
Collapse
Affiliation(s)
- Lian-Ming Wu
- Department of Radiology, Renji Hospital, Shanghai Jiao Tong University School of Medicine, 1630 Dongfang Road, Shanghai, 200127, China
| | | | | | | | | | | | | | | |
Collapse
|
11
|
Whiting PF, Rutjes AWS, Westwood ME, Mallett S. A systematic review classifies sources of bias and variation in diagnostic test accuracy studies. J Clin Epidemiol 2013; 66:1093-104. [PMID: 23958378 DOI: 10.1016/j.jclinepi.2013.05.014] [Citation(s) in RCA: 190] [Impact Index Per Article: 17.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2012] [Revised: 05/08/2013] [Accepted: 05/15/2013] [Indexed: 11/15/2022]
Abstract
OBJECTIVE To classify the sources of bias and variation and to provide an updated summary of the evidence of the effects of each source of bias and variation. STUDY DESIGN AND SETTING We conducted a systematic review of studies of any design with the main objective of addressing bias or variation in the results of diagnostic accuracy studies. We searched MEDLINE, EMBASE, BIOSIS, the Cochrane Methodology Register, and Database of Abstracts of Reviews of Effects (DARE) from 2001 to October 2011. Citation searches based on three key papers were conducted, and studies from our previous review (search to 2001) were eligible. One reviewer extracted data on the study design, objective, sources of bias and/or variation, and results. A second reviewer checked the extraction. RESULTS We summarized the number of studies providing evidence of an effect arising from each source of bias and variation on the estimates of sensitivity, specificity, and overall accuracy. CONCLUSIONS We found consistent evidence for the effects of case-control design, observer variability, availability of clinical information, reference standard, partial and differential verification bias, demographic features, and disease prevalence and severity. Effects were generally stronger for sensitivity than for specificity. Evidence for other sources of bias and variation was limited.
Collapse
Affiliation(s)
- Penny F Whiting
- Kleijnen Systematic Reviews Ltd, Unit 6, Escrick Business Park, Riccall Road, Escrick, York YO19 6FD, United Kingdom.
| | | | | | | | | |
Collapse
|
12
|
The role of 11C-choline and 18F-fluorocholine positron emission tomography (PET) and PET/CT in prostate cancer: a systematic review and meta-analysis. Eur Urol 2013; 64:106-17. [PMID: 23628493 DOI: 10.1016/j.eururo.2013.04.019] [Citation(s) in RCA: 255] [Impact Index Per Article: 23.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2013] [Accepted: 04/10/2013] [Indexed: 01/28/2023]
Abstract
CONTEXT The role of positron emission tomography (PET) and PET/computed tomography (PET/CT) in prostate cancer (PCa) imaging is still debated, although guidelines for their use have emerged over the last few years. OBJECTIVE To systematically review and conduct a meta-analysis of the available evidence of PET and PET/CT using 11C-choline and 18F-fluorocholine as tracers in imaging PCa patients in staging and restaging settings. EVIDENCE ACQUISITION PubMed, Embase, and Web of Science (by citation of reference) were searched. Reference lists of review articles and included articles were checked to complement electronic searches. EVIDENCE SYNTHESIS In staging patients with proven but untreated PCa, the results of the meta-analysis on a per-patient basis (10 studies, n = 637) showed pooled sensitivity, specificity, and diagnostic odds ratio (DOR) of 84% (95% confidence interval [CI], 68-93%), 79% (95% CI, 53-93%), and 20.4 (95% CI, 9.9-42.0), respectively. The positive and negative likelihood ratios were 4.02 (95% CI, 1.73-9.31) and 0.20 (95% CI, 0.11-0.37), respectively. On a per-lesion basis (11 studies, n = 5117), these values were 66% (95% CI, 56-75%), 92% (95% CI, 78-97%), and 22.7 (95% CI, 8.9-58.0), respectively, for pooled sensitivity, specificity, and DOR; and 8.29 (95% CI, 3.05-22.54) and 0.36 (95% CI, 0.29-0.46), respectively, for positive and negative likelihood ratios. In restaging patients with biochemical failure after local treatment with curative intent, the meta-analysis results on a per-patient basis (12 studies, n = 1055) showed pooled sensitivity, specificity, and DOR of 85% (95% CI, 79-89%), 88% (95% CI, 73-95%), and 41.4 (95% CI, 19.7-86.8), respectively; the positive and negative likelihood ratios were 7.06 (95% CI, 3.06-16.27) and 0.17 (95% CI, 0.13-0.22), respectively. CONCLUSIONS PET and PET/CT imaging with 11C-choline and 18F-fluorocholine in restaging of patients with biochemical failure after local treatment for PCa might help guide further treatment decisions. In staging of patients with proven but untreated, high-risk PCa, there is limited but promising evidence warranting further studies. However, the current evidence shows crucial limitations in terms of its applicability in common clinical scenarios.
Collapse
|
13
|
Wu LM, Xu JR, Lu Q, Hua J, Chen J, Hu J. A pooled analysis of diffusion-weighted imaging in the diagnosis of hepatocellular carcinoma in chronic liver diseases. J Gastroenterol Hepatol 2013. [PMID: 23190006 DOI: 10.1111/jgh.12054] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
BACKGROUND AND AIM The purpose of this study was to perform a meta-analysis of all available studies of the diagnostic performance of diffusion-weighted imaging (DWI) in the detection of hepatocellular carcinoma (HCC) in patients with chronic liver disease. METHODS Databases including MEDLINE and EMBASE were searched for relevant original articles published from January 2000 to April 2012. Pooled estimation and subgroup analysis data were obtained by statistical analysis. RESULTS Across the nine studies (476 patients), DWI sensitivity was 81% (95%CI: 67%-90%), and specificity was 89% (95% CI: 76%-95%). Overall, positive likelihood ratio was 7.11 (95%CI: 3.50, 14.48), negative likelihood ratio was 0.21 (95%CI: 0.12-0.37), and the diagnostic odds ratio (DOR) was 33.48 (95%CI: 16.67-67.25). The area under the curve of the summary receiver operator characteristic (ROC) was 0.92 (95% CI:0.89-0.94). In studies in which both DWI and conventional contrast-enhanced magnetic resonance imaging (CE-MRI) were performed, the comparison of DWI performance with that of conventional CE-MRI suggested no major differences against these two methods (P > 0.05). DWI combined CE-MRI had higher pooled sensitivity than DWI alone (93% vs 73%) (P < 0.05). CONCLUSION DWI has good diagnostic performance in the detection of HCC in patients with chronic liver disease and equivalent to conventional CE-MRI. Combination of CE-MRI and DWI can improve the diagnostic accuracy of MRI. Further larger prospective studies are still needed to establish its value for detecting HCC in patients with chronic liver disease.
Collapse
Affiliation(s)
- Lian-Ming Wu
- Department of Radiology, Renji Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | | | | | | | | | | |
Collapse
|
14
|
Bouwmans AEP, Vlaar AMM, Mess WH, Kessels A, Weber WEJ. Specificity and sensitivity of transcranial sonography of the substantia nigra in the diagnosis of Parkinson's disease: prospective cohort study in 196 patients. BMJ Open 2013; 3:bmjopen-2013-002613. [PMID: 23550093 PMCID: PMC3641465 DOI: 10.1136/bmjopen-2013-002613] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/04/2022] Open
Abstract
OBJECTIVE Numerous ultrasound studies have suggested that a typical enlarged area of echogenicity in the substantia nigra (SN+) can help diagnose idiopathic Parkinson's disease (IPD). Almost all these studies were retrospective and involved patients with well-established diagnoses and long-disease duration. In this study the diagnostic accuracy of transcranial sonography (TCS) of the substantia nigra in the patient with an undiagnosed parkinsonian syndrome of recent onset has been evaluated. DESIGN Prospective cohort study for diagnostic accuracy. SETTING Neurology outpatient clinics of two teaching hospitals in the Netherlands. PATIENTS 196 consecutive patients, who were referred to two neurology outpatient clinics for analysis of clinically unclear parkinsonism. Within 2 weeks of inclusion all patients also underwent a TCS and a (123)I-ioflupane Single Photon Emission CT (FP-CIT SPECT) scan of the brain (n=176). OUTCOME MEASURES After 2 years, patients were re-examined by two movement disorder specialist neurologists for a final clinical diagnosis, that served as a surrogate gold standard for our study. RESULTS Temporal acoustic windows were insufficient in 45 of 241 patients (18.67%). The final clinical diagnosis was IPD in 102 (52.0%) patients. Twenty-four (12.3%) patients were diagnosed with atypical parkinsonisms (APS) of which 8 (4.0%) multisystem atrophy (MSA), 6 (3.1%) progressive supranuclear palsy (PSP), 6 (3.1%) Lewy body dementia and 4 (2%) corticobasal degeneration. Twenty-one (10.7%) patients had a diagnosis of vascular parkinsonism, 20 (10.2%) essential tremor, 7 (3.6%) drug-induced parkinsonism and 22 (11.2%) patients had no parkinsonism but an alternative diagnosis. The sensitivity of a SN+ for the diagnosis IPD was 0.40 (CI 0.30 to 0.50) and the specificity 0.61 (CI 0.52 to 0.70). Hereby the positive predictive value (PPV) was 0.53 and the negative predictive value (NPV) 0.48. The sensitivity and specificity of FP-CIT SPECT scans for diagnosing IPD was 0.88 (CI 0.1 to 0.95) and 0.68 (CI 0.58 to 0.76) with a PPV of 0.75 and an NPV of 0.84. CONCLUSIONS The diagnostic accuracy of TCS in early stage Parkinson's disease is not sufficient for routine clinical use. CLINICALTRIALS.GOV IDENTIFIER: NCT0036819.
Collapse
Affiliation(s)
- Angela E P Bouwmans
- Department of Neurology, Maastricht University Medical Centre, Maastricht, The Netherlands
| | | | | | | | | |
Collapse
|
15
|
Caraguel C, Stryhn H, Gagné N, Dohoo I, Hammell L. A modelling approach to predict the variation of repeatability and reproducibility of a RT-PCR assay for infectious salmon anaemia virus across infection prevalences and infection stages. Prev Vet Med 2012; 103:63-73. [DOI: 10.1016/j.prevetmed.2011.08.012] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2010] [Revised: 07/18/2011] [Accepted: 08/29/2011] [Indexed: 10/17/2022]
|
16
|
Oliveira MRFD, Gomes ADC, Toscano CM. QUADAS e STARD: avaliação da qualidade de estudos de acurácia de testes diagnósticos. Rev Saude Publica 2011; 45:416-22. [DOI: 10.1590/s0034-89102011000200021] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2010] [Accepted: 08/25/2010] [Indexed: 11/21/2022] Open
Abstract
OBJETIVO: Comparar duas abordagens baseadas em critérios do Quality Assessment of Diagnostic Accuracy Studies (QUADAS) e do Standards for Reporting Studies of Diagnostic Accuracy (STARD) na avaliação de qualidade de estudos de validação do teste rápido OptiMal®, para diagnóstico de malária. MÉTODOS: Foi realizada busca de artigos de validação do teste rápido na base bibliográfica Medline acessada pelo PubMed, no ano de 2007. Treze artigos foram recuperados na busca. Foram combinados 12 critérios do QUADAS e três do STARD para comparação com os critérios do QUADAS isoladamente. Foi considerado que artigos de regular a boa qualidade atenderiam pelo menos 50% dos critérios do QUADAS. RESULTADOS: Dos 13 artigos recuperados, 12 cumpriram pelo menos 50% dos critérios do QUADAS, e apenas dois atenderam à combinação dos critérios. Considerando-se a combinação dos dois critérios (> 6 QUADAS e > 3STARD), dois estudos (15,4%) apresentaram boa qualidade metodológica. A seleção de artigos usando a combinação proposta variou de dois a oito artigos, dependendo do número de itens considerados como ponto de corte. CONCLUSÕES: A combinação do QUADAS com o STARD tem o potencial de conferir maior rigor nas avaliações da qualidade de artigos publicados sobre validação de testes diagnósticos em malária, por incorporar a checagem de informações relevantes não alcançáveis pelo uso do QUADAS isoladamente.
Collapse
|
17
|
Transcranial sonography for the discrimination of idiopathic Parkinson's disease from the atypical parkinsonian syndromes. INTERNATIONAL REVIEW OF NEUROBIOLOGY 2011. [PMID: 20692498 DOI: 10.1016/s0074-7742(10)90009-3] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/13/2023]
Abstract
We reviewed eight studies on transcranial sonography (TCS) as a tool for differentiating idiopathic Parkinson's disease (IPD) from atypical parkinsonian syndromes (APS) and included some first data on TCS findings in the subforms of PSP. Changes of specific structures on TCS like the substantia nigra (SN), lenticular nucleus (LN), and the third ventricle are discussed as well as how they can contribute to differentiate between IPD, multiple system atrophy (MSA), progressive supranuclear palsy (PSP), Lewy body disease (LBD), and corticobasal degeneration (CBD). We finish with an algorithm that may be used to employ TCS as a diagnostic instrument delineating IPD from the APS and discerning among the APS themselves. As TCS is at present the most promising tool for this particular diagnostic problem, this algorithm might be a suitable hypothesis to study in future research.
Collapse
|
18
|
Turkelson C, Jacobs JJ. Role of technology assessment in orthopaedics. Clin Orthop Relat Res 2009; 467:2570-6. [PMID: 19404712 PMCID: PMC2745459 DOI: 10.1007/s11999-009-0859-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/19/2008] [Accepted: 04/09/2009] [Indexed: 01/31/2023]
Abstract
A technology assessment is a literature-based research project that seeks to determine whether a medical device, drug, procedure, or biologic is effective or to summarize literature on a given technology. A well-conducted assessment is a form of secondary research that employs the same steps used in primary research studies (ie, well-designed clinical trials). The primary difference is that in technology assessment the investigator does not collect the raw data. Rather, (s)he must use data collected by someone else. Nevertheless, a well-designed assessment, like a well-designed study, employs the scientific method, which is a method designed to combat bias. When there is little available information, such as with new technologies, unbiased examinations can typically show that enthusiasm for that technology is not backed by much data. When there is more information, assessments can not only determine whether a technology is effective, but also how effective it is. Technology assessments can provide busy orthopaedic surgeons (who do not have the time to keep up with and critically evaluate current literature) with succinct information that enables them to rapidly determine what is and what is not known about any given medical technology.
Collapse
Affiliation(s)
- Charles Turkelson
- Department of Research and Scientific Affairs, American Academy of Orthopaedic Surgeons, Rosemont, IL USA
| | - Joshua J. Jacobs
- Department of Orthopaedic Surgery, Rush University Medical Center, Chicago, IL 60612 USA
| |
Collapse
|
19
|
Knottnerus BJ, Bindels PJE, Geerlings SE, Moll van Charante EP, ter Riet G. Optimizing the diagnostic work-up of acute uncomplicated urinary tract infections. BMC FAMILY PRACTICE 2008; 9:64. [PMID: 19063737 PMCID: PMC2607275 DOI: 10.1186/1471-2296-9-64] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 12/14/2007] [Accepted: 12/08/2008] [Indexed: 01/08/2023]
Abstract
BACKGROUND Most diagnostic tests for acute uncomplicated urinary tract infections (UTIs) have been previously studied in so-called single-test evaluations. In practice, however, clinicians use more than one test in the diagnostic work-up. Since test results carry overlapping information, results from single-test studies may be confounded. The primary objective of the Amsterdam Cystitis/Urinary Tract Infection Study (ACUTIS) is to determine the (additional) diagnostic value of relevant tests from patient history and laboratory investigations, taking into account their mutual dependencies. Consequently, after suitable validation, an easy to use, multivariable diagnostic rule (clinical index) will be derived. METHODS Women who contact their GP with painful and/or frequent micturition undergo a series of possibly relevant tests, consisting of patient history questions and laboratory investigations. Using urine culture as the reference standard, two multivariable models (diagnostic indices) will be generated: a model which assumes that patients attend the GP surgery and a model based on telephone contact only. Models will be made more robust using the bootstrap. Discrimination will be visualized in high resolution histograms of the posterior UTI probabilities and summarized as 5th, 10th, 25th 50th, 75th, 90th, and 95th centiles of these, Brier score and the area under the receiver operating characteristics curve (ROC) with 95% confidence intervals. Using the regression coefficients of the independent diagnostic indicators, a diagnostic rule will be derived, consisting of an efficient set of tests and their diagnostic values. The course of the presenting complaints is studied using 7-day patient diaries. To learn more about the natural history of UTIs, patients will be offered the opportunity to postpone the use of antibiotics. DISCUSSION We expect that our diagnostic rule will allow efficient diagnosis of UTIs, necessitating the collection of diagnostic indicators with proven added value. GPs may use the rule (preferably after suitable validation) to estimate UTI probabilities for women with different combinations of test results. Finally, in a subcohort, an attempt is made to identify which indicators (including antibiotic treatment) are useful to prognosticate recovery from painful and/or frequent micturition.
Collapse
Affiliation(s)
- Bart J Knottnerus
- Department of General Practice, Academic Medical Center – University of Amsterdam, Amsterdam, the Netherlands
| | - Patrick JE Bindels
- Department of General Practice, Academic Medical Center – University of Amsterdam, Amsterdam, the Netherlands
| | - Suzanne E Geerlings
- Department of Infectious Diseases, Tropical Medicine & AIDS, Center for Infection and Immunity Amsterdam (CINIMA), Academic Medical Center – University of Amsterdam, Amsterdam, the Netherlands
| | - Eric P Moll van Charante
- Department of General Practice, Academic Medical Center – University of Amsterdam, Amsterdam, the Netherlands
| | - Gerben ter Riet
- Department of General Practice, Academic Medical Center – University of Amsterdam, Amsterdam, the Netherlands
- Horten Centre, University of Zurich, Zurich, Switzerland
| |
Collapse
|
20
|
Umbehr M, Bachmann LM, Held U, Kessler TM, Sulser T, Weishaupt D, Kurhanewicz J, Steurer J. Combined magnetic resonance imaging and magnetic resonance spectroscopy imaging in the diagnosis of prostate cancer: a systematic review and meta-analysis. Eur Urol 2008; 55:575-90. [PMID: 18952365 DOI: 10.1016/j.eururo.2008.10.019] [Citation(s) in RCA: 94] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2008] [Accepted: 10/07/2008] [Indexed: 01/10/2023]
Abstract
CONTEXT Magnetic resonance imaging (MRI) combined with magnetic resonance spectroscopy imaging (MRSI) emerged as a promising test in the diagnosis of prostate cancer and showed encouraging results. OBJECTIVE The aim of this systematic review is to meta-analyse the diagnostic accuracy of combined MRI/MRSI in prostate cancer and to explore risk profiles with highest benefit. EVIDENCE ACQUISITION The authors searched the MEDLINE and EMBASE databases and the Cochrane Library, and the authors screened reference lists and contacted experts. There were no language restrictions. The last search was performed in August 2008. EVIDENCE SYNTHESIS We identified 31 test-accuracy studies (1765 patients); 16 studies (17 populations) with a total of 581 patients were suitable for meta-analysis. Nine combined MRI/MRSI studies (10 populations) examining men with pathologically confirmed prostate cancer (297 patients; 1518 specimens) had a pooled sensitivity and specificity on prostate subpart level of 68% (95% CI, 56-78%) and 85% (95% CI, 78-90%), respectively. Compared with patients at high risk for clinically relevant cancer (six studies), sensitivity was lower in low-risk patients (four studies) (58% [46-69%] vs 74% [58-85%]; p>0.05) but higher for specificity (91% [86-94%] vs 78% [70-84%]; p<0.01). Seven studies examining patients with suspected prostate cancer at combined MRI/MRSI (284 patients) had an overall pooled sensitivity and specificity on patients level of 82% (59-94%) and 88% (80-95%). In the low-risk group (five studies) these values were 75% (39-93%) and 91% (77-97%), respectively. CONCLUSIONS A limited number of small studies suggest that MRI combined with MRSI could be a rule-in test for low-risk patients. This finding needs further confirmation in larger studies and cost-effectiveness needs to be established.
Collapse
Affiliation(s)
- Martin Umbehr
- Horten Centre for Patient-Oriented Research and Knowledge Transfer, University of Zurich, Zurich, Switzerland.
| | | | | | | | | | | | | | | |
Collapse
|