1
|
Redondo MJ, Harrall KK, Glueck DH, Tosur M, Uysal S, Muir A, Atkinson EG, Shapiro MR, Yu L, Winter WE, Weedon M, Brusko TM, Oram R, Vehik K, Hagopian W, Atkinson MA, Dabelea D. Diabetes Study of Children of Diverse Ethnicity and Race: Study design. Diabetes Metab Res Rev 2024; 40:e3744. [PMID: 37888801 PMCID: PMC10939959 DOI: 10.1002/dmrr.3744] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Revised: 08/16/2023] [Accepted: 10/03/2023] [Indexed: 10/28/2023]
Abstract
AIMS Determining diabetes type in children has become increasingly difficult due to an overlap in typical characteristics between type 1 diabetes (T1D) and type 2 diabetes (T2D). The Diabetes Study in Children of Diverse Ethnicity and Race (DISCOVER) programme is a National Institutes of Health (NIH)-supported multicenter, prospective, observational study that enrols children and adolescents with non-secondary diabetes. The primary aim of the study was to develop improved models to differentiate between T1D and T2D in diverse youth. MATERIALS AND METHODS The proposed models will evaluate the utility of three existing T1D genetic risk scores in combination with data on islet autoantibodies and other parameters typically available at the time of diabetes onset. Low non-fasting serum C-peptide (<0.6 nmol/L) between 3 and 10 years after diabetes diagnosis will be considered a biomarker for T1D as it reflects the loss of insulin secretion ability. Participating centres are enrolling youth (<19 years old) either with established diabetes (duration 3-10 years) for a cross-sectional evaluation or with recent onset diabetes (duration 3 weeks-15 months) for the longitudinal observation with annual visits for 3 years. Cross-sectional data will be used to develop models. Longitudinal data will be used to externally validate the best-fitting model. RESULTS The results are expected to improve the ability to classify diabetes type in a large and growing subset of children who have an unclear form of diabetes at diagnosis. CONCLUSIONS Accurate and timely classification of diabetes type will help establish the correct clinical management early in the course of the disease.
Collapse
Affiliation(s)
- Maria J. Redondo
- Diabetes and Endocrinology Division, Department of Pediatrics. Texas Children’s Hospital, Baylor College of Medicine, Houston, TX, USA
| | - Kylie K. Harrall
- Lifecourse Epidemiology of Adiposity and Diabetes (LEAD) Center, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
- Department of Pediatrics, University of Colorado School of Medicine, Aurora, CO, USA
| | - Deborah H. Glueck
- Lifecourse Epidemiology of Adiposity and Diabetes (LEAD) Center, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
- Department of Pediatrics, University of Colorado School of Medicine, Aurora, CO, USA
| | - Mustafa Tosur
- Diabetes and Endocrinology Division, Department of Pediatrics. Texas Children’s Hospital, Baylor College of Medicine, Houston, TX, USA
- Children’s Nutrition Research Center, USDA/ARS, Houston, TX, USA
| | - Serife Uysal
- Diabetes and Endocrinology Division, Department of Pediatrics. Texas Children’s Hospital, Baylor College of Medicine, Houston, TX, USA
| | - Andrew Muir
- Department of Pediatrics, Emory University, Atlanta, GA, USA
| | - Elizabeth G. Atkinson
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Melanie R. Shapiro
- Department of Pathology, Immunology, and Laboratory Medicine, College of Medicine, Diabetes Institute, University of Florida, Gainesville, FL, USA
| | - Liping Yu
- Barbara Davis Center for Diabetes, University of Colorado School of Medicine; Aurora, CO, USA
| | - William E. Winter
- Departments of Pathology and Pediatrics, University of Florida, Gainesville, FL, USA
| | - Michael Weedon
- Institute of Biomedical and Clinical Science, University of Exeter Medical School, Exeter, UK
| | - Todd M. Brusko
- Department of Pathology, Immunology, and Laboratory Medicine, College of Medicine, Diabetes Institute, University of Florida, Gainesville, FL, USA
- Department of Pediatrics, College of Medicine, Diabetes Institute, University of Florida, Gainesville, FL, USA
| | - Richard Oram
- Institute of Biomedical and Clinical Science, University of Exeter Medical School, Exeter, UK
| | - Kendra Vehik
- Health Informatics Institute, Morsani College of Medicine, University of South Florida, Tampa, FL, USA
| | | | - Mark A. Atkinson
- Department of Pathology, Immunology, and Laboratory Medicine, College of Medicine, Diabetes Institute, University of Florida, Gainesville, FL, USA
- Department of Pediatrics, College of Medicine, Diabetes Institute, University of Florida, Gainesville, FL, USA
| | - Dana Dabelea
- Lifecourse Epidemiology of Adiposity and Diabetes (LEAD) Center, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
- Department of Pediatrics, University of Colorado School of Medicine, Aurora, CO, USA
| | | |
Collapse
|
2
|
Classification and diagnosis of cervical lesions based on colposcopy images using deep fully convolutional networks: a man-machine comparison cohort study. FUNDAMENTAL RESEARCH 2022. [DOI: 10.1016/j.fmre.2022.09.032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022] Open
|
3
|
Bogdanet D, O'Shea PM, Halperin J, Dunne F. Plasma glycated CD59 (gCD59), a novel biomarker for the diagnosis, management and follow up of women with Gestational Diabetes (GDM) - protocol for prospective cohort study. BMC Pregnancy Childbirth 2020; 20:412. [PMID: 32682411 PMCID: PMC7368790 DOI: 10.1186/s12884-020-03090-9] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2020] [Accepted: 07/03/2020] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The prevalence of Gestational Diabetes (GDM) is rising and with it the number of mothers and children at risk of adverse outcomes. As treatment has been shown to reduce adverse events, it is imperative that we identify all at-risk pregnant women. In Ireland, the national standard of care is selective screening with a 2-hour 75 g oral glucose tolerance test (OGTT). Aiming for universal screening is of utmost importance but this is difficult given the length, the unfeasibility and impracticability of the OGTT. We aim to assess if the novel biomarker glycated CD59 (gCD59) is a suitable contender for the OGTT in identifying women with GDM. METHODS In this prospective cohort study, the study participants will be consecutive pregnant women at Galway University Hospital, Galway, Ireland. Samples for the plasma gCD59 biomarker will be taken together with routine bloods at the first antenatal visit, at weeks 24-28 at the time of routine 75 g OGTT, in trimester 3- and 12-weeks post-partum for women with GDM while having their routine post-partum 75 g OGTT. The constructed database will contain baseline information on each study participant, baseline laboratory data, follow-up laboratory data and pregnancy related outcomes. We aim to recruit a total of 2,000 participants over the project period and with a national GDM prevalence of 12-13%, we will have 240-260 subjects who meet OGTT criteria for GDM. Following regional prevalence, we expect to have 34-37 women who will develop either diabetes or pre-diabetes in the early post-partum period. The sensitivity and specificity of plasma gCD59 to predict the results of the OGTT will be assessed using nonparametric estimates of the receiver operating characteristic (ROC) curves and respective area under the ROC curve (AUROC). DISCUSSION A body of clinical and experimental evidence supports a link between the complement system, complement regulatory proteins, and the pathogenesis of diabetes complications. Building on this research, our study plans to look at the plasma gCD59 capacity to classify pregnant women with normal or abnormal glucose tolerance but also to assess if plasma gCD59 can be used as an early predictor for GDM, for adverse pregnancy outcomes and/or post-partum glucose intolerance.
Collapse
Affiliation(s)
- D Bogdanet
- College of Medicine Nursing and Health Sciences, National University of Ireland Galway, Galway, Ireland. .,Diabetic Day Centre, Galway University Hospital, Galway , Ireland.
| | - P M O'Shea
- Diabetic Day Centre, Galway University Hospital, Galway , Ireland
| | - J Halperin
- Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, MA, Boston, USA
| | - F Dunne
- College of Medicine Nursing and Health Sciences, National University of Ireland Galway, Galway, Ireland.,Diabetic Day Centre, Galway University Hospital, Galway , Ireland
| |
Collapse
|
4
|
Santos HGD, Nascimento CFD, Izbicki R, Duarte YADO, Porto Chiavegatto Filho AD. [Machine learning for predictive analyses in health: an example of an application to predict death in the elderly in São Paulo, Brazil]. CAD SAUDE PUBLICA 2019; 35:e00050818. [PMID: 31365698 DOI: 10.1590/0102-311x00050818] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2018] [Accepted: 05/20/2019] [Indexed: 01/15/2023] Open
Abstract
This study aims to present the stages related to the use of machine learning algorithms for predictive analyses in health. An application was performed in a database of elderly residents in the city of São Paulo, Brazil, who participated in the Health, Well-Being, and Aging Study (SABE) (n = 2,808). The outcome variable was the occurrence of death within five years of the elder's entry into the study (n = 423), and the predictors were 37 variables related to the elder's demographic, socioeconomic, and health profile. The application was organized according to the following stages: division of data in training (70%) and testing (30%), pre-processing of the predictors, learning, and assessment of the models. The learning stage used 5 algorithms to adjust the models: logistic regression with and without penalization, neural networks, gradient boosted trees, and random forest. The algorithms' hyperparameters were optimized by 10-fold cross-validation to select those corresponding to the best models. For each algorithm, the best model was assessed in test data via area under the ROC curve (AUC) and related measures. All the models presented AUC ROC greater than 0.70. For the three models with the highest AUC ROC (neural networks and logistic regression with LASSO penalization and without penalization, respectively), quality measures of the predicted probability were also assessed. The expectation is that with the increased availability of data and trained human capital, it will be possible to develop predictive machine learning models with the potential to help health professionals make the best decisions.
Collapse
Affiliation(s)
| | | | - Rafael Izbicki
- Centro de Ciências Exatas e de Tecnologia, Universidade Federal de São Carlos, São Carlos, Brasil
| | | | | |
Collapse
|
5
|
Johnson KM, Tan WC, Bourbeau J, Sin DD, Sadatsafavi M. The diagnostic performance of patient symptoms in screening for COPD. Respir Res 2018; 19:147. [PMID: 30075717 PMCID: PMC6090694 DOI: 10.1186/s12931-018-0853-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2018] [Accepted: 07/26/2018] [Indexed: 11/24/2022] Open
Abstract
It is recommended that screening for COPD be restricted to symptomatic individuals, but supporting evidence is lacking. We determined the performance of wheeze, cough, phlegm, and dyspnea in discriminating COPD versus non-COPD in a population-based sample of 1332 adults. Area Under the Receiver Operating Curves (AUC) indicated that symptoms had modest performance whether assessed individually (AUCs 0.55–0.62), or in combination (AUC for number of symptoms as the predictor 0.64). AUC improved with the inclusion of multiple other factors (AUC 0.71). Restricting screening to symptomatic individuals is unlikely to substantially improve the yield of general population screening for undiagnosed COPD.
Collapse
Affiliation(s)
- Kate M Johnson
- Respiratory Evaluation Sciences Program, Collaboration for Outcomes Research and Evaluation, Faculty of Pharmaceutical Sciences, University of British Columbia, Vancouver, Canada
| | - Wan C Tan
- Centre for Heart Lung Innovation (the James Hogg Research Centre), St. Paul's Hospital, Vancouver, Canada
| | - Jean Bourbeau
- Respiratory Epidemiology and Clinical Research Unit, McGill University, Montreal, Canada
| | - Don D Sin
- Centre for Heart Lung Innovation (the James Hogg Research Centre), St. Paul's Hospital, Vancouver, Canada.,Institute for Heart and Lung Health, Department of Medicine, The University of British Columbia, Vancouver, Canada
| | - Mohsen Sadatsafavi
- Respiratory Evaluation Sciences Program, Collaboration for Outcomes Research and Evaluation, Faculty of Pharmaceutical Sciences, University of British Columbia, Vancouver, Canada. .,Centre for Clinical Epidemiology and Evaluation, Vancouver Coastal Health Institute, Vancouver, Canada. .,Institute for Heart and Lung Health, Department of Medicine, The University of British Columbia, Vancouver, Canada.
| | | |
Collapse
|
6
|
Menke JM, Ahsan MS, Khoo SP. More Accurate Oral Cancer Screening with Fewer Salivary Biomarkers. BIOMARKERS IN CANCER 2017; 9:1179299X17732007. [PMID: 29085239 PMCID: PMC5648090 DOI: 10.1177/1179299x17732007] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/13/2017] [Accepted: 08/21/2017] [Indexed: 01/05/2023]
Abstract
Signal detection and Bayesian inferential tools were applied to salivary biomarkers to improve screening accuracy and efficiency in detecting oral squamous cell carcinoma (OSCC). Potential cancer biomarkers are identified by significant differences in assay concentrations, receiver operating characteristic areas under the curve (AUCs), sensitivity, and specificity. However, the end goal is to report to individual patients their risk of having disease given positive or negative test results. Likelihood ratios (LRs) and Bayes factors (BFs) estimate evidential support and compile biomarker information to optimize screening accuracy. In total, 26 of 77 biomarkers were mentioned as having been tested at least twice in 137 studies and published in 16 summary papers through 2014. Studies represented 10 212 OSCC and 25 645 healthy patients. The measure of biomarker and panel information value was number of biomarkers needed to approximate 100% positive predictive value (PPV). As few as 5 biomarkers could achieve nearly 100% PPV for a disease prevalence of 0.2% when biomarkers were ordered from highest to lowest LR. When sequentially interpreting biomarker tests, high specificity was more important than test sensitivity in achieving rapid convergence toward a high PPV. Biomarkers ranked from highest to lowest LR were more informative and easier to interpret than AUC or Youden index. The proposed method should be applied to more recently published biomarker data to test its screening value.
Collapse
Affiliation(s)
| | - Md Shahidul Ahsan
- Department of Oral Pathology, Radiology and Medicine, College of Dentistry and Dental Clinics, The University of Iowa, Iowa City, IA, USA
| | - Suan Phaik Khoo
- Department of Oral Diagnostic and Surgical Sciences, School of Dentistry, International Medical University (IMU), Kuala Lumpur, Malaysia
| |
Collapse
|
7
|
Jong VL, Ahout IML, van den Ham HJ, Jans J, Zaaraoui-Boutahar F, Zomer A, Simonetti E, Bijl MA, Brand HK, van IJcken WFJ, de Jonge MI, Fraaij PL, de Groot R, Osterhaus ADME, Eijkemans MJ, Ferwerda G, Andeweg AC. Transcriptome assists prognosis of disease severity in respiratory syncytial virus infected infants. Sci Rep 2016; 6:36603. [PMID: 27833115 PMCID: PMC5105123 DOI: 10.1038/srep36603] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2016] [Accepted: 10/17/2016] [Indexed: 12/17/2022] Open
Abstract
Respiratory syncytial virus (RSV) causes infections that range from common cold to severe lower respiratory tract infection requiring high-level medical care. Prediction of the course of disease in individual patients remains challenging at the first visit to the pediatric wards and RSV infections may rapidly progress to severe disease. In this study we investigate whether there exists a genomic signature that can accurately predict the course of RSV. We used early blood microarray transcriptome profiles from 39 hospitalized infants that were followed until recovery and of which the level of disease severity was determined retrospectively. Applying support vector machine learning on age by sex standardized transcriptomic data, an 84 gene signature was identified that discriminated hospitalized infants with eventually less severe RSV infection from infants that suffered from most severe RSV disease. This signature yielded an area under the receiver operating characteristic curve (AUC) of 0.966 using leave-one-out cross-validation on the experimental data and an AUC of 0.858 on an independent validation cohort consisting of 53 infants. A combination of the gene signature with age and sex yielded an AUC of 0.971. Thus, the presented signature may serve as the basis to develop a prognostic test to support clinical management of RSV patients.
Collapse
Affiliation(s)
- Victor L. Jong
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht, The Netherlands
- Department of Viroscience, Erasmus Medical Center, Rotterdam, The Netherlands
| | - Inge M. L. Ahout
- Department of Pediatrics, Laboratory of Pediatric Infectious Diseases, Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Nijmegen, The Netherlands
| | | | - Jop Jans
- Department of Pediatrics, Laboratory of Pediatric Infectious Diseases, Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Nijmegen, The Netherlands
| | | | - Aldert Zomer
- Department of Pediatrics, Laboratory of Pediatric Infectious Diseases, Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Elles Simonetti
- Department of Pediatrics, Laboratory of Pediatric Infectious Diseases, Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Maarten A. Bijl
- Department of Viroscience, Erasmus Medical Center, Rotterdam, The Netherlands
| | - H. Kim Brand
- Department of Pediatrics, Laboratory of Pediatric Infectious Diseases, Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Nijmegen, The Netherlands
| | | | - Marien I. de Jonge
- Department of Pediatrics, Laboratory of Pediatric Infectious Diseases, Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Pieter L. Fraaij
- Department of Viroscience, Erasmus Medical Center, Rotterdam, The Netherlands
- Department of Pediatrics, Erasmus Medical Center, Rotterdam, The Netherlands
| | - Ronald de Groot
- Department of Pediatrics, Laboratory of Pediatric Infectious Diseases, Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Albert D. M. E. Osterhaus
- Department of Viroscience, Erasmus Medical Center, Rotterdam, The Netherlands
- Research Institute for Infectious Diseases and Zoonoses, Veterinary University Hannover, Germany
| | - Marinus J. Eijkemans
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Gerben Ferwerda
- Department of Pediatrics, Laboratory of Pediatric Infectious Diseases, Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Arno C. Andeweg
- Department of Viroscience, Erasmus Medical Center, Rotterdam, The Netherlands
| |
Collapse
|
8
|
Jong VL, Novianti PW, Roes KCB, Eijkemans MJC. Selecting a classification function for class prediction with gene expression data. Bioinformatics 2016; 32:1814-22. [PMID: 26873933 DOI: 10.1093/bioinformatics/btw034] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2015] [Accepted: 01/15/2016] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Class predicting with gene expression is widely used to generate diagnostic and/or prognostic models. The literature reveals that classification functions perform differently across gene expression datasets. The question, which classification function should be used for a given dataset remains to be answered. In this study, a predictive model for choosing an optimal function for class prediction on a given dataset was devised. RESULTS To achieve this, gene expression data were simulated for different values of gene-pairs correlations, sample size, genes' variances, deferentially expressed genes and fold changes. For each simulated dataset, ten classifiers were built and evaluated using ten classification functions. The resulting accuracies from 1152 different simulation scenarios by ten classification functions were then modeled using a linear mixed effects regression on the studied data characteristics, yielding a model that predicts the accuracy of the functions on a given data. An application of our model on eight real-life datasets showed positive correlations (0.33-0.82) between the predicted and expected accuracies. CONCLUSION The here presented predictive model might serve as a guide to choose an optimal classification function among the 10 studied functions, for any given gene expression data. AVAILABILITY AND IMPLEMENTATION The R source code for the analysis and an R-package 'SPreFuGED' are available at Bioinformatics online. CONTACT v.l.jong@umcutecht.nl SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Victor L Jong
- Biostatistics & Research Support, Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, 3508 GA, Utrecht, The Netherlands, Viroscience Lab, Erasmus Medical Center Rotterdam, Rotterdam, CE 3015, The Netherlands and
| | - Putri W Novianti
- Biostatistics & Research Support, Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, 3508 GA, Utrecht, The Netherlands, Epidemiology & Biostatistics Department, Vrije University Medical Center Amsterdam, HV Amsterdam 1081, The Netherlands
| | - Kit C B Roes
- Biostatistics & Research Support, Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, 3508 GA, Utrecht, The Netherlands
| | - Marinus J C Eijkemans
- Biostatistics & Research Support, Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, 3508 GA, Utrecht, The Netherlands
| |
Collapse
|
9
|
Blood Transcriptional Biomarkers for Active Tuberculosis among Patients in the United States: a Case-Control Study with Systematic Cross-Classifier Evaluation. J Clin Microbiol 2015; 54:274-82. [PMID: 26582831 DOI: 10.1128/jcm.01990-15] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2015] [Accepted: 11/03/2015] [Indexed: 01/04/2023] Open
Abstract
UNLABELLED Blood transcriptional signatures are promising for tuberculosis (TB) diagnosis but have not been evaluated among U.S. PATIENTS To be used clinically, transcriptional classifiers need reproducible accuracy in diverse populations that vary in genetic composition, disease spectrum and severity, and comorbidities. In a prospective case-control study, we identified novel transcriptional classifiers for active TB among U.S. patients and systematically compared their accuracy to classifiers from published studies. Blood samples from HIV-uninfected U.S. adults with active TB, pneumonia, or latent TB infection underwent whole-transcriptome microarray. We used support vector machines to classify disease state based on transcriptional patterns. We externally validated our classifiers using data from sub-Saharan African cohorts and evaluated previously published transcriptional classifiers in our population. Our classifier distinguishing active TB from pneumonia had an area under the concentration-time curve (AUC) of 96.5% (95.4% to 97.6%) among U.S. patients, but the AUC was lower (90.6% [89.6% to 91.7%]) in HIV-uninfected Sub-Saharan Africans. Previously published comparable classifiers had AUC values of 90.0% (87.7% to 92.3%) and 82.9% (80.8% to 85.1%) when tested in U.S. PATIENTS Our classifier distinguishing active TB from latent TB had AUC values of 95.9% (95.2% to 96.6%) among U.S. patients and 95.3% (94.7% to 96.0%) among Sub-Saharan Africans. Previously published comparable classifiers had AUC values of 98.0% (97.4% to 98.7%) and 94.8% (92.9% to 96.8%) when tested in U.S. PATIENTS Blood transcriptional classifiers accurately detected active TB among U.S. adults. The accuracy of classifiers for active TB versus that of other diseases decreased when tested in new populations with different disease controls, suggesting additional studies are required to enhance generalizability. Classifiers that distinguish active TB from latent TB are accurate and generalizable across populations and can be explored as screening assays.
Collapse
|
10
|
Lim R, Lappas M, Riley C, Borregaard N, Moller HJ, Ahmed N, Rice GE. Investigation of human cationic antimicrobial protein-18 (hCAP-18), lactoferrin and CD163 as potential biomarkers for ovarian cancer. J Ovarian Res 2013; 6:5. [PMID: 23339669 PMCID: PMC3557177 DOI: 10.1186/1757-2215-6-5] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2012] [Accepted: 01/18/2013] [Indexed: 01/12/2023] Open
Abstract
UNLABELLED BACKGROUND Epithelial ovarian cancer is one of the leading causes of gynaecological cancer morbidity and mortality in women. Early stage ovarian cancer is usually asymptomatic, therefore, is often first diagnosed when it is widely disseminated. Currently available diagnostics lack the requisite sensitivity and specificity to be implemented as community-based screening tests. The identification of additional biomarkers may improve the diagnostic efficiency of multivariate index assays. The aims of this study were to characterise and compare the ovarian tissue immunohistochemical localisation and plasma concentrations of three putative ovarian cancer biomarkers: human cationic antimicrobial protein-18 (hCAP-18); lactoferrin; and CD163 in normal healthy women and women with ovarian cancer. METHODS In this case-control cohort study, ovarian tissue and blood samples were obtained from 164 women (73 controls, including 28 women with benign pelvic masses; 91 cancer, including 21 women with borderline tumours). Localisation of each antigen within the ovary was assessed by immunohistochemistry and serum concentrations determined by ELISA assays. RESULTS Immunoreactive (ir) hCAP-18 and lactoferrin were identified in epithelial cells, while CD163 was predominately localised in stromal cells. Tissue ir CD163 increased significantly (P<0.05) with disease grade. Median plasma concentrations of soluble (s)CD163 were significantly greater in the cases (3220 ng/ml) than in controls (2488 ng/ml) (P< 0.01). Median plasma concentrations of hCAP-18 and lactoferrin were not significantly different between cases and controls. The classification efficiency of each biomarker (as determined by the area under the receiver operator characteristic curve; AUC) was: 0.67± 0.04; 0.62 ± 0.08 and 0.51 ± 0.07 for sCD163, hCAP-18 and lactoferrin, respectively. When the 3 biomarkers were modelled using stochastic gradient boosted logistic regression, the AUC increased to 0.95 ± 0.03. CONCLUSIONS The data obtained in this study establishes the localisation and concentrations of CD163, hCAP-18, and lactoferrin in ovarian tumours and peripheral blood. Individually, the 3 biomarkers display only modest diagnostic efficiency as assessed by AUC. When combined in a multivariate index assay, however, diagnostic efficiency increases significantly. As such, the utility of the biomarker panel, as an aid in the diagnosis of cancer in symptomatic women, is worthy of further investigation in a larger phase 2 biomarker trial.
Collapse
Affiliation(s)
- Ratana Lim
- Department of Obstetrics and Gynaecology, University of Melbourne, Melbourne, VIC, Australia.
| | | | | | | | | | | | | |
Collapse
|
11
|
Karageorgiou E, Schulz SC, Gollub RL, Andreasen NC, Ho BC, Lauriello J, Calhoun VD, Bockholt HJ, Sponheim SR, Georgopoulos AP. Neuropsychological testing and structural magnetic resonance imaging as diagnostic biomarkers early in the course of schizophrenia and related psychoses. Neuroinformatics 2012; 9:321-33. [PMID: 21246418 DOI: 10.1007/s12021-010-9094-6] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
Making an accurate diagnosis of schizophrenia and related psychoses early in the course of the disease is important for initiating treatment and counseling patients and families. In this study, we developed classification models for early disease diagnosis using structural MRI (sMRI) and neuropsychological (NP) testing. We used sMRI measurements and NP test results from 28 patients with recent-onset schizophrenia and 47 healthy subjects, drawn from the larger sample of the Mind Clinical Imaging Consortium. We developed diagnostic models based on Linear Discriminant Analysis (LDA) following two approaches; namely, (a) stepwise (STP) LDA on the original measurements, and (b) LDA on variables created through Principal Component Analysis (PCA) and selected using the Humphrey-Ilgen parallel analysis. Error estimation of the modeling algorithms was evaluated by leave-one-out external cross-validation. These analyses were performed on sMRI and NP variables separately and in combination. The following classification accuracy was obtained for different variables and modeling algorithms. sMRI only: (a) STP-LDA: 64.3% sensitivity and 76.6% specificity, (b) PCA-LDA: 67.9% sensitivity and 72.3% specificity. NP only: (a) STP-LDA: 71.4% sensitivity and 80.9% specificity, (b) PCA-LDA: 78.5% sensitivity and 91.5% specificity. Combined sMRI-NP: (a) STP-LDA: 64.3% sensitivity and 83.0% specificity, (b) PCA-LDA: 89.3% sensitivity and 93.6% specificity. (i) Maximal diagnostic accuracy was achieved by combining sMRI and NP variables. (ii) NP variables were more informative than sMRI, indicating that cognitive deficits can be detected earlier than volumetric structural abnormalities. (iii) PCA-LDA yielded more accurate classification than STP-LDA. As these sMRI and NP tests are widely available, they can increase accuracy of early intervention strategies and possibly be used in evaluating treatment response.
Collapse
Affiliation(s)
- Elissaios Karageorgiou
- Brain Sciences Center (11B), Veterans Affairs Medical Center, One Veterans Drive, Minneapolis, MN 55417, USA.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
12
|
Brooks JD, Cairns P, Shore RE, Klein CB, Wirgin I, Afanasyeva Y, Zeleniuch-Jacquotte A. DNA methylation in pre-diagnostic serum samples of breast cancer cases: results of a nested case-control study. Cancer Epidemiol 2011; 34:717-23. [PMID: 20627767 DOI: 10.1016/j.canep.2010.05.006] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2009] [Revised: 05/06/2010] [Accepted: 05/08/2010] [Indexed: 11/27/2022]
Abstract
BACKGROUND Promoter methylation of tumor suppressor genes is a frequent and early event in breast carcinogenesis. Paired tumor tissue and serum samples from women with breast cancer show that promoter methylation is detectable in both sample types, with good concordance. This suggests the potential for these serum markers to be used for breast cancer detection. METHODS The current study was a case-control study nested within the prospective New York University Women's Health Study cohort aimed to assess the ability of promoter methylation in serum to detect pre-clinical disease. Cases were women with blood samples collected within the 6 months preceding breast cancer diagnosis (n=50). Each case was matched to 2 healthy cancer-free controls and 1 cancer-free control with a history of benign breast disease (BBD). RESULTS Promoter methylation analysis of four cancer-related genes: -RASSF1A, GSTP1, APC and RARβ2, - was conducted using quantitative methylation-specific PCR. Results showed that the frequency of methylation was lower than expected among cases and higher than expected among controls. Methylation was detected in the promoter region of: RASSF1A in 22.0%, 22.9% and 17.2% of cases, BBD controls and healthy controls respectively; GSTP1 in 4%, 10.4% and 7.1% respectively; APC in 2.0%, 4.4% and 4.2% respectively and RARβ2 in 6.7%, 2.3% and 1.1% respectively. CONCLUSION Methylation status of the four genes included in this study was unable to distinguish between cases and either control group. This study highlights some methodological issues to be addressed in planning prospective studies to evaluate methylation markers as diagnostic biomarkers.
Collapse
Affiliation(s)
- Jennifer D Brooks
- Division of Epidemiology, Department of Environmental Medicine, New York University School of Medicine, New York, NY 10016, United States.
| | | | | | | | | | | | | |
Collapse
|
13
|
Franco R, Caraglia M, Facchini G, Abbruzzese A, Botti G. The role of tissue microarray in the era of target-based agents. Expert Rev Anticancer Ther 2011; 11:859-69. [PMID: 21707283 DOI: 10.1586/era.11.65] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
Tissue microarray (TMA) technologies have been developed over the last years, mainly to identify biomarkers useful for the correct identification and characterization of tumors. Moreover, TMA has been implemented in retrospective studies in order to identify predictive biomarkers of response to a given therapy and/or to find potential new targets for biological therapy. We analyzed the fields of application of TMA technology and the design of TMA varying according to the objectives to be studied. In this article, the reader will learn how to design TMAs in order to cover the objectives of clinical trials based upon the use of target-based agents. The main limits and advantages of TMA and the results achieved in cancer diagnosis will be also described. Tissue microarray technology should be systematically applied to define critical markers, in retrospective studies and in the screening of most human tumors in order to find new possible molecular targets and to molecularly define the diagnosis of the neoplastic diseases. TMAs have substantially improved the field of translational studies, even in the design and follow-up of studies based upon the use of target-based agents in cancer therapy.
Collapse
Affiliation(s)
- Renato Franco
- Pathology Department, National Institute of Tumors of Naples Fondazione G Pascale, Naples, Italy
| | | | | | | | | |
Collapse
|
14
|
Wang Y, Chen H, Schwartz T, Duan N, Parcesepe A, Lewis-Fernández R. Assessment of a disease screener by hierarchical all-subset selection using area under the receiver operating characteristic curves. Stat Med 2011; 30:1751-60. [DOI: 10.1002/sim.4246] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2010] [Accepted: 02/15/2011] [Indexed: 11/08/2022]
|
15
|
Diagnostic accuracy and receiver-operating characteristics curve analysis in surgical research and decision making. Ann Surg 2011; 253:27-34. [PMID: 21294285 DOI: 10.1097/sla.0b013e318204a892] [Citation(s) in RCA: 73] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
In surgical research, the ability to correctly classify one type of condition or specific outcome from another is of great importance for variables influencing clinical decision making. Receiver-operating characteristic (ROC) curve analysis is a useful tool in assessing the diagnostic accuracy of any variable with a continuous spectrum of results. In order to rule a disease state in or out with a given test, the test results are usually binary, with arbitrarily chosen cut-offs for defining disease versus health, or for grading of disease severity. In the postgenomic era, the translation from bench-to-bedside of biomarkers in various tissues and body fluids requires appropriate tools for analysis. In contrast to predetermining a cut-off value to define disease, the advantages of applying ROC analysis include the ability to test diagnostic accuracy across the entire range of variable scores and test outcomes. In addition, ROC analysis can easily examine visual and statistical comparisons across tests or scores. ROC is also favored because it is thought to be independent from the prevalence of the condition under investigation. ROC analysis is used in various surgical settings and across disciplines, including cancer research, biomarker assessment, imaging evaluation, and assessment of risk scores.With appropriate use, ROC curves may help identify the most appropriate cutoff value for clinical and surgical decision making and avoid confounding effects seen with subjective ratings. ROC curve results should always be put in perspective, because a good classifier does not guarantee the expected clinical outcome. In this review, we discuss the fundamental roles, suggested presentation, potential biases, and interpretation of ROC analysis in surgical research.
Collapse
|
16
|
Wang Y, Chen H, Li R, Duan N, Lewis-Fernández R. Prediction-based structured variable selection through the receiver operating characteristic curves. Biometrics 2010; 67:896-905. [PMID: 21175555 DOI: 10.1111/j.1541-0420.2010.01533.x] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
In many clinical settings, a commonly encountered problem is to assess accuracy of a screening test for early detection of a disease. In these applications, predictive performance of the test is of interest. Variable selection may be useful in designing a medical test. An example is a research study conducted to design a new screening test by selecting variables from an existing screener with a hierarchical structure among variables: there are several root questions followed by their stem questions. The stem questions will only be asked after a subject has answered the root question. It is therefore unreasonable to select a model that only contains stem variables but not its root variable. In this work, we propose methods to perform variable selection with structured variables when predictive accuracy of a diagnostic test is the main concern of the analysis. We take a linear combination of individual variables to form a combined test. We then maximize a direct summary measure of the predictive performance of the test, the area under a receiver operating characteristic curve (AUC of an ROC), subject to a penalty function to control for overfitting. Since maximizing empirical AUC of the ROC of a combined test is a complicated nonconvex problem (Pepe, Cai, and Longton, 2006, Biometrics62, 221-229), we explore the connection between the empirical AUC and a support vector machine (SVM). We cast the problem of maximizing predictive performance of a combined test as a penalized SVM problem and apply a reparametrization to impose the hierarchical structure among variables. We also describe a penalized logistic regression variable selection procedure for structured variables and compare it with the ROC-based approaches. We use simulation studies based on real data to examine performance of the proposed methods. Finally we apply developed methods to design a structured screener to be used in primary care clinics to refer potentially psychotic patients for further specialty diagnostics and treatment.
Collapse
Affiliation(s)
- Yuanjia Wang
- Department of Biostatistics, Mailman School of Public Health, Columbia University, New York, New York 10032, USA.
| | | | | | | | | |
Collapse
|
17
|
Miecznikowski JC, Wang D, Liu S, Sucheston L, Gold D. Comparative survival analysis of breast cancer microarray studies identifies important prognostic genetic pathways. BMC Cancer 2010; 10:573. [PMID: 20964848 PMCID: PMC2972286 DOI: 10.1186/1471-2407-10-573] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2010] [Accepted: 10/21/2010] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND An estimated 12% of females in the United States will develop breast cancer in their lifetime. Although, there are advances in treatment options including surgery and chemotherapy, breast cancer is still the second most lethal cancer in women. Thus, there is a clear need for better methods to predict prognosis for each breast cancer patient. With the advent of large genetic databases and the reduction in cost for the experiments, researchers are faced with choosing from a large pool of potential prognostic markers from numerous breast cancer gene expression profile studies. METHODS Five microarray datasets related to breast cancer were examined using gene set analysis and the cancers were categorized into different subtypes using a scoring system based on genetic pathway activity. RESULTS We have observed that significant genes in the individual studies show little reproducibility across the datasets. From our comparative analysis, using gene pathways with clinical variables is more reliable across studies and shows promise in assessing a patient's prognosis. CONCLUSIONS This study concludes that, in light of clinical variables, there are significant gene pathways in common across the datasets. Specifically, several pathways can further significantly stratify patients for survival. These candidate pathways should help to develop a panel of significant biomarkers for the prognosis of breast cancer patients in a clinical setting.
Collapse
|
18
|
Foulkes AS, Azzoni L, Li X, Johnson MA, Smith C, Mounzer K, Montaner LJ. Prediction based classification for longitudinal biomarkers. Ann Appl Stat 2010; 4:1476-1497. [PMID: 21274424 DOI: 10.1214/10-aoas326] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
Assessment of circulating CD4 count change over time in HIV-infected subjects on antiretroviral therapy (ART) is a central component of disease monitoring. The increasing number of HIV-infected subjects starting therapy and the limited capacity to support CD4 count testing within resource-limited settings have fueled interest in identifying correlates of CD4 count change such as total lymphocyte count, among others. The application of modeling techniques will be essential to this endeavor due to the typically non-linear CD4 trajectory over time and the multiple input variables necessary for capturing CD4 variability. We propose a prediction based classification approach that involves first stage modeling and subsequent classification based on clinically meaningful thresholds. This approach draws on existing analytical methods described in the receiver operating characteristic curve literature while presenting an extension for handling a continuous outcome. Application of this method to an independent test sample results in greater than 98% positive predictive value for CD4 count change. The prediction algorithm is derived based on a cohort of n = 270 HIV-1 infected individuals from the Royal Free Hospital, London who were followed for up to three years from initiation of ART. A test sample comprised of n = 72 individuals from Philadelphia and followed for a similar length of time is used for validation. Results suggest that this approach may be a useful tool for prioritizing limited laboratory resources for CD4 testing after subjects start antiretroviral therapy.
Collapse
Affiliation(s)
- A S Foulkes
- Division of Biostatistics, School of Public Health and Health Sciences, University of Massachusetts, Amherst, MA USA
| | | | | | | | | | | | | |
Collapse
|
19
|
Liu Z, Magder LS, Hyslop T, Mao L. Survival associated pathway identification with group Lp penalized global AUC maximization. Algorithms Mol Biol 2010; 5:30. [PMID: 20712896 PMCID: PMC2930641 DOI: 10.1186/1748-7188-5-30] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2010] [Accepted: 08/16/2010] [Indexed: 11/24/2022] Open
Abstract
It has been demonstrated that genes in a cell do not act independently. They interact with one another to complete certain biological processes or to implement certain molecular functions. How to incorporate biological pathways or functional groups into the model and identify survival associated gene pathways is still a challenging problem. In this paper, we propose a novel iterative gradient based method for survival analysis with group Lp penalized global AUC summary maximization. Unlike LASSO, Lp (p < 1) (with its special implementation entitled adaptive LASSO) is asymptotic unbiased and has oracle properties [1]. We first extend Lp for individual gene identification to group Lp penalty for pathway selection, and then develop a novel iterative gradient algorithm for penalized global AUC summary maximization (IGGAUCS). This method incorporates the genetic pathways into global AUC summary maximization and identifies survival associated pathways instead of individual genes. The tuning parameters are determined using 10-fold cross validation with training data only. The prediction performance is evaluated using test data. We apply the proposed method to survival outcome analysis with gene expression profile and identify multiple pathways simultaneously. Experimental results with simulation and gene expression data demonstrate that the proposed procedures can be used for identifying important biological pathways that are related to survival phenotype and for building a parsimonious model for predicting the survival times.
Collapse
|
20
|
Liu Z, Gartenhaus RB, Chen XW, Howell CD, Tan M. Survival prediction and gene identification with penalized global AUC maximization. J Comput Biol 2010; 16:1661-70. [PMID: 19772397 DOI: 10.1089/cmb.2008.0188] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Identifying genes (biomarkers) and predicting the clinical outcomes with censored survival times are important for cancer prognosis and pathogenesis. In this article, we propose a novel method with L(1) penalized global AUC summary maximization (L(1)GAUCS). The L(1)GAUCS method is developed for simultaneous gene (feature) selection and survival prediction. L(1) penalty shrinks coefficients and produces some coefficients that are exactly zero, and therefore selects a small subset of genes (features). It is a well-known fact that many genes are highly correlated in gene expression data and the highly correlated genes may function together. We, therefore, define a correlation measure to identify those genes such that their expression level may be low but they are highly correlated with the downstream highly expressed genes selected with L(1)GAUCS. Partial pathways associated with the correlated genes are identified with DAVID (http://david.abcc.ncifcrf.gov/). Experimental results with chemotherapy and gene expression data demonstrate that the proposed procedures can be used for identifying important genes and pathways that are related to time to death due to cancer and for building a parsimonious model for predicting the survival of future patients. Software is available upon request from the first author.
Collapse
Affiliation(s)
- Zhenqiu Liu
- Division of Biostatistics, Greenebaum Cancer Center, University of Maryland, Baltimore, Maryland 21201, USA.
| | | | | | | | | |
Collapse
|
21
|
Brooks J, Cairns P, Zeleniuch-Jacquotte A. Promoter methylation and the detection of breast cancer. Cancer Causes Control 2010; 20:1539-50. [PMID: 19768562 DOI: 10.1007/s10552-009-9415-y] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2009] [Accepted: 07/29/2009] [Indexed: 12/31/2022]
Abstract
Mammographic screening has been shown to reduce breast cancer mortality in women over the age of 50 years, and to a lesser extent in younger women. The sensitivity of mammography, however, is reduced in some groups of women. There remains a need for a minimally invasive, cost-effective procedure that could be used along side mammography to improve screening sensitivity. Silencing of tumor suppressor genes through promoter hypermethylation is known to be a frequent and early event in carcinogenesis. Further, changes in methylation patterns observed in tumors are also detectable in the circulation of women with breast cancer. This makes these alterations candidate markers for early tumor detection. In this paper, we review the current literature on promoter hypermethylation changes and breast cancer and discuss issues that remain to be addressed in order for the potential of these markers to augment the sensitivity of screening mammography. In general, studies in well-defined populations, including appropriate controls and larger numbers are needed. Further, focus on the optimization of methods of methylation detection in small amounts of DNA is needed.
Collapse
Affiliation(s)
- Jennifer Brooks
- Division of Epidemiology, Department of Environmental Medicine, New York University School of Medicine, 650 First Avenue, New York, NY 10016-3240, USA.
| | | | | |
Collapse
|
22
|
Lijmer JG, Leeflang M, Bossuyt PMM. Proposals for a Phased Evaluation of Medical Tests. Med Decis Making 2009; 29:E13-21. [DOI: 10.1177/0272989x09336144] [Citation(s) in RCA: 89] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Background. In drug development, a 4-phase hierarchical model for the clinical evaluation of new pharmaceuticals is well known. Several comparable phased evaluation schemes have been proposed for medical tests. Purpose. To perform a systematic search of the literature, a synthesis, and a critical review of phased evaluation schemes for medical tests. Data Sources. Literature databases of Medline, Web of Science, and Embase. Study Selection and Data Extraction. Two authors separately evaluated potentially eligible papers and independently extracted data. Results. We identified 19 schemes, published between 1978 and 2007. Despite their variability, these models show substantial similarity. Common phases are evaluations of technical efficacy, diagnostic accuracy, diagnostic thinking efficacy, therapeutic efficacy, patient outcome, and societal aspects. Conclusions. The evaluation frameworks can be useful to distinguish between study types, but they cannot be seen as a necessary sequence of evaluations. The evaluation of tests is most likely not a linear but a cyclic and repetitive process.
Collapse
Affiliation(s)
- Jeroen G. Lijmer
- Department of Psychiatry, Waterland Hospital, Purmerend, the Netherlands, Department of Clinical Epidemiology & Biostatistics, Academic Medical Center, University of Amsterdam, Amsterdam, the Netherlands
| | - Mariska Leeflang
- Department of Clinical Epidemiology & Biostatistics, Academic Medical Center, University of Amsterdam, Amsterdam, the Netherlands
| | - Patrick M. M. Bossuyt
- Department of Clinical Epidemiology & Biostatistics, Academic Medical Center, University of Amsterdam, Amsterdam, the Netherlands,
| |
Collapse
|
23
|
Lee CK, Lord SJ, Coates AS, Simes RJ. Molecular biomarkers to individualise treatment: assessing the evidence. Med J Aust 2009; 190:631-6. [DOI: 10.5694/j.1326-5377.2009.tb02592.x] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2008] [Accepted: 02/16/2009] [Indexed: 11/17/2022]
Affiliation(s)
- Chee K Lee
- NHMRC Clinical Trials Centre, University of Sydney, Sydney, NSW
| | - Sarah J Lord
- NHMRC Clinical Trials Centre, University of Sydney, Sydney, NSW
- Screening and Test Evaluation Program, University of Sydney, Sydney, NSW
| | - Alan S Coates
- School of Public Health, University of Sydney, Sydney, NSW
- International Breast Cancer Study Group, Bern, Switzerland
| | - R John Simes
- NHMRC Clinical Trials Centre, University of Sydney, Sydney, NSW
| |
Collapse
|
24
|
Kim SY. Effects of sample size on robustness and prediction accuracy of a prognostic gene signature. BMC Bioinformatics 2009; 10:147. [PMID: 19445687 PMCID: PMC2689196 DOI: 10.1186/1471-2105-10-147] [Citation(s) in RCA: 60] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2008] [Accepted: 05/16/2009] [Indexed: 12/22/2022] Open
Abstract
BACKGROUND Few overlap between independently developed gene signatures and poor inter-study applicability of gene signatures are two of major concerns raised in the development of microarray-based prognostic gene signatures. One recent study suggested that thousands of samples are needed to generate a robust prognostic gene signature. RESULTS A data set of 1,372 samples was generated by combining eight breast cancer gene expression data sets produced using the same microarray platform and, using the data set, effects of varying samples sizes on a few performances of a prognostic gene signature were investigated. The overlap between independently developed gene signatures was increased linearly with more samples, attaining an average overlap of 16.56% with 600 samples. The concordance between predicted outcomes by different gene signatures also was increased with more samples up to 94.61% with 300 samples. The accuracy of outcome prediction also increased with more samples. Finally, analysis using only Estrogen Receptor-positive (ER+) patients attained higher prediction accuracy than using both patients, suggesting that sub-type specific analysis can lead to the development of better prognostic gene signatures CONCLUSION Increasing sample sizes generated a gene signature with better stability, better concordance in outcome prediction, and better prediction accuracy. However, the degree of performance improvement by the increased sample size was different between the degree of overlap and the degree of concordance in outcome prediction, suggesting that the sample size required for a study should be determined according to the specific aims of the study.
Collapse
Affiliation(s)
- Seon-Young Kim
- Medical Genomics Research Center, KRIBB, Yuseong-Gu, Daejeon, Republic of Korea.
| |
Collapse
|
25
|
Regularized F-measure maximization for feature selection and classification. J Biomed Biotechnol 2009; 2009:617946. [PMID: 19421401 PMCID: PMC2674633 DOI: 10.1155/2009/617946] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2008] [Accepted: 03/17/2009] [Indexed: 11/18/2022] Open
Abstract
Receiver Operating Characteristic (ROC) analysis is a common tool for
assessing the performance of various classifications. It gained much popularity in medical and other fields including biological markers and, diagnostic test. This is particularly due to the fact that in real-world problems
misclassification costs are not known, and thus, ROC curve and related utility
functions such as F-measure can be more meaningful performance measures.
F-measure combines recall and precision into a global measure. In this paper, we propose a novel method through regularized F-measure maximization.
The proposed method assigns different costs to positive and negative samples and does simultaneous feature selection and prediction with L1 penalty. This method is useful especially when data set is highly unbalanced, or the
labels for negative (positive) samples are missing. Our experiments with the
benchmark, methylation, and high dimensional microarray data show that the performance of proposed algorithm is better or equivalent compared with the other popular classifiers in limited experiments.
Collapse
|
26
|
Guenther T, Mueller I, Preuss M, Kruse R, Sabel B. A Treatment Outcome Prediction Model of Visual Field Recovery Using Self-Organizing Maps. IEEE Trans Biomed Eng 2009; 56:572-81. [DOI: 10.1109/tbme.2008.2009995] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
27
|
Camp RL, Neumeister V, Rimm DL. A Decade of Tissue Microarrays: Progress in the Discovery and Validation of Cancer Biomarkers. J Clin Oncol 2008; 26:5630-7. [PMID: 18936473 DOI: 10.1200/jco.2008.17.3567] [Citation(s) in RCA: 191] [Impact Index Per Article: 11.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
This year, 2008, marks the 10-year anniversary of the development of the modern tissue microarray (TMA). During the last decade, the use of TMAs has grown steadily and accounts for a small but increasing percentage of all cancer biomarker studies performed. The growing popularity of TMA-based studies attests to their benefits in the discovery and validation of new biomarkers. This review will focus on these benefits, but also on the faults of TMAs and the challenges of TMA studies that have been overcome in the last decade. We will also discuss the role of TMAs in the latest revolution in cancer treatment, the use of targeted drug therapy.
Collapse
Affiliation(s)
- Robert L. Camp
- From the Department of Pathology, Yale University School of Medicine, New Haven, CT
| | - Veronique Neumeister
- From the Department of Pathology, Yale University School of Medicine, New Haven, CT
| | - David L. Rimm
- From the Department of Pathology, Yale University School of Medicine, New Haven, CT
| |
Collapse
|
28
|
Menten J, Boelaert M, Lesaffre E. Bayesian latent class models with conditionally dependent diagnostic tests: A case study. Stat Med 2008; 27:4469-88. [DOI: 10.1002/sim.3317] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
29
|
Kim SY, Kim YS. A gene sets approach for identifying prognostic gene signatures for outcome prediction. BMC Genomics 2008; 9:177. [PMID: 18416850 PMCID: PMC2364634 DOI: 10.1186/1471-2164-9-177] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2007] [Accepted: 04/16/2008] [Indexed: 11/23/2022] Open
Abstract
Background Gene expression profiling is a promising approach to better estimate patient prognosis; however, there are still unresolved problems, including little overlap among similarly developed gene sets and poor performance of a developed gene set in other datasets. Results We applied a gene sets approach to develop a prognostic gene set from multiple gene expression datasets. By analyzing 12 independent breast cancer gene expression datasets comprising 1,756 tissues with 2,411 pre-defined gene sets including gene ontology categories and pathways, we found many gene sets that were prognostic in most of the analyzed datasets. Those prognostic gene sets were related to biological processes such as cell cycle and proliferation and had additional prognostic values over conventional clinical parameters such as tumor grade, lymph node status, estrogen receptor (ER) status, and tumor size. We then estimated the prediction accuracy of each gene set by performing external validation using six large datasets and identified a gene set with an average prediction accuracy of 67.55%. Conclusion A gene sets approach is an effective method to develop prognostic gene sets to predict patient outcome and to understand the underlying biology of the developed gene set. Using the gene sets approach we identified many prognostic gene sets in breast cancer.
Collapse
Affiliation(s)
- Seon-Young Kim
- Human Genomics Laboratory, Functional Genomics Research Center, KRIBB, Daejeon 305-806, Korea.
| | | |
Collapse
|
30
|
Statistical data processing in clinical proteomics. J Chromatogr B Analyt Technol Biomed Life Sci 2008; 866:77-88. [DOI: 10.1016/j.jchromb.2007.10.042] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2007] [Revised: 10/17/2007] [Accepted: 10/18/2007] [Indexed: 01/12/2023]
|
31
|
Liu Z, Tan M. ROC-Based Utility Function Maximization for Feature Selection and Classification with Applications to High-Dimensional Protease Data. Biometrics 2008; 64:1155-61. [DOI: 10.1111/j.1541-0420.2008.01015.x] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
32
|
Thorpe JD, Duan X, Forrest R, Lowe K, Brown L, Segal E, Nelson B, Anderson GL, McIntosh M, Urban N. Effects of blood collection conditions on ovarian cancer serum markers. PLoS One 2007; 2:e1281. [PMID: 18060075 PMCID: PMC2093996 DOI: 10.1371/journal.pone.0001281] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2007] [Accepted: 11/09/2007] [Indexed: 01/09/2023] Open
Abstract
BACKGROUND Evaluating diagnostic and early detection biomarkers requires comparing serum protein concentrations among biosamples ascertained from subjects with and without cancer. Efforts are generally made to standardize blood processing and storage conditions for cases and controls, but blood sample collection conditions cannot be completely controlled. For example, blood samples from cases are often obtained from persons aware of their diagnoses, and collected after fasting or in surgery, whereas blood samples from some controls may be obtained in different conditions, such as a clinic visit. By measuring the effects of differences in collection conditions on three different markers, we investigated the potential of these effects to bias validation studies. METHODOLOGY AND PRINCIPLE FINDINGS We analyzed serum concentrations of three previously studied putative ovarian cancer serum biomarkers-CA 125, Prolactin and MIF-in healthy women, women with ovarian cancer undergoing gynecologic surgery, women undergoing surgery for benign ovary pathology, and women undergoing surgery with pathologically normal ovaries. For women undergoing surgery, a blood sample was collected either in the clinic 1 to 39 days prior to surgery, or on the day of surgery after anesthesia was administered but prior to the surgical procedure, or both. We found that one marker, prolactin, was dramatically affected by collection conditions, while CA 125 and MIF were unaffected. Prolactin levels were not different between case and control groups after accounting for the conditions of sample collection, suggesting that sample ascertainment could explain some or all of the previously reported results about its potential as a biomarker for ovarian cancer. CONCLUSIONS Biomarker validation studies should use standardized collection conditions, use multiple control groups, and/or collect samples from cases prior to influence of diagnosis whenever feasible to detect and correct for potential biases associated with sample collection.
Collapse
Affiliation(s)
- Jason D Thorpe
- Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
33
|
Georgopoulos AP, Karageorgiou E, Leuthold AC, Lewis SM, Lynch JK, Alonso AA, Aslam Z, Carpenter AF, Georgopoulos A, Hemmy LS, Koutlas IG, Langheim FJP, McCarten JR, McPherson SE, Pardo JV, Pardo PJ, Parry GJ, Rottunda SJ, Segal BM, Sponheim SR, Stanwyck JJ, Stephane M, Westermeyer JJ. Synchronous neural interactions assessed by magnetoencephalography: a functional biomarker for brain disorders. J Neural Eng 2007; 4:349-55. [PMID: 18057502 DOI: 10.1088/1741-2560/4/4/001] [Citation(s) in RCA: 82] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
We report on a test to assess the dynamic brain function at high temporal resolution using magnetoencephalography (MEG). The essence of the test is the measurement of the dynamic synchronous neural interactions, an essential aspect of the brain function. MEG signals were recorded from 248 axial gradiometers while 142 human subjects fixated a spot of light for 45-60 s. After fitting an autoregressive integrative moving average (ARIMA) model and taking the stationary residuals, all pairwise, zero-lag, partial cross-correlations (PCC(ij)(0)) and their z-transforms (z(ij)(0)) between i and j sensors were calculated, providing estimates of the strength and sign (positive, negative) of direct synchronous coupling at 1 ms temporal resolution. We found that subsets of z(ij)(0) successfully classified individual subjects to their respective groups (multiple sclerosis, Alzheimer's disease, schizophrenia, Sjögren's syndrome, chronic alcoholism, facial pain, healthy controls) and gave excellent external cross-validation results.
Collapse
|
34
|
DeSouza LV, Grigull J, Ghanny S, Dubé V, Romaschin AD, Colgan TJ, Siu KWM. Endometrial carcinoma biomarker discovery and verification using differentially tagged clinical samples with multidimensional liquid chromatography and tandem mass spectrometry. Mol Cell Proteomics 2007; 6:1170-82. [PMID: 17374602 DOI: 10.1074/mcp.m600378-mcp200] [Citation(s) in RCA: 82] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
The utility of differentially expressed proteins discovered and identified in an earlier study (DeSouza, L., Diehl, G., Rodrigues, M. J., Guo, J., Romaschin, A. D., Colgan, T. J., and Siu, K. W. M. (2005) Search for cancer markers from endometrial tissues using differentially labeled tags iTRAQ and cleavable ICAT with multidimensional liquid chromatography and tandem mass spectrometry. J. Proteome Res. 4, 377-386) to discriminate malignant and benign endometrial tissue samples was verified in a 40-sample iTRAQ (isobaric tags for relative and absolute quantitation) labeling study involving normal proliferative and secretory samples and Types I and II endometrial cancer samples. None of these proteins had the sensitivity and specificity to be used individually to discriminate between normal and cancer samples. However, a panel of pyruvate kinase, chaperonin 10, and alpha1-antitrypsin achieved the best results with a sensitivity, specificity, predictive value, and positive predictive value of 0.95 each in a logistic regression analysis. In addition, three new potential markers were discovered, whereas two other proteins showed promising trends but were not detected in sufficient numbers of samples to permit statistical validation. Differential expressions of some of these candidate biomarkers were independently verified using immunohistochemistry.
Collapse
Affiliation(s)
- Leroi V DeSouza
- Department of Chemistry, York University, 4700 Keele Street, Toronto, Ontario M2J 1P3, Canada
| | | | | | | | | | | | | |
Collapse
|
35
|
Wang SJ. Biomarker as a classifier in pharmacogenomics clinical trials: a tribute to 30th anniversary of PSI. Pharm Stat 2007; 6:283-96. [DOI: 10.1002/pst.316] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
|