1
|
Dong Y, Gottardo R. An approach for integrating multimodal omics data into sparse and interpretable models. CELL REPORTS METHODS 2024; 4:100718. [PMID: 38412832 PMCID: PMC10921032 DOI: 10.1016/j.crmeth.2024.100718] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/05/2024] [Revised: 02/06/2024] [Accepted: 02/06/2024] [Indexed: 02/29/2024]
Abstract
Using omics data, a common goal is to identify a concise set of variables that predict a clinical endpoint from an extensive pool. In a recent paper published in Nature Biotechnology, Hédou et al.1 introduced Stabl, a computational method crafted to identify sparse yet robust signatures linked to endpoints.
Collapse
Affiliation(s)
- Yixing Dong
- Lausanne University Hospital and University of Lausanne, Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Raphael Gottardo
- Lausanne University Hospital and University of Lausanne, Swiss Institute of Bioinformatics, Lausanne, Switzerland.
| |
Collapse
|
2
|
Rahnenführer J, De Bin R, Benner A, Ambrogi F, Lusa L, Boulesteix AL, Migliavacca E, Binder H, Michiels S, Sauerbrei W, McShane L. Statistical analysis of high-dimensional biomedical data: a gentle introduction to analytical goals, common approaches and challenges. BMC Med 2023; 21:182. [PMID: 37189125 DOI: 10.1186/s12916-023-02858-y] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/28/2022] [Accepted: 04/03/2023] [Indexed: 05/17/2023] Open
Abstract
BACKGROUND In high-dimensional data (HDD) settings, the number of variables associated with each observation is very large. Prominent examples of HDD in biomedical research include omics data with a large number of variables such as many measurements across the genome, proteome, or metabolome, as well as electronic health records data that have large numbers of variables recorded for each patient. The statistical analysis of such data requires knowledge and experience, sometimes of complex methods adapted to the respective research questions. METHODS Advances in statistical methodology and machine learning methods offer new opportunities for innovative analyses of HDD, but at the same time require a deeper understanding of some fundamental statistical concepts. Topic group TG9 "High-dimensional data" of the STRATOS (STRengthening Analytical Thinking for Observational Studies) initiative provides guidance for the analysis of observational studies, addressing particular statistical challenges and opportunities for the analysis of studies involving HDD. In this overview, we discuss key aspects of HDD analysis to provide a gentle introduction for non-statisticians and for classically trained statisticians with little experience specific to HDD. RESULTS The paper is organized with respect to subtopics that are most relevant for the analysis of HDD, in particular initial data analysis, exploratory data analysis, multiple testing, and prediction. For each subtopic, main analytical goals in HDD settings are outlined. For each of these goals, basic explanations for some commonly used analysis methods are provided. Situations are identified where traditional statistical methods cannot, or should not, be used in the HDD setting, or where adequate analytic tools are still lacking. Many key references are provided. CONCLUSIONS This review aims to provide a solid statistical foundation for researchers, including statisticians and non-statisticians, who are new to research with HDD or simply want to better evaluate and understand the results of HDD analyses.
Collapse
Affiliation(s)
| | | | - Axel Benner
- Division of Biostatistics, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Federico Ambrogi
- Department of Clinical Sciences and Community Health, University of Milan, Milan, Italy
- Scientific Directorate, IRCCS Policlinico San Donato, San Donato Milanese, Italy
| | - Lara Lusa
- Department of Mathematics, Faculty of Mathematics, Natural Sciences and Information Technology, University of Primorksa, Koper, Slovenia
- Institute of Biostatistics and Medical Informatics, University of Ljubljana, Ljubljana, Slovenia
| | - Anne-Laure Boulesteix
- Institute for Medical Information Processing, Biometry and Epidemiology, Ludwig Maximilian University of Munich, Munich, Germany
| | | | - Harald Binder
- Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center, University of Freiburg, Freiburg, Germany
| | - Stefan Michiels
- Service de Biostatistique et d'Épidémiologie, Gustave Roussy, Université Paris-Saclay, Villejuif, France
- Oncostat U1018, Inserm, Université Paris-Saclay, Labeled Ligue Contre le Cancer, Villejuif, France
| | - Willi Sauerbrei
- Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center, University of Freiburg, Freiburg, Germany
| | - Lisa McShane
- Biometric Research Program, Division of Cancer Treatment and Diagnosis, National Cancer Institute, Bethesda, MD, USA.
| |
Collapse
|
3
|
Abstract
Randomization is considered a safeguard against bias and a gold standard in clinical studies. To assess the generalizability of the accuracy of a model, a common approach is to randomly split a master data set into two parts: one for training and the other for testing. In this paper, we demonstrated the limitations of random split in assessing the generalizability of the accuracy of models through simulation studies. We generated three simulation data for binary or continuous endpoints, each with large sample size (n = 10,000). In each simulation scenario, we randomly split the data into two, one for training and one for testing, and then compare the performance of the model between training and testing data. All simulations were repeated 1,000 times. When random split was used, the model performance based on training and testing data behaved similarly in terms of the true positive fraction and false positive fraction for binary data and mean-squared errors for continuous data. However, when there is a time drift effect in the data, random split will result in large differences between training and testing data. As the training and testing data are similar through a random split, assessing the generalizability of the model on similar data will generate similar results. Generalizability of the accuracy of models is thus best achieved if testing is done in a distinct and independent study.
Collapse
Affiliation(s)
- Zhiheng Xu
- Center for Device Evaluation and Radiological Health (CDRH), FDA, U.S. Food and Drug Administration, Silver Spring, Maryland, USA
| | - Arkendra De
- Companion Diagnostics, Agilent Technologies, Carpinteria, California, USA
| |
Collapse
|
4
|
De A. Statistical Considerations and Challenges for Pivotal Clinical Studies of Artificial Intelligence Medical Tests for Widespread Use: Opportunities for Inter-Disciplinary Collaboration. Stat Biopharm Res 2023. [DOI: 10.1080/19466315.2023.2169752] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
Affiliation(s)
- Arkendra De
- Agilent Technologies, 1005 Mark Avenue, Carpinteria, CA 93013, Tel: 408-553-7111,
| |
Collapse
|
5
|
Saeed RF, Awan UA, Saeed S, Mumtaz S, Akhtar N, Aslam S. Targeted Therapy and Personalized Medicine. Cancer Treat Res 2023; 185:177-205. [PMID: 37306910 DOI: 10.1007/978-3-031-27156-4_10] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Targeted therapy and personalized medicine are novel emerging disciplines of cancer research intended for treatment and prevention. One of the most significant advancements in modern oncology is the shift from an organ-centric strategy to a personalized strategy guided by deep molecular analysis. This shift in view, which focuses on the tumour's precise molecular changes, has paved the way for individualized treatment. Researchers and clinicians are using targeted therapies to select the best treatment available based on the molecular characterization of malignant cancer. In the treatment of a cancer, personalized medicine entails the use of genetic, immunological, and proteomic profiling to provide therapeutic alternatives as well as prognostic information about cancer. In this book, targeted therapies and personalized medicine have been covered for specific malignancies, including latest FDA-approved targeted therapies and it also sheds light on effective anti-cancer regimens and drug resistance. This will help to enhance our ability to conduct individualized health planning, make early diagnoses, and choose optimal medications for each cancer patient with predictable side effects and outcomes in a quickly evolving era. Various applications and tools' capacity have been improved for early diagnosis of cancer and the growing number of clinical trials that choose specific molecular targets reflects this predicament. Nevertheless, there are several limitations that must need to be addressed. Hence, in this chapter, we will discuss recent advancements, challenges, and opportunities in personalized medicine for various cancers, with a specific emphasis on target therapies in diagnostics and therapeutics.
Collapse
Affiliation(s)
- Rida Fatima Saeed
- Department of Biological Sciences, National University of Medical Sciences, Rawalpindi, Pakistan.
| | - Uzma Azeem Awan
- Department of Biological Sciences, National University of Medical Sciences, Rawalpindi, Pakistan
| | | | - Sara Mumtaz
- Department of Biological Sciences, National University of Medical Sciences, Rawalpindi, Pakistan
| | - Nosheen Akhtar
- Department of Biological Sciences, National University of Medical Sciences, Rawalpindi, Pakistan
| | - Shaista Aslam
- Department of Biological Sciences, National University of Medical Sciences, Rawalpindi, Pakistan
| |
Collapse
|
6
|
Grobe N, Scheiber J, Zhang H, Garbe C, Wang X. Omics and Artificial Intelligence in Kidney Diseases. ADVANCES IN KIDNEY DISEASE AND HEALTH 2023; 30:47-52. [PMID: 36723282 DOI: 10.1053/j.akdh.2022.11.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/06/2022] [Revised: 10/28/2022] [Accepted: 11/16/2022] [Indexed: 01/20/2023]
Abstract
Omics applications in nephrology may have relevance in the future to improve clinical care of kidney disease patients. In a short term, patients will benefit from specific measurement and computational analyses around biomarkers identified at various omics-levels. In mid term and long term, these approaches will need to be integrated into a holistic representation of the kidney and all its influencing factors for individualized patient care. Research demonstrates robust data to justify the application of omics for better understanding, risk stratification, and individualized treatment of kidney disease patients. Despite these advances in the research setting, there is still a lack of evidence showing the combination of omics technologies with artificial intelligence and its application in clinical diagnostics and care of patients with kidney disease.
Collapse
Affiliation(s)
| | | | | | - Christian Garbe
- Frankfurter Innovationszentrum Biotechnologie, Frankfurt am Main, Germany
| | | |
Collapse
|
7
|
Diakou I, Papakonstantinou E, Papageorgiou L, Pierouli K, Dragoumani K, Spandidos DA, Bacopoulou F, Chrousos GP, Goulielmos GΝ, Eliopoulos E, Vlachakis D. Multiple sclerosis and computational biology (Review). Biomed Rep 2022; 17:96. [PMID: 36382258 PMCID: PMC9634047 DOI: 10.3892/br.2022.1579] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2022] [Accepted: 09/27/2022] [Indexed: 12/02/2022] Open
Abstract
Multiple sclerosis (MS) is an autoimmune neurodegenerative disease whose prevalence has increased worldwide. The resultant symptoms may be debilitating and can substantially reduce the of patients. Computational biology, which involves the use of computational tools to answer biomedical questions, may provide the basis for novel healthcare approaches in the context of MS. The rapid accumulation of health data, and the ever-increasing computational power and evolving technology have helped to modernize and refine MS research. From the discovery of novel biomarkers to the optimization of treatment and a number of quality-of-life enhancements for patients, computational biology methods and tools are shaping the field of MS diagnosis, management and treatment. The final goal in such a complex disease would be personalized medicine, i.e., providing healthcare services that are tailored to the individual patient, in accordance to the particular biology of their disease and the environmental factors to which they are subjected. The present review article summarizes the current knowledge on MS, modern computational biology and the impact of modern computational approaches of MS.
Collapse
Affiliation(s)
- Io Diakou
- Laboratory of Genetics, Department of Biotechnology, School of Applied Biology and Biotechnology, Agricultural University of Athens, 11855 Athens, Greece
| | - Eleni Papakonstantinou
- Laboratory of Genetics, Department of Biotechnology, School of Applied Biology and Biotechnology, Agricultural University of Athens, 11855 Athens, Greece
| | - Louis Papageorgiou
- Laboratory of Genetics, Department of Biotechnology, School of Applied Biology and Biotechnology, Agricultural University of Athens, 11855 Athens, Greece
| | - Katerina Pierouli
- Laboratory of Genetics, Department of Biotechnology, School of Applied Biology and Biotechnology, Agricultural University of Athens, 11855 Athens, Greece
| | - Konstantina Dragoumani
- Laboratory of Genetics, Department of Biotechnology, School of Applied Biology and Biotechnology, Agricultural University of Athens, 11855 Athens, Greece
| | - Demetrios A. Spandidos
- Laboratory of Clinical Virology, School of Medicine, University of Crete, 71003 Heraklion, Greece
| | - Flora Bacopoulou
- University Research Institute of Maternal and Child Health and Precision Medicine, and UNESCO Chair on Adolescent Health Care, National and Kapodistrian University of Athens, ‘Aghia Sophia’ Children's Hospital, 11527 Athens, Greece
| | - George P. Chrousos
- University Research Institute of Maternal and Child Health and Precision Medicine, and UNESCO Chair on Adolescent Health Care, National and Kapodistrian University of Athens, ‘Aghia Sophia’ Children's Hospital, 11527 Athens, Greece
| | - Georges Ν. Goulielmos
- Section of Molecular Pathology and Human Genetics, Department of Internal Medicine, School of Medicine, University of Crete, 71003 Heraklion, Greece
| | - Elias Eliopoulos
- Laboratory of Genetics, Department of Biotechnology, School of Applied Biology and Biotechnology, Agricultural University of Athens, 11855 Athens, Greece
| | - Dimitrios Vlachakis
- Laboratory of Genetics, Department of Biotechnology, School of Applied Biology and Biotechnology, Agricultural University of Athens, 11855 Athens, Greece
- University Research Institute of Maternal and Child Health and Precision Medicine, and UNESCO Chair on Adolescent Health Care, National and Kapodistrian University of Athens, ‘Aghia Sophia’ Children's Hospital, 11527 Athens, Greece
- Division of Endocrinology and Metabolism, Center of Clinical, Experimental Surgery and Translational Research, Biomedical Research Foundation of The Academy of Athens, 11527 Athens, Greece
| |
Collapse
|
8
|
Gharipour M, Nezafati P, Sadeghian L, Eftekhari A, Rothenberg I, Jahanfar S. Precision medicine and metabolic syndrome. ARYA ATHEROSCLEROSIS 2022; 18:1-10. [PMID: 36817343 PMCID: PMC9937665 DOI: 10.22122/arya.2022.26215] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Subscribe] [Scholar Register] [Received: 07/18/2021] [Accepted: 10/09/2021] [Indexed: 02/24/2023]
Abstract
Metabolic syndrome (MetS) is one of the most important health issues around the world and a major risk factor for both type 2 diabetes mellitus (T2DM) and cardiovascular diseases. The etiology of MetS is determined by the interaction between genetic and environmental factors. Effective prevention and treatment of MetS notably decreases the risk of its complications such as diabetes, obesity, hypertension, and dyslipidemia. According to recent genome-wide association studies, multiple genes are involved in the incidence and development of MetS. The presence of particular genes which are responsible for obesity and lipid metabolism, affecting insulin sensitivity and blood pressure, as well as genes associated with inflammation, can increase the risk of MetS. These molecular markers, together with clinical data and findings from proteomic, metabolomic, pharmacokinetic, and other methods, would clarify the etiology and pathophysiology of MetS and facilitate the development of personalized approaches to the management of MetS. The application of personalized medicinebased on susceptibility identified genomes would help physicians recommend healthier lifestyles and prescribe medications to improve various aspects of health in patients with MetS. In recent years, personalized medicine by genetic testing has helped physicians determine genetic predisposition to MetS, prevent the disease by behavioral, lifestyle-related, or therapeutic interventions, and detect, diagnose, treat, and manage the disease. Clinically, personalized medicine is providing effective strategies for the prevention and treatment of MetS by reducing the time, cost, and failure rate of pharmaceutical clinical trials. It is also eliminating trial-and-error inefficiencies that inflate health care costs and undermine patient care.
Collapse
Affiliation(s)
- Mojgan Gharipour
- Isfahan Cardiovascular Research Center, Cardiovascular Research Institute, Isfahan University of Medical Sciences, Isfahan, Iran,Address for correspondence: Mojgan Gharipour; Isfahan Cardiovascular Research Center, Cardiovascular Research Institute, Isfahan
University of Medical Sciences, Isfahan, Iran;
| | - Pouya Nezafati
- Cardiac Rehabilitation Research Center, Cardiovascular Research Institute, Isfahan University of Medical Sciences, Isfahan, Iran
| | - Ladan Sadeghian
- Isfahan Cardiovascular Research Center, Cardiovascular Research Institute, Isfahan University of Medical Sciences, Isfahan, Iran
| | - Ava Eftekhari
- Hypertension Research Center, Cardiovascular Research Institute, Isfahan University of Medicine Sciences, Isfahan, Iran
| | - Irwin Rothenberg
- Laboratory Quality Advisor/Technical Writer at COLA Resources Inc., Washington, Columbia, USA
| | - Shayesteh Jahanfar
- Health Sciences Building, Central Michigan University, Mount Pleasant, MI, USA
| |
Collapse
|
9
|
Dobbin KK, McShane LM. Sample size methods for evaluation of predictive biomarkers. Stat Med 2022; 41:3199-3210. [DOI: 10.1002/sim.9412] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2021] [Revised: 03/24/2022] [Accepted: 04/02/2022] [Indexed: 11/09/2022]
Affiliation(s)
- Kevin K. Dobbin
- Department of Epidemiology and Biostatistics University of Georgia Athens Georgia USA
| | - Lisa M. McShane
- Biometric Research Program National Cancer Institute Bethesda Maryland USA
| |
Collapse
|
10
|
Polley MYC, Dignam JJ. Statistical Considerations in the Evaluation of Continuous Biomarkers. J Nucl Med 2021; 62:605-611. [PMID: 33579807 DOI: 10.2967/jnumed.120.251520] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2020] [Accepted: 01/19/2021] [Indexed: 01/02/2023] Open
Abstract
Discovery of biomarkers has been steadily increasing over the past decade. Although a plethora of biomarkers has been reported in the biomedical literature, few have been sufficiently validated for broader clinical applications. One particular challenge that may have hindered the adoption of biomarkers into practice is the lack of reproducible biomarker cut points. In this article, we attempt to identify some common statistical issues related to biomarker cut point identification and provide guidance on proper evaluation, interpretation, and validation of such cut points. First, we illustrate how discretization of a continuous biomarker using sample percentiles results in significant information loss and should be avoided. Second, we review the popular "minimal-P-value" approach for cut point identification and show that this method results in highly unstable P values and unduly increases the chance of significant findings when the biomarker is not associated with outcome. Third, we critically review a common analysis strategy by which the selected biomarker cut point is used to categorize patients into different risk categories and then the difference in survival curves among these risk groups in the same dataset is claimed as the evidence supporting the biomarker's prognostic strength. We show that this method yields an exaggerated P value and overestimates the prognostic impact of the biomarker. We illustrate that the degree of the optimistic bias increases with the number of variables being considered in a risk model. Finally, we discuss methods to appropriately ascertain the additional prognostic contribution of the new biomarker in disease settings where standard prognostic factors already exist. Throughout the article, we use real examples in oncology to highlight relevant methodologic issues, and when appropriate, we use simulations to illustrate more abstract statistical concepts.
Collapse
Affiliation(s)
- Mei-Yin C Polley
- Department of Public Health Sciences, University of Chicago, Chicago, Illinois, and NRG Oncology Statistics and Data Management Center, Philadelphia, Pennsylvania
| | - James J Dignam
- Department of Public Health Sciences, University of Chicago, Chicago, Illinois, and NRG Oncology Statistics and Data Management Center, Philadelphia, Pennsylvania
| |
Collapse
|
11
|
Pires JG, da Silva GF, Weyssow T, Conforte AJ, Pagnoncelli D, da Silva FAB, Carels N. Galaxy and MEAN Stack to Create a User-Friendly Workflow for the Rational Optimization of Cancer Chemotherapy. Front Genet 2021; 12:624259. [PMID: 33679888 PMCID: PMC7935533 DOI: 10.3389/fgene.2021.624259] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2020] [Accepted: 01/22/2021] [Indexed: 12/24/2022] Open
Abstract
One aspect of personalized medicine is aiming at identifying specific targets for therapy considering the gene expression profile of each patient individually. The real-world implementation of this approach is better achieved by user-friendly bioinformatics systems for healthcare professionals. In this report, we present an online platform that endows users with an interface designed using MEAN stack supported by a Galaxy pipeline. This pipeline targets connection hubs in the subnetworks formed by the interactions between the proteins of genes that are up-regulated in tumors. This strategy has been proved to be suitable for the inhibition of tumor growth and metastasis in vitro. Therefore, Perl and Python scripts were enclosed in Galaxy for translating RNA-seq data into protein targets suitable for the chemotherapy of solid tumors. Consequently, we validated the process of target diagnosis by (i) reference to subnetwork entropy, (ii) the critical value of density probability of differential gene expression, and (iii) the inhibition of the most relevant targets according to TCGA and GDC data. Finally, the most relevant targets identified by the pipeline are stored in MongoDB and can be accessed through the aforementioned internet portal designed to be compatible with mobile or small devices through Angular libraries.
Collapse
Affiliation(s)
- Jorge Guerra Pires
- Plataforma de Modelagem de Sistemas Biológicos, Center for Technology Development in Health (CDTS), Oswaldo Cruz Foundation (FIOCRUZ), Rio de Janeiro, Brazil
| | - Gilberto Ferreira da Silva
- Plataforma de Modelagem de Sistemas Biológicos, Center for Technology Development in Health (CDTS), Oswaldo Cruz Foundation (FIOCRUZ), Rio de Janeiro, Brazil
| | - Thomas Weyssow
- Informatic Department, Free University of Brussels (ULB), Brussels, Belgium
| | - Alessandra Jordano Conforte
- Plataforma de Modelagem de Sistemas Biológicos, Center for Technology Development in Health (CDTS), Oswaldo Cruz Foundation (FIOCRUZ), Rio de Janeiro, Brazil.,Laboratório de Modelagem Computacional de Sistemas Biológicos, Scientific Computing Program, FIOCRUZ, Rio de Janeiro, Brazil
| | | | - Fabricio Alves Barbosa da Silva
- Laboratório de Modelagem Computacional de Sistemas Biológicos, Scientific Computing Program, FIOCRUZ, Rio de Janeiro, Brazil
| | - Nicolas Carels
- Plataforma de Modelagem de Sistemas Biológicos, Center for Technology Development in Health (CDTS), Oswaldo Cruz Foundation (FIOCRUZ), Rio de Janeiro, Brazil
| |
Collapse
|
12
|
Identification of robust reference genes for studies of gene expression in FFPE melanoma samples and melanoma cell lines. Melanoma Res 2020; 30:26-38. [PMID: 31567589 PMCID: PMC6940030 DOI: 10.1097/cmr.0000000000000644] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]
Abstract
Supplemental Digital Content is available in the text. There is an urgent need for novel diagnostic melanoma biomarkers that can predict increased risk of metastasis at an early stage. Relative quantification of gene expression is the preferred method for quantitative validation of potential biomarkers. However, this approach relies on robust tissue-specific reference genes. In the melanoma field, this has been an obstacle due to lack of validated reference genes. Accordingly, we aimed to identify robust reference genes for normalization of gene expression in melanoma. The robustness of 24 candidate reference genes was evaluated across 80 formalin-fixed paraffin-embedded melanomas of different thickness, −/+ ulceration, −/+ reported cases of metastases and of different BRAF mutation status using quantitative real-time PCR. The expression of the same genes and their robustness as normalizers was furthermore evaluated across a number of melanoma cell lines. We show that housekeeping genes like GAPDH do not qualify as stand-alone normalizers of genes expression in melanoma. Instead, we have as the first identified a panel of robust reference genes for normalization of gene expression in melanoma tumors and cultured melanoma cells. We recommend using a geometric mean of the expression of CLTA, MRPL19 and ACTB for normalization of gene expression in melanomas and a geometric mean of the expression of CASC3 and RPS2 for normalization of gene expression in melanoma cell lines. Normalization, according to our recommendation will allow for quantitative validation of potential novel melanoma biomarkers by quantitative real-time PCR.
Collapse
|
13
|
Challenges and Opportunities in Clinical Applications of Blood-Based Proteomics in Cancer. Cancers (Basel) 2020; 12:cancers12092428. [PMID: 32867043 PMCID: PMC7564506 DOI: 10.3390/cancers12092428] [Citation(s) in RCA: 42] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2020] [Revised: 08/23/2020] [Accepted: 08/25/2020] [Indexed: 12/12/2022] Open
Abstract
Simple Summary The traditional approach in identifying cancer related protein biomarkers has focused on evaluation of a single peptide/protein in tissue or circulation. At best, this approach has had limited success for clinical applications, since multiple pathological tumor pathways may be involved during initiation or progression of cancer which diminishes the significance of a single candidate protein/peptide. Emerging sensitive proteomic based technologies like liquid chromatography mass spectrometry (LC-MS)-based quantitative proteomics can provide a platform for evaluating serial serum or plasma samples to interrogate secreted products of tumor–host interactions, thereby revealing a more “complete” repertoire of biological variables encompassing heterogeneous tumor biology. However, several challenges need to be met for successful application of serum/plasma based proteomics. These include uniform pre-analyte processing of specimens, sensitive and specific proteomic analytical platforms and adequate attention to study design during discovery phase followed by validation of discovery-level signatures for prognostic, predictive, and diagnostic cancer biomarker applications. Abstract Blood is a readily accessible biofluid containing a plethora of important proteins, nucleic acids, and metabolites that can be used as clinical diagnostic tools in diseases, including cancer. Like the on-going efforts for cancer biomarker discovery using the liquid biopsy detection of circulating cell-free and cell-based tumor nucleic acids, the circulatory proteome has been underexplored for clinical cancer biomarker applications. A comprehensive proteome analysis of human serum/plasma with high-quality data and compelling interpretation can potentially provide opportunities for understanding disease mechanisms, although several challenges will have to be met. Serum/plasma proteome biomarkers are present in very low abundance, and there is high complexity involved due to the heterogeneity of cancers, for which there is a compelling need to develop sensitive and specific proteomic technologies and analytical platforms. To date, liquid chromatography mass spectrometry (LC-MS)-based quantitative proteomics has been a dominant analytical workflow to discover new potential cancer biomarkers in serum/plasma. This review will summarize the opportunities of serum proteomics for clinical applications; the challenges in the discovery of novel biomarkers in serum/plasma; and current proteomic strategies in cancer research for the application of serum/plasma proteomics for clinical prognostic, predictive, and diagnostic applications, as well as for monitoring minimal residual disease after treatments. We will highlight some of the recent advances in MS-based proteomics technologies with appropriate sample collection, processing uniformity, study design, and data analysis, focusing on how these integrated workflows can identify novel potential cancer biomarkers for clinical applications.
Collapse
|
14
|
Abstract
The term axial spondyloarthritis (axSpA) encompasses a heterogeneous group of diseases that have variable presentations, extra-articular manifestations and clinical outcomes, and that will respond differently to treatments. The prototypical type of axSpA, ankylosing spondylitis, is thought to be caused by interaction between the genetically primed host immune system and gut microbiota. Currently used biomarkers such as HLA-B27 status, C-reactive protein and erythrocyte sedimentation rate have, at best, moderate diagnostic and predictive value. Improved biomarkers are needed for axSpA to assist with early diagnosis and to better predict treatment responses and long-term outcomes. Advances in a range of 'omics' technologies and statistical approaches, including genomics approaches (such as polygenic risk scores), microbiome profiling and, potentially, transcriptomic, proteomic and metabolomic profiling, are making it possible for more informative biomarker sets to be developed for use in such clinical applications. Future developments in this field will probably involve combinations of biomarkers that require novel statistical approaches to analyse and to produce easy to interpret metrics for clinical application. Large publicly available datasets from well-characterized case-cohort studies that use extensive biological sampling, particularly focusing on early disease and responses to medications, are required to establish successful biomarker discovery and validation programmes.
Collapse
|
15
|
Gal J, Bailleux C, Chardin D, Pourcher T, Gilhodes J, Jing L, Guigonis JM, Ferrero JM, Milano G, Mograbi B, Brest P, Chateau Y, Humbert O, Chamorey E. Comparison of unsupervised machine-learning methods to identify metabolomic signatures in patients with localized breast cancer. Comput Struct Biotechnol J 2020; 18:1509-1524. [PMID: 32637048 PMCID: PMC7327012 DOI: 10.1016/j.csbj.2020.05.021] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2020] [Revised: 05/15/2020] [Accepted: 05/16/2020] [Indexed: 02/08/2023] Open
Abstract
Genomics and transcriptomics have led to the widely-used molecular classification of breast cancer (BC). However, heterogeneous biological behaviors persist within breast cancer subtypes. Metabolomics is a rapidly-expanding field of study dedicated to cellular metabolisms affected by the environment. The aim of this study was to compare metabolomic signatures of BC obtained by 5 different unsupervised machine learning (ML) methods. Fifty-two consecutive patients with BC with an indication for adjuvant chemotherapy between 2013 and 2016 were retrospectively included. We performed metabolomic profiling of tumor resection samples using liquid chromatography-mass spectrometry. Here, four hundred and forty-nine identified metabolites were selected for further analysis. Clusters obtained using 5 unsupervised ML methods (PCA k-means, sparse k-means, spectral clustering, SIMLR and k-sparse) were compared in terms of clinical and biological characteristics. With an optimal partitioning parameter k = 3, the five methods identified three prognosis groups of patients (favorable, intermediate, unfavorable) with different clinical and biological profiles. SIMLR and K-sparse methods were the most effective techniques in terms of clustering. In-silico survival analysis revealed a significant difference for 5-year predicted OS between the 3 clusters. Further pathway analysis using the 449 selected metabolites showed significant differences in amino acid and glucose metabolism between BC histologic subtypes. Our results provide proof-of-concept for the use of unsupervised ML metabolomics enabling stratification and personalized management of BC patients. The design of novel computational methods incorporating ML and bioinformatics techniques should make available tools particularly suited to improving the outcome of cancer treatment and reducing cancer-related mortalities.
Collapse
Affiliation(s)
- Jocelyn Gal
- University Côte d’Azur, Epidemiology and Biostatistics Department, Centre Antoine Lacassagne, Nice F-06189, France
| | - Caroline Bailleux
- University Côte d’Azur, Medical Oncology Department Centre Antoine Lacassagne, Nice F-06189, France
| | - David Chardin
- University Côte d’Azur, Nuclear Medicine Department, Centre Antoine Lacassagne, Nice F-06189, France
- University Côte d’Azur, Commissariat à l’Energie Atomique, Institut de Biosciences et Biotechnologies d'Aix-Marseille, Laboratory Transporters in Imaging and Radiotherapy in Oncology, Faculty of Medicine, Nice F-06100, France
| | - Thierry Pourcher
- University Côte d’Azur, Commissariat à l’Energie Atomique, Institut de Biosciences et Biotechnologies d'Aix-Marseille, Laboratory Transporters in Imaging and Radiotherapy in Oncology, Faculty of Medicine, Nice F-06100, France
| | - Julia Gilhodes
- Department of Biostatistics, Institut Claudius Regaud, IUCT-O Toulouse, France
| | - Lun Jing
- University Côte d’Azur, Commissariat à l’Energie Atomique, Institut de Biosciences et Biotechnologies d'Aix-Marseille, Laboratory Transporters in Imaging and Radiotherapy in Oncology, Faculty of Medicine, Nice F-06100, France
| | - Jean-Marie Guigonis
- University Côte d’Azur, Commissariat à l’Energie Atomique, Institut de Biosciences et Biotechnologies d'Aix-Marseille, Laboratory Transporters in Imaging and Radiotherapy in Oncology, Faculty of Medicine, Nice F-06100, France
| | - Jean-Marc Ferrero
- University Côte d’Azur, Medical Oncology Department Centre Antoine Lacassagne, Nice F-06189, France
| | - Gerard Milano
- University Côte d’Azur, Centre Antoine Lacassagne, Oncopharmacology Unit, Nice F-06189, France
| | - Baharia Mograbi
- University Côte d’Azur, CNRS UMR7284, INSERM U1081, IRCAN TEAM4 Centre Antoine Lacassagne FHU-Oncoage, Nice F-06189, France
| | - Patrick Brest
- University Côte d’Azur, CNRS UMR7284, INSERM U1081, IRCAN TEAM4 Centre Antoine Lacassagne FHU-Oncoage, Nice F-06189, France
| | - Yann Chateau
- University Côte d’Azur, Epidemiology and Biostatistics Department, Centre Antoine Lacassagne, Nice F-06189, France
| | - Olivier Humbert
- University Côte d’Azur, Nuclear Medicine Department, Centre Antoine Lacassagne, Nice F-06189, France
- University Côte d’Azur, Commissariat à l’Energie Atomique, Institut de Biosciences et Biotechnologies d'Aix-Marseille, Laboratory Transporters in Imaging and Radiotherapy in Oncology, Faculty of Medicine, Nice F-06100, France
| | - Emmanuel Chamorey
- University Côte d’Azur, Epidemiology and Biostatistics Department, Centre Antoine Lacassagne, Nice F-06189, France
| |
Collapse
|
16
|
Varela N, Lanas F, Salazar LA, Zambrano T. The Current State of MicroRNAs as Restenosis Biomarkers. Front Genet 2020; 10:1247. [PMID: 31998354 PMCID: PMC6967329 DOI: 10.3389/fgene.2019.01247] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2019] [Accepted: 11/13/2019] [Indexed: 12/21/2022] Open
Abstract
In-stent restenosis corresponds to the diameter reduction of coronary vessels following percutaneous coronary intervention (PCI), an invasive procedure in which a stent is deployed into the coronary arteries, producing profuse neointimal hyperplasia. The reasons for this process to occur still lack a clear answer, which is partly why it remains as a clinically significant problem. As a consequence, there is a vigorous need to identify useful non-invasive biomarkers to differentiate and follow-up subjects at risk of developing restenosis, and due to their extraordinary stability in several bodily fluids, microRNA research has received extensive attention to accomplish this task. This review depicts the current understanding, diagnostic potential and clinical challenges of microRNA molecules as possible blood-based restenosis biomarkers.
Collapse
Affiliation(s)
- Nelson Varela
- Laboratory of Chemical Carcinogenesis and Pharmacogenetics, Department of Basic-Clinical Oncology, Faculty of Medicine, Universidad de Chile, Santiago, Chile
| | - Fernando Lanas
- Department of Internal Medicine, Faculty of Medicine, Universidad de La Frontera, Temuco, Chile.,Center of Molecular Biology and Pharmacogenetics, Scientific and Technological Bioresource Nucleus, Universidad de La Frontera, Temuco, Chile
| | - Luis A Salazar
- Center of Molecular Biology and Pharmacogenetics, Scientific and Technological Bioresource Nucleus, Universidad de La Frontera, Temuco, Chile
| | - Tomás Zambrano
- Department of Medical Technology, Faculty of Medicine, Universidad de Chile, Santiago, Chile
| |
Collapse
|
17
|
Sparano J, O'Neill A, Alpaugh K, Wolff AC, Northfelt DW, Dang CT, Sledge GW, Miller KD. Association of Circulating Tumor Cells With Late Recurrence of Estrogen Receptor-Positive Breast Cancer: A Secondary Analysis of a Randomized Clinical Trial. JAMA Oncol 2019; 4:1700-1706. [PMID: 30054636 DOI: 10.1001/jamaoncol.2018.2574] [Citation(s) in RCA: 132] [Impact Index Per Article: 26.4] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Importance Late recurrence 5 or more years after diagnosis accounts for at least one-half of all cases of recurrent hormone receptor-positive breast cancer. Objective To determine whether the presence of circulating tumor cells (CTCs) in a peripheral blood sample obtained approximately 5 years after diagnosis was associated with late clinical recurrence of operable human epidermal growth factor receptor 2-negative breast cancer. Design, Setting, and Participants This per-protocol secondary analysis of the Double-Blind Phase III Trial of Doxorubicin and Cyclophosphamide Followed by Paclitaxel With Bevacizumab or Placebo in Patients With Lymph Node Positive and High Risk Lymph Node Negative Breast Cancer enrolled patients from 2007 to 2011 who were without clinical evidence of recurrence between 4.5 and 7.5 years after primary surgical treatment of human epidermal growth factor receptor 2-negative stage II-III breast cancer followed by adjuvant systemic therapy. Patients were enrolled in a subprotocol for secondary analysis from February 25, 2013, to July 29, 2016, after signing consent for the subprotocol. The analysis was performed in April 2018. Interventions A blood sample was obtained for identification and enumeration of CTCs. Main Outcome and Measures The association between a positive CTC assay result (at least 1 CTC per 7.5 mL of blood) and clinical recurrence. Results Among 547 women included in this analysis, the results of the CTC assay were positive for 18 of 353 with hormone receptor-positive disease (5.1% [95% CI, 3.0%-7.9%]); 23 of 353 patients (6.5% [95% CI, 4.2%-9.6%]) had a clinical recurrence. The recurrence rates per person-year of follow-up in the CTC-positive and CTC-negative groups were 21.4% (7 recurrences per 32.7 person-years) and 2.0% (16 recurrences per 796.3 person-years), respectively. In multivariate models including clinical covariates, a positive CTC assay result was associated with a 13.1-fold higher risk of recurrence (hazard ratio point estimate, 13.1; 95% CI, 4.7-36.3). Seven of 23 patients (30.4% [95% CI, 13.2%-52.9%]) with recurrence had a positive CTC assay result at a median of 2.8 years (range, 0.1-2.8 years) before clinical recurrence. The CTC assay result was also positive for 8 of 193 patients (4.1% [95% CI, 1.8%-8.0%]) with hormone receptor-negative disease, although only 1 patient (0.5% [95% CI, 0%-2.9%]) experienced disease recurrence (this patient was CTC negative). Conclusions and Relevance A single positive CTC assay result 5 years after diagnosis of hormone receptor-positive breast cancer provided independent prognostic information for late clinical recurrence, which provides proof of concept that liquid-based biomarkers may be used to risk stratify for late recurrence and guide therapy. Trial Registration ClinicalTrials.gov identifier: NCT00433511.
Collapse
Affiliation(s)
- Joseph Sparano
- Department of Oncology, Montefiore Medical Center, Albert Einstein College of Medicine, Bronx, New York
| | - Anne O'Neill
- Department of Biostatistics & Computational Biology, Dana Farber Cancer Institute, Boston, Massachusetts
| | - Katherine Alpaugh
- Department of Medical Oncology, Fox Chase Cancer Center, Philadelphia, Pennsylvania
| | - Antonio C Wolff
- Department of Oncology, Johns Hopkins University Sidney Kimmel Comprehensive Cancer Center, Baltimore, Maryland
| | - Donald W Northfelt
- Department of Internal Medicine, Division of Hematology/Oncology, Mayo Clinic, Scottsdale, Arizona
| | - Chau T Dang
- Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, New York
| | - George W Sledge
- Department of Medicine, Division of Oncology, Stanford Cancer Center, Palo Alto, California
| | - Kathy D Miller
- Department of Medicine, Division of Hematology/Oncology, Indiana University Melvin and Bren Simon Cancer Center, Indianapolis
| |
Collapse
|
18
|
Krzyszczyk P, Acevedo A, Davidoff EJ, Timmins LM, Marrero-Berrios I, Patel M, White C, Lowe C, Sherba JJ, Hartmanshenn C, O'Neill KM, Balter ML, Fritz ZR, Androulakis IP, Schloss RS, Yarmush ML. The growing role of precision and personalized medicine for cancer treatment. TECHNOLOGY 2018; 6:79-100. [PMID: 30713991 PMCID: PMC6352312 DOI: 10.1142/s2339547818300020] [Citation(s) in RCA: 196] [Impact Index Per Article: 32.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2023]
Abstract
Cancer is a devastating disease that takes the lives of hundreds of thousands of people every year. Due to disease heterogeneity, standard treatments, such as chemotherapy or radiation, are effective in only a subset of the patient population. Tumors can have different underlying genetic causes and may express different proteins in one patient versus another. This inherent variability of cancer lends itself to the growing field of precision and personalized medicine (PPM). There are many ongoing efforts to acquire PPM data in order to characterize molecular differences between tumors. Some PPM products are already available to link these differences to an effective drug. It is clear that PPM cancer treatments can result in immense patient benefits, and companies and regulatory agencies have begun to recognize this. However, broader changes to the healthcare and insurance systems must be addressed if PPM is to become part of standard cancer care.
Collapse
Affiliation(s)
- Paulina Krzyszczyk
- Department of Biomedical Engineering, Rutgers University, 599 Taylor Road, Piscataway, NJ 08854, USA
| | - Alison Acevedo
- Department of Biomedical Engineering, Rutgers University, 599 Taylor Road, Piscataway, NJ 08854, USA
| | - Erika J Davidoff
- Department of Biomedical Engineering, Rutgers University, 599 Taylor Road, Piscataway, NJ 08854, USA
| | - Lauren M Timmins
- Department of Biomedical Engineering, Rutgers University, 599 Taylor Road, Piscataway, NJ 08854, USA
| | - Ileana Marrero-Berrios
- Department of Biomedical Engineering, Rutgers University, 599 Taylor Road, Piscataway, NJ 08854, USA
| | - Misaal Patel
- Department of Biomedical Engineering, Rutgers University, 599 Taylor Road, Piscataway, NJ 08854, USA
| | - Corina White
- Department of Biomedical Engineering, Rutgers University, 599 Taylor Road, Piscataway, NJ 08854, USA
| | - Christopher Lowe
- Department of Biomedical Engineering, Rutgers University, 599 Taylor Road, Piscataway, NJ 08854, USA
| | - Joseph J Sherba
- Department of Biomedical Engineering, Rutgers University, 599 Taylor Road, Piscataway, NJ 08854, USA
| | - Clara Hartmanshenn
- Department of Chemical & Biochemical Engineering, Rutgers University, 98 Brett Road, Piscataway, NJ 08854, USA
| | - Kate M O'Neill
- Department of Biomedical Engineering, Rutgers University, 599 Taylor Road, Piscataway, NJ 08854, USA
| | - Max L Balter
- Department of Biomedical Engineering, Rutgers University, 599 Taylor Road, Piscataway, NJ 08854, USA
| | - Zachary R Fritz
- Department of Biomedical Engineering, Rutgers University, 599 Taylor Road, Piscataway, NJ 08854, USA
| | - Ioannis P Androulakis
- Department of Biomedical Engineering, Rutgers University, 599 Taylor Road, Piscataway, NJ 08854, USA
- Department of Chemical & Biochemical Engineering, Rutgers University, 98 Brett Road, Piscataway, NJ 08854, USA
| | - Rene S Schloss
- Department of Biomedical Engineering, Rutgers University, 599 Taylor Road, Piscataway, NJ 08854, USA
| | - Martin L Yarmush
- Department of Biomedical Engineering, Rutgers University, 599 Taylor Road, Piscataway, NJ 08854, USA
- Department of Chemical & Biochemical Engineering, Rutgers University, 98 Brett Road, Piscataway, NJ 08854, USA
| |
Collapse
|
19
|
Cuzick J. Prognosis vs Treatment Interaction. JNCI Cancer Spectr 2018; 2:pky006. [PMID: 31360838 PMCID: PMC6649762 DOI: 10.1093/jncics/pky006] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2018] [Revised: 02/12/2018] [Accepted: 02/21/2018] [Indexed: 11/18/2022] Open
Abstract
There is a somewhat confused belief that a biomarker must show an interaction effect with a treatment before it can be used to determine the need for such a treatment. This is rarely true for well-established clinical markers such as tumor size or regional lymph node involvement. In many cases, this is also not true for biomarkers, especially when considering nontargeted therapies. Here I argue that for nontargeted treatments prognosis is often more important than interaction with treatment, because it is the absolute and not the relative benefit that matters, and when there is no treatment interaction, the same relative benefit translates into a larger absolute benefit for poor prognosis patients.
Collapse
Affiliation(s)
- Jack Cuzick
- Centre for Cancer Prevention, Wolfson Institute of Preventive Medicine, Queen Mary University of London, Charterhouse Square, London, EC1M 6BQ, UK
| |
Collapse
|
20
|
CD8+ T cell infiltration in breast and colon cancer: A histologic and statistical analysis. PLoS One 2018; 13:e0190158. [PMID: 29320521 PMCID: PMC5761898 DOI: 10.1371/journal.pone.0190158] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2017] [Accepted: 12/08/2017] [Indexed: 12/27/2022] Open
Abstract
The prevalence of cytotoxic tumor infiltrating lymphocytes (TILs) has demonstrated prognostic value in multiple tumor types. In particular, CD8 counts (in combination with CD3 and CD45RO) have been shown to be superior to traditional UICC staging in colon cancer patients and higher total CD8 counts have been associated with better survival in breast cancer patients. However, immune infiltrate heterogeneity can lead to potentially significant misrepresentations of marker prevalence in routine histologic sections. We examined step sections of breast and colorectal cancer samples for CD8+ T cell prevalence by standard chromogenic immunohistochemistry to determine marker variability and inform practice of T cell biomarker assessment in formalin-fixed, paraffin-embedded (FFPE) tissue samples. Stained sections were digitally imaged and CD8+ lymphocytes within defined regions of interest (ROI) including the tumor and surrounding stroma were enumerated. Statistical analyses of CD8+ cell count variability using a linear model/ANOVA framework between patients as well as between levels within a patient sample were performed. Our results show that CD8+ T-cell distribution is highly homogeneous within a standard tissue sample in both colorectal and breast carcinomas. As such, cytotoxic T cell prevalence by immunohistochemistry on a single level or even from a subsample of biopsy fragments taken from that level can be considered representative of cytotoxic T cell infiltration for the entire tumor section within the block. These findings support the technical validity of biomarker strategies relying on CD8 immunohistochemistry.
Collapse
|
21
|
Kang T, Ding W, Zhang L, Ziemek D, Zarringhalam K. A biological network-based regularized artificial neural network model for robust phenotype prediction from gene expression data. BMC Bioinformatics 2017; 18:565. [PMID: 29258445 PMCID: PMC5735940 DOI: 10.1186/s12859-017-1984-2] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2017] [Accepted: 12/05/2017] [Indexed: 12/21/2022] Open
Abstract
BACKGROUND Stratification of patient subpopulations that respond favorably to treatment or experience and adverse reaction is an essential step toward development of new personalized therapies and diagnostics. It is currently feasible to generate omic-scale biological measurements for all patients in a study, providing an opportunity for machine learning models to identify molecular markers for disease diagnosis and progression. However, the high variability of genetic background in human populations hampers the reproducibility of omic-scale markers. In this paper, we develop a biological network-based regularized artificial neural network model for prediction of phenotype from transcriptomic measurements in clinical trials. To improve model sparsity and the overall reproducibility of the model, we incorporate regularization for simultaneous shrinkage of gene sets based on active upstream regulatory mechanisms into the model. RESULTS We benchmark our method against various regression, support vector machines and artificial neural network models and demonstrate the ability of our method in predicting the clinical outcomes using clinical trial data on acute rejection in kidney transplantation and response to Infliximab in ulcerative colitis. We show that integration of prior biological knowledge into the classification as developed in this paper, significantly improves the robustness and generalizability of predictions to independent datasets. We provide a Java code of our algorithm along with a parsed version of the STRING DB database. CONCLUSION In summary, we present a method for prediction of clinical phenotypes using baseline genome-wide expression data that makes use of prior biological knowledge on gene-regulatory interactions in order to increase robustness and reproducibility of omic-scale markers. The integrated group-wise regularization methods increases the interpretability of biological signatures and gives stable performance estimates across independent test sets.
Collapse
Affiliation(s)
- Tianyu Kang
- Department of Computer Science, University of Massachusetts Boston, 100 Morrissey Boulevard, Boston, 02125 MA USA
| | - Wei Ding
- Department of Computer Science, University of Massachusetts Boston, 100 Morrissey Boulevard, Boston, 02125 MA USA
| | - Luoyan Zhang
- Department of Computer Science, University of Massachusetts Boston, 100 Morrissey Boulevard, Boston, 02125 MA USA
| | - Daniel Ziemek
- Inflammation and Immunology, Pfizer Worldwide Research & Development, Berlin, Germany
| | - Kourosh Zarringhalam
- Department of Mathematics, University of Massachusetts Boston, 100 Morrissey Boulevard, Boston, 0212 MA USA
| |
Collapse
|
22
|
Robles AI, Harris CC. Integration of multiple "OMIC" biomarkers: A precision medicine strategy for lung cancer. Lung Cancer 2017; 107:50-58. [PMID: 27344275 PMCID: PMC5156586 DOI: 10.1016/j.lungcan.2016.06.003] [Citation(s) in RCA: 38] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2016] [Revised: 06/07/2016] [Accepted: 06/10/2016] [Indexed: 12/17/2022]
Abstract
More than half of all new lung cancer diagnoses are made in patients with locally advanced or metastatic disease, at which point therapeutic options are scarce. It is anticipated, however, that the widespread use of Low-Dose Computed Tomography (LDCT) screening, will lead to a greater proportion of lung cancers being diagnosed at an early, operable, stage. Still, the overall rate of recurrence for surgically treated Stage I lung cancer patients is up to 30% within 5 years of diagnosis. Thus, the identification and clinical application of biomarkers of early stage lung cancer are a pressing medical need. The integrative analysis of "omic," clinical and epidemiological data for single patients is a core principle of precision medicine. Through rigorous bioinformatics and statistical analyses we have identified biomarkers of early-stage lung cancer based on DNA methylation, expression of mRNA and miRNA, inflammatory cytokines, and urinary metabolites. Beyond a more comprehensive understanding of the molecular taxonomy of lung cancer, these biomarkers can have very practical implications in the context of unmet clinical needs of early stage lung cancer patients: First, current guidelines for LDCT screening broadly include individuals based on age and history of heavy smoking. Tumor-derived circulating biomarkers in the blood and urine associated with lung cancer risk could narrow and prioritize individuals for LDCT screening. Second, a high number of nodules are identified by LDCT, of which fewer than 5% are finally diagnosed as lung cancer. Biomarkers may help discriminate malignant nodules from benign or indolent lesions. Third, the expected rise in the numbers of lung cancer patients diagnosed at an early stage will necessitate new treatment options. Circulating, urinary and tissue-based biomarkers that molecularly categorize Stage I patients after tumor resection can help identify high-risk patients who may benefit from adjuvant chemotherapy or innovative immunotherapy regimens.
Collapse
Affiliation(s)
- Ana I Robles
- Laboratory of Human Carcinogenesis, National Cancer Institute, NIH, Bethesda, MD 20892, USA.
| | - Curtis C Harris
- Laboratory of Human Carcinogenesis, National Cancer Institute, NIH, Bethesda, MD 20892, USA
| |
Collapse
|
23
|
Quezada H, Guzmán-Ortiz AL, Díaz-Sánchez H, Valle-Rios R, Aguirre-Hernández J. Omics-based biomarkers: current status and potential use in the clinic. ACTA ACUST UNITED AC 2017. [DOI: 10.1016/j.bmhime.2017.11.030] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
|
24
|
Omics-based biomarkers: current status and potential use in the clinic. BOLETIN MEDICO DEL HOSPITAL INFANTIL DE MEXICO 2017; 74:219-226. [DOI: 10.1016/j.bmhimx.2017.03.003] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2017] [Accepted: 03/17/2017] [Indexed: 12/20/2022] Open
|
25
|
Delmar P, Irl C, Tian L. Innovative methods for the identification of predictive biomarker signatures in oncology: Application to bevacizumab. Contemp Clin Trials Commun 2017; 5:107-115. [PMID: 29740627 PMCID: PMC5936698 DOI: 10.1016/j.conctc.2017.01.007] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2015] [Revised: 12/06/2016] [Accepted: 01/17/2017] [Indexed: 11/26/2022] Open
Abstract
Current methods for subgroup analyses of data collected from randomized clinical trials (RCTs) may lead to false-positives from multiple testing, lack power to detect moderate but clinically meaningful differences, or be too simplistic in characterizing patients who may benefit from treatment. Herein, we present a general procedure based on a set of newly developed statistical methods for the identification and evaluation of complex multivariate predictors of treatment effect. Furthermore, we implemented this procedure to identify a subgroup of patients who may receive the largest benefit from bevacizumab treatment using a panel of 10 biomarkers measured at baseline in patients enrolled on two RCTs investigating bevacizumab in metastatic breast cancer. Data were collected from patients with human epidermal growth factor receptor 2 (HER2)-negative (AVADO) and HER2-positive (AVEREL) metastatic breast cancer. We first developed a classification rule based on an estimated individual scoring system, using data from the AVADO study only. The classification rule takes into consideration a panel of biomarkers, including vascular endothelial growth factor (VEGF)-A. We then classified the patients in the independent AVEREL study into patient groups according to “promising” or “not-promising” treatment benefit based on this rule and conducted a statistical analysis within these subgroups to compute point estimates, confidence intervals, and p-values for treatment effect and its interaction. In the group with promising treatment benefit in the AVEREL study, the estimated hazard ratio of bevacizumab versus placebo for progression-free survival was 0.687 (95% confidence interval [CI]: 0.462–1.024, p = 0.065), while in the not-promising group the hazard ratio (HR) was 1.152 (95% CI: 0.526–2.524, p = 0.723). Using the median level of VEGF-A from the AVEREL study to divide the study population, then the HR becomes 0.711 (95% CI: 0.435–1.163, p = 0.174) in the promising group and 0.828 (95% CI: 0.496–1.380, p = 0.468) in the not-promising group. Similar results were obtained with the median VEGF-A levels from the AVADO study (“promising” group: HR = 0.709, 95%CI: 0.444–1.133, p = 0.151; “not-promising” group: HR = 0.851, 95% CI: 0.497–1.458, p = 0.556). Our analysis shows it is feasible to employ statistical methods for empirically constructing and validating a scoring system based on a panel of biomarkers. This scoring system can be used to estimate the treatment effect for individual patients and identify a subgroup of patients who may benefit from treatment. The proposed procedure can provide a general framework to organize many statistical methods (existing or to be developed) into a coherent set of analyses for the development of personalized medicines and has the potential of broad applications.
Collapse
Affiliation(s)
- Paul Delmar
- Department of Biostatistics, F. Hoffmann-La Roche Ltd., Basel, Switzerland
| | - Cornelia Irl
- Department of Biostatistics, Genentech Inc., South San Francisco, CA, USA
| | - Lu Tian
- Department of Biomedical Data Science, Stanford University School of Medicine, Palo Alto, CA, USA
| |
Collapse
|
26
|
Rankin NJ, Preiss D, Welsh P, Sattar N. Applying metabolomics to cardiometabolic intervention studies and trials: past experiences and a roadmap for the future. Int J Epidemiol 2016; 45:1351-1371. [PMID: 27789671 PMCID: PMC5100629 DOI: 10.1093/ije/dyw271] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/01/2016] [Indexed: 12/22/2022] Open
Abstract
Metabolomics and lipidomics are emerging methods for detailed phenotyping of small molecules in samples. It is hoped that such data will: (i) enhance baseline prediction of patient response to pharmacotherapies (beneficial or adverse); (ii) reveal changes in metabolites shortly after initiation of therapy that may predict patient response, including adverse effects, before routine biomarkers are altered; and( iii) give new insights into mechanisms of drug action, particularly where the results of a trial of a new agent were unexpected, and thus help future drug development. In these ways, metabolomics could enhance research findings from intervention studies. This narrative review provides an overview of metabolomics and lipidomics in early clinical intervention studies for investigation of mechanisms of drug action and prediction of drug response (both desired and undesired). We highlight early examples from drug intervention studies associated with cardiometabolic disease. Despite the strengths of such studies, particularly the use of state-of-the-art technologies and advanced statistical methods, currently published studies in the metabolomics arena are largely underpowered and should be considered as hypothesis-generating. In order for metabolomics to meaningfully improve stratified medicine approaches to patient treatment, there is a need for higher quality studies, with better exploitation of biobanks from randomized clinical trials i.e. with large sample size, adjudicated outcomes, standardized procedures, validation cohorts, comparison witth routine biochemistry and both active and control/placebo arms. On the basis of this review, and based on our research experience using clinically established biomarkers, we propose steps to more speedily advance this area of research towards potential clinical impact.
Collapse
Affiliation(s)
- Naomi J Rankin
- BHF Glasgow Cardiovascular Research Centre
- Glasgow Polyomics, University of Glasgow, Glasgow, UK
| | - David Preiss
- Clinical Trials Service Unit and Epidemiological Studies Unit, University of Oxford, Oxford, UK
| | - Paul Welsh
- BHF Glasgow Cardiovascular Research Centre
| | | |
Collapse
|
27
|
Korenkova V, Slyskova J, Novosadova V, Pizzamiglio S, Langerova L, Bjorkman J, Vycital O, Liska V, Levy M, Veskrna K, Vodicka P, Vodickova L, Kubista M, Verderio P. The focus on sample quality: Influence of colon tissue collection on reliability of qPCR data. Sci Rep 2016; 6:29023. [PMID: 27383461 PMCID: PMC4935944 DOI: 10.1038/srep29023] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2016] [Accepted: 06/14/2016] [Indexed: 01/12/2023] Open
Abstract
Successful molecular analyses of human solid tissues require intact biological material with well-preserved nucleic acids, proteins, and other cell structures. Pre-analytical handling, comprising of the collection of material at the operating theatre, is among the first critical steps that influence sample quality. The aim of this study was to compare the experimental outcomes obtained from samples collected and stored by the conventional means of snap freezing and by PAXgene Tissue System (Qiagen). These approaches were evaluated by measuring rRNA and mRNA integrity of the samples (RNA Quality Indicator and Differential Amplification Method) and by gene expression profiling. The collection procedures of the biological material were implemented in two hospitals during colon cancer surgery in order to identify the impact of the collection method on the experimental outcome. Our study shows that the pre-analytical sample handling has a significant effect on the quality of RNA and on the variability of qPCR data. PAXgene collection mode proved to be more easily implemented in the operating room and moreover the quality of RNA obtained from human colon tissues by this method is superior to the one obtained by snap freezing.
Collapse
Affiliation(s)
- Vlasta Korenkova
- Institute of Biotechnology, BIOCEV Centre, Czech Academy of Sciences, Průmyslová 595, 252 42, Vestec u Prahy, Czech Republic
| | - Jana Slyskova
- Institute of Experimental Medicine, Czech Academy of Sciences, Prague, Czech Republic
| | - Vendula Novosadova
- Institute of Biotechnology, BIOCEV Centre, Czech Academy of Sciences, Průmyslová 595, 252 42, Vestec u Prahy, Czech Republic
| | - Sara Pizzamiglio
- Unit of Medical Statistics, Biometry and Bioinformatics, Fondazione Istituto di Ricovero e Cura a Carattere Scientifico (IRCCS) Istituto Nazionale dei Tumori, Milan, Italy
| | - Lucie Langerova
- Institute of Biotechnology, BIOCEV Centre, Czech Academy of Sciences, Průmyslová 595, 252 42, Vestec u Prahy, Czech Republic
| | | | - Ondrej Vycital
- Deparment of Surgery, Teaching Hospital and Medical School Pilsen, Charles University in Prague, Pilsen, Czech Republic.,Biomedical Centre, Medical School Pilsen, Charles University in Prague, Pilsen, Czech Republic
| | - Vaclav Liska
- Deparment of Surgery, Teaching Hospital and Medical School Pilsen, Charles University in Prague, Pilsen, Czech Republic.,Biomedical Centre, Medical School Pilsen, Charles University in Prague, Pilsen, Czech Republic
| | - Miroslav Levy
- Surgical Department, Thomayer Hospital, First Faculty of Medicine, Charles University in Prague, Prague, Czech Republic
| | - Karel Veskrna
- Surgical Department, Thomayer Hospital, First Faculty of Medicine, Charles University in Prague, Prague, Czech Republic
| | - Pavel Vodicka
- Institute of Experimental Medicine, Czech Academy of Sciences, Prague, Czech Republic.,Biomedical Centre, Medical School Pilsen, Charles University in Prague, Pilsen, Czech Republic.,Institute of Biology and Medical Genetics, First Faculty of Medicine, Charles University in Prague, Prague, Czech Republic
| | - Ludmila Vodickova
- Institute of Experimental Medicine, Czech Academy of Sciences, Prague, Czech Republic.,Biomedical Centre, Medical School Pilsen, Charles University in Prague, Pilsen, Czech Republic.,Institute of Biology and Medical Genetics, First Faculty of Medicine, Charles University in Prague, Prague, Czech Republic
| | - Mikael Kubista
- Institute of Biotechnology, BIOCEV Centre, Czech Academy of Sciences, Průmyslová 595, 252 42, Vestec u Prahy, Czech Republic.,TATAA Biocenter AB, Göteborg, Sweden
| | - Paolo Verderio
- Unit of Medical Statistics, Biometry and Bioinformatics, Fondazione Istituto di Ricovero e Cura a Carattere Scientifico (IRCCS) Istituto Nazionale dei Tumori, Milan, Italy
| |
Collapse
|
28
|
Lerner SP, Bajorin DF, Dinney CP, Efstathiou JA, Groshen S, Hahn NM, Hansel D, Kwiatkowski D, O’Donnell M, Rosenberg J, Svatek R, Abrams JS, Al-Ahmadie H, Apolo AB, Bellmunt J, Callahan M, Cha EK, Drake C, Jarow J, Kamat A, Kim W, Knowles M, Mann B, Marchionni L, McConkey D, McShane L, Ramirez N, Sharabi A, Sharpe AH, Solit D, Tangen CM, Amiri AT, Van Allen E, West PJ, Witjes JA, Quale DZ. Summary and Recommendations from the National Cancer Institute's Clinical Trials Planning Meeting on Novel Therapeutics for Non-Muscle Invasive Bladder Cancer. Bladder Cancer 2016; 2:165-202. [PMID: 27376138 PMCID: PMC4927845 DOI: 10.3233/blc-160053] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
The NCI Bladder Cancer Task Force convened a Clinical Trials Planning Meeting (CTPM) Workshop focused on Novel Therapeutics for Non-Muscle Invasive Bladder Cancer (NMIBC). Meeting attendees included a broad and multi-disciplinary group of clinical and research stakeholders and included leaders from NCI, FDA, National Clinical Trials Network (NCTN), advocacy and the pharmaceutical and biotech industry. The meeting goals and objectives were to: 1) create a collaborative environment in which the greater bladder research community can pursue future optimally designed novel clinical trials focused on the theme of molecular targeted and immune-based therapies in NMIBC; 2) frame the clinical and translational questions that are of highest priority; and 3) develop two clinical trial designs focusing on immunotherapy and molecular targeted therapy. Despite successful development and implementation of large Phase II and Phase III trials in bladder and upper urinary tract cancers, there are no active and accruing trials in the NMIBC space within the NCTN. Disappointingly, there has been only one new FDA approved drug (Valrubicin) in any bladder cancer disease state since 1998. Although genomic-based data for bladder cancer are increasingly available, translating these discoveries into practice changing treatment is still to come. Recently, major efforts in defining the genomic characteristics of NMIBC have been achieved. Aligned with these data is the growing number of targeted therapy agents approved and/or in development in other organ site cancers and the multiple similarities of bladder cancer with molecular subtypes in these other cancers. Additionally, although bladder cancer is one of the more immunogenic tumors, some tumors have the ability to attenuate or eliminate host immune responses. Two trial concepts emerged from the meeting including a window of opportunity trial (Phase 0) testing an FGFR3 inhibitor and a second multi-arm multi-stage trial testing combinations of BCG or radiotherapy and immunomodulatory agents in patients who recur after induction BCG (BCG failure).
Collapse
Affiliation(s)
| | - Dean F. Bajorin
- Memorial Sloan Kettering Cancer Center, New York, NY, USA
- Weill Medical College of Cornell University, New York, NY, USA
| | - Colin P. Dinney
- The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | | | - Susan Groshen
- USC Norris Comprehensive Cancer Center, University of Southern California, Los Angeles, CA, USA
| | - Noah M. Hahn
- Johns Hopkins Sidney Kimmel Comprehensive Cancer Center, Baltimore, MD, USA
| | - Donna Hansel
- University of California, La Jolla, San Diego, CA, USA
| | - David Kwiatkowski
- Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA
| | | | - Jonathan Rosenberg
- Memorial Sloan Kettering Cancer Center, New York, NY, USA
- Weill Medical College of Cornell University, New York, NY, USA
| | - Robert Svatek
- UT Health Science Center San Antonio, San Antonio, TX, USA
| | - Jeffrey S. Abrams
- Cancer Therapy Evaluation Program, Division of Cancer Treatment and Diagnosis, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | | | - Andrea B. Apolo
- Genitourinary Malignancies Branch, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Joaquim Bellmunt
- Dana-Farber Cancer Institute and Harvard Medical School, Boston, MA, USA
| | - Margaret Callahan
- Memorial Sloan Kettering Cancer Center, New York, NY, USA
- Weill Medical College of Cornell University, New York, NY, USA
| | - Eugene K. Cha
- Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Charles Drake
- Johns Hopkins Sidney Kimmel Comprehensive Cancer Center, Baltimore, MD, USA
| | - Jonathan Jarow
- Office of Hematology and Oncology Products, U.S. Food and Drug Administration, Silver Spring, MD, USA
| | - Ashish Kamat
- The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - William Kim
- University of North Carolina Lineberger Comprehensive Cancer Center, Chapel Hill, NC, USA
| | - Margaret Knowles
- Leeds Institute of Cancer and Pathology, University of Leeds, Leeds, UK
| | - Bhupinder Mann
- Cancer Therapy Evaluation Program, Division of Cancer Treatment and Diagnosis, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Luigi Marchionni
- Johns Hopkins Sidney Kimmel Comprehensive Cancer Center, Baltimore, MD, USA
| | - David McConkey
- The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Lisa McShane
- Biometric Research Branch, Division of Cancer Treatment and Diagnosis, National Cancer Institute, Bethesda, MD, USA
| | - Nilsa Ramirez
- The Research Institute at Nationwide Children’s Hospital, Columbus, OH, USA
| | - Andrew Sharabi
- USC Norris Comprehensive Cancer Center, University of Southern California, Los Angeles, CA, USA
- Johns Hopkins Sidney Kimmel Comprehensive Cancer Center, Baltimore, MD, USA
| | - Arlene H. Sharpe
- Dana-Farber Cancer Institute and Harvard Medical School, Boston, MA, USA
| | - David Solit
- Memorial Sloan Kettering Cancer Center, New York, NY, USA
- Weill Medical College of Cornell University, New York, NY, USA
| | - Catherine M. Tangen
- SWOG Statistical Center, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | | | - Eliezer Van Allen
- Dana-Farber Cancer Institute and Harvard Medical School, Boston, MA, USA
| | | | - J. A. Witjes
- Department of Urology, Radboud UMC, Nijmegen, The Netherlands
| | | |
Collapse
|
29
|
Kim S, Lin CW, Tseng GC. MetaKTSP: a meta-analytic top scoring pair method for robust cross-study validation of omics prediction analysis. Bioinformatics 2016; 32:1966-73. [PMID: 27153719 DOI: 10.1093/bioinformatics/btw115] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2015] [Accepted: 02/19/2016] [Indexed: 01/08/2023] Open
Abstract
MOTIVATION Supervised machine learning is widely applied to transcriptomic data to predict disease diagnosis, prognosis or survival. Robust and interpretable classifiers with high accuracy are usually favored for their clinical and translational potential. The top scoring pair (TSP) algorithm is an example that applies a simple rank-based algorithm to identify rank-altered gene pairs for classifier construction. Although many classification methods perform well in cross-validation of single expression profile, the performance usually greatly reduces in cross-study validation (i.e. the prediction model is established in the training study and applied to an independent test study) for all machine learning methods, including TSP. The failure of cross-study validation has largely diminished the potential translational and clinical values of the models. The purpose of this article is to develop a meta-analytic top scoring pair (MetaKTSP) framework that combines multiple transcriptomic studies and generates a robust prediction model applicable to independent test studies. RESULTS We proposed two frameworks, by averaging TSP scores or by combining P-values from individual studies, to select the top gene pairs for model construction. We applied the proposed methods in simulated data sets and three large-scale real applications in breast cancer, idiopathic pulmonary fibrosis and pan-cancer methylation. The result showed superior performance of cross-study validation accuracy and biomarker selection for the new meta-analytic framework. In conclusion, combining multiple omics data sets in the public domain increases robustness and accuracy of the classification model that will ultimately improve disease understanding and clinical treatment decisions to benefit patients. AVAILABILITY AND IMPLEMENTATION An R package MetaKTSP is available online. (http://tsenglab.biostat.pitt.edu/software.htm). CONTACT ctseng@pitt.edu SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- SungHwan Kim
- Department of Biostatistics, University of Pittsburgh, Pittsburgh, PA, USA Department of Statistics, Korea University, Seoul, South Korea
| | - Chien-Wei Lin
- Department of Biostatistics, University of Pittsburgh, Pittsburgh, PA, USA
| | - George C Tseng
- Department of Biostatistics, University of Pittsburgh, Pittsburgh, PA, USA Department of Computational and Systems Biology Department of Human Genetics, University of Pittsburgh, Pittsburgh, PA, USA
| |
Collapse
|
30
|
Tang R, Pennello G. Validation of Prognostic Marker Tests: Statistical Lessons Learned From Regulatory Experience. Ther Innov Regul Sci 2016; 50:241-252. [DOI: 10.1177/2168479015601721] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|