1
|
Koole SN, Schouten PC, Hauke J, Kluin RJC, Nederlof P, Richters LK, Krebsbach G, Sikorska K, Alkemade M, Opdam M, Schagen van Leeuwen JH, Schreuder HWR, Hermans RHM, de Hingh IHJT, Mom CH, Arts HJG, van Ham M, van Dam P, Vuylsteke P, Sanders J, Horlings HM, van de Vijver KK, Hahnen E, van Driel WJ, Schmutzler R, Sonke GS, Linn SC. Effect of HIPEC according to HRD/BRCAwt genomic profile in stage III ovarian cancer - results from the phase III OVHIPEC trial. Int J Cancer 2022; 151:1394-1404. [PMID: 35583992 DOI: 10.1002/ijc.34124] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2021] [Revised: 04/09/2022] [Accepted: 04/21/2022] [Indexed: 11/07/2022]
Abstract
The addition of hyperthermic intraperitoneal chemotherapy (HIPEC) with cisplatin to interval cytoreductive surgery improves recurrence-free (RFS) and overall survival (OS) in patients with stage III ovarian cancer. Homologous recombination deficient (HRD) ovarian tumors are usually more platinum sensitive. Since hyperthermia impairs BRCA1/2 protein function, we hypothesized that HRD tumors respond best to treatment with HIPEC. We analyzed the effect of HIPEC in patients in the OVHIPEC trial, stratified by HRD status and BRCAm status. Clinical data and tissue samples were collected from patients included in the randomized, phase III OVHIPEC-1 trial. DNA copy number variation (CNV) profiles, HRD-related pathogenic mutations, and BRCA1 promotor hypermethylation were determined. CNV-profiles were categorized as HRD or non-HRD, based on a previously validated algorithm-based BRCA1-like classifier. Hazard ratios (HR) and corresponding 99% confidence intervals (CI) for the effect of RFS and OS of HIPEC in the BRCAm, the HRD/BRCAwt and the non-HRD group were estimated using Cox proportional hazard models. DNA was available from 200/245 (82%) patients. Seventeen (9%) tumors carried a pathogenic mutation in BRCA1 and 14 (7%) in BRCA2. Ninety-one (46%) tumors classified as BRCA1-like. The effect of HIPEC on RFS and OS was absent in BRCAm tumors (HR 1.25; 99%CI 0.48-3.29), and most present in HRD/BRCAwt (HR 0.44; 99%CI 0.21-0.91), and non-HRD/BRCAwt tumors (HR 0.82; 99%CI 0.48-1.42), interaction p-value: 0.024. Patients with HRD tumors without pathogenic BRCA1/2 mutation appear to benefit most from treatment with HIPEC, while benefit in patients with BRCA1/2 pathogenic mutations and patients without HRD seems less evident.
Collapse
Affiliation(s)
- Simone N Koole
- Department of Gynecology, The Netherlands Cancer Institute, Center of Gynecologic Oncology Amsterdam, Amsterdam, The Netherlands
- Department of Medical Oncology, The Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Philip C Schouten
- Department of Molecular Pathology, The Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Jan Hauke
- Faculty of Medicine and Center for Familial Breast and Ovarian Cancer and Center for Integrated Oncology (CIO), Cologne, University Hospital Cologne, Cologne, Germany
| | - Roel J C Kluin
- Genomics Core Facility, Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Petra Nederlof
- Department of Pathology, The Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Lisa K Richters
- Faculty of Medicine and Center for Familial Breast and Ovarian Cancer and Center for Integrated Oncology (CIO), Cologne, University Hospital Cologne, Cologne, Germany
| | - Gabriele Krebsbach
- Faculty of Medicine and Center for Familial Breast and Ovarian Cancer and Center for Integrated Oncology (CIO), Cologne, University Hospital Cologne, Cologne, Germany
| | - Karolina Sikorska
- Department of Biometrics, The Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Maartje Alkemade
- Core Facility of Molecular Pathology and Biobanking, The Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Mark Opdam
- Core Facility of Molecular Pathology and Biobanking, The Netherlands Cancer Institute, Amsterdam, The Netherlands
| | | | - Henk W R Schreuder
- Department of Gynecological Oncology, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Ralph H M Hermans
- Department of Gynecology and Obstetrics, Catharina Hospital, Eindhoven, The Netherlands
| | | | - Constantijne H Mom
- Department of Obstetrics and Gynecology, Amsterdam University Medical Center, Center of Gynecologic Oncology Amsterdam, Amsterdam, The Netherlands
| | - Henriette J G Arts
- Department of Gynecological Oncology, University Medical Center Groningen, Groningen, The Netherlands
| | - Maaike van Ham
- Department of Gynecological Oncology, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Peter van Dam
- Department of Gynecologic Oncology, University Hospital Antwerp, Antwerp, Belgium
| | - Peter Vuylsteke
- Department of Medical Oncology, UCL Louvain, CHU Namur Sainte-Elisabeth, Namur, Belgium
- University of Botswana, Gaborone, Botswana
| | - Joyce Sanders
- Department of Pathology, The Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Hugo M Horlings
- Department of Pathology, The Netherlands Cancer Institute, Amsterdam, The Netherlands
| | | | - Eric Hahnen
- Faculty of Medicine and Center for Familial Breast and Ovarian Cancer and Center for Integrated Oncology (CIO), Cologne, University Hospital Cologne, Cologne, Germany
| | - Willemien J van Driel
- Department of Gynecology, The Netherlands Cancer Institute, Center of Gynecologic Oncology Amsterdam, Amsterdam, The Netherlands
| | - Rita Schmutzler
- Faculty of Medicine and Center for Familial Breast and Ovarian Cancer and Center for Integrated Oncology (CIO), Cologne, University Hospital Cologne, Cologne, Germany
| | - Gabe S Sonke
- Department of Medical Oncology, The Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Sabine C Linn
- Department of Medical Oncology, The Netherlands Cancer Institute, Amsterdam, The Netherlands
| |
Collapse
|
2
|
Marchais A, Marques Da Costa ME, Job B, Abbas R, Drubay D, Piperno-Neumann S, Fromigué O, Gomez-Brouchet A, Françoise R, Droit R, Lervat C, ENTZ-WERLE N, Pacquement H, Devoldere C, Cupissol D, Bodet D, GANDEMER V, Berger MG, Bérard PM, Jimenez M, Vassal G, Geoerger B, Brugieres L, Gaspar N. Immune infiltrate and tumor microenvironment transcriptional programs stratify pediatric osteosarcoma into prognostic groups at diagnosis. Cancer Res 2022; 82:974-985. [DOI: 10.1158/0008-5472.can-20-4189] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2020] [Revised: 07/26/2021] [Accepted: 01/18/2022] [Indexed: 11/16/2022]
|
3
|
Coscarelli R, Caroletti GN, Joelsson M, Engström E, Caloiero T. Validation metrics of homogenization techniques on artificially inhomogenized monthly temperature networks in Sweden and Slovenia (1950-2005). Sci Rep 2021; 11:18288. [PMID: 34521908 PMCID: PMC8440641 DOI: 10.1038/s41598-021-97685-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2021] [Accepted: 08/20/2021] [Indexed: 11/29/2022] Open
Abstract
In order to correctly detect climate signals and discard possible instrumentation errors, establishing coherent data records has become increasingly relevant. However, since real measurements can be inhomogeneous, their use for assessing homogenization techniques is not directly possible, and the study of their performance must be done on homogeneous datasets subjected to controlled, artificial inhomogeneities. In this paper, considering two European temperature networks over the 1950–2005 period, up to 7 artificial breaks and an average of 107 missing data per station were introduced, in order to determine that mean square error, absolute bias and factor of exceedance can be meaningfully used to validate the best-performing homogenization technique. Three techniques were used, ACMANT and two versions of HOMER: the standard, automated setup mode and a manual setup. Results showed that the HOMER techniques performed better regarding the factor of exceedance, while ACMANT was best with regard to absolute error and root mean square error. Regardless of the technique used, it was also established that homogenization quality anti-correlated meaningfully to the number of breaks. On the other hand, as missing data are almost always replaced in the two HOMER techniques, only ACMANT performance is significantly, negatively affected by the amount of missing data.
Collapse
Affiliation(s)
- Roberto Coscarelli
- National Research Council of Italy, Research Institute for Geo-Hydrological Protection (CNR-IRPI), Via Cavour 4/6, 87036, Rende, CS, Italy.
| | - Giulio Nils Caroletti
- National Research Council of Italy, Research Institute for Geo-Hydrological Protection (CNR-IRPI), Via Cavour 4/6, 87036, Rende, CS, Italy
| | - Magnus Joelsson
- Swedish Meteorological and Hydrological Institute (SMHI), Climate Information and Statistics, 601 76, Norrköping, Sweden
| | - Erik Engström
- Swedish Meteorological and Hydrological Institute (SMHI), Climate Information and Statistics, 601 76, Norrköping, Sweden
| | - Tommaso Caloiero
- National Research Council of Italy, Institute for Agriculture and Forest Systems in the Mediterranean (CNR-ISAFOM), Via Cavour 4/6, 87036, Rende, CS, Italy
| |
Collapse
|
4
|
Dupain C, Masliah‐Planchon J, Gu C, Girard E, Gestraud P, Du Rusquec P, Borcoman E, Bello D, Ricci F, Hescot S, Sablin M, Tresca P, de Moura A, Loirat D, Frelaut M, Vincent‐Salomon A, Lecerf C, Callens C, Antonio S, Franck C, Mariani O, Bièche I, Kamal M, Le Tourneau C, Servois V. Fine-needle aspiration as an alternative to core needle biopsy for tumour molecular profiling in precision oncology: prospective comparative study of next-generation sequencing in cancer patients included in the SHIVA02 trial. Mol Oncol 2021; 15:104-115. [PMID: 32750212 PMCID: PMC7782085 DOI: 10.1002/1878-0261.12776] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2020] [Revised: 07/07/2020] [Accepted: 07/30/2020] [Indexed: 12/15/2022] Open
Abstract
High-throughput molecular profiling of solid tumours using core needle biopsies (CNB) allows the identification of actionable molecular alterations, with around 70% success rate. Although several studies have demonstrated the utility of small biopsy specimens for molecular testing, there remains debate as to the sensitivity of the less invasive fine-needle aspiration (FNA) compared to CNB to detect molecular alterations. We aimed to prospectively evaluate the potential of FNA to detect such alterations in various tumour types as compared to CNB in cancer patients included in the SHIVA02 trial. An in-house amplicon-based targeted sequencing panel (Illumina TSCA 99.3 kb panel covering 87 genes) was used to identify pathogenic variants and gene copy number variations (CNV) in concomitant CNB and FNA samples obtained from 61 patients enrolled in the SHIVA02 trial (NCT03084757). The main tumour types analysed were breast (38%), colon (15%), pancreas (11%), followed by cervix and stomach (7% each). We report 123 molecular alterations (85 variants, 23 amplifications and 15 homozygous deletions) among which 98 (80%) were concordant between CNB and FNA. The remaining discordances were mainly related to deletions status, yet undetected alterations were not exclusively specific to FNA. Comparative analysis of molecular alterations in CNB and FNA showed high concordance in terms of variants as well as CNVs identified. We conclude FNA could therefore be used in routine diagnostics workflow and clinical trials for tumour molecular profiling with the advantages of being minimally invasive and preserve tissue material needed for diagnostic, prognostic or theranostic purposes.
Collapse
Affiliation(s)
- Célia Dupain
- Department of Drug Development and Innovation (D3i)Institut CurieParis & Saint‐CloudFrance
| | | | - Céline Gu
- Department of PathologyInstitut CuriePSL Research UniversityParisFrance
| | - Elodie Girard
- INSERM U900 Research UnitInstitut CurieSaint‐CloudFrance
| | | | - Pauline Du Rusquec
- Department of Drug Development and Innovation (D3i)Institut CurieParis & Saint‐CloudFrance
| | - Edith Borcoman
- Department of Drug Development and Innovation (D3i)Institut CurieParis & Saint‐CloudFrance
| | - Diana Bello
- Department of Drug Development and Innovation (D3i)Institut CurieParis & Saint‐CloudFrance
| | - Francesco Ricci
- Department of Drug Development and Innovation (D3i)Institut CurieParis & Saint‐CloudFrance
| | - Ségolène Hescot
- Department of Drug Development and Innovation (D3i)Institut CurieParis & Saint‐CloudFrance
| | - Marie‐Paule Sablin
- Department of Drug Development and Innovation (D3i)Institut CurieParis & Saint‐CloudFrance
| | - Patricia Tresca
- Department of Drug Development and Innovation (D3i)Institut CurieParis & Saint‐CloudFrance
| | - Alexandre de Moura
- Department of Drug Development and Innovation (D3i)Institut CurieParis & Saint‐CloudFrance
| | - Delphine Loirat
- Department of Drug Development and Innovation (D3i)Institut CurieParis & Saint‐CloudFrance
| | - Maxime Frelaut
- Department of Drug Development and Innovation (D3i)Institut CurieParis & Saint‐CloudFrance
| | | | - Charlotte Lecerf
- Department of Drug Development and Innovation (D3i)Institut CurieParis & Saint‐CloudFrance
| | - Céline Callens
- Department of GeneticsInstitut CuriePSL Research UniversityParisFrance
| | - Samantha Antonio
- Department of GeneticsInstitut CuriePSL Research UniversityParisFrance
| | - Coralie Franck
- Department of GeneticsInstitut CuriePSL Research UniversityParisFrance
| | - Odette Mariani
- Department of PathologyInstitut CuriePSL Research UniversityParisFrance
| | - Ivan Bièche
- Department of GeneticsInstitut CuriePSL Research UniversityParisFrance
- INSERM U1016Faculty of Pharmaceutical and Biological SciencesParis Descartes UniversityParisFrance
| | - Maud Kamal
- Department of Drug Development and Innovation (D3i)Institut CurieParis & Saint‐CloudFrance
| | - Christophe Le Tourneau
- Department of Drug Development and Innovation (D3i)Institut CurieParis & Saint‐CloudFrance
- INSERM U900 Research UnitInstitut CurieSaint‐CloudFrance
- Paris‐Saclay UniversityParisFrance
| | - Vincent Servois
- Department of RadiologyInstitut CuriePSL Research UniversityParis & Saint‐CloudFrance
| |
Collapse
|
5
|
Coll J, Domonkos P, Guijarro J, Curley M, Rustemeier E, Aguilar E, Walsh S, Sweeney J. Application of homogenization methods for Ireland's monthly precipitation records: Comparison of break detection results. INTERNATIONAL JOURNAL OF CLIMATOLOGY : A JOURNAL OF THE ROYAL METEOROLOGICAL SOCIETY 2020; 40:6169-6188. [PMID: 33281282 PMCID: PMC7687140 DOI: 10.1002/joc.6575] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/21/2018] [Revised: 03/24/2020] [Accepted: 03/25/2020] [Indexed: 06/12/2023]
Abstract
Time series homogenization for 299 of the available precipitation records for the island of Ireland (IENet) was performed. Four modern relative homogenization methods, that is, HOMER, ACMANT, CLIMATOL and AHOPS were applied to this network of station series where contiguous intact monthly records range from 30 to 70 years within the base period 1941-2010. Break detection results are compared between homogenization methods, and coincidences with available documentary information (metadata) were analysed. The lowest (highest) number of breaks were detected with HOMER (ACMANT). Large differences of break frequency were found, namely ACMANT and AHOPS detected 8 times as many breaks than HOMER, while the break frequency with CLIMATOL was intermediate. Also, the ratio of series classified to be homogeneous varies widely between the methods. It is 85% with HOMER, 60% with CLIMATOL, 31% with AHOPS, while only 22% with ACMANT. In a further experiment, all the available time series for Ireland and Northern Ireland, (910 series) were used with ACMANT and CLIMATOL to explore the stability of break frequency for the same 299 series examined in the base experiment. While overall break frequency slightly increased (by 6-13%), the break positions notably changed for individual time series. The number of breaks changed for 59% (23%) of the series with ACMANT (CLIMATOL). For the breaks detected coincidentally by at least three methods including ACMANT and CLIMATOL in the base experiment, the second experiment confirmed the break positions in 86-87% of the breaks. The consequences of these results in relation to the reliability of statistical homogenization are discussed. Sometimes markedly different step functions provide comparable good approaches. However, the accuracy of homogenized time series cannot be related directly to the instability of break detection results.
Collapse
Affiliation(s)
- John Coll
- Irish Climate Analysis and Research Units, Department of GeographyMaynooth UniversityMaynoothIreland
| | - Peter Domonkos
- Centre for Climate Change (C3)Universitat Rovira i VirgiliTortosaSpain
| | - José Guijarro
- Agencia Estatal de MeteorologiaDelegación Territorial en Illes BalearsPalma, MallorcaSpain
| | - Mary Curley
- Climatology and Observations DivisionMet ÉireannDublinIreland
| | - Elke Rustemeier
- Department of HydrometeorologyDeutscher WetterdienstOffenbachGermany
| | - Enric Aguilar
- Centre for Climate Change (C3)Universitat Rovira i VirgiliTarragonaSpain
| | - Séamus Walsh
- Climatology and Observations DivisionMet ÉireannDublinIreland
| | - John Sweeney
- Irish Climate Analysis and Research Units, Department of GeographyMaynooth UniversityMaynoothIreland
| |
Collapse
|
6
|
Khalil AIS, Muzaki SRBM, Chattopadhyay A, Sanyal A. Identification and utilization of copy number information for correcting Hi-C contact map of cancer cell lines. BMC Bioinformatics 2020; 21:506. [PMID: 33160308 PMCID: PMC7648276 DOI: 10.1186/s12859-020-03832-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2019] [Accepted: 10/23/2020] [Indexed: 12/13/2022] Open
Abstract
Background Hi-C and its variant techniques have been developed to capture the spatial organization of chromatin. Normalization of Hi-C contact map is essential for accurate modeling and interpretation of high-throughput chromatin conformation capture (3C) experiments. Hi-C correction tools were originally developed to normalize systematic biases of karyotypically normal cell lines. However, a vast majority of available Hi-C datasets are derived from cancer cell lines that carry multi-level DNA copy number variations (CNVs). CNV regions display over- or under-representation of interaction frequencies compared to CN-neutral regions. Therefore, it is necessary to remove CNV-driven bias from chromatin interaction data of cancer cell lines to generate a euploid-equivalent contact map. Results We developed the HiCNAtra framework to compute high-resolution CNV profiles from Hi-C or 3C-seq data of cancer cell lines and to correct chromatin contact maps from systematic biases including CNV-associated bias. First, we introduce a novel ‘entire-fragment’ counting method for better estimation of the read depth (RD) signal from Hi-C reads that recapitulates the whole-genome sequencing (WGS)-derived coverage signal. Second, HiCNAtra employs a multimodal-based hierarchical CNV calling approach, which outperformed OneD and HiNT tools, to accurately identify CNVs of cancer cell lines. Third, incorporating CNV information with other systematic biases, HiCNAtra simultaneously estimates the contribution of each bias and explicitly corrects the interaction matrix using Poisson regression. HiCNAtra normalization abolishes CNV-induced artifacts from the contact map generating a heatmap with homogeneous signal. When benchmarked against OneD, CAIC, and ICE methods using MCF7 cancer cell line, HiCNAtra-corrected heatmap achieves the least 1D signal variation without deforming the inherent chromatin interaction signal. Additionally, HiCNAtra-corrected contact frequencies have minimum correlations with each of the systematic bias sources compared to OneD’s explicit method. Visual inspection of CNV profiles and contact maps of cancer cell lines reveals that HiCNAtra is the most robust Hi-C correction tool for ameliorating CNV-induced bias. Conclusions HiCNAtra is a Hi-C-based computational tool that provides an analytical and visualization framework for DNA copy number profiling and chromatin contact map correction of karyotypically abnormal cell lines. HiCNAtra is an open-source software implemented in MATLAB and is available at https://github.com/AISKhalil/HiCNAtra.
Collapse
Affiliation(s)
- Ahmed Ibrahim Samir Khalil
- School of Computer Science and Engineering, Nanyang Technological University, 50 Nanyang Avenue, Singapore, 639798, Singapore
| | | | - Anupam Chattopadhyay
- School of Computer Science and Engineering, Nanyang Technological University, 50 Nanyang Avenue, Singapore, 639798, Singapore.
| | - Amartya Sanyal
- School of Biological Sciences, Nanyang Technological University, 60 Nanyang Drive, Singapore, 637551, Singapore.
| |
Collapse
|
7
|
Alshawaqfeh M, Al Kawam A, Serpedin E, Datta A. Robust Recurrent CNV Detection in the Presence of Inter-Subject Variability. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020; 17:1056-1067. [PMID: 30387737 DOI: 10.1109/tcbb.2018.2878560] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
The study of recurrent copy number variations (CNVs) plays an important role in understanding the onset and evolution of complex diseases such as cancer. Array-based comparative genomic hybridization (aCGH) is a widely used microarray based technology for identifying CNVs. However, due to high noise levels and inter-sample variability, detecting recurrent CNVs from aCGH data remains a challenging topic. This paper proposes a novel method for identification of the recurrent CNVs. In the proposed method, the noisy aCGH data is modeled as the superposition of three matrices: a full-rank matrix of weighted piece-wise generating signals accounting for the clean aCGH data, a Gaussian noise matrix to model the inherent experimentation errors and other sources of error, and a sparse matrix to capture the sparse inter-sample (sample-specific) variations. We demonstrated the ability of our method to separate accurately recurrent CNVs from sample-specific variations and noise in both simulated (artificial) data and real data. The proposed method produced more accurate results than current state-of-the-art methods used in recurrent CNV detection and exhibited robustness to noise and sample-specific variations.
Collapse
|
8
|
Villepelet A, Hugonin S, Atallah S, Job B, Baujat B, Lacau St Guily J, Lacave R. Effects of tobacco abuse on major chromosomal instability in human papilloma virus 16-positive oropharyngeal squamous cell carcinoma. Int J Oncol 2019; 55:527-535. [PMID: 31268157 DOI: 10.3892/ijo.2019.4826] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2019] [Accepted: 05/31/2019] [Indexed: 11/05/2022] Open
Abstract
A substantial number of patients with oropharyngeal squamous cell carcinoma (OPSCC) have two oncogenic risk factors: Human papilloma virus (HPV) infection and tobacco use. These factors can be competitive or synergistic at the chromosomal and genomic levels, with strong prognostic and therapeutic implications. HPV16 has been shown in vitro to be a high‑risk HPV that induces low rates of chromosomal copy number alterations. However, chromosomal instability can be increased by smoking. Evaluating chromosomal instability in HPV‑positive patients according to their smoking status is therefore critical for assessing the prognosis and therapeutic impact. The aim of this study was to assess chromosomal instability in patients with HPV‑positive OPSCC according to smoking status. Chromosomal instability was investigated with array‑based comparative genomic hybridization (aCGH) in 50 patients with OPSCC. Differences in chromosomal alterations were examined according to the HPV and smoking status of the patients. HPV‑positive tumors (24/26 were HPV16‑positive) had fewer genomic aberrations (P=0.0082) and fewer breakpoints (P=0.048) than HPV‑negative tumors. We confirmed the association between HPV‑positive OPSCC and chromosomal losses at 11q. We verified the association between HPV‑negative OPSCC and losses at 3p and 9p and gains at 7q and 11q13. In the patients with OPSCC who were HPV‑positive, the total number of chromosomal aberrations per tumor was significantly higher in the group of patients who were smokers (P=0.003). However, the cytobands did not differ significantly according to the smoking status. On the whole, the data of this study may help to improve the stratification of HPV‑positive OPSCC patients and must be supplemented by next‑generation sequencing studies in order to describe the mutational and transcriptomic profiles of such patients according to smoking status.
Collapse
Affiliation(s)
- Aude Villepelet
- Assistance Publique-Hôpitaux de Paris (AP-HP), Department of Otolaryngology-Head and Neck Surgery, Tenon Hospital, 75020 Paris, France
| | - Sylvain Hugonin
- Assistance Publique-Hôpitaux de Paris (AP-HP), Tumors Genomic Unit, Tenon Hospital, 75020 Paris, France
| | - Sarah Atallah
- Assistance Publique-Hôpitaux de Paris (AP-HP), Department of Otolaryngology-Head and Neck Surgery, Tenon Hospital, 75020 Paris, France
| | - Bastien Job
- INSERM US23, Gustave Roussy Cancer Campus, 94805 Villejuif, France
| | - Bertrand Baujat
- Assistance Publique-Hôpitaux de Paris (AP-HP), Department of Otolaryngology-Head and Neck Surgery, Tenon Hospital, 75020 Paris, France
| | - Jean Lacau St Guily
- Assistance Publique-Hôpitaux de Paris (AP-HP), Department of Otolaryngology-Head and Neck Surgery, Tenon Hospital, 75020 Paris, France
| | - Roger Lacave
- Faculty of Medicine, Sorbonne University, GRC10, 75013 Paris, France
| |
Collapse
|
9
|
Developing Gridded Climate Data Sets of Precipitation for Greece Based on Homogenized Time Series. CLIMATE 2019. [DOI: 10.3390/cli7050068] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
The creation of realistic gridded precipitation fields improves our understanding of the observed climate and is necessary for validating climate model output for a wide range of applications. The challenge in trying to represent the highly variable nature of precipitation is to overcome the lack of density of observations in both time and space. Data sets of mean monthly and annual precipitations were developed for Greece in gridded format with an analysis of 30 arcsec (∼800 m) based on data from 1971 to 2000. One hundred and fifty-seven surface stations from two different observation networks were used to cover a satisfactory range of elevations. Station data were homogenized and subjected to quality control to represent changes in meteorological conditions rather than changes in the conditions under which the observations were made. The Meteorological Interpolation based on Surface Homogenized Data Basis (MISH) interpolation method was used to develop data sets that reproduce, as closely as possible, the spatial climate patterns over the region of interest. The main geophysical factors considered for the interpolation of mean monthly precipitation fields were elevation, latitude, incoming solar irradiance, Euclidian distance from the coastline, and land-to-sea percentage. Low precipitation interpolation uncertainties estimated with the cross-validation method provided confidence in the interpolation method. The resulting high-resolution maps give an overall realistic representation of precipitation, especially in fall and winter, with a clear longitudinal dependence on precipitation decreasing from western to eastern continental Greece.
Collapse
|
10
|
Collilieux X, Lebarbier E, Robin S. A factor model approach for the joint segmentation with between‐series correlation. Scand Stat Theory Appl 2018. [DOI: 10.1111/sjos.12368] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
Affiliation(s)
- Xavier Collilieux
- Laboratoire de Recherche en Géodésie (LAREG), l'Institut National de l'information Géographique et forestière (IGN)Université Paris Diderot Paris France
| | - Emilie Lebarbier
- UMR MIA‐Paris, AgroParisTech, INRAUniversité Paris‐Saclay Paris France
| | - Stéphane Robin
- UMR MIA‐Paris, AgroParisTech, INRAUniversité Paris‐Saclay Paris France
| |
Collapse
|
11
|
Nagorski J, Allen GI. Genomic region detection via Spatial Convex Clustering. PLoS One 2018; 13:e0203007. [PMID: 30204756 PMCID: PMC6133280 DOI: 10.1371/journal.pone.0203007] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2018] [Accepted: 08/13/2018] [Indexed: 12/31/2022] Open
Abstract
Several modern genomic technologies, such as DNA-Methylation arrays, measure spatially registered probes that number in the hundreds of thousands across multiple chromosomes. The measured probes are by themselves less interesting scientifically; instead scientists seek to discover biologically interpretable genomic regions comprised of contiguous groups of probes which may act as biomarkers of disease or serve as a dimension-reducing pre-processing step for downstream analyses. In this paper, we introduce an unsupervised feature learning technique which maps technological units (probes) to biological units (genomic regions) that are common across all subjects. We use ideas from fusion penalties and convex clustering to introduce a method for Spatial Convex Clustering, or SpaCC. Our method is specifically tailored to detecting multi-subject regions of methylation, but we also test our approach on the well-studied problem of detecting segments of copy number variation. We formulate our method as a convex optimization problem, develop a massively parallelizable algorithm to find its solution, and introduce automated approaches for handling missing values and determining tuning parameters. Through simulation studies based on real methylation and copy number variation data, we show that SpaCC exhibits significant performance gains relative to existing methods. Finally, we illustrate SpaCC's advantages as a pre-processing technique that reduces large-scale genomics data into a smaller number of genomic regions through several cancer epigenetics case studies on subtype discovery, network estimation, and epigenetic-wide association.
Collapse
Affiliation(s)
- John Nagorski
- Department of Statistics, Rice University, Houston, TX, United States of America
| | - Genevera I. Allen
- Department of Statistics, Rice University, Houston, TX, United States of America
- Department of Electrical and Computer Engineering, Rice University, Houston, TX, United States of America
- Jan and Dan Duncan Neurological Research Institute and Department of Pediatrics-Neurology, Baylor College of Medicine, Houston, TX, United States of America
| |
Collapse
|
12
|
Servant N, Varoquaux N, Heard E, Barillot E, Vert JP. Effective normalization for copy number variation in Hi-C data. BMC Bioinformatics 2018; 19:313. [PMID: 30189838 PMCID: PMC6127909 DOI: 10.1186/s12859-018-2256-5] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2017] [Accepted: 06/20/2018] [Indexed: 12/28/2022] Open
Abstract
BACKGROUND Normalization is essential to ensure accurate analysis and proper interpretation of sequencing data, and chromosome conformation capture data such as Hi-C have particular challenges. Although several methods have been proposed, the most widely used type of normalization of Hi-C data usually casts estimation of unwanted effects as a matrix balancing problem, relying on the assumption that all genomic regions interact equally with each other. RESULTS In order to explore the effect of copy-number variations on Hi-C data normalization, we first propose a simulation model that predict the effects of large copy-number changes on a diploid Hi-C contact map. We then show that the standard approaches relying on equal visibility fail to correct for unwanted effects in the presence of copy-number variations. We thus propose a simple extension to matrix balancing methods that model these effects. Our approach can either retain the copy-number variation effects (LOIC) or remove them (CAIC). We show that this leads to better downstream analysis of the three-dimensional organization of rearranged genomes. CONCLUSIONS Taken together, our results highlight the importance of using dedicated methods for the analysis of Hi-C cancer data. Both CAIC and LOIC methods perform well on simulated and real Hi-C data sets, each fulfilling different needs.
Collapse
Affiliation(s)
- Nicolas Servant
- Institut Curie, PSL Research University, Paris, F-75005 France
- INSERM, U900, Paris, F-75005 France
- Mines ParisTech, PSL Research University, CBIO-Centre for Computational Biology, Paris, F-75006 France
| | - Nelle Varoquaux
- Department of Statistics, University of California, Berkeley, USA
- Berkeley Institute for Data Science, Berkeley, USA
| | - Edith Heard
- Institut Curie, PSL Research University, CNRS UMR3215, INSERM U934, Paris, F-75005 France
| | - Emmanuel Barillot
- Institut Curie, PSL Research University, Paris, F-75005 France
- INSERM, U900, Paris, F-75005 France
- Mines ParisTech, PSL Research University, CBIO-Centre for Computational Biology, Paris, F-75006 France
| | - Jean-Philippe Vert
- Institut Curie, PSL Research University, Paris, F-75005 France
- INSERM, U900, Paris, F-75005 France
- Mines ParisTech, PSL Research University, CBIO-Centre for Computational Biology, Paris, F-75006 France
- Ecole Normale Supérieure, PSL Research University, Department of Mathematics and Applications, Paris, F-75005 France
| |
Collapse
|
13
|
Baragatti M, Bertin K, Lebarbier E, Meza C. A Bayesian approach for the segmentation of series with a functional effect. STAT MODEL 2018. [DOI: 10.1177/1471082x18755539] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Abstract: In some application fields, series are affected by two different types of effects: abrupt changes (or change-points) and functional effects. We propose here a Bayesian approach that allows us to estimate these two parts. Here, the underlying piecewise-constant part (associated to the abrupt changes) is expressed as the product of a lower triangular matrix by a sparse vector and the functional part as a linear combination of functions from a large dictionary where we want to select the relevant ones. This problem can thus lead to a global sparse estimation and a stochastic search variable selection approach is used to this end. The performance of our proposed method is assessed using simulation experiments. Applications to three real datasets from geodesy, agronomy and economy fields are also presented.
Collapse
Affiliation(s)
- Meili Baragatti
- MISTEA, Montpellier SupAgro, INRA, CNRS, Univ Montpellier, Montpellier, France
| | - Karine Bertin
- CIMFAV-Facultad de Ingeniería, Universidad de Valparaíso, Valparaíso, Chile
| | | | - Cristian Meza
- CIMFAV-Facultad de Ingeniería, Universidad de Valparaíso, Valparaíso, Chile
| |
Collapse
|
14
|
Fan Z, Mackey L. Empirical Bayesian analysis of simultaneous changepoints in multiple data sequences. Ann Appl Stat 2017. [DOI: 10.1214/17-aoas1075] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
15
|
Machné R, Murray DB, Stadler PF. Similarity-Based Segmentation of Multi-Dimensional Signals. Sci Rep 2017; 7:12355. [PMID: 28955039 PMCID: PMC5617875 DOI: 10.1038/s41598-017-12401-8] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2017] [Accepted: 08/30/2017] [Indexed: 11/25/2022] Open
Abstract
The segmentation of time series and genomic data is a common problem in computational biology. With increasingly complex measurement procedures individual data points are often not just numbers or simple vectors in which all components are of the same kind. Analysis methods that capitalize on slopes in a single real-valued data track or that make explicit use of the vectorial nature of the data are not applicable in such scenaria. We develop here a framework for segmentation in arbitrary data domains that only requires a minimal notion of similarity. Using unsupervised clustering of (a sample of) the input yields an approximate segmentation algorithm that is efficient enough for genome-wide applications. As a showcase application we segment a time-series of transcriptome sequencing data from budding yeast, in high temporal resolution over ca. 2.5 cycles of the short-period respiratory oscillation. The algorithm is used with a similarity measure focussing on periodic expression profiles across the metabolic cycle rather than coverage per time point.
Collapse
Affiliation(s)
- Rainer Machné
- Institute for Synthetic Microbiology, Cluster of Excellence on Plant Sciences (CEPLAS), Heinrich Heine University Düsseldorf, Universitätsstraße 1, D-40225, Düsseldorf, Germany. .,Department of Theoretical Chemistry of the University of Vienna, Währingerstrasse 17, Vienna, A-1090, Austria.
| | - Douglas B Murray
- Institute for Advanced Biosciences, Keio University, Tsuruoka, Yamagata, 997-0017, Japan
| | - Peter F Stadler
- Bioinformatics Group, Department of Computer Science, Interdisciplinary Center for Bioinformatics, German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Competence Center for Scalable Data Services and Solutions, and Leipzig Research Center for Civilization Diseases, University Leipzig, Härtelstrasse 16-18, D-04107, Leipzig, Germany. .,Max Planck Institute for Mathematics in the Sciences, Inselstraße 22, D-04103, Leipzig, Germany. .,Fraunhofer Institute for Cell Therapy and Immunology, Perlickstrasse 1, D-04103, Leipzig, Germany. .,Department of Theoretical Chemistry of the University of Vienna, Währingerstrasse 17, Vienna, A-1090, Austria. .,Center for RNA in Technology and Health, Univ. Copenhagen, Grønneg ardsvej 3, Frederiksberg C, Denmark. .,Santa Fe Institute, 1399 Hyde Park Road, Santa Fe, NM, 87501, USA.
| |
Collapse
|
16
|
Delatola EI, Lebarbier E, Mary-Huard T, Radvanyi F, Robin S, Wong J. SegCorr a statistical procedure for the detection of genomic regions of correlated expression. BMC Bioinformatics 2017; 18:333. [PMID: 28697800 PMCID: PMC5504623 DOI: 10.1186/s12859-017-1742-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2016] [Accepted: 06/26/2017] [Indexed: 01/27/2023] Open
Abstract
BACKGROUND Detecting local correlations in expression between neighboring genes along the genome has proved to be an effective strategy to identify possible causes of transcriptional deregulation in cancer. It has been successfully used to illustrate the role of mechanisms such as copy number variation (CNV) or epigenetic alterations as factors that may significantly alter expression in large chromosomal regions (gene silencing or gene activation). RESULTS The identification of correlated regions requires segmenting the gene expression correlation matrix into regions of homogeneously correlated genes and assessing whether the observed local correlation is significantly higher than the background chromosomal correlation. A unified statistical framework is proposed to achieve these two tasks, where optimal segmentation is efficiently performed using dynamic programming algorithm, and detection of highly correlated regions is then achieved using an exact test procedure. We also propose a simple and efficient procedure to correct the expression signal for mechanisms already known to impact expression correlation. The performance and robustness of the proposed procedure, called SegCorr, are evaluated on simulated data. The procedure is illustrated on cancer data, where the signal is corrected for correlations caused by copy number variation. It permitted the detection of regions with high correlations linked to epigenetic marks like DNA methylation. CONCLUSIONS SegCorr is a novel method that performs correlation matrix segmentation and applies a test procedure in order to detect highly correlated regions in gene expression.
Collapse
Affiliation(s)
- Eleni Ioanna Delatola
- AgroParisTech UMR518, Paris, 75005, France.
- INRA UMR518, Paris, 75005, France.
- Institut Curie, PSL Research University, Cedex 05, Paris, 75248, France.
- CNRS UMR144, Equipe Labellisee par La Ligue Nationale contre le Cancer, Cedex 05, Paris, 75248, France.
| | - Emilie Lebarbier
- AgroParisTech UMR518, Paris, 75005, France
- INRA UMR518, Paris, 75005, France
| | - Tristan Mary-Huard
- AgroParisTech UMR518, Paris, 75005, France
- INRA UMR518, Paris, 75005, France
- INRA, UMR 0320 - UMR 8120 Genetique Quantitative et Evolution-Le Moulon, Gif-sur-Yvette, F-91190, France
| | - François Radvanyi
- Institut Curie, PSL Research University, Cedex 05, Paris, 75248, France
- CNRS UMR144, Equipe Labellisee par La Ligue Nationale contre le Cancer, Cedex 05, Paris, 75248, France
| | - Stéphane Robin
- AgroParisTech UMR518, Paris, 75005, France
- INRA UMR518, Paris, 75005, France
| | - Jennifer Wong
- Institut Curie, PSL Research University, Cedex 05, Paris, 75248, France
- CNRS UMR144, Equipe Labellisee par La Ligue Nationale contre le Cancer, Cedex 05, Paris, 75248, France
- Molecular Oncology Unit, Department of Biochemistry, Hospital Saint Louis, AP-HP, Cedex 10, Paris, 75475, France
- Université Paris Diderot, Sorbonne Paris Cité, CNRS UMR7212/INSERM U944, Cedex 10, Paris, 75475, France
| |
Collapse
|
17
|
Chakar S, Lebarbier E, Lévy-Leduc C, Robin S. A robust approach for estimating change-points in the mean of an $\operatorname{AR}(1)$ process. BERNOULLI 2017. [DOI: 10.3150/15-bej782] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
18
|
Fu J, Beaty TH, Scott AF, Hetmanski J, Parker MM, Wilson JEB, Marazita ML, Mangold E, Albacha-Hejazi H, Murray JC, Bureau A, Carey J, Cristiano S, Ruczinski I, Scharpf RB. Whole exome association of rare deletions in multiplex oral cleft families. Genet Epidemiol 2017; 41:61-69. [PMID: 27910131 PMCID: PMC5154821 DOI: 10.1002/gepi.22010] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2016] [Revised: 09/21/2016] [Accepted: 09/21/2016] [Indexed: 11/11/2022]
Abstract
By sequencing the exomes of distantly related individuals in multiplex families, rare mutational and structural changes to coding DNA can be characterized and their relationship to disease risk can be assessed. Recently, several rare single nucleotide variants (SNVs) were associated with an increased risk of nonsyndromic oral cleft, highlighting the importance of rare sequence variants in oral clefts and illustrating the strength of family-based study designs. However, the extent to which rare deletions in coding regions of the genome occur and contribute to risk of nonsyndromic clefts is not well understood. To identify putative structural variants underlying risk, we developed a pipeline for rare hemizygous deletions in families from whole exome sequencing and statistical inference based on rare variant sharing. Among 56 multiplex families with 115 individuals, we identified 53 regions with one or more rare hemizygous deletions. We found 45 of the 53 regions contained rare deletions occurring in only one family member. Members of the same family shared a rare deletion in only eight regions. We also devised a scalable global test for enrichment of shared rare deletions.
Collapse
Affiliation(s)
- Jack Fu
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore MD, USA
| | - Terri H. Beaty
- Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore MD, USA
| | - Alan F. Scott
- Center for Inherited Disease Research and Institute of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore MD, USA
| | - Jacqueline Hetmanski
- Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore MD, USA
| | - Margaret M. Parker
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, Boston MA, USA
| | - Joan E. Bailey Wilson
- Inherited Disease Research Branch, National Human Genome Research Institute, National Institutes of Health, Baltimore MD, USA
| | - Mary L. Marazita
- Department of Oral Biology, Center for Craniofacial and Dental Genetics, School of Dental Medicine, University of Pittsburgh, PA, USA
| | | | | | - Jeffrey C. Murray
- Department of Pediatrics, School of Medicine, University of Iowa, IA, USA
| | - Alexandre Bureau
- Centre de Recherche de l’Institut Universitaire en Santé Mentale de Québec and Département de Médecine Sociale et Préventive, Université Laval, Québec, Canada
| | - Jacob Carey
- Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore MD, USA
| | - Stephen Cristiano
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore MD, USA
| | - Ingo Ruczinski
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore MD, USA
| | - Robert B. Scharpf
- Department of Oncology, Johns Hopkins School of Medicine, Baltimore MD, USA
| |
Collapse
|
19
|
Bertin K, Collilieux X, Lebarbier E, Meza C. Semi-parametric segmentation of multiple series using a DP-Lasso strategy. J STAT COMPUT SIM 2016. [DOI: 10.1080/00949655.2016.1260726] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
20
|
Maidstone R, Hocking T, Rigaill G, Fearnhead P. On optimal multiple changepoint algorithms for large data. STATISTICS AND COMPUTING 2016; 27:519-533. [PMID: 32355427 PMCID: PMC7175693 DOI: 10.1007/s11222-016-9636-3] [Citation(s) in RCA: 45] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/06/2015] [Accepted: 02/01/2016] [Indexed: 06/11/2023]
Abstract
Many common approaches to detecting changepoints, for example based on statistical criteria such as penalised likelihood or minimum description length, can be formulated in terms of minimising a cost over segmentations. We focus on a class of dynamic programming algorithms that can solve the resulting minimisation problem exactly, and thus find the optimal segmentation under the given statistical criteria. The standard implementation of these dynamic programming methods have a computational cost that scales at least quadratically in the length of the time-series. Recently pruning ideas have been suggested that can speed up the dynamic programming algorithms, whilst still being guaranteed to be optimal, in that they find the true minimum of the cost function. Here we extend these pruning methods, and introduce two new algorithms for segmenting data: FPOP and SNIP. Empirical results show that FPOP is substantially faster than existing dynamic programming methods, and unlike the existing methods its computational efficiency is robust to the number of changepoints in the data. We evaluate the method for detecting copy number variations and observe that FPOP has a computational cost that is even competitive with that of binary segmentation, but can give much more accurate segmentations.
Collapse
Affiliation(s)
- Robert Maidstone
- STOR-i Centre for Doctoral Training, Lancaster University, Lancaster, UK
| | - Toby Hocking
- McGill University and Genome Quebec Innovation Center, Quebec, Canada
| | - Guillem Rigaill
- Institute of Plant Sciences Paris-Saclay, UMR 9213/UMR1403, CNRS, INRA, Université Paris-Sud, Université d’Evry, Université Paris-Diderot, Sorbonne Paris-Cité, Paris, France
| | - Paul Fearnhead
- Department of Mathematics and Statistics, Lancaster University, Lancaster, UK
| |
Collapse
|
21
|
Schouten PC, Vollebergh MA, Opdam M, Jonkers M, Loden M, Wesseling J, Hauptmann M, Linn SC. High XIST and Low 53BP1 Expression Predict Poor Outcome after High-Dose Alkylating Chemotherapy in Patients with a BRCA1-like Breast Cancer. Mol Cancer Ther 2015; 15:190-8. [DOI: 10.1158/1535-7163.mct-15-0470] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2015] [Accepted: 11/06/2015] [Indexed: 11/16/2022]
|
22
|
Wang T, Chen M, Zhao H. Estimating DNA methylation levels by joint modeling of multiple methylation profiles from microarray data. Biometrics 2015; 72:354-63. [PMID: 26433612 DOI: 10.1111/biom.12422] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2015] [Revised: 06/01/2015] [Accepted: 08/01/2015] [Indexed: 12/29/2022]
Abstract
DNA methylation studies have been revolutionized by the recent development of high throughput array-based platforms. Most of the existing methods analyze microarray methylation data on a probe-by-probe basis, ignoring probe-specific effects and correlations among methylation levels at neighboring genomic locations. These methods can potentially miss functionally relevant findings associated with genomic regions. In this article, we propose a statistical model that allows us to pool information on the same probe across multiple samples to estimate the probe affinity effect, and to borrow strength from the neighboring probe sites to better estimate the methylation values. Using a simulation study, we demonstrate that our method can provide accurate model-based estimates. We further use the proposed method to develop a new procedure for detecting differentially methylated regions, and compare it with a state-of-the-art approach via a data application.
Collapse
Affiliation(s)
- Tao Wang
- Department of Biostatistics, Yale University, New Haven, Connecticut, 06520, U.S.A
| | - Mengjie Chen
- Department of Biostatistics, University of North Carolina, Chapel Hill, North Carolina, 27599, U.S.A
| | - Hongyu Zhao
- Department of Biostatistics, Yale University, New Haven, Connecticut, 06520, U.S.A
| |
Collapse
|
23
|
Masecchia S, Coco S, Barla A, Verri A, Tonini GP. Genome instability model of metastatic neuroblastoma tumorigenesis by a dictionary learning algorithm. BMC Med Genomics 2015; 8:57. [PMID: 26358114 PMCID: PMC4566396 DOI: 10.1186/s12920-015-0132-y] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2015] [Accepted: 08/28/2015] [Indexed: 12/21/2022] Open
Abstract
Background Metastatic neuroblastoma (NB) occurs in pediatric patients as stage 4S or stage 4 and it is characterized by heterogeneous clinical behavior associated with diverse genotypes. Tumors of stage 4 contain several structural copy number aberrations (CNAs) rarely found in stage 4S. To date, the NB tumorigenesis is not still elucidated, although it is evident that genomic instability plays a critical role in the genesis of the tumor. Here we propose a mathematical approach to decipher genomic data and we provide a new model of NB metastatic tumorigenesis. Method We elucidate NB tumorigenesis using Enhanced Fused Lasso Latent Feature Model (E-FLLat) modeling the array comparative chromosome hybridization (aCGH) data of 190 metastatic NBs (63 stage 4S and 127 stage 4). This model for aCGH segmentation, based on the minimization of functional dictionary learning (DL), combines several penalties tailored to the specificities of aCGH data. In DL, the original signal is approximated by a linear weighted combination of atoms: the elements of the learned dictionary. Results The hierarchical structures for stage 4S shows at the first level of the oncogenetic tree several whole chromosome gains except to the unbalanced gains of 17q, 2p and 2q. Conversely, the high CNA complexity found in stage 4 tumors, requires two different trees. Both stage 4 oncogenetic trees are marked diverged, up to five sublevels and the 17q gain is the most common event at the first level (2/3 nodes). Moreover the 11q deletion, one of the major unfavorable marker of disease progression, occurs before 3p loss indicating that critical chromosome aberrations appear at early stages of tumorigenesis. Finally, we also observed a significant (p = 0.025) association between patient age and chromosome loss in stage 4 cases. Conclusion These results led us to propose a genome instability progressive model in which NB cells initiate with a DNA synthesis uncoupled from cell division, that leads to stage 4S tumors, primarily characterized by numerical aberrations, or stage 4 tumors with high levels of genome instability resulting in complex chromosome rearrangements associated with high tumor aggressiveness and rapid disease progression. Electronic supplementary material The online version of this article (doi:10.1186/s12920-015-0132-y) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | - Simona Coco
- Lung Cancer Unit; IRCCS A.O.U. San Martino - IST, Genova, Italy.
| | - Annalisa Barla
- DIBRIS, Università degli Studi di Genova, Genova, Italy.
| | | | - Gian Paolo Tonini
- Neuroblastoma Laboratory, Onco/Hematology Laboratory, Department of Woman and Child Health, University of Padua, Pediatric Research Institute, Fondazione Città della Speranza, Padua, Corso Stati Uniti, 4, 35127, Padua, Italy.
| |
Collapse
|
24
|
Biesma HD, Schouten PC, Lacle MM, Sanders J, Brugman W, Kerkhoven R, Mandjes I, van der Groep P, van Diest PJ, Linn SC. Copy number profiling by array comparative genomic hybridization identifies frequently occurring BRCA2-like male breast cancer. Genes Chromosomes Cancer 2015; 54:734-44. [DOI: 10.1002/gcc.22284] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2015] [Accepted: 06/25/2015] [Indexed: 11/08/2022] Open
Affiliation(s)
- Hedde D. Biesma
- Department of Molecular Pathology; Netherlands Cancer Institute; Amsterdam The Netherlands
| | - Philip C. Schouten
- Department of Molecular Pathology; Netherlands Cancer Institute; Amsterdam The Netherlands
| | - Miangela M. Lacle
- Department of Pathology; University Medical Center Utrecht; The Netherlands
| | - Joyce Sanders
- Department of Pathology; Netherlands Cancer Institute; Amsterdam The Netherlands
| | - Wim Brugman
- Genomics Core Facility, Netherlands Cancer Institute; Amsterdam The Netherlands
| | - Ron Kerkhoven
- Genomics Core Facility, Netherlands Cancer Institute; Amsterdam The Netherlands
| | - Ingrid Mandjes
- Data Center, Netherlands Cancer Institute; Amsterdam The Netherlands
| | | | - Paul J. van Diest
- Department of Pathology; University Medical Center Utrecht; The Netherlands
| | - Sabine C. Linn
- Department of Molecular Pathology; Netherlands Cancer Institute; Amsterdam The Netherlands
- Department of Pathology; University Medical Center Utrecht; The Netherlands
- Department of Medical Oncology; Netherlands Cancer Institute; Amsterdam The Netherlands
| |
Collapse
|
25
|
Schouten PC, Grigoriadis A, Kuilman T, Mirza H, Watkins JA, Cooke SA, van Dyk E, Severson TM, Rueda OM, Hoogstraat M, Verhagen CVM, Natrajan R, Chin SF, Lips EH, Kruizinga J, Velds A, Nieuwland M, Kerkhoven RM, Krijgsman O, Vens C, Peeper D, Nederlof PM, Caldas C, Tutt AN, Wessels LF, Linn SC. Robust BRCA1-like classification of copy number profiles of samples repeated across different datasets and platforms. Mol Oncol 2015; 9:1274-86. [PMID: 25825120 PMCID: PMC5528812 DOI: 10.1016/j.molonc.2015.03.002] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2014] [Revised: 03/01/2015] [Accepted: 03/11/2015] [Indexed: 11/30/2022] Open
Abstract
Breast cancers with BRCA1 germline mutation have a characteristic DNA copy number (CN) pattern. We developed a test that assigns CN profiles to be 'BRCA1-like' or 'non-BRCA1-like', which refers to resembling a BRCA1-mutated tumor or resembling a tumor without a BRCA1 mutation, respectively. Approximately one third of the BRCA1-like breast cancers have a BRCA1 mutation, one third has hypermethylation of the BRCA1 promoter and one third has an unknown reason for being BRCA1-like. This classification is indicative of patients' response to high dose alkylating and platinum containing chemotherapy regimens, which targets the inability of BRCA1 deficient cells to repair DNA double strand breaks. We investigated whether this classification can be reliably obtained with next generation sequencing and copy number platforms other than the bacterial artificial chromosome (BAC) array Comparative Genomic Hybridization (aCGH) on which it was originally developed. We investigated samples from 230 breast cancer patients for which a CN profile had been generated on two to five platforms, comprising low coverage CN sequencing, CN extraction from targeted sequencing panels (CopywriteR), Affymetrix SNP6.0, 135K/720K oligonucleotide aCGH, Affymetrix Oncoscan FFPE (MIP) technology, 3K BAC and 32K BAC aCGH. Pairwise comparison of genomic position-mapped profiles from the original aCGH platform and other platforms revealed concordance. For most cases, biological differences between samples exceeded the differences between platforms within one sample. We observed the same classification across different platforms in over 80% of the patients and kappa values of at least 0.36. Differential classification could be attributed to CN profiles that were not strongly associated to one class. In conclusion, we have shown that the genomic regions that define our BRCA1-like classifier are robustly measured by different CN profiling technologies, providing the possibility to retro- and prospectively investigate BRCA1-like classification across a wide range of CN platforms.
Collapse
Affiliation(s)
- Philip C Schouten
- Department of Molecular Pathology, Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Anita Grigoriadis
- Breakthrough Breast Cancer Research Unit, Department of Research Oncology, Guy's Hospital, King's College London School of Medicine, London, United Kingdom
| | - Thomas Kuilman
- Division of Molecular Oncology, Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Hasan Mirza
- Breakthrough Breast Cancer Research Unit, Department of Research Oncology, Guy's Hospital, King's College London School of Medicine, London, United Kingdom
| | - Johnathan A Watkins
- Breakthrough Breast Cancer Research Unit, Department of Research Oncology, Guy's Hospital, King's College London School of Medicine, London, United Kingdom
| | - Saskia A Cooke
- Breakthrough Breast Cancer Research Unit, Department of Research Oncology, Guy's Hospital, King's College London School of Medicine, London, United Kingdom
| | - Ewald van Dyk
- Department of Molecular Carcinogenesis, Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Tesa M Severson
- Department of Molecular Pathology, Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Oscar M Rueda
- Cancer Research UK Cambridge Research Institute, Li Ka Shing Centre, Cambridge, UK
| | - Marlous Hoogstraat
- Department of Molecular Pathology, Netherlands Cancer Institute, Amsterdam, The Netherlands; Department of Molecular Carcinogenesis, Netherlands Cancer Institute, Amsterdam, The Netherlands; Department of Medical Oncology, University Medical Center Utrecht, Utrecht, The Netherlands; Netherlands Center for Personalized Cancer Treatment, Utrecht, The Netherlands
| | - Caroline V M Verhagen
- Division of Biological Stress Response, Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Rachael Natrajan
- The Breakthrough Breast Cancer Research Centre, The Institute of Cancer Research, London, UK
| | - Suet-Feung Chin
- Cancer Research UK Cambridge Research Institute, Li Ka Shing Centre, Cambridge, UK
| | - Esther H Lips
- Department of Molecular Pathology, Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Janneke Kruizinga
- Genomics Core Facility, Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Arno Velds
- Genomics Core Facility, Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Marja Nieuwland
- Genomics Core Facility, Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Ron M Kerkhoven
- Genomics Core Facility, Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Oscar Krijgsman
- Division of Molecular Oncology, Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Conchita Vens
- Division of Biological Stress Response, Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Daniel Peeper
- Division of Molecular Oncology, Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Petra M Nederlof
- Department of Pathology, Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Carlos Caldas
- Cancer Research UK Cambridge Research Institute, Li Ka Shing Centre, Cambridge, UK; Department of Oncology, University of Cambridge, Addenbrooke's Hospital, Cambridge, UK; Cambridge Experimental Cancer Medicine Centre and NIHR Cambridge Biomedical, Research Centre, Cambridge University Hospitals NHS, Cambridge, UK
| | - Andrew N Tutt
- Breakthrough Breast Cancer Research Unit, Department of Research Oncology, Guy's Hospital, King's College London School of Medicine, London, United Kingdom
| | - Lodewyk F Wessels
- Department of Molecular Carcinogenesis, Netherlands Cancer Institute, Amsterdam, The Netherlands; Faculty of Electrical Engineering, Mathematics and Computer Science, Delft University of Technology, Delft, The Netherlands
| | - Sabine C Linn
- Department of Molecular Pathology, Netherlands Cancer Institute, Amsterdam, The Netherlands; Department of Pathology, University Medical Center Utrecht, Utrecht, The Netherlands; Division of Medical Oncology, Netherlands Cancer Institute, Amsterdam, The Netherlands.
| |
Collapse
|
26
|
Pérez-Zanón N, Sigró J, Domonkos P, Ashcroft L. Comparison of HOMER and ACMANT homogenization methods using a central Pyrenees temperature dataset. ADVANCES IN SCIENCE AND RESEARCH 2015. [DOI: 10.5194/asr-12-111-2015] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022] Open
Abstract
Abstract. The aim of this research is to compare the results of two modern multiple break point homogenization methods, namely ACMANT and HOMER, over a Pyrenees temperature dataset in order to detect differences between their outputs which can affect future studies. Both methods are applied to a dataset of 44 monthly maximum and minimum temperature series placed around central Pyrenees and covering the 1910–2013 period. The results indicate that the automatic method ACMANT produces credible results. While HOMER detects more breaks supported by metadata, this method is also more dependent on the user skill and thus sensitive to subjective errors.
Collapse
|
27
|
Du Y, Murani E, Ponsuksili S, Wimmers K. biomvRhsmm: genomic segmentation with hidden semi-Markov model. BIOMED RESEARCH INTERNATIONAL 2014; 2014:910390. [PMID: 24995333 PMCID: PMC4065698 DOI: 10.1155/2014/910390] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/19/2013] [Revised: 03/03/2014] [Accepted: 03/21/2014] [Indexed: 11/25/2022]
Abstract
High-throughput technologies like tiling array and next-generation sequencing (NGS) generate continuous homogeneous segments or signal peaks in the genome that represent transcripts and transcript variants (transcript mapping and quantification), regions of deletion and amplification (copy number variation), or regions characterized by particular common features like chromatin state or DNA methylation ratio (epigenetic modifications). However, the volume and output of data produced by these technologies present challenges in analysis. Here, a hidden semi-Markov model (HSMM) is implemented and tailored to handle multiple genomic profile, to better facilitate genome annotation by assisting in the detection of transcripts, regulatory regions, and copy number variation by holistic microarray or NGS. With support for various data distributions, instead of limiting itself to one specific application, the proposed hidden semi-Markov model is designed to allow modeling options to accommodate different types of genomic data and to serve as a general segmentation engine. By incorporating genomic positions into the sojourn distribution of HSMM, with optional prior learning using annotation or previous studies, the modeling output is more biologically sensible. The proposed model has been compared with several other state-of-the-art segmentation models through simulation benchmarking, which shows that our efficient implementation achieves comparable or better sensitivity and specificity in genomic segmentation.
Collapse
Affiliation(s)
- Yang Du
- Institute for Genome Biology, Leibniz Institute for Farm Animal Biology, 18196 Dummerstorf, Germany
| | - Eduard Murani
- Institute for Genome Biology, Leibniz Institute for Farm Animal Biology, 18196 Dummerstorf, Germany
| | - Siriluck Ponsuksili
- Research Group Functional Genomics, Leibniz Institute for Farm Animal Biology, 18196 Dummerstorf, Germany
| | - Klaus Wimmers
- Institute for Genome Biology, Leibniz Institute for Farm Animal Biology, 18196 Dummerstorf, Germany
| |
Collapse
|
28
|
Zhou X, Liu J, Wan X, Yu W. Piecewise-constant and low-rank approximation for identification of recurrent copy number variations. Bioinformatics 2014; 30:1943-9. [PMID: 24642062 DOI: 10.1093/bioinformatics/btu131] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
MOTIVATION The post-genome era sees urgent need for more novel approaches to extracting useful information from the huge amount of genetic data. The identification of recurrent copy number variations (CNVs) from array-based comparative genomic hybridization (aCGH) data can help understand complex diseases, such as cancer. Most of the previous computational methods focused on single-sample analysis or statistical testing based on the results of single-sample analysis. Finding recurrent CNVs from multi-sample data remains a challenging topic worth further study. RESULTS We present a general and robust method to identify recurrent CNVs from multi-sample aCGH profiles. We express the raw dataset as a matrix and demonstrate that recurrent CNVs will form a low-rank matrix. Hence, we formulate the problem as a matrix recovering problem, where we aim to find a piecewise-constant and low-rank approximation (PLA) to the input matrix. We propose a convex formulation for matrix recovery and an efficient algorithm to globally solve the problem. We demonstrate the advantages of PLA compared with alternative methods using synthesized datasets and two breast cancer datasets. The experimental results show that PLA can successfully reconstruct the recurrent CNV patterns from raw data and achieve better performance compared with alternative methods under a wide range of scenarios. AVAILABILITY AND IMPLEMENTATION The MATLAB code is available at http://bioinformatics.ust.hk/pla.zip.
Collapse
Affiliation(s)
- Xiaowei Zhou
- Department of Electronic and Computer Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon and Department of Computer Science and Institute of Theoretical and Computational Study, Hong Kong Baptist University, Kowloon Tong, Hong Kong, China
| | - Jiming Liu
- Department of Electronic and Computer Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon and Department of Computer Science and Institute of Theoretical and Computational Study, Hong Kong Baptist University, Kowloon Tong, Hong Kong, China
| | - Xiang Wan
- Department of Electronic and Computer Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon and Department of Computer Science and Institute of Theoretical and Computational Study, Hong Kong Baptist University, Kowloon Tong, Hong Kong, China
| | - Weichuan Yu
- Department of Electronic and Computer Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon and Department of Computer Science and Institute of Theoretical and Computational Study, Hong Kong Baptist University, Kowloon Tong, Hong Kong, China
| |
Collapse
|
29
|
Cleynen A, Koskas M, Lebarbier E, Rigaill G, Robin S. Segmentor3IsBack: an R package for the fast and exact segmentation of Seq-data. Algorithms Mol Biol 2014; 9:6. [PMID: 24612691 PMCID: PMC3977952 DOI: 10.1186/1748-7188-9-6] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2013] [Accepted: 03/03/2014] [Indexed: 11/10/2022] Open
Abstract
Background Change point problems arise in many genomic analyses such as the detection of copy number variations or the detection of transcribed regions. The expanding Next Generation Sequencing technologies now allow to locate change points at the nucleotide resolution. Results Because of its complexity which is almost linear in the sequence length when the maximal number of segments is constant, and as its performance had been acknowledged for microarrays, we propose to use the Pruned Dynamic Programming algorithm for Seq-experiment outputs. This requires the adaptation of the algorithm to the negative binomial distribution with which we model the data. We show that if the dispersion in the signal is known, the PDP algorithm can be used, and we provide an estimator for this dispersion. We describe a compression framework which reduces the time complexity without modifying the accuracy of the segmentation. We propose to estimate the number of segments via a penalized likelihood criterion. We illustrate the performance of the proposed methodology on RNA-Seq data. Conclusions We illustrate the results of our approach on a real dataset and show its good performance. Our algorithm is available as an R package on the CRAN repository.
Collapse
|
30
|
Younkin SG, Scharpf RB, Schwender H, Parker MM, Scott AF, Marazita ML, Beaty TH, Ruczinski I. A genome-wide study of de novo deletions identifies a candidate locus for non-syndromic isolated cleft lip/palate risk. BMC Genet 2014; 15:24. [PMID: 24528994 PMCID: PMC3929298 DOI: 10.1186/1471-2156-15-24] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2013] [Accepted: 01/31/2014] [Indexed: 01/25/2023] Open
Abstract
Background Copy number variants (CNVs) may play an important part in the development of common birth defects such as oral clefts, and individual patients with multiple birth defects (including clefts) have been shown to carry small and large chromosomal deletions. In this paper we investigate de novo deletions defined as DNA segments missing in an oral cleft proband but present in both unaffected parents. We compare de novo deletion frequencies in children of European ancestry with an isolated, non-syndromic oral cleft to frequencies in children of European ancestry from randomly sampled trios. Results We identified a genome-wide significant 62 kilo base (kb) non-coding region on chromosome 7p14.1 where de novo deletions occur more frequently among oral cleft cases than controls. We also observed wider de novo deletions among cleft lip and palate (CLP) cases than seen among cleft palate (CP) and cleft lip (CL) cases. Conclusions This study presents a region where de novo deletions appear to be involved in the etiology of oral clefts, although the underlying biological mechanisms are still unknown. Larger de novo deletions are more likely to interfere with normal craniofacial development and may result in more severe clefts. Study protocol and sample DNA source can severely affect estimates of de novo deletion frequencies. Follow-up studies are needed to further validate these findings and to potentially identify additional structural variants underlying oral clefts.
Collapse
Affiliation(s)
- Samuel G Younkin
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore MD, USA.
| | | | | | | | | | | | | | | |
Collapse
|
31
|
Subramanian A, Shackney S, Schwartz R. Novel multisample scheme for inferring phylogenetic markers from whole genome tumor profiles. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2013; 10:1422-1431. [PMID: 24407301 PMCID: PMC3830698 DOI: 10.1109/tcbb.2013.33] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
Computational cancer phylogenetics seeks to enumerate the temporal sequences of aberrations in tumor evolution, thereby delineating the evolution of possible tumor progression pathways, molecular subtypes, and mechanisms of action. We previously developed a pipeline for constructing phylogenies describing evolution between major recurring cell types computationally inferred from whole-genome tumor profiles. The accuracy and detail of the phylogenies, however, depend on the identification of accurate, high-resolution molecular markers of progression, i.e., reproducible regions of aberration that robustly differentiate different subtypes and stages of progression. Here, we present a novel hidden Markov model (HMM) scheme for the problem of inferring such phylogenetically significant markers through joint segmentation and calling of multisample tumor data. Our method classifies sets of genome-wide DNA copy number measurements into a partitioning of samples into normal (diploid) or amplified at each probe. It differs from other similar HMM methods in its design specifically for the needs of tumor phylogenetics, by seeking to identify robust markers of progression conserved across a set of copy number profiles. We show an analysis of our method in comparison to other methods on both synthetic and real tumor data, which confirms its effectiveness for tumor phylogeny inference and suggests avenues for future advances.
Collapse
Affiliation(s)
- Ayshwarya Subramanian
- Graduate student at the Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA, 15213.
| | | | | |
Collapse
|
32
|
Efficiencies of Inhomogeneity-Detection Algorithms: Comparison of Different Detection Methods and Efficiency Measures. ACTA ACUST UNITED AC 2013. [DOI: 10.1155/2013/390945] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Efficiency evaluations for change point Detection methods used in nine major Objective Homogenization Methods (DOHMs) are presented. The evaluations are conducted using ten different simulated datasets and four efficiency measures: detection skill, skill of linear trend estimation, sum of squared error, and a combined efficiency measure. Test datasets applied have a diverse set of inhomogeneity (IH) characteristics and include one dataset that is similar to the monthly benchmark temperature dataset of the European benchmarking effort known by the acronym COST HOME. The performance of DOHMs is highly dependent on the characteristics of test datasets and efficiency measures. Measures of skills differ markedly according to the frequency and mean duration of inhomogeneities and vary with the ratio of IH-magnitudes and background noise. The study focuses on cases when high quality relative time series (i.e., the difference between a candidate and reference series) can be created, but the frequency and intensity of inhomogeneities are high. Results show that in these cases the Caussinus-Mestre method is the most effective, although appreciably good results can also be achieved by the use of several other DOHMs, such as the Multiple Analysis of Series for Homogenisation, Bayes method, Multiple Linear Regression, and the Standard Normal Homogeneity Test.
Collapse
|
33
|
Genomic instability: a stronger prognostic marker than proliferation for early stage luminal breast carcinomas. PLoS One 2013; 8:e76496. [PMID: 24143191 PMCID: PMC3797106 DOI: 10.1371/journal.pone.0076496] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2013] [Accepted: 08/27/2013] [Indexed: 01/15/2023] Open
Abstract
BACKGROUND The accurate prognosis definition to tailor treatment for early luminal invasive breast carcinoma patients remains challenging. MATERIALS AND METHODS Two hundred fourteen early luminal breast carcinomas were genotyped with single nucleotide polymorphisms (SNPs) array to determine the number of chromosomal breakpoints as a marker of genomic instability. Proliferation was assessed by KI67 (immunohistochemistry) and genomic grade index (transcriptomic analysis). IHC3 (IHC4 score for HER2 negative tumors) was also determined. RESULTS In the training set (109 cases), the optimal cut-off was 34 breakpoints with a specificity of 0.94 and a sensitivity of 0.57 (Area under the curve (AUC): 0.81[0.71; 0.91]). In the validation set (105 cases), the outcome of patients with > 34 breakpoints (11 events / 22 patients) was poorer (logrank test p < 0.001; Relative Risk (RR): 3.7 [1.73; 7.92]), than that of patients with < 34 breakpoints (19 events / 83 patients).Whereas genomic grade and KI67 had a significant prognostic value in univariate analysis in contrast to IHC3 that failed to have a statistical significant prognostic value in this series, the number of breakpoints remained the only significant parameter predictive of outcome (RR: 3.47, Confidence Interval (CI [1.29; 9.31], p = 0.014)) in multivariate analysis . CONCLUSION Genomic instability, defined herein as a high number of chromosomal breakpoints, in early stage luminal breast carcinoma is a stronger prognostic marker than proliferation.
Collapse
|
34
|
Schouten PC, van Dyk E, Braaf LM, Mulder L, Lips EH, de Ronde JJ, Holtman L, Wesseling J, Hauptmann M, Wessels LFA, Linn SC, Nederlof PM. Platform comparisons for identification of breast cancers with a BRCA-like copy number profile. Breast Cancer Res Treat 2013; 139:317-27. [PMID: 23670131 DOI: 10.1007/s10549-013-2558-2] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2013] [Accepted: 04/29/2013] [Indexed: 12/28/2022]
Abstract
Previously, we employed bacterial artificial chromosome (BAC) array comparative genomic hybridization (aCGH) profiles from BRCA1 and -2 mutation carriers and sporadic tumours to construct classifiers that identify tumour samples most likely to harbour BRCA1 and -2 mutations, designated 'BRCA1 and -2-like' tumours, respectively. The classifiers are used in clinical genetics to evaluate unclassified variants, and patients for which no good quality germline DNA is available. Furthermore, we have shown that breast cancer patients with BRCA-like tumour aCGH profiles benefit substantially from platinum-based chemotherapy, potentially due to their inability to repair DNA double strand breaks (DSB), providing a further important clinical application for the classifiers. The BAC array technology has been replaced with oligonucleotide arrays. To continue clinical use of existing classifiers, we mapped oligonucleotide aCGH data to the BAC domain, such that the oligonucleotide profiles can be employed as in the BAC classifier. We demonstrate that segmented profiles derived from oligonucleotide aCGH show high correlation with BAC aCGH profiles. Furthermore, we trained a support vector machine score to objectify aCGH profile quality. Using the mapped oligonucleotide aCGH data, we show equivalence in classification of biologically relevant cases between BAC and oligonucleotide data. Furthermore, the predicted benefit of DSB inducing chemotherapy due to a homologous recombination defect is retained. We conclude that oligonucleotide aCGH data can be mapped to and used in the previously developed and validated BAC aCGH classifiers. Our findings suggest that it is possible to map copy number data from any other technology in a similar way.
Collapse
Affiliation(s)
- Philip C Schouten
- Division of Molecular Pathology, Netherlands Cancer Institute, Plesmanlaan 121, 1066CX, Amsterdam, The Netherlands
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
35
|
Zhou X, Yang C, Wan X, Zhao H, Yu W. Multisample aCGH data analysis via total variation and spectral regularization. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2013; 10:230-235. [PMID: 23702561 PMCID: PMC3715577 DOI: 10.1109/tcbb.2012.166] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
DNA copy number variation (CNV) accounts for a large proportion of genetic variation. One commonly used approach to detecting CNVs is array-based comparative genomic hybridization (aCGH). Although many methods have been proposed to analyze aCGH data, it is not clear how to combine information from multiple samples to improve CNV detection. In this paper, we propose to use a matrix to approximate the multisample aCGH data and minimize the total variation of each sample as well as the nuclear norm of the whole matrix. In this way, we can make use of the smoothness property of each sample and the correlation among multiple samples simultaneously in a convex optimization framework. We also developed an efficient and scalable algorithm to handle large-scale data. Experiments demonstrate that the proposed method outperforms the state-of-the-art techniques under a wide range of scenarios and it is capable of processing large data sets with millions of probes.
Collapse
Affiliation(s)
- Xiaowei Zhou
- Department of Electronic and Computer Engineering, Hong Kong University of Science and Technology, Hong Kong, China.
| | | | | | | | | |
Collapse
|
36
|
Abstract
Osteosarcoma, the most frequent primary bone tumor, is a malignant mesenchymal sarcoma with a peak incidence in young children and adolescents. Left untreated, it progresses relentlessly to local and systemic disease, ultimately leading to death within months. Genomically, osteosarcomas are aneuploid with chaotic karyotypes, lacking the pathognomonic genetic rearrangements characteristic of most sarcomas. The familial genetics of osteosarcoma helped in elucidating some of the etiological molecular disruptions, such as the tumor suppressor genes RB1 in retinoblastoma and TP53 in Li-Fraumeni, and RECQL4 involved in DNA repair/replication in Rothmund-Thomson syndrome. Genomic profiling approaches such as array comparative genomic hybridization (aCGH) have provided additional insights concerning the mechanisms responsible for generating complex osteosarcoma genomes. This chapter provides a brief introduction to the clinical features of conventional osteosarcoma, the predominant subtypes, and a general overview of materials and analytical methods of osteosarcoma aCGH, followed by a more detailed literature overview of aCGH studies and a discussion of emerging genes, molecular mechanisms, and their clinical implications, as well as more recent application of integrative genomics in osteosarcoma. aCHG is helping elucidate genomic events leading to tumor development and evolution as well as identification of prognostic markers and therapeutic targets in osteosarcoma.
Collapse
|
37
|
Scharpf RB, Beaty TH, Schwender H, Younkin SG, Scott AF, Ruczinski I. Fast detection of de novo copy number variants from SNP arrays for case-parent trios. BMC Bioinformatics 2012; 13:330. [PMID: 23234608 PMCID: PMC3576329 DOI: 10.1186/1471-2105-13-330] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2011] [Accepted: 12/07/2012] [Indexed: 11/10/2022] Open
Abstract
Background In studies of case-parent trios, we define copy number variants (CNVs) in the offspring that differ from the parental copy numbers as de novo and of interest for their potential functional role in disease. Among the leading array-based methods for discovery of de novo CNVs in case-parent trios is the joint hidden Markov model (HMM) implemented in the PennCNV software. However, the computational demands of the joint HMM are substantial and the extent to which false positive identifications occur in case-parent trios has not been well described. We evaluate these issues in a study of oral cleft case-parent trios. Results Our analysis of the oral cleft trios reveals that genomic waves represent a substantial source of false positive identifications in the joint HMM, despite a wave-correction implementation in PennCNV. In addition, the noise of low-level summaries of relative copy number (log R ratios) is strongly associated with batch and correlated with the frequency of de novo CNV calls. Exploiting the trio design, we propose a univariate statistic for relative copy number referred to as the minimum distance that can reduce technical variation from probe effects and genomic waves. We use circular binary segmentation to segment the minimum distance and maximum a posteriori estimation to infer de novo CNVs from the segmented genome. Compared to PennCNV on simulated data, MinimumDistance identifies fewer false positives on average and is comparable to PennCNV with respect to false negatives. Genomic waves contribute to discordance of PennCNV and MinimumDistance for high coverage de novo calls, while highly concordant calls on chromosome 22 were validated by quantitative PCR. Computationally, MinimumDistance provides a nearly 8-fold increase in speed relative to the joint HMM in a study of oral cleft trios. Conclusions Our results indicate that batch effects and genomic waves are important considerations for case-parent studies of de novo CNV, and that the minimum distance is an effective statistic for reducing technical variation contributing to false de novo discoveries. Coupled with segmentation and maximum a posteriori estimation, our algorithm compares favorably to the joint HMM with MinimumDistance being much faster.
Collapse
Affiliation(s)
- Robert B Scharpf
- Department of Oncology, Johns Hopkins University, Baltimore, MD, USA.
| | | | | | | | | | | |
Collapse
|
38
|
Tarabichi M, Detours V, Konopka T. Piecewise polynomial representations of genomic tracks. PLoS One 2012; 7:e48941. [PMID: 23166601 PMCID: PMC3499510 DOI: 10.1371/journal.pone.0048941] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2012] [Accepted: 10/01/2012] [Indexed: 01/17/2023] Open
Abstract
Genomic data from micro-array and sequencing projects consist of associations of measured values to chromosomal coordinates. These associations can be thought of as functions in one dimension and can thus be stored, analyzed, and interpreted as piecewise-polynomial curves. We present a general framework for building piecewise polynomial representations of genome-scale signals and illustrate some of its applications via examples. We show that piecewise constant segmentation, a typical step in copy-number analyses, can be carried out within this framework for both array and (DNA) sequencing data offering advantages over existing methods in each case. Higher-order polynomial curves can be used, for example, to detect trends and/or discontinuities in transcription levels from RNA-seq data. We give a concrete application of piecewise linear functions to diagnose and quantify alignment quality at exon borders (splice sites). Our software (source and object code) for building piecewise polynomial models is available at http://sourceforge.net/projects/locsmoc/.
Collapse
Affiliation(s)
| | - Vincent Detours
- IRIBHM, Université Libre de Bruxelles, Brussels, Belgium
- Welbio, Université Libre de Bruxelles, Brussels, Belgium
| | - Tomasz Konopka
- IRIBHM, Université Libre de Bruxelles, Brussels, Belgium
| |
Collapse
|
39
|
Nilsen G, Liestøl K, Van Loo P, Moen Vollan HK, Eide MB, Rueda OM, Chin SF, Russell R, Baumbusch LO, Caldas C, Børresen-Dale AL, Lingjaerde OC. Copynumber: Efficient algorithms for single- and multi-track copy number segmentation. BMC Genomics 2012; 13:591. [PMID: 23442169 PMCID: PMC3582591 DOI: 10.1186/1471-2164-13-591] [Citation(s) in RCA: 207] [Impact Index Per Article: 17.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2012] [Accepted: 10/15/2012] [Indexed: 12/15/2022] Open
Abstract
Background Cancer progression is associated with genomic instability and an accumulation of gains and losses of DNA. The growing variety of tools for measuring genomic copy numbers, including various types of array-CGH, SNP arrays and high-throughput sequencing, calls for a coherent framework offering unified and consistent handling of single- and multi-track segmentation problems. In addition, there is a demand for highly computationally efficient segmentation algorithms, due to the emergence of very high density scans of copy number. Results A comprehensive Bioconductor package for copy number analysis is presented. The package offers a unified framework for single sample, multi-sample and multi-track segmentation and is based on statistically sound penalized least squares principles. Conditional on the number of breakpoints, the estimates are optimal in the least squares sense. A novel and computationally highly efficient algorithm is proposed that utilizes vector-based operations in R. Three case studies are presented. Conclusions The R package copynumber is a software suite for segmentation of single- and multi-track copy number data using algorithms based on coherent least squares principles.
Collapse
Affiliation(s)
- Gro Nilsen
- Biomedical Informatics, Dept of Informatics, University of Oslo, Oslo, Norway
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
40
|
Killick R, Fearnhead P, Eckley IA. Optimal Detection of Changepoints With a Linear Computational Cost. J Am Stat Assoc 2012. [DOI: 10.1080/01621459.2012.737745] [Citation(s) in RCA: 460] [Impact Index Per Article: 38.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
41
|
Rigaill GJ, Cadot S, Kluin RJ, Xue Z, Bernards R, Majewski IJ, Wessels LF. A regression model for estimating DNA copy number applied to capture sequencing data. Bioinformatics 2012; 28:2357-65. [DOI: 10.1093/bioinformatics/bts448] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
|
42
|
Rottenberg S, Vollebergh MA, de Hoon B, de Ronde J, Schouten PC, Kersbergen A, Zander SAL, Pajic M, Jaspers JE, Jonkers M, Lodén M, Sol W, van der Burg E, Wesseling J, Gillet JP, Gottesman MM, Gribnau J, Wessels L, Linn SC, Jonkers J, Borst P. Impact of intertumoral heterogeneity on predicting chemotherapy response of BRCA1-deficient mammary tumors. Cancer Res 2012; 72:2350-61. [PMID: 22396490 DOI: 10.1158/0008-5472.can-11-4201] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
The lack of markers to predict chemotherapy responses in patients poses a major handicap in cancer treatment. We searched for gene expression patterns that correlate with docetaxel or cisplatin response in a mouse model for breast cancer associated with BRCA1 deficiency. Array-based expression profiling did not identify a single marker gene predicting docetaxel response, despite an increase in Abcb1 (P-glycoprotein) expression that was sufficient to explain resistance in several poor responders. Intertumoral heterogeneity explained the inability to identify a predictive gene expression signature for docetaxel. To address this problem, we used a novel algorithm designed to detect differential gene expression in a subgroup of the poor responders that could identify tumors with increased Abcb1 transcript levels. In contrast, standard analytical tools, such as significance analysis of microarrays, detected a marker only if it correlated with response in a substantial fraction of tumors. For example, low expression of the Xist gene correlated with cisplatin hypersensitivity in most tumors, and it also predicted long recurrence-free survival of HER2-negative, stage III breast cancer patients treated with intensive platinum-based chemotherapy. Our findings may prove useful for selecting patients with high-risk breast cancer who could benefit from platinum-based therapy.
Collapse
Affiliation(s)
- Sven Rottenberg
- Division of Molecular Biology, The Netherlands Cancer Institute, Amsterdam, The Netherlands.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
43
|
Single-cell copy number variation detection. Genome Biol 2011; 12:R80. [PMID: 21854607 PMCID: PMC3245619 DOI: 10.1186/gb-2011-12-8-r80] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2011] [Revised: 08/09/2011] [Accepted: 08/19/2011] [Indexed: 12/15/2022] Open
Abstract
Detection of chromosomal aberrations from a single cell by array comparative genomic hybridization (single-cell array CGH), instead of from a population of cells, is an emerging technique. However, such detection is challenging because of the genome artifacts and the DNA amplification process inherent to the single cell approach. Current normalization algorithms result in inaccurate aberration detection for single-cell data. We propose a normalization method based on channel, genome composition and recurrent genome artifact corrections. We demonstrate that the proposed channel clone normalization significantly improves the copy number variation detection in both simulated and real single-cell array CGH data.
Collapse
|