1
|
De Falco A, Caruso F, Su XD, Iavarone A, Ceccarelli M. A variational algorithm to detect the clonal copy number substructure of tumors from scRNA-seq data. Nat Commun 2023; 14:1074. [PMID: 36841879 PMCID: PMC9968345 DOI: 10.1038/s41467-023-36790-9] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2021] [Accepted: 02/16/2023] [Indexed: 02/27/2023] Open
Abstract
Single-cell RNA sequencing is the reference technology to characterize the composition of the tumor microenvironment and to study tumor heterogeneity at high resolution. Here we report Single CEll Variational ANeuploidy analysis (SCEVAN), a fast variational algorithm for the deconvolution of the clonal substructure of tumors from single-cell RNA-seq data. It uses a multichannel segmentation algorithm exploiting the assumption that all the cells in a given copy number clone share the same breakpoints. Thus, the smoothed expression profile of every individual cell constitutes part of the evidence of the copy number profile in each subclone. SCEVAN can automatically and accurately discriminate between malignant and non-malignant cells, resulting in a practical framework to analyze tumors and their microenvironment. We apply SCEVAN to datasets encompassing 106 samples and 93,322 cells from different tumor types and technologies. We demonstrate its application to characterize the intratumor heterogeneity and geographic evolution of malignant brain tumors.
Collapse
Affiliation(s)
- Antonio De Falco
- Department of Electrical Engineering and Information Technology (DIETI), University of Naples 'Federico II', 80128, Naples, Italy.,BIOGEM Institute of Molecular Biology and Genetics, 83031, Ariano Irpino, Italy
| | - Francesca Caruso
- Department of Electrical Engineering and Information Technology (DIETI), University of Naples 'Federico II', 80128, Naples, Italy.,BIOGEM Institute of Molecular Biology and Genetics, 83031, Ariano Irpino, Italy
| | - Xiao-Dong Su
- Biomedical Pioneering Innovation Center (BIOPIC), School of Life Sciences, Peking University, 5 Yiheyuan Road, Haidian District, 100871, Beijing, China
| | - Antonio Iavarone
- Sylvester Comprehensive Cancer Center, University of Miami, Miller School of Medicine, Miami, FL, USA.,Department of Neurological Surgery, University of Miami, Miller School of Medicine, Miami, FL, USA
| | - Michele Ceccarelli
- Department of Electrical Engineering and Information Technology (DIETI), University of Naples 'Federico II', 80128, Naples, Italy. .,BIOGEM Institute of Molecular Biology and Genetics, 83031, Ariano Irpino, Italy.
| |
Collapse
|
2
|
Balagué-Dobón L, Cáceres A, González JR. Fully exploiting SNP arrays: a systematic review on the tools to extract underlying genomic structure. Brief Bioinform 2022; 23:6535682. [PMID: 35211719 PMCID: PMC8921734 DOI: 10.1093/bib/bbac043] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2021] [Revised: 01/25/2022] [Accepted: 01/28/2022] [Indexed: 12/12/2022] Open
Abstract
Single nucleotide polymorphisms (SNPs) are the most abundant type of genomic variation and the most accessible to genotype in large cohorts. However, they individually explain a small proportion of phenotypic differences between individuals. Ancestry, collective SNP effects, structural variants, somatic mutations or even differences in historic recombination can potentially explain a high percentage of genomic divergence. These genetic differences can be infrequent or laborious to characterize; however, many of them leave distinctive marks on the SNPs across the genome allowing their study in large population samples. Consequently, several methods have been developed over the last decade to detect and analyze different genomic structures using SNP arrays, to complement genome-wide association studies and determine the contribution of these structures to explain the phenotypic differences between individuals. We present an up-to-date collection of available bioinformatics tools that can be used to extract relevant genomic information from SNP array data including population structure and ancestry; polygenic risk scores; identity-by-descent fragments; linkage disequilibrium; heritability and structural variants such as inversions, copy number variants, genetic mosaicisms and recombination histories. From a systematic review of recently published applications of the methods, we describe the main characteristics of R packages, command-line tools and desktop applications, both free and commercial, to help make the most of a large amount of publicly available SNP data.
Collapse
|
3
|
Gordeeva V, Sharova E, Arapidi G. Progress in Methods for Copy Number Variation Profiling. Int J Mol Sci 2022; 23:ijms23042143. [PMID: 35216262 PMCID: PMC8879278 DOI: 10.3390/ijms23042143] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2022] [Revised: 02/09/2022] [Accepted: 02/11/2022] [Indexed: 02/04/2023] Open
Abstract
Copy number variations (CNVs) are the predominant class of structural genomic variations involved in the processes of evolutionary adaptation, genomic disorders, and disease progression. Compared with single-nucleotide variants, there have been challenges associated with the detection of CNVs owing to their diverse sizes. However, the field has seen significant progress in the past 20–30 years. This has been made possible due to the rapid development of molecular diagnostic methods which ensure a more detailed view of the genome structure, further complemented by recent advances in computational methods. Here, we review the major approaches that have been used to routinely detect CNVs, ranging from cytogenetics to the latest sequencing technologies, and then cover their specific features.
Collapse
Affiliation(s)
- Veronika Gordeeva
- Center for Precision Genome Editing and Genetic Technologies for Biomedicine, Federal Research and Clinical Center of Physical-Chemical Medicine of Federal Medical Biological Agency, 119435 Moscow, Russia
- Federal Research and Clinical Center of Physical-Chemical Medicine of Federal Medical Biological Agency, 119435 Moscow, Russia; (E.S.); (G.A.)
- Moscow Institute of Physics and Technology, National Research University, Moscow Oblast, 141701 Moscow, Russia
- Correspondence:
| | - Elena Sharova
- Federal Research and Clinical Center of Physical-Chemical Medicine of Federal Medical Biological Agency, 119435 Moscow, Russia; (E.S.); (G.A.)
| | - Georgij Arapidi
- Federal Research and Clinical Center of Physical-Chemical Medicine of Federal Medical Biological Agency, 119435 Moscow, Russia; (E.S.); (G.A.)
- Moscow Institute of Physics and Technology, National Research University, Moscow Oblast, 141701 Moscow, Russia
- Shemyakin–Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Sciences, 117997 Moscow, Russia
| |
Collapse
|
4
|
Silvestri G, Canedo-Ribeiro C, Serrano-Albal M, Labrecque R, Blondin P, Larmer SG, Marras G, Tutt DA, Handyside AH, Farré M, Sinclair KD, Griffin DK. Preimplantation Genetic Testing for Aneuploidy Improves Live Birth Rates with In Vitro Produced Bovine Embryos: A Blind Retrospective Study. Cells 2021; 10:cells10092284. [PMID: 34571932 PMCID: PMC8465548 DOI: 10.3390/cells10092284] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2021] [Revised: 08/29/2021] [Accepted: 08/30/2021] [Indexed: 12/31/2022] Open
Abstract
Approximately one million in vitro produced (IVP) cattle embryos are transferred worldwide each year as a way to improve the rates of genetic gain. The most advanced programmes also apply genomic selection at the embryonic stage by SNP genotyping and the calculation of genomic estimated breeding values (GEBVs). However, a high proportion of cattle embryos fail to establish a pregnancy. Here, we demonstrate that further interrogation of the SNP data collected for GEBVs can effectively remove aneuploid embryos from the pool, improving live births per embryo transfer (ET). Using three preimplantation genetic testing for aneuploidy (PGT-A) approaches, we assessed 1713 cattle blastocysts in a blind, retrospective analysis. Our findings indicate aneuploid embryos have a 5.8% chance of establishing a pregnancy and a 5.0% chance of given rise to a live birth. This compares to 59.6% and 46.7% for euploid embryos (p < 0.0001). PGT-A improved overall pregnancy and live birth rates by 7.5% and 5.8%, respectively (p < 0.0001). More detailed analyses revealed donor, chromosome, stage, grade, and sex-specific rates of error. Notably, we discovered a significantly higher incidence of aneuploidy in XY embryos and, as in humans, detected a preponderance of maternal meiosis I errors. Our data strongly support the use of PGT-A in cattle IVP programmes.
Collapse
Affiliation(s)
- Giuseppe Silvestri
- School of Biosciences, University of Kent, Canterbury CT2 7NH, UK; (G.S.); (C.C.-R.); (M.S.-A.); (A.H.H.); (M.F.)
| | - Carla Canedo-Ribeiro
- School of Biosciences, University of Kent, Canterbury CT2 7NH, UK; (G.S.); (C.C.-R.); (M.S.-A.); (A.H.H.); (M.F.)
| | - María Serrano-Albal
- School of Biosciences, University of Kent, Canterbury CT2 7NH, UK; (G.S.); (C.C.-R.); (M.S.-A.); (A.H.H.); (M.F.)
| | - Remi Labrecque
- L’Alliance Boviteq Inc., Saint-Hyacinthe, QC J2T 5H1, Canada; (R.L.); (P.B.); (S.G.L.); (G.M.)
| | - Patrick Blondin
- L’Alliance Boviteq Inc., Saint-Hyacinthe, QC J2T 5H1, Canada; (R.L.); (P.B.); (S.G.L.); (G.M.)
| | - Steven G. Larmer
- L’Alliance Boviteq Inc., Saint-Hyacinthe, QC J2T 5H1, Canada; (R.L.); (P.B.); (S.G.L.); (G.M.)
| | - Gabriele Marras
- L’Alliance Boviteq Inc., Saint-Hyacinthe, QC J2T 5H1, Canada; (R.L.); (P.B.); (S.G.L.); (G.M.)
| | - Desmond A.R. Tutt
- School of Biosciences, University of Nottingham, Nottingham LE12 5RD, UK; (D.A.R.T.); (K.D.S.)
| | - Alan H. Handyside
- School of Biosciences, University of Kent, Canterbury CT2 7NH, UK; (G.S.); (C.C.-R.); (M.S.-A.); (A.H.H.); (M.F.)
| | - Marta Farré
- School of Biosciences, University of Kent, Canterbury CT2 7NH, UK; (G.S.); (C.C.-R.); (M.S.-A.); (A.H.H.); (M.F.)
| | - Kevin D. Sinclair
- School of Biosciences, University of Nottingham, Nottingham LE12 5RD, UK; (D.A.R.T.); (K.D.S.)
| | - Darren K. Griffin
- School of Biosciences, University of Kent, Canterbury CT2 7NH, UK; (G.S.); (C.C.-R.); (M.S.-A.); (A.H.H.); (M.F.)
- Correspondence:
| |
Collapse
|
5
|
Cho SB. Set-Wise Differential Interaction Between Copy Number Alterations and Gene Expressions of Lower-Grade Glioma Reveals Prognosis-Associated Pathways. ENTROPY 2020; 22:e22121434. [PMID: 33353229 PMCID: PMC7765960 DOI: 10.3390/e22121434] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/22/2020] [Revised: 11/30/2020] [Accepted: 12/16/2020] [Indexed: 12/22/2022]
Abstract
The integrative analysis of copy number alteration (CNA) and gene expression (GE) is an essential part of cancer research considering the impact of CNAs on cancer progression and prognosis. In this research, an integrative analysis was performed with generalized differentially coexpressed gene sets (gdCoxS), which is a modification of dCoxS. In gdCoxS, set-wise interaction is measured using the correlation of sample-wise distances with Renyi’s relative entropy, which requires an estimation of sample density based on omics profiles. To capture correlations between the variables, multivariate density estimation with covariance was applied. In the simulation study, the power of gdCoxS outperformed dCoxS that did not use the correlations in the density estimation explicitly. In the analysis of the lower-grade glioma of the cancer genome atlas program (TCGA-LGG) data, the gdCoxS identified 577 pathway CNAs and GEs pairs that showed significant changes of interaction between the survival and non-survival group, while other benchmark methods detected lower numbers of such pathways. The biological implications of the significant pathways were well consistent with previous reports of the TCGA-LGG. Taken together, the gdCoxS is a useful method for an integrative analysis of CNAs and GEs.
Collapse
Affiliation(s)
- Seong Beom Cho
- Department of Biomedical Informatics, College of Medicine, Gachon University, Seongnam-Daero 1342, Korea
| |
Collapse
|
6
|
Technologies for Pharmacogenomics: A Review. Genes (Basel) 2020; 11:genes11121456. [PMID: 33291630 PMCID: PMC7761897 DOI: 10.3390/genes11121456] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2020] [Revised: 11/30/2020] [Accepted: 12/02/2020] [Indexed: 12/11/2022] Open
Abstract
The continuous development of new genotyping technologies requires awareness of their potential advantages and limitations concerning utility for pharmacogenomics (PGx). In this review, we provide an overview of technologies that can be applied in PGx research and clinical practice. Most commonly used are single nucleotide variant (SNV) panels which contain a pre-selected panel of genetic variants. SNV panels offer a short turnaround time and straightforward interpretation, making them suitable for clinical practice. However, they are limited in their ability to assess rare and structural variants. Next-generation sequencing (NGS) and long-read sequencing are promising technologies for the field of PGx research. Both NGS and long-read sequencing often provide more data and more options with regard to deciphering structural and rare variants compared to SNV panels-in particular, in regard to the number of variants that can be identified, as well as the option for haplotype phasing. Nonetheless, while useful for research, not all sequencing data can be applied to clinical practice yet. Ultimately, selecting the right technology is not a matter of fact but a matter of choosing the right technique for the right problem.
Collapse
|
7
|
Analysis of bovine blastocysts indicates ovarian stimulation does not induce chromosome errors, nor discordance between inner-cell mass and trophectoderm lineages. Theriogenology 2020; 161:108-119. [PMID: 33307428 PMCID: PMC7837012 DOI: 10.1016/j.theriogenology.2020.11.021] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2020] [Revised: 11/24/2020] [Accepted: 11/27/2020] [Indexed: 01/08/2023]
Abstract
Contemporary systems for oocyte retrieval and culture of both cattle and human embryos are suboptimal with respect to pregnancy outcomes following transfer. In humans, chromosome abnormalities are the leading cause of early pregnancy loss in assisted reproduction. Consequently, pre-implantation genetic testing for aneuploidy (PGT-A) is widespread and there is considerable interest in its application to identify suitable cattle IVP embryos for transfer. Here we report on the nature and extent of chromosomal abnormalities following transvaginal follicular aspiration (OPU) and IVP in cattle. Nine sexually mature Holstein heifers underwent nine sequential cycles of OPU-IVP (six non-stimulated and three stimulated cycles), generating 459 blastocysts from 783 oocytes. We adopted a SNP-array approach normally employed in genomic evaluations but reanalysed (Turner et al., 2019; Theriogenology125: 249) to detect levels of meiotic aneuploidy. Specifically, we asked whether ovarian stimulation increased the level of aneuploidy in either trophectoderm (TE) or inner-cell mass (ICM) lineages of blastocysts generated from OPU-IVP cycles. The proportion of Day 8 blastocysts of inseminated was greater (P < 0.001) for stimulated than non-stimulated cycles (0.712 ± 0.0288 vs. 0.466 ± 0.0360), but the overall proportion aneuploidy was similar for both groups (0.241 ± 0.0231). Most abnormalities consisted of meiotic trisomies. Twenty in vivo derived blastocysts recovered from the same donors were all euploid, thus indicating that 24 h of maturation is primarily responsible for aneuploidy induction. Chromosomal errors in OPU-IVP blastocysts decreased (P < 0.001) proportionately as stage/grade improved (from 0.373 for expanded Grade 2 to 0.128 for hatching Grade 1 blastocysts). Importantly, there was a high degree of concordance in the incidence of aneuploidy between TE and ICM lineages. Proportionately, 0.94 were "perfectly concordant" (i.e. identical result in both); 0.01 were imperfectly concordant (differing abnormalities detected); 0.05 were discordant; of which 0.03 detected a potentially lethal TE abnormality (false positives), leaving only 0.02 false negatives. These data support the use of TE biopsies for PGT-A in embryos undergoing genomic evaluation in cattle breeding. Finally, we report chromosome-specific errors and a high degree of variability in the incidence of aneuploidy between donors, suggesting a genetic contribution that merits further investigation.
Collapse
|
8
|
Jun Shin S, Wu Y, Hao N. A backward procedure for change‐point detection with applications to copy number variation detection. CAN J STAT 2020. [DOI: 10.1002/cjs.11535] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Affiliation(s)
- Seung Jun Shin
- Department of StatisticsKorea UniversitySeoul South Korea
| | - Yichao Wu
- Department of Mathematics, Statistics, and Computer ScienceThe University of Illinois at ChicagoChicago IL U.S.A
| | - Ning Hao
- Department of MathematicsThe University of ArizonaTuscon AZ U.S.A
| |
Collapse
|
9
|
Gu Z, Mullighan CG. ShinyCNV: a Shiny/R application to view and annotate DNA copy number variations. Bioinformatics 2019; 35:126-129. [PMID: 30561549 DOI: 10.1093/bioinformatics/bty546] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2018] [Accepted: 06/28/2018] [Indexed: 11/12/2022] Open
Abstract
Motivation Single nucleotide polymorphism (SNP) array is the most widely used platform to assess somatic copy number variations (CNVs) in cancer studies. Many SNP data-based CNV callers are available, however, the false positive rates from automated calling are commonly high, and reported breakpoints can be inaccurate. Manual review for each reported CNV by visualizing the SNP data is important, but is challenging for users lacking computational experience. To address this, we present a Shiny/R application ShinyCNV, an interactive graphical user interface to view and annotate CNVs. Results With this application, normalized SNP data, which includes log R ratio (LRR) and B allele frequency, can be plotted against the reported CNVs, and users can visually check the reliability of CNVs per se or adjust the incorrectly assigned breakpoints. Further, the interactive LRR spectrum panel within ShinyCNV can facilitate the process to identify commonly affected CNV regions from a group of samples, and to visually check if important focal gains/losses are missing from reported CNVs. ShinyCNV is designed to be intuitive for cancer researchers and can be easily installed for either personal use or deployed on servers to provide online service. Availability and implementation ShinyCNV and the tutorial are freely available from https://github.com/gzhmat/ShinyCNV. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Zhaohui Gu
- Department of Pathology, St. Jude Children's Research Hospital, Memphis, TN, USA
| | - Charles G Mullighan
- Department of Pathology, St. Jude Children's Research Hospital, Memphis, TN, USA
| |
Collapse
|
10
|
Belhadj S, Quintana I, Mur P, Munoz-Torres PM, Alonso MH, Navarro M, Terradas M, Piñol V, Brunet J, Moreno V, Lázaro C, Capellá G, Valle L. NTHL1 biallelic mutations seldom cause colorectal cancer, serrated polyposis or a multi-tumor phenotype, in absence of colorectal adenomas. Sci Rep 2019; 9:9020. [PMID: 31227763 PMCID: PMC6588610 DOI: 10.1038/s41598-019-45281-1] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2018] [Accepted: 05/30/2019] [Indexed: 12/20/2022] Open
Abstract
The cancer-predisposing syndrome caused by biallelic mutations in NTHL1 may not be a solely colorectal cancer (CRC) and polyposis syndrome but rather a multi-tumor recessive disease. The presence of ≤10 adenomas in several mutation carriers suggests a possible causal role of NTHL1 in hereditary or early-onset nonpolyposis CRC. The involvement of NTHL1 in serrated/hyperplastic polyposis remains unexplored. The aim of our study is to elucidate the role of NTHL1 in the predisposition to personal or familial history of multiple tumor types, familial/early-onset nonpolyposis CRC, and serrated polyposis. NTHL1 mutational screening was performed in 312 cancer patients with personal or family history of multiple tumor types, 488 with hereditary nonpolyposis CRC, and 96 with serrated/hyperplastic polyposis. While no biallelic mutation carriers were identified in patients with personal and/or family history of multiple tumor types or with serrated polyposis, one was identified among the 488 nonpolyposis CRC patients. The carrier of c.268C>T (p.Q90*) and 550-1G>A was diagnosed with CRC and meningioma at ages 37 and 45 respectively, being reclassified as attenuated adenomatous polyposis after the cumulative detection of 26 adenomas. Our findings suggest that biallelic mutations in NTHL1 rarely cause CRC, a personal/familial multi-tumor history, or serrated polyposis, in absence of adenomas.
Collapse
Affiliation(s)
- Sami Belhadj
- Hereditary Cancer Program, Catalan Institute of Oncology, IDIBELL, 08908 Hospitalet de Llobregat, Barcelona, Spain.,Program in Molecular Mechanisms and Experimental Therapy in Oncology (Oncobell), IDIBELL, 08908 Hospitalet de Llobregat, Barcelona, Spain
| | - Isabel Quintana
- Hereditary Cancer Program, Catalan Institute of Oncology, IDIBELL, 08908 Hospitalet de Llobregat, Barcelona, Spain.,Program in Molecular Mechanisms and Experimental Therapy in Oncology (Oncobell), IDIBELL, 08908 Hospitalet de Llobregat, Barcelona, Spain
| | - Pilar Mur
- Hereditary Cancer Program, Catalan Institute of Oncology, IDIBELL, 08908 Hospitalet de Llobregat, Barcelona, Spain.,Program in Molecular Mechanisms and Experimental Therapy in Oncology (Oncobell), IDIBELL, 08908 Hospitalet de Llobregat, Barcelona, Spain.,Centro de Investigación Biomédica en Red de Cáncer (CIBERONC), Madrid, Spain
| | - Pau M Munoz-Torres
- Hereditary Cancer Program, Catalan Institute of Oncology, IDIBELL, 08908 Hospitalet de Llobregat, Barcelona, Spain.,Program in Molecular Mechanisms and Experimental Therapy in Oncology (Oncobell), IDIBELL, 08908 Hospitalet de Llobregat, Barcelona, Spain
| | - M Henar Alonso
- Program in Molecular Mechanisms and Experimental Therapy in Oncology (Oncobell), IDIBELL, 08908 Hospitalet de Llobregat, Barcelona, Spain.,Unit of Biomarkers and Susceptibility, Cancer Prevention and Control Program, Catalan Institute of Oncology, IDIBELL, Barcelona, Spain.,Centro de Investigación Biomédica en Red de Epidemiologia y Salud Pública (CIBERESP), Madrid, Spain.,Department of Clinical Sciences, Faculty of Medicine, University of Barcelona, Barcelona, Spain
| | - Matilde Navarro
- Hereditary Cancer Program, Catalan Institute of Oncology, IDIBELL, 08908 Hospitalet de Llobregat, Barcelona, Spain.,Program in Molecular Mechanisms and Experimental Therapy in Oncology (Oncobell), IDIBELL, 08908 Hospitalet de Llobregat, Barcelona, Spain.,Centro de Investigación Biomédica en Red de Cáncer (CIBERONC), Madrid, Spain
| | - Mariona Terradas
- Hereditary Cancer Program, Catalan Institute of Oncology, IDIBELL, 08908 Hospitalet de Llobregat, Barcelona, Spain.,Program in Molecular Mechanisms and Experimental Therapy in Oncology (Oncobell), IDIBELL, 08908 Hospitalet de Llobregat, Barcelona, Spain
| | - Virginia Piñol
- Gastroenterology Unit, Hospital Universitario de Girona Dr Josep Trueta, 17007, Girona, Spain.,School of Medicine, University of Girona, 17071, Girona, Spain
| | - Joan Brunet
- Hereditary Cancer Program, Catalan Institute of Oncology, IDIBELL, 08908 Hospitalet de Llobregat, Barcelona, Spain.,Centro de Investigación Biomédica en Red de Cáncer (CIBERONC), Madrid, Spain.,Hereditary Cancer Program, Catalan Institute of Oncology, IDIBGi, 17007, Girona, Spain
| | - Victor Moreno
- Program in Molecular Mechanisms and Experimental Therapy in Oncology (Oncobell), IDIBELL, 08908 Hospitalet de Llobregat, Barcelona, Spain.,Unit of Biomarkers and Susceptibility, Cancer Prevention and Control Program, Catalan Institute of Oncology, IDIBELL, Barcelona, Spain.,Centro de Investigación Biomédica en Red de Epidemiologia y Salud Pública (CIBERESP), Madrid, Spain.,Department of Clinical Sciences, Faculty of Medicine, University of Barcelona, Barcelona, Spain
| | - Conxi Lázaro
- Hereditary Cancer Program, Catalan Institute of Oncology, IDIBELL, 08908 Hospitalet de Llobregat, Barcelona, Spain.,Program in Molecular Mechanisms and Experimental Therapy in Oncology (Oncobell), IDIBELL, 08908 Hospitalet de Llobregat, Barcelona, Spain.,Centro de Investigación Biomédica en Red de Cáncer (CIBERONC), Madrid, Spain
| | - Gabriel Capellá
- Hereditary Cancer Program, Catalan Institute of Oncology, IDIBELL, 08908 Hospitalet de Llobregat, Barcelona, Spain.,Program in Molecular Mechanisms and Experimental Therapy in Oncology (Oncobell), IDIBELL, 08908 Hospitalet de Llobregat, Barcelona, Spain.,Centro de Investigación Biomédica en Red de Cáncer (CIBERONC), Madrid, Spain
| | - Laura Valle
- Hereditary Cancer Program, Catalan Institute of Oncology, IDIBELL, 08908 Hospitalet de Llobregat, Barcelona, Spain. .,Program in Molecular Mechanisms and Experimental Therapy in Oncology (Oncobell), IDIBELL, 08908 Hospitalet de Llobregat, Barcelona, Spain. .,Centro de Investigación Biomédica en Red de Cáncer (CIBERONC), Madrid, Spain.
| |
Collapse
|
11
|
Wang X, Lebarbier E, Aubert J, Robin S. Variational Inference for Coupled Hidden Markov Models Applied to the Joint Detection of Copy Number Variations. Int J Biostat 2019; 15:/j/ijb.ahead-of-print/ijb-2018-0023/ijb-2018-0023.xml. [PMID: 30779702 DOI: 10.1515/ijb-2018-0023] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2018] [Accepted: 11/21/2018] [Indexed: 02/04/2023]
Abstract
Hidden Markov models provide a natural statistical framework for the detection of the copy number variations (CNV) in genomics. In this context, we define a hidden Markov process that underlies all individuals jointly in order to detect and to classify genomics regions in different states (typically, deletion, normal or amplification). Structural variations from different individuals may be dependent. It is the case in agronomy where varietal selection program exists and species share a common phylogenetic past. We propose to take into account these dependencies inthe HMM model. When dealing with a large number of series, maximum likelihood inference (performed classically using the EM algorithm) becomes intractable. We thus propose an approximate inference algorithm based on a variational approach (VEM), implemented in the CHMM R package. A simulation study is performed to assess the performance of the proposed method and an application to the detection of structural variations in plant genomes is presented.
Collapse
Affiliation(s)
- Xiaoqiang Wang
- School of Mathematics and Statistics, Shandong University (Weihai), Weihai,Shandong, China
| | - Emilie Lebarbier
- UMR MIA-Paris, AgroParisTech, INRA, Université Paris-Saclay, Paris, France
| | - Julie Aubert
- UMR MIA-Paris, AgroParisTech, INRA, Université Paris-Saclay, Paris, France
| | - Stéphane Robin
- UMR MIA-Paris, AgroParisTech, INRA, Université Paris-Saclay, Paris, France
| |
Collapse
|
12
|
Ruan J, Liu Z, Sun M, Wang Y, Yue J, Yu G. DBS: a fast and informative segmentation algorithm for DNA copy number analysis. BMC Bioinformatics 2019; 20:1. [PMID: 30606105 PMCID: PMC6318921 DOI: 10.1186/s12859-018-2565-8] [Citation(s) in RCA: 100] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2017] [Accepted: 12/07/2018] [Indexed: 12/02/2022] Open
Abstract
Background Genome-wide DNA copy number changes are the hallmark events in the initiation and progression of cancers. Quantitative analysis of somatic copy number alterations (CNAs) has broad applications in cancer research. With the increasing capacity of high-throughput sequencing technologies, fast and efficient segmentation algorithms are required when characterizing high density CNAs data. Results A fast and informative segmentation algorithm, DBS (Deviation Binary Segmentation), is developed and discussed. The DBS method is based on the least absolute error principles and is inspired by the segmentation method rooted in the circular binary segmentation procedure. DBS uses point-by-point model calculation to ensure the accuracy of segmentation and combines a binary search algorithm with heuristics derived from the Central Limit Theorem. The DBS algorithm is very efficient requiring a computational complexity of O(n*log n), and is faster than its predecessors. Moreover, DBS measures the change-point amplitude of mean values of two adjacent segments at a breakpoint, where the significant degree of change-point amplitude is determined by the weighted average deviation at breakpoints. Accordingly, using the constructed binary tree of significant degree, DBS informs whether the results of segmentation are over- or under-segmented. Conclusion DBS is implemented in a platform-independent and open-source Java application (ToolSeg), including a graphical user interface and simulation data generation, as well as various segmentation methods in the native Java language.
Collapse
Affiliation(s)
- Jun Ruan
- School of Information Engineering, Wuhan University of Technology, Wuhan, Hubei, 430070, China
| | - Zhen Liu
- School of Information Engineering, Wuhan University of Technology, Wuhan, Hubei, 430070, China
| | - Ming Sun
- School of Information Engineering, Wuhan University of Technology, Wuhan, Hubei, 430070, China
| | - Yue Wang
- Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Arlington, VA, 22203, USA
| | - Junqiu Yue
- Department of Pathology, Hubei Cancer Hospital, Wuhan, Hubei, 430079, China
| | - Guoqiang Yu
- Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Arlington, VA, 22203, USA.
| |
Collapse
|
13
|
Pitea A, Kondofersky I, Sass S, Theis FJ, Mueller NS, Unger K. Copy number aberrations from Affymetrix SNP 6.0 genotyping data-how accurate are commonly used prediction approaches? Brief Bioinform 2018; 21:272-281. [PMID: 30351397 DOI: 10.1093/bib/bby096] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2018] [Revised: 08/11/2018] [Accepted: 08/14/2018] [Indexed: 01/08/2023] Open
Abstract
Copy number aberrations (CNAs) are known to strongly affect oncogenes and tumour suppressor genes. Given the critical role CNAs play in cancer research, it is essential to accurately identify CNAs from tumour genomes. One particular challenge in finding CNAs is the effect of confounding variables. To address this issue, we assessed how commonly used CNA identification algorithms perform on SNP 6.0 genotyping data in the presence of confounding variables. We simulated realistic synthetic data with varying levels of three confounding variables-the tumour purity, the length of a copy number region and the CNA burden (the percentage of CNAs present in a profiled genome)-and evaluated the performance of OncoSNP, ASCAT, GenoCNA, GISTIC and CGHcall. Furthermore, we implemented and assessed CGHcall*, an adjusted version of CGHcall accounting for high CNA burden. Our analysis on synthetic data indicates that tumour purity and the CNA burden strongly influence the performance of all the algorithms. No algorithm can correctly find lost and gained genomic regions across all tumour purities. The length of CNA regions influenced the performance of ASCAT, CGHcall and GISTIC. OncoSNP, GenoCNA and CGHcall* showed little sensitivity. Overall, CGHcall* and OncoSNP showed reasonable performance, particularly in samples with high tumour purity. Our analysis on the HapMap data revealed a good overlap between CGHcall, CGHcall* and GenoCNA results and experimentally validated data. Our exploratory analysis on the TCGA HNSCC data revealed plausible results of CGHcall, CGHcall* and GISTIC in consensus HNSCC CNA regions. Code is available at https://github.com/adspit/PASCAL.
Collapse
Affiliation(s)
- Adriana Pitea
- Institute of Computational Biology, Helmholtz Zentrum München, Neuherberg, Germany.,Research Unit Radiation Cytogenetics, Helmholtz Zentrum München, Neuherberg, Germany
| | - Ivan Kondofersky
- Institute of Computational Biology, Helmholtz Zentrum München, Neuherberg, Germany.,Department of Mathematics, Technical University of Munich, Garching, Germany
| | - Steffen Sass
- Institute of Computational Biology, Helmholtz Zentrum München, Neuherberg, Germany
| | - Fabian J Theis
- Institute of Computational Biology, Helmholtz Zentrum München, Neuherberg, Germany.,Department of Mathematics, Technical University of Munich, Garching, Germany
| | - Nikola S Mueller
- Institute of Computational Biology, Helmholtz Zentrum München, Neuherberg, Germany
| | - Kristian Unger
- Research Unit Radiation Cytogenetics, Helmholtz Zentrum München, Neuherberg, Germany.,Clinical Cooperation Group Personalized Radiotherapy in Head and Neck Cancer, Helmholtz Zentrum München, Neuherberg, Germany Nikola S. Mueller, Institute of Computational Biology, Helmholtz Zentrum München, Neuherberg, Germany
| |
Collapse
|
14
|
Johnson J, Bessette DC, Saunus JM, Smart CE, Song S, Johnston RL, Cocciardi S, Rozali EN, Johnstone CN, Vargas AC, Kazakoff SH, BioBank VC, Khanna KK, Lakhani SR, Chenevix-Trench G, Simpson PT, Nones K, Waddell N, Al-Ejeh F. Characterization of a novel breast cancer cell line derived from a metastatic bone lesion of a breast cancer patient. Breast Cancer Res Treat 2018; 170:179-188. [DOI: 10.1007/s10549-018-4719-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2017] [Accepted: 02/15/2018] [Indexed: 02/03/2023]
|
15
|
Wei Z, Shu C, Zhang C, Huang J, Cai H. A short review of variants calling for single-cell-sequencing data with applications. Int J Biochem Cell Biol 2017; 92:218-226. [PMID: 28951246 DOI: 10.1016/j.biocel.2017.09.018] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2017] [Revised: 09/19/2017] [Accepted: 09/23/2017] [Indexed: 11/16/2022]
Abstract
The field of single-cell sequencing is fleetly expanding, and many techniques have been developed in the past decade. With this technology, biologists can study not only the heterogeneity between two adjacent cells in the same tissue or organ, but also the evolutionary relationships and degenerative processes in a single cell. Calling variants is the main purpose in analyzing single cell sequencing (SCS) data. Currently, some popular methods used for bulk-cell-sequencing data analysis are tailored directly to be applied in dealing with SCS data. However, SCS requires an extra step of genome amplification to accumulate enough quantity for satisfying sequencing needs. The amplification yields large biases and thus raises challenge for using the bulk-cell-sequencing methods. In order to provide guidance for the development of specialized analyzed methods as well as using currently developed tools for SNS, this paper aims to bridge the gap. In this paper, we firstly introduced two popular genome amplification methods and compared their capabilities. Then we introduced a few popular models for calling single-nucleotide polymorphisms and copy-number variations. Finally, break-through applications of SNS were summarized to demonstrate its potential in researching cell evolution.
Collapse
Affiliation(s)
- Zhuohui Wei
- School of Computer Science & Engineering, South China University of Technology, Guangzhou, China
| | - Chang Shu
- School of Computer Science & Engineering, South China University of Technology, Guangzhou, China
| | - Changsheng Zhang
- School of Computer Science & Engineering, South China University of Technology, Guangzhou, China
| | - Jingying Huang
- School of Computer Science & Engineering, South China University of Technology, Guangzhou, China
| | - Hongmin Cai
- School of Computer Science & Engineering, South China University of Technology, Guangzhou, China.
| |
Collapse
|
16
|
Silva GO, Siegel MB, Mose LE, Parker JS, Sun W, Perou CM, Chen M. SynthEx: a synthetic-normal-based DNA sequencing tool for copy number alteration detection and tumor heterogeneity profiling. Genome Biol 2017; 18:66. [PMID: 28390427 PMCID: PMC5385048 DOI: 10.1186/s13059-017-1193-3] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2016] [Accepted: 03/16/2017] [Indexed: 01/22/2023] Open
Abstract
Changes in the quantity of genetic material, known as somatic copy number alterations (CNAs), can drive tumorigenesis. Many methods exist for assessing CNAs using microarrays, but considerable technical issues limit current CNA calling based upon DNA sequencing. We present SynthEx, a novel tool for detecting CNAs from whole exome and genome sequencing. SynthEx utilizes a “synthetic-normal” strategy to overcome technical and financial issues. In terms of accuracy and precision, SynthEx is highly comparable to array-based methods and outperforms sequencing-based CNA detection tools. SynthEx robustly identifies CNAs using sequencing data without the additional costs associated with matched normal specimens.
Collapse
Affiliation(s)
- Grace O Silva
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.,Curriculum in Bioinformatics and Computational Biology, University of North Carolina, Chapel Hill, NC, 27599, USA.,Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC, 27599, USA
| | - Marni B Siegel
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.,Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC, 27599, USA
| | - Lisle E Mose
- Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC, 27599, USA
| | - Joel S Parker
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.,Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC, 27599, USA
| | - Wei Sun
- Public Health Division, Fred Hutchison Cancer Research Center, Seattle, WA, 98109, USA
| | - Charles M Perou
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.,Curriculum in Bioinformatics and Computational Biology, University of North Carolina, Chapel Hill, NC, 27599, USA.,Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC, 27599, USA
| | - Mengjie Chen
- Section of Genetic Medicine, Department of Medicine, The University of Chicago, 900 East 57th Street, KCBD 3220A, Chicago, IL, 60637, USA.
| |
Collapse
|
17
|
Whole genome sequencing discriminates hepatocellular carcinoma with intrahepatic metastasis from multi-centric tumors. J Hepatol 2017; 66:363-373. [PMID: 27742377 DOI: 10.1016/j.jhep.2016.09.021] [Citation(s) in RCA: 75] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/20/2016] [Revised: 09/01/2016] [Accepted: 09/21/2016] [Indexed: 12/12/2022]
Abstract
BACKGROUND & AIMS Patients with hepatocellular carcinoma (HCC) have a high-risk of multi-centric (MC) tumor occurrence due to a strong carcinogenic background in the liver. In addition, they have a high risk of intrahepatic metastasis (IM). Liver tumors withIM or MC are profoundly different in their development and clinical outcome. However, clinically or pathologically discriminating between IM and MC can be challenging. This study investigated whether IM or MC could be diagnosed at the molecular level. METHODS We performed whole genome and RNA sequencing analyses of 49 tumors including two extra-hepatic metastases, and one nodule-in-nodule tumor from 23 HCC patients. RESULTS Sequencing-based molecular diagnosis using somatic single nucleotide variation information showed higher sensitivity compared to previous techniques due to the inclusion of a larger number of mutation events. This proved useful in cases, which showed inconsistent clinical diagnoses. In addition, whole genome sequencing offered advantages in profiling of other genetic alterations, such as structural variations, copy number alterations, and variant allele frequencies, and helped to confirm the IM/MCdiagnosis. Divergent alterations between IM tumors with sorafenib treatment, long time-intervals, or tumor-in-tumor nodules indicated high intra-tumor heterogeneity, evolution, and clonal switching of liver cancers. CONCLUSIONS It is important to analyze the differences between IM tumors, in addition to IM/MC diagnosis, before selecting a therapeutic strategy for multiple tumors in the liver. LAY SUMMARY Whole genome sequencing of multiple liver tumors enabled the accuratediagnosis ofmulti-centric occurrence and intrahepatic metastasis using somatic single nucleotide variation information. In addition, genetic discrepancies between tumors help us to understand the physical changes during recurrence and cancer spread.
Collapse
|
18
|
Shen R, Seshan VE. FACETS: allele-specific copy number and clonal heterogeneity analysis tool for high-throughput DNA sequencing. Nucleic Acids Res 2016; 44:e131. [PMID: 27270079 PMCID: PMC5027494 DOI: 10.1093/nar/gkw520] [Citation(s) in RCA: 729] [Impact Index Per Article: 91.1] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2016] [Accepted: 05/29/2016] [Indexed: 12/02/2022] Open
Abstract
Allele-specific copy number analysis (ASCN) from next generation sequencing (NGS) data can greatly extend the utility of NGS beyond the identification of mutations to precisely annotate the genome for the detection of homozygous/heterozygous deletions, copy-neutral loss-of-heterozygosity (LOH), allele-specific gains/amplifications. In addition, as targeted gene panels are increasingly used in clinical sequencing studies for the detection of ‘actionable’ mutations and copy number alterations to guide treatment decisions, accurate, tumor purity-, ploidy- and clonal heterogeneity-adjusted integer copy number calls are greatly needed to more reliably interpret NGS-based cancer gene copy number data in the context of clinical sequencing. We developed FACETS, an ASCN tool and open-source software with a broad application to whole genome, whole-exome, as well as targeted panel sequencing platforms. It is a fully integrated stand-alone pipeline that includes sequencing BAM file post-processing, joint segmentation of total- and allele-specific read counts, and integer copy number calls corrected for tumor purity, ploidy and clonal heterogeneity, with comprehensive output and integrated visualization. We demonstrate the application of FACETS using The Cancer Genome Atlas (TCGA) whole-exome sequencing of lung adenocarcinoma samples. We also demonstrate its application to a clinical sequencing platform based on a targeted gene panel.
Collapse
Affiliation(s)
- Ronglai Shen
- Department of Epidemiology and Biostatistics, Memorial Sloan-Kettering Cancer Center, New York, NY 10065, USA
| | - Venkatraman E Seshan
- Department of Epidemiology and Biostatistics, Memorial Sloan-Kettering Cancer Center, New York, NY 10065, USA
| |
Collapse
|
19
|
Richardson S, Tseng GC, Sun W. Statistical Methods in Integrative Genomics. ANNUAL REVIEW OF STATISTICS AND ITS APPLICATION 2016; 3:181-209. [PMID: 27482531 PMCID: PMC4963036 DOI: 10.1146/annurev-statistics-041715-033506] [Citation(s) in RCA: 55] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
Statistical methods in integrative genomics aim to answer important biology questions by jointly analyzing multiple types of genomic data (vertical integration) or aggregating the same type of data across multiple studies (horizontal integration). In this article, we introduce different types of genomic data and data resources, and then review statistical methods of integrative genomics, with emphasis on the motivation and rationale of these methods. We conclude with some summary points and future research directions.
Collapse
Affiliation(s)
- Sylvia Richardson
- MRC Biostatistics Unit, Cambridge Institute of Public Health, University of Cambridge, CB2 0SR, United Kingdom
| | - George C. Tseng
- Department of Biostatistics, University of Pittsburgh, Pittsburgh, PA 15261
| | - Wei Sun
- Department of Biostatistics, Department of Genetics, University of North Carolina, Chapel Hill, NC 27599
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington 27516
| |
Collapse
|
20
|
Danecek P, McCarthy SA, Durbin R. A Method for Checking Genomic Integrity in Cultured Cell Lines from SNP Genotyping Data. PLoS One 2016; 11:e0155014. [PMID: 27176002 PMCID: PMC4866717 DOI: 10.1371/journal.pone.0155014] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2015] [Accepted: 04/22/2016] [Indexed: 12/16/2022] Open
Abstract
Genomic screening for chromosomal abnormalities is an important part of quality control when establishing and maintaining stem cell lines. We present a new method for sensitive detection of copy number alterations, aneuploidy, and contamination in cell lines using genome-wide SNP genotyping data. In contrast to other methods designed for identifying copy number variations in a single sample or in a sample composed of a mixture of normal and tumor cells, this new method is tailored for determining differences between cell lines and the starting material from which they were derived, which allows us to distinguish between normal and novel copy number variation. We implemented the method in the freely available BCFtools package and present results based on induced pluripotent stem cell lines obtained in the HipSci project.
Collapse
Affiliation(s)
- Petr Danecek
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, CB10 1SA, United Kingdom
- * E-mail: (PD); (RD)
| | - Shane A. McCarthy
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, CB10 1SA, United Kingdom
| | | | - Richard Durbin
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, CB10 1SA, United Kingdom
- * E-mail: (PD); (RD)
| |
Collapse
|
21
|
Abstract
Pancreatic cancer is the fourth leading cause of cancer death in our society, with a mortality that virtually parallels its incidence, a median survival of <12 months even with maximal therapy, and a 5-year survival rate of <5 %. The diversity of clinical outcomes and the molecular heterogeneity of histopathologically similar cancer types, incomplete knowledge of the genomic aberrations that drive carcinogenesis and the lack of therapeutics that specifically target most known genomic aberrations necessitates large-scale detailed analysis of cancer genomes to identify novel potential therapeutic strategies. As part of the International Cancer Genome Consortium (ICGC), the Australian Pancreatic Cancer Genome Initiative (APGI) used exomic sequencing and copy number analysis to define genomic aberrations that characterize a large, clinically focused, prospectively accrued cohort of patients with pancreatic cancer. The cohort consisted of early (clinical stages I and II) non-pre-treated patients with pancreatic ductal adenocarcinoma who underwent operative resection with curative intent. We devised approaches to adjust for low epithelial content in primary tumours and to define the genomic landscape of pancreatic cancer to identify novel candidate driver genes and mechanisms. We aim to develop stratified, molecular phenotype-guided therapeutic strategies using existing therapeutics that are either rescued, repurposed, in development, or are known to be effective in an undefined subgroup of PC patients. These are then tested in primary patient-derived xenografts and cell lines from the above deeply characterized cohort. In addition, we return information to treating clinicians that influences patient care and are launching a clinical trial called IMPaCT (Individualized Molecular Pancreatic Cancer Therapy). This umbrella design trial randomizes patients with metastatic disease to either standard first-line therapy with gemcitabine, or a molecular phenotype-guided approach using next-generation sequencing strategies to screen for actionable mutations defined through the ICGC effort.
Collapse
|
22
|
Abstract
Genotyping microarrays are an important and widely-used tool in genetics. I present argyle, an R package for analysis of genotyping array data tailored to Illumina arrays. The goal of the argyle package is to provide simple, expressive tools for nonexpert users to perform quality checks and exploratory analyses of genotyping data. To these ends, the package consists of a suite of quality-control functions, normalization procedures, and utilities for visually and statistically summarizing such data. Format-conversion tools allow interoperability with popular software packages for analysis of genetic data including PLINK, R/qtl and DOQTL. Detailed vignettes demonstrating common use cases are included as supporting information. argyle bridges the gap between the low-level tasks of quality control and high-level tasks of genetic analysis. It is freely available at https://github.com/andrewparkermorgan/argyle and has been submitted to Bioconductor.
Collapse
|
23
|
Walker LC, Wiggins GAR, Pearson JF. The Role of Constitutional Copy Number Variants in Breast Cancer. ACTA ACUST UNITED AC 2015; 4:407-23. [PMID: 27600231 PMCID: PMC4996380 DOI: 10.3390/microarrays4030407] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2015] [Revised: 08/26/2015] [Accepted: 09/01/2015] [Indexed: 01/16/2023]
Abstract
Constitutional copy number variants (CNVs) include inherited and de novo deviations from a diploid state at a defined genomic region. These variants contribute significantly to genetic variation and disease in humans, including breast cancer susceptibility. Identification of genetic risk factors for breast cancer in recent years has been dominated by the use of genome-wide technologies, such as single nucleotide polymorphism (SNP)-arrays, with a significant focus on single nucleotide variants. To date, these large datasets have been underutilised for generating genome-wide CNV profiles despite offering a massive resource for assessing the contribution of these structural variants to breast cancer risk. Technical challenges remain in determining the location and distribution of CNVs across the human genome due to the accuracy of computational prediction algorithms and resolution of the array data. Moreover, better methods are required for interpreting the functional effect of newly discovered CNVs. In this review, we explore current and future application of SNP array technology to assess rare and common CNVs in association with breast cancer risk in humans.
Collapse
Affiliation(s)
- Logan C Walker
- Mackenzie Cancer Research Group, Department of Pathology, University of Otago, Christchurch 8140, New Zealand.
| | - George A R Wiggins
- Mackenzie Cancer Research Group, Department of Pathology, University of Otago, Christchurch 8140, New Zealand.
| | - John F Pearson
- Biostatistics and Computational Biology Unit, University of Otago, Christchurch 8140, New Zealand.
| |
Collapse
|
24
|
Chien J, Sicotte H, Fan JB, Humphray S, Cunningham JM, Kalli KR, Oberg AL, Hart SN, Li Y, Davila JI, Baheti S, Wang C, Dietmann S, Atkinson EJ, Asmann YW, Bell DA, Ota T, Tarabishy Y, Kuang R, Bibikova M, Cheetham RK, Grocock RJ, Swisher EM, Peden J, Bentley D, Kocher JPA, Kaufmann SH, Hartmann LC, Shridhar V, Goode EL. TP53 mutations, tetraploidy and homologous recombination repair defects in early stage high-grade serous ovarian cancer. Nucleic Acids Res 2015; 43:6945-58. [PMID: 25916844 PMCID: PMC4538798 DOI: 10.1093/nar/gkv111] [Citation(s) in RCA: 45] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2014] [Revised: 01/23/2015] [Accepted: 02/02/2015] [Indexed: 12/30/2022] Open
Abstract
To determine early somatic changes in high-grade serous ovarian cancer (HGSOC), we performed whole genome sequencing on a rare collection of 16 low stage HGSOCs. The majority showed extensive structural alterations (one had an ultramutated profile), exhibited high levels of p53 immunoreactivity, and harboured a TP53 mutation, deletion or inactivation. BRCA1 and BRCA2 mutations were observed in two tumors, with nine showing evidence of a homologous recombination (HR) defect. Combined Analysis with The Cancer Genome Atlas (TCGA) indicated that low and late stage HGSOCs have similar mutation and copy number profiles. We also found evidence that deleterious TP53 mutations are the earliest events, followed by deletions or loss of heterozygosity (LOH) of chromosomes carrying TP53, BRCA1 or BRCA2. Inactivation of HR appears to be an early event, as 62.5% of tumours showed a LOH pattern suggestive of HR defects. Three tumours with the highest ploidy had little genome-wide LOH, yet one of these had a homozygous somatic frame-shift BRCA2 mutation, suggesting that some carcinomas begin as tetraploid then descend into diploidy accompanied by genome-wide LOH. Lastly, we found evidence that structural variants (SV) cluster in HGSOC, but are absent in one ultramutated tumor, providing insights into the pathogenesis of low stage HGSOC.
Collapse
Affiliation(s)
- Jeremy Chien
- Department of Cancer Biology, University of Kansas Medical Center, Kansas City, KS 66160, USA
| | - Hugues Sicotte
- Department of Health Sciences Research, Mayo Clinic, Rochester, MN 55905, USA
| | | | - Sean Humphray
- Illumina Cambridge Ltd, Little Chesterford, Essex CB10 1, UK
| | - Julie M Cunningham
- Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN 55905, USA
| | | | - Ann L Oberg
- Department of Health Sciences Research, Mayo Clinic, Rochester, MN 55905, USA
| | - Steven N Hart
- Department of Health Sciences Research, Mayo Clinic, Rochester, MN 55905, USA
| | - Ying Li
- Department of Health Sciences Research, Mayo Clinic, Rochester, MN 55905, USA
| | - Jaime I Davila
- Department of Health Sciences Research, Mayo Clinic, Rochester, MN 55905, USA
| | - Saurabh Baheti
- Department of Health Sciences Research, Mayo Clinic, Rochester, MN 55905, USA
| | - Chen Wang
- Department of Health Sciences Research, Mayo Clinic, Rochester, MN 55905, USA
| | - Sabine Dietmann
- Wellcome Trust, Medical Research Council Stem Cell Institute, University of Cambridge, Cambridge CB2 1TN, UK
| | | | - Yan W Asmann
- Department of Health Sciences Research, Mayo Clinic, Jacksonville, FL 32224, USA
| | - Debra A Bell
- Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN 55905, USA
| | - Takayo Ota
- Department of Internal Medicine, Rinku General Medical Center, Izumi-sano, 598-8577, Japan
| | - Yaman Tarabishy
- Department of Pathology and Immunology, Washington University in St. Louis, St. Louis, MO 63110, USA
| | - Rui Kuang
- Department of Biomedical Informatics and Computational Biology, University of Minnesota, Minneapolis, MN 55414, USA
| | | | | | | | - Elizabeth M Swisher
- Department of Obstetrics and Gynecology, University of Washington, Seattle, WA 98109, USA
| | - John Peden
- Illumina Cambridge Ltd, Little Chesterford, Essex CB10 1, UK
| | - David Bentley
- Illumina Cambridge Ltd, Little Chesterford, Essex CB10 1, UK
| | | | | | - Lynn C Hartmann
- Department of Oncology, Mayo Clinic, Rochester, MN 55905, USA
| | - Viji Shridhar
- Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN 55905, USA
| | - Ellen L Goode
- Department of Health Sciences Research, Mayo Clinic, Rochester, MN 55905, USA
| |
Collapse
|
25
|
Xia H, Liu Y, Wang M, Li A. Identification of Genomic Aberrations in Cancer Subclones from Heterogeneous Tumor Samples. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2015; 12:679-685. [PMID: 26357278 DOI: 10.1109/tcbb.2014.2366114] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Tumor samples are usually heterogeneous, containing admixture of more than one kind of tumor subclones. Studies of genomic aberrations from heterogeneous tumor data are hindered by the mixed signal of tumor subclone cells. Most of the existing algorithms cannot distinguish contributions of different subclones from the measured single nucleotide polymorphism (SNP) array signals, which may cause erroneous estimation of genomic aberrations. Here, we have introduced a computational method, Cancer Heterogeneity Analysis from SNP-array Experiments (CHASE), to automatically detect subclone proportions and genomic aberrations from heterogeneous tumor samples. Our method is based on HMM, and incorporates EM algorithm to build a statistical model for modeling mixed signal of multiple tumor subclones. We tested the proposed approach on simulated datasets and two real datasets, and the results show that the proposed method can efficiently estimate tumor subclone proportions and recovery the genomic aberrations.
Collapse
|
26
|
Wang W, Wang W, Sun W, Crowley JJ, Szatkiewicz JP. Allele-specific copy-number discovery from whole-genome and whole-exome sequencing. Nucleic Acids Res 2015; 43:e90. [PMID: 25883151 PMCID: PMC4538801 DOI: 10.1093/nar/gkv319] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2014] [Accepted: 03/27/2015] [Indexed: 11/14/2022] Open
Abstract
Copy-number variants (CNVs) are a major form of genetic variation and a risk factor for various human diseases, so it is crucial to accurately detect and characterize them. It is conceivable that allele-specific reads from high-throughput sequencing data could be leveraged to both enhance CNV detection and produce allele-specific copy number (ASCN) calls. Although statistical methods have been developed to detect CNVs using whole-genome sequence (WGS) and/or whole-exome sequence (WES) data, information from allele-specific read counts has not yet been adequately exploited. In this paper, we develop an integrated method, called AS-GENSENG, which incorporates allele-specific read counts in CNV detection and estimates ASCN using either WGS or WES data. To evaluate the performance of AS-GENSENG, we conducted extensive simulations, generated empirical data using existing WGS and WES data sets and validated predicted CNVs using an independent methodology. We conclude that AS-GENSENG not only predicts accurate ASCN calls but also improves the accuracy of total copy number calls, owing to its unique ability to exploit information from both total and allele-specific read counts while accounting for various experimental biases in sequence data. Our novel, user-friendly and computationally efficient method and a complete analytic protocol is freely available at https://sourceforge.net/projects/asgenseng/.
Collapse
Affiliation(s)
- WeiBo Wang
- Department of Computer Science, University of North Carolina at Chapel Hill, NC 27599-3175, USA
| | - Wei Wang
- Department of Computer Science, University of California, Los Angeles, CA 90095, USA
| | - Wei Sun
- Department of Biostatistics, University of North Carolina at Chapel Hill, NC 27599-7400, USA
| | - James J Crowley
- Department of Genetics, University of North Carolina at Chapel Hill, NC 27599-7264, USA
| | - Jin P Szatkiewicz
- Department of Genetics, University of North Carolina at Chapel Hill, NC 27599-7264, USA
| |
Collapse
|
27
|
Waddell N, Pajic M, Patch AM, Chang DK, Kassahn KS, Bailey P, Johns AL, Miller D, Nones K, Quek K, Quinn MCJ, Robertson AJ, Fadlullah MZH, Bruxner TJC, Christ AN, Harliwong I, Idrisoglu S, Manning S, Nourse C, Nourbakhsh E, Wani S, Wilson PJ, Markham E, Cloonan N, Anderson MJ, Fink JL, Holmes O, Kazakoff SH, Leonard C, Newell F, Poudel B, Song S, Taylor D, Waddell N, Wood S, Xu Q, Wu J, Pinese M, Cowley MJ, Lee HC, Jones MD, Nagrial AM, Humphris J, Chantrill LA, Chin V, Steinmann AM, Mawson A, Humphrey ES, Colvin EK, Chou A, Scarlett CJ, Pinho AV, Giry-Laterriere M, Rooman I, Samra JS, Kench JG, Pettitt JA, Merrett ND, Toon C, Epari K, Nguyen NQ, Barbour A, Zeps N, Jamieson NB, Graham JS, Niclou SP, Bjerkvig R, Grützmann R, Aust D, Hruban RH, Maitra A, Iacobuzio-Donahue CA, Wolfgang CL, Morgan RA, Lawlor RT, Corbo V, Bassi C, Falconi M, Zamboni G, Tortora G, Tempero MA, Gill AJ, Eshleman JR, Pilarsky C, Scarpa A, Musgrove EA, Pearson JV, Biankin AV, Grimmond SM. Whole genomes redefine the mutational landscape of pancreatic cancer. Nature 2015; 518:495-501. [PMID: 25719666 PMCID: PMC4523082 DOI: 10.1038/nature14169] [Citation(s) in RCA: 1792] [Impact Index Per Article: 199.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2014] [Accepted: 12/18/2014] [Indexed: 12/13/2022]
Abstract
Pancreatic cancer remains one of the most lethal of malignancies and a major health burden. We performed whole-genome sequencing and copy number variation (CNV) analysis of 100 pancreatic ductal adenocarcinomas (PDACs). Chromosomal rearrangements leading to gene disruption were prevalent, affecting genes known to be important in pancreatic cancer (TP53, SMAD4, CDKN2A, ARID1A and ROBO2) and new candidate drivers of pancreatic carcinogenesis (KDM6A and PREX2). Patterns of structural variation (variation in chromosomal structure) classified PDACs into 4 subtypes with potential clinical utility: the subtypes were termed stable, locally rearranged, scattered and unstable. A significant proportion harboured focal amplifications, many of which contained druggable oncogenes (ERBB2, MET, FGFR1, CDK6, PIK3R3 and PIK3CA), but at low individual patient prevalence. Genomic instability co-segregated with inactivation of DNA maintenance genes (BRCA1, BRCA2 or PALB2) and a mutational signature of DNA damage repair deficiency. Of 8 patients who received platinum therapy, 4 of 5 individuals with these measures of defective DNA maintenance responded.
Collapse
Affiliation(s)
- Nicola Waddell
- 1] Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, The University of Queensland, St Lucia, Brisbane, Queensland 4072, Australia [2] QIMR Berghofer Medical Research Institute, Herston Road, Brisbane 4006, Australia
| | - Marina Pajic
- 1] The Kinghorn Cancer Centre, Cancer Division, Garvan Institute of Medical Research, University of New South Wales, 384 Victoria St, Darlinghurst, Sydney, New South Wales 2010, Australia [2] St Vincent's Clinical School, Faculty of Medicine, University of New South Wales, New South Wales 2010, Australia
| | - Ann-Marie Patch
- Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, The University of Queensland, St Lucia, Brisbane, Queensland 4072, Australia
| | - David K Chang
- 1] The Kinghorn Cancer Centre, Cancer Division, Garvan Institute of Medical Research, University of New South Wales, 384 Victoria St, Darlinghurst, Sydney, New South Wales 2010, Australia [2] Department of Surgery, Bankstown Hospital, Eldridge Road, Bankstown, Sydney, New South Wales 2200, Australia [3] South Western Sydney Clinical School, Faculty of Medicine, University of New South Wales, Liverpool, New South Wales 2170, Australia [4] Wolfson Wohl Cancer Research Centre, Institute of Cancer Sciences, University of Glasgow, Garscube Estate, Switchback Road, Bearsden, Glasgow G61 1BD, UK
| | - Karin S Kassahn
- Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, The University of Queensland, St Lucia, Brisbane, Queensland 4072, Australia
| | - Peter Bailey
- 1] Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, The University of Queensland, St Lucia, Brisbane, Queensland 4072, Australia [2] Wolfson Wohl Cancer Research Centre, Institute of Cancer Sciences, University of Glasgow, Garscube Estate, Switchback Road, Bearsden, Glasgow G61 1BD, UK
| | - Amber L Johns
- The Kinghorn Cancer Centre, Cancer Division, Garvan Institute of Medical Research, University of New South Wales, 384 Victoria St, Darlinghurst, Sydney, New South Wales 2010, Australia
| | - David Miller
- Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, The University of Queensland, St Lucia, Brisbane, Queensland 4072, Australia
| | - Katia Nones
- Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, The University of Queensland, St Lucia, Brisbane, Queensland 4072, Australia
| | - Kelly Quek
- Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, The University of Queensland, St Lucia, Brisbane, Queensland 4072, Australia
| | - Michael C J Quinn
- Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, The University of Queensland, St Lucia, Brisbane, Queensland 4072, Australia
| | - Alan J Robertson
- Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, The University of Queensland, St Lucia, Brisbane, Queensland 4072, Australia
| | - Muhammad Z H Fadlullah
- Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, The University of Queensland, St Lucia, Brisbane, Queensland 4072, Australia
| | - Tim J C Bruxner
- Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, The University of Queensland, St Lucia, Brisbane, Queensland 4072, Australia
| | - Angelika N Christ
- Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, The University of Queensland, St Lucia, Brisbane, Queensland 4072, Australia
| | - Ivon Harliwong
- Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, The University of Queensland, St Lucia, Brisbane, Queensland 4072, Australia
| | - Senel Idrisoglu
- Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, The University of Queensland, St Lucia, Brisbane, Queensland 4072, Australia
| | - Suzanne Manning
- Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, The University of Queensland, St Lucia, Brisbane, Queensland 4072, Australia
| | - Craig Nourse
- 1] Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, The University of Queensland, St Lucia, Brisbane, Queensland 4072, Australia [2] Wolfson Wohl Cancer Research Centre, Institute of Cancer Sciences, University of Glasgow, Garscube Estate, Switchback Road, Bearsden, Glasgow G61 1BD, UK
| | - Ehsan Nourbakhsh
- Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, The University of Queensland, St Lucia, Brisbane, Queensland 4072, Australia
| | - Shivangi Wani
- Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, The University of Queensland, St Lucia, Brisbane, Queensland 4072, Australia
| | - Peter J Wilson
- Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, The University of Queensland, St Lucia, Brisbane, Queensland 4072, Australia
| | - Emma Markham
- Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, The University of Queensland, St Lucia, Brisbane, Queensland 4072, Australia
| | - Nicole Cloonan
- 1] Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, The University of Queensland, St Lucia, Brisbane, Queensland 4072, Australia [2] QIMR Berghofer Medical Research Institute, Herston Road, Brisbane 4006, Australia
| | - Matthew J Anderson
- Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, The University of Queensland, St Lucia, Brisbane, Queensland 4072, Australia
| | - J Lynn Fink
- Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, The University of Queensland, St Lucia, Brisbane, Queensland 4072, Australia
| | - Oliver Holmes
- Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, The University of Queensland, St Lucia, Brisbane, Queensland 4072, Australia
| | - Stephen H Kazakoff
- Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, The University of Queensland, St Lucia, Brisbane, Queensland 4072, Australia
| | - Conrad Leonard
- Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, The University of Queensland, St Lucia, Brisbane, Queensland 4072, Australia
| | - Felicity Newell
- Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, The University of Queensland, St Lucia, Brisbane, Queensland 4072, Australia
| | - Barsha Poudel
- Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, The University of Queensland, St Lucia, Brisbane, Queensland 4072, Australia
| | - Sarah Song
- Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, The University of Queensland, St Lucia, Brisbane, Queensland 4072, Australia
| | - Darrin Taylor
- Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, The University of Queensland, St Lucia, Brisbane, Queensland 4072, Australia
| | - Nick Waddell
- Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, The University of Queensland, St Lucia, Brisbane, Queensland 4072, Australia
| | - Scott Wood
- Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, The University of Queensland, St Lucia, Brisbane, Queensland 4072, Australia
| | - Qinying Xu
- Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, The University of Queensland, St Lucia, Brisbane, Queensland 4072, Australia
| | - Jianmin Wu
- The Kinghorn Cancer Centre, Cancer Division, Garvan Institute of Medical Research, University of New South Wales, 384 Victoria St, Darlinghurst, Sydney, New South Wales 2010, Australia
| | - Mark Pinese
- The Kinghorn Cancer Centre, Cancer Division, Garvan Institute of Medical Research, University of New South Wales, 384 Victoria St, Darlinghurst, Sydney, New South Wales 2010, Australia
| | - Mark J Cowley
- The Kinghorn Cancer Centre, Cancer Division, Garvan Institute of Medical Research, University of New South Wales, 384 Victoria St, Darlinghurst, Sydney, New South Wales 2010, Australia
| | - Hong C Lee
- The Kinghorn Cancer Centre, Cancer Division, Garvan Institute of Medical Research, University of New South Wales, 384 Victoria St, Darlinghurst, Sydney, New South Wales 2010, Australia
| | - Marc D Jones
- 1] The Kinghorn Cancer Centre, Cancer Division, Garvan Institute of Medical Research, University of New South Wales, 384 Victoria St, Darlinghurst, Sydney, New South Wales 2010, Australia [2] Wolfson Wohl Cancer Research Centre, Institute of Cancer Sciences, University of Glasgow, Garscube Estate, Switchback Road, Bearsden, Glasgow G61 1BD, UK
| | - Adnan M Nagrial
- The Kinghorn Cancer Centre, Cancer Division, Garvan Institute of Medical Research, University of New South Wales, 384 Victoria St, Darlinghurst, Sydney, New South Wales 2010, Australia
| | - Jeremy Humphris
- The Kinghorn Cancer Centre, Cancer Division, Garvan Institute of Medical Research, University of New South Wales, 384 Victoria St, Darlinghurst, Sydney, New South Wales 2010, Australia
| | - Lorraine A Chantrill
- The Kinghorn Cancer Centre, Cancer Division, Garvan Institute of Medical Research, University of New South Wales, 384 Victoria St, Darlinghurst, Sydney, New South Wales 2010, Australia
| | - Venessa Chin
- The Kinghorn Cancer Centre, Cancer Division, Garvan Institute of Medical Research, University of New South Wales, 384 Victoria St, Darlinghurst, Sydney, New South Wales 2010, Australia
| | - Angela M Steinmann
- The Kinghorn Cancer Centre, Cancer Division, Garvan Institute of Medical Research, University of New South Wales, 384 Victoria St, Darlinghurst, Sydney, New South Wales 2010, Australia
| | - Amanda Mawson
- The Kinghorn Cancer Centre, Cancer Division, Garvan Institute of Medical Research, University of New South Wales, 384 Victoria St, Darlinghurst, Sydney, New South Wales 2010, Australia
| | - Emily S Humphrey
- The Kinghorn Cancer Centre, Cancer Division, Garvan Institute of Medical Research, University of New South Wales, 384 Victoria St, Darlinghurst, Sydney, New South Wales 2010, Australia
| | - Emily K Colvin
- The Kinghorn Cancer Centre, Cancer Division, Garvan Institute of Medical Research, University of New South Wales, 384 Victoria St, Darlinghurst, Sydney, New South Wales 2010, Australia
| | - Angela Chou
- 1] The Kinghorn Cancer Centre, Cancer Division, Garvan Institute of Medical Research, University of New South Wales, 384 Victoria St, Darlinghurst, Sydney, New South Wales 2010, Australia [2] Department of Anatomical Pathology, St Vincent's Hospital, Sydney, New South Wales 2010, Australia
| | - Christopher J Scarlett
- 1] The Kinghorn Cancer Centre, Cancer Division, Garvan Institute of Medical Research, University of New South Wales, 384 Victoria St, Darlinghurst, Sydney, New South Wales 2010, Australia [2] School of Environmental &Life Sciences, University of Newcastle, Ourimbah, New South Wales 2258, Australia
| | - Andreia V Pinho
- The Kinghorn Cancer Centre, Cancer Division, Garvan Institute of Medical Research, University of New South Wales, 384 Victoria St, Darlinghurst, Sydney, New South Wales 2010, Australia
| | - Marc Giry-Laterriere
- The Kinghorn Cancer Centre, Cancer Division, Garvan Institute of Medical Research, University of New South Wales, 384 Victoria St, Darlinghurst, Sydney, New South Wales 2010, Australia
| | - Ilse Rooman
- The Kinghorn Cancer Centre, Cancer Division, Garvan Institute of Medical Research, University of New South Wales, 384 Victoria St, Darlinghurst, Sydney, New South Wales 2010, Australia
| | - Jaswinder S Samra
- 1] Department of Surgery, Royal North Shore Hospital, St Leonards, Sydney, New South Wales 2065, Australia [2] University of Sydney, Sydney, New South Wales 2006, Australia
| | - James G Kench
- 1] The Kinghorn Cancer Centre, Cancer Division, Garvan Institute of Medical Research, University of New South Wales, 384 Victoria St, Darlinghurst, Sydney, New South Wales 2010, Australia [2] University of Sydney, Sydney, New South Wales 2006, Australia [3] Tissue Pathology and Diagnostic Oncology, Royal Prince Alfred Hospital, Camperdown, New South Wales 2050, Australia
| | - Jessica A Pettitt
- The Kinghorn Cancer Centre, Cancer Division, Garvan Institute of Medical Research, University of New South Wales, 384 Victoria St, Darlinghurst, Sydney, New South Wales 2010, Australia
| | - Neil D Merrett
- 1] Department of Surgery, Bankstown Hospital, Eldridge Road, Bankstown, Sydney, New South Wales 2200, Australia [2] School of Medicine, University of Western Sydney, Penrith, New South Wales 2175, Australia
| | - Christopher Toon
- The Kinghorn Cancer Centre, Cancer Division, Garvan Institute of Medical Research, University of New South Wales, 384 Victoria St, Darlinghurst, Sydney, New South Wales 2010, Australia
| | - Krishna Epari
- Department of Surgery, Fremantle Hospital, Alma Street, Fremantle, Western Australia 6160, Australia
| | - Nam Q Nguyen
- Department of Gastroenterology, Royal Adelaide Hospital, North Terrace, Adelaide, South Australia 5000, Australia
| | - Andrew Barbour
- Department of Surgery, Princess Alexandra Hospital, Ipswich Rd, Woollongabba, Queensland 4102, Australia
| | - Nikolajs Zeps
- 1] School of Surgery M507, University of Western Australia, 35 Stirling Highway, Nedlands 6009, Australia [2] St John of God Pathology, 12 Salvado Rd, Subiaco, Western Australia 6008, Australia [3] Bendat Family Comprehensive Cancer Centre, St John of God Subiaco Hospital, Subiaco, Western Australia 6008, Australia
| | - Nigel B Jamieson
- 1] Wolfson Wohl Cancer Research Centre, Institute of Cancer Sciences, University of Glasgow, Garscube Estate, Switchback Road, Bearsden, Glasgow G61 1BD, UK [2] Academic Unit of Surgery, School of Medicine, College of Medical, Veterinary and Life Sciences, University of Glasgow, Glasgow Royal Infirmary, Glasgow G4 OSF, UK [3] West of Scotland Pancreatic Unit, Glasgow Royal Infirmary, Glasgow G31 2ER, UK
| | - Janet S Graham
- 1] Wolfson Wohl Cancer Research Centre, Institute of Cancer Sciences, University of Glasgow, Garscube Estate, Switchback Road, Bearsden, Glasgow G61 1BD, UK [2] Department of Medical Oncology, Beatson West of Scotland Cancer Centre, 1053 Great Western Road, Glasgow G12 0YN, UK
| | - Simone P Niclou
- Norlux Neuro-Oncology Laboratory, CRP-Santé Luxembourg, 84 Val Fleuri, L-1526, Luxembourg
| | - Rolf Bjerkvig
- Norlux Neuro-Oncology, Department of Biomedicine, University of Bergen, Jonas Lies vei 91, N-5019 Bergen, Norway
| | - Robert Grützmann
- Departments of Surgery and Pathology, TU Dresden, Fetscherstr. 74, 01307 Dresden, Germany
| | - Daniela Aust
- Departments of Surgery and Pathology, TU Dresden, Fetscherstr. 74, 01307 Dresden, Germany
| | - Ralph H Hruban
- Department of Pathology, The Sol Goldman Pancreatic Cancer Research Center, the Johns Hopkins University School of Medicine, Baltimore, Maryland 21231, USA
| | - Anirban Maitra
- Departments of Pathology and Translational Molecular Pathology, University of Texas MD Anderson Cancer Center, Houston Texas 77030, USA
| | - Christine A Iacobuzio-Donahue
- The David M. Rubenstein Pancreatic Cancer Research Center and Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, New York 10065, USA
| | - Christopher L Wolfgang
- Department of Surgery, The Sol Goldman Pancreatic Cancer Research Center, the Johns Hopkins University School of Medicine, Baltimore, Maryland 21231, USA
| | - Richard A Morgan
- Department of Pathology, The Sol Goldman Pancreatic Cancer Research Center, the Johns Hopkins University School of Medicine, Baltimore, Maryland 21231, USA
| | - Rita T Lawlor
- 1] ARC-NET Centre for Applied Research on Cancer, University and Hospital Trust of Verona, Verona 37134, Italy [2] Department of Pathology and Diagnostics, University of Verona, Verona 37134, Italy
| | - Vincenzo Corbo
- ARC-NET Centre for Applied Research on Cancer, University and Hospital Trust of Verona, Verona 37134, Italy
| | - Claudio Bassi
- Department of Surgery and Oncology, Pancreas Institute, University and Hospital Trust of Verona, Verona 37134, Italy
| | - Massimo Falconi
- 1] Department of Surgery and Oncology, Pancreas Institute, University and Hospital Trust of Verona, Verona 37134, Italy [2] Departments of Surgery and Pathology, Ospedale Sacro Cuore Don Calabria Negrar, Verona 37024, Italy
| | - Giuseppe Zamboni
- 1] Department of Pathology and Diagnostics, University of Verona, Verona 37134, Italy [2] Departments of Surgery and Pathology, Ospedale Sacro Cuore Don Calabria Negrar, Verona 37024, Italy
| | - Giampaolo Tortora
- Department of Oncology, University and Hospital Trust of Verona, Verona 37134, Italy
| | - Margaret A Tempero
- Division of Hematology and Oncology, University of California, San Francisco, California 94122, USA
| | - Anthony J Gill
- 1] The Kinghorn Cancer Centre, Cancer Division, Garvan Institute of Medical Research, University of New South Wales, 384 Victoria St, Darlinghurst, Sydney, New South Wales 2010, Australia [2] University of Sydney, Sydney, New South Wales 2006, Australia
| | - James R Eshleman
- Department of Pathology, The Sol Goldman Pancreatic Cancer Research Center, the Johns Hopkins University School of Medicine, Baltimore, Maryland 21231, USA
| | - Christian Pilarsky
- Departments of Surgery and Pathology, TU Dresden, Fetscherstr. 74, 01307 Dresden, Germany
| | - Aldo Scarpa
- 1] ARC-NET Centre for Applied Research on Cancer, University and Hospital Trust of Verona, Verona 37134, Italy [2] Department of Pathology and Diagnostics, University of Verona, Verona 37134, Italy
| | - Elizabeth A Musgrove
- Wolfson Wohl Cancer Research Centre, Institute of Cancer Sciences, University of Glasgow, Garscube Estate, Switchback Road, Bearsden, Glasgow G61 1BD, UK
| | - John V Pearson
- 1] Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, The University of Queensland, St Lucia, Brisbane, Queensland 4072, Australia [2] QIMR Berghofer Medical Research Institute, Herston Road, Brisbane 4006, Australia
| | - Andrew V Biankin
- 1] The Kinghorn Cancer Centre, Cancer Division, Garvan Institute of Medical Research, University of New South Wales, 384 Victoria St, Darlinghurst, Sydney, New South Wales 2010, Australia [2] Department of Surgery, Bankstown Hospital, Eldridge Road, Bankstown, Sydney, New South Wales 2200, Australia [3] South Western Sydney Clinical School, Faculty of Medicine, University of New South Wales, Liverpool, New South Wales 2170, Australia [4] Wolfson Wohl Cancer Research Centre, Institute of Cancer Sciences, University of Glasgow, Garscube Estate, Switchback Road, Bearsden, Glasgow G61 1BD, UK
| | - Sean M Grimmond
- 1] Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, The University of Queensland, St Lucia, Brisbane, Queensland 4072, Australia [2] Wolfson Wohl Cancer Research Centre, Institute of Cancer Sciences, University of Glasgow, Garscube Estate, Switchback Road, Bearsden, Glasgow G61 1BD, UK
| |
Collapse
|
28
|
Hu YJ, Lin DY, Sun W, Zeng D. A Likelihood-Based Framework for Association Analysis of Allele-Specific Copy Numbers. J Am Stat Assoc 2015; 109:1533-1545. [PMID: 25663726 DOI: 10.1080/01621459.2014.908777] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
Copy number variants (CNVs) and single nucleotide polymorphisms (SNPs) co-exist throughout the human genome and jointly contribute to phenotypic variations. Thus, it is desirable to consider both types of variants, as characterized by allele-specific copy numbers (ASCNs), in association studies of complex human diseases. Current SNP genotyping technologies capture the CNV and SNP information simultaneously via fluorescent intensity measurements. The common practice of calling ASCNs from the intensity measurements and then using the ASCN calls in downstream association analysis has important limitations. First, the association tests are prone to false-positive findings when differential measurement errors between cases and controls arise from differences in DNA quality or handling. Second, the uncertainties in the ASCN calls are ignored. We present a general framework for the integrated analysis of CNVs and SNPs, including the analysis of total copy numbers as a special case. Our approach combines the ASCN calling and the association analysis into a single step while allowing for differential measurement errors. We construct likelihood functions that properly account for case-control sampling and measurement errors. We establish the asymptotic properties of the maximum likelihood estimators and develop EM algorithms to implement the corresponding inference procedures. The advantages of the proposed methods over the existing ones are demonstrated through realistic simulation studies and an application to a genome-wide association study of schizophrenia. Extensions to next-generation sequencing data are discussed.
Collapse
|
29
|
Seiser EL, Innocenti F. Hidden Markov Model-Based CNV Detection Algorithms for Illumina Genotyping Microarrays. Cancer Inform 2015; 13:77-83. [PMID: 25657572 PMCID: PMC4310714 DOI: 10.4137/cin.s16345] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2014] [Revised: 11/18/2014] [Accepted: 11/21/2014] [Indexed: 12/24/2022] Open
Abstract
Somatic alterations in DNA copy number have been well studied in numerous malignancies, yet the role of germline DNA copy number variation in cancer is still emerging. Genotyping microarrays generate allele-specific signal intensities to determine genotype, but may also be used to infer DNA copy number using additional computational approaches. Numerous tools have been developed to analyze Illumina genotype microarray data for copy number variant (CNV) discovery, although commonly utilized algorithms freely available to the public employ approaches based upon the use of hidden Markov models (HMMs). QuantiSNP, PennCNV, and GenoCN utilize HMMs with six copy number states but vary in how transition and emission probabilities are calculated. Performance of these CNV detection algorithms has been shown to be variable between both genotyping platforms and data sets, although HMM approaches generally outperform other current methods. Low sensitivity is prevalent with HMM-based algorithms, suggesting the need for continued improvement in CNV detection methodologies.
Collapse
Affiliation(s)
- Eric L Seiser
- Center for Pharmacogenomics and Individualized Therapy, The University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Federico Innocenti
- Center for Pharmacogenomics and Individualized Therapy, The University of North Carolina at Chapel Hill, Chapel Hill, NC, USA. ; UNC Lineberger Comprehensive Cancer Center, The University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| |
Collapse
|
30
|
Yang S, Cui X, Fang Z. BCRgt: a Bayesian cluster regression-based genotyping algorithm for the samples with copy number alterations. BMC Bioinformatics 2014; 15:74. [PMID: 24629125 PMCID: PMC4003822 DOI: 10.1186/1471-2105-15-74] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2013] [Accepted: 03/10/2014] [Indexed: 11/17/2022] Open
Abstract
Background Accurate genotype calling is a pre-requisite of a successful Genome-Wide Association Study (GWAS). Although most genotyping algorithms can achieve an accuracy rate greater than 99% for genotyping DNA samples without copy number alterations (CNAs), almost all of these algorithms are not designed for genotyping tumor samples that are known to have large regions of CNAs. Results This study aims to develop a statistical method that can accurately genotype tumor samples with CNAs. The proposed method adds a Bayesian layer to a cluster regression model and is termed a Bayesian Cluster Regression-based genotyping algorithm (BCRgt). We demonstrate that high concordance rates with HapMap calls can be achieved without using reference/training samples, when CNAs do not exist. By adding a training step, we have obtained higher genotyping concordance rates, without requiring large sample sizes. When CNAs exist in the samples, accuracy can be dramatically improved in regions with DNA copy loss and slightly improved in regions with copy number gain, comparing with the Bayesian Robust Linear Model with Mahalanobis distance classifier (BRLMM). Conclusions In conclusion, we have demonstrated that BCRgt can provide accurate genotyping calls for tumor samples with CNAs.
Collapse
|
31
|
Didion JP, Buus RJ, Naghashfar Z, Threadgill DW, Morse HC, de Villena FPM. SNP array profiling of mouse cell lines identifies their strains of origin and reveals cross-contamination and widespread aneuploidy. BMC Genomics 2014; 15:847. [PMID: 25277546 PMCID: PMC4198738 DOI: 10.1186/1471-2164-15-847] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2014] [Accepted: 09/29/2014] [Indexed: 12/17/2022] Open
Abstract
BACKGROUND The crisis of Misidentified and contaminated cell lines have plagued the biological research community for decades. Some repositories and journals have heeded calls for mandatory authentication of human cell lines, yet misidentification of mouse cell lines has received little publicity despite their importance in sponsored research. Short tandem repeat (STR) profiling is the standard authentication method, but it may fail to distinguish cell lines derived from the same inbred strain of mice. Additionally, STR profiling does not reveal karyotypic changes that occur in some high-passage lines and may have functional consequences. Single nucleotide polymorphism (SNP) profiling has been suggested as a more accurate and versatile alternative to STR profiling; however, a high-throughput method for SNP-based authentication of mouse cell lines has not been described. RESULTS We have developed computational methods (Cell Line Authentication by SNP Profiling, CLASP) for cell line authentication and copy number analysis based on a cost-efficient SNP array, and we provide a reference database of commonly used mouse strains and cell lines. We show that CLASP readily discriminates among cell lines of diverse taxonomic origins, including multiple cell lines derived from a single inbred strain, intercross or wild caught mouse. CLASP is also capable of detecting contaminants present at concentrations as low as 5%. Of the 99 cell lines we tested, 15 exhibited substantial divergence from the reported genetic background. In all cases, we were able to distinguish whether the authentication failure was due to misidentification (one cell line, Ba/F3), the presence of multiple strain backgrounds (five cell lines), contamination by other cells and/or the presence of aneuploid chromosomes (nine cell lines). CONCLUSIONS Misidentification and contamination of mouse cell lines is potentially as widespread as it is in human cell culture. This may have substantial implications for studies that are dependent on the expected background of their cell cultures. Laboratories can mitigate these risks by regular authentication of their cell cultures. Our results demonstrate that SNP array profiling is an effective method to combat cell line misidentification.
Collapse
Affiliation(s)
- John P Didion
- />Department of Genetics, University of North Carolina at Chapel Hill, CB 7295, Chapel Hill, NC 27599-7264 USA
- />Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, CB 7295, Chapel Hill, NC 27599-7264 USA
- />Carolina Center for Genome Science, University of North Carolina at Chapel Hill, CB 7295, Chapel Hill, NC 27599-7264 USA
| | - Ryan J Buus
- />Department of Genetics, University of North Carolina at Chapel Hill, CB 7295, Chapel Hill, NC 27599-7264 USA
- />Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, CB 7295, Chapel Hill, NC 27599-7264 USA
- />Carolina Center for Genome Science, University of North Carolina at Chapel Hill, CB 7295, Chapel Hill, NC 27599-7264 USA
| | - Zohreh Naghashfar
- />Laboratory of Immunogenetics, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Twinbrook I, Room 1421, 5640 Fishers Lane, Rockville, MD 20852 USA
| | - David W Threadgill
- />Department of Veterinary Pathobiology, College of Veterinary Medicine and Biomedical Sciences, Texas A&M University, College Station, TX 77843 USA
- />Department of Molecular and Cellular Medicine, College of Medicine, Texas A&M University, College Station, TX 77843 USA
| | - Herbert C Morse
- />Laboratory of Immunogenetics, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Twinbrook I, Room 1421, 5640 Fishers Lane, Rockville, MD 20852 USA
| | - Fernando Pardo-Manuel de Villena
- />Department of Genetics, University of North Carolina at Chapel Hill, CB 7295, Chapel Hill, NC 27599-7264 USA
- />Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, CB 7295, Chapel Hill, NC 27599-7264 USA
- />Carolina Center for Genome Science, University of North Carolina at Chapel Hill, CB 7295, Chapel Hill, NC 27599-7264 USA
| |
Collapse
|
32
|
Liu B, Morrison CD, Johnson CS, Trump DL, Qin M, Conroy JC, Wang J, Liu S. Computational methods for detecting copy number variations in cancer genome using next generation sequencing: principles and challenges. Oncotarget 2014; 4:1868-81. [PMID: 24240121 PMCID: PMC3875755 DOI: 10.18632/oncotarget.1537] [Citation(s) in RCA: 66] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
Accurate detection of somatic copy number variations (CNVs) is an essential part of cancer genome analysis, and plays an important role in oncotarget identifications. Next generation sequencing (NGS) holds the promise to revolutionize somatic CNV detection. In this review, we provide an overview of current analytic tools used for CNV detection in NGS-based cancer studies. We summarize the NGS data types used for CNV detection, decipher the principles for data preprocessing, segmentation, and interpretation, and discuss the challenges in somatic CNV detection. This review aims to provide a guide to the analytic tools used in NGS-based cancer CNV studies, and to discuss the important factors that researchers need to consider when analyzing NGS data for somatic CNV detections.
Collapse
Affiliation(s)
- Biao Liu
- Center for Personalized Medicine, Roswell Park Cancer Institute, Buffalo, NY
| | | | | | | | | | | | | | | |
Collapse
|
33
|
Pierre-Jean M, Rigaill G, Neuvial P. Performance evaluation of DNA copy number segmentation methods. Brief Bioinform 2014; 16:600-15. [PMID: 25202135 PMCID: PMC4501247 DOI: 10.1093/bib/bbu026] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2014] [Accepted: 06/10/2014] [Indexed: 11/13/2022] Open
Abstract
A number of bioinformatic or biostatistical methods are available for analyzing DNA copy number profiles measured from microarray or sequencing technologies. In the absence of rich enough gold standard data sets, the performance of these methods is generally assessed using unrealistic simulation studies, or based on small real data analyses. To make an objective and reproducible performance assessment, we have designed and implemented a framework to generate realistic DNA copy number profiles of cancer samples with known truth. These profiles are generated by resampling publicly available SNP microarray data from genomic regions with known copy-number state. The original data have been extracted from dilutions series of tumor cell lines with matched blood samples at several concentrations. Therefore, the signal-to-noise ratio of the generated profiles can be controlled through the (known) percentage of tumor cells in the sample. This article describes this framework and its application to a comparison study between methods for segmenting DNA copy number profiles from SNP microarrays. This study indicates that no single method is uniformly better than all others. It also helps identifying pros and cons of the compared methods as a function of biologically informative parameters, such as the fraction of tumor cells in the sample and the proportion of heterozygous markers. This comparison study may be reproduced using the open source and cross-platform R package jointseg, which implements the proposed data generation and evaluation framework: http://r-forge.r-project.org/R/?group_id=1562.
Collapse
|
34
|
Amarasinghe KC, Li J, Hunter SM, Ryland GL, Cowin PA, Campbell IG, Halgamuge SK. Inferring copy number and genotype in tumour exome data. BMC Genomics 2014; 15:732. [PMID: 25167919 PMCID: PMC4162913 DOI: 10.1186/1471-2164-15-732] [Citation(s) in RCA: 87] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2014] [Accepted: 08/18/2014] [Indexed: 12/20/2022] Open
Abstract
BACKGROUND Using whole exome sequencing to predict aberrations in tumours is a cost effective alternative to whole genome sequencing, however is predominantly used for variant detection and infrequently utilised for detection of somatic copy number variation. RESULTS We propose a new method to infer copy number and genotypes using whole exome data from paired tumour/normal samples. Our algorithm uses two Hidden Markov Models to predict copy number and genotypes and computationally resolves polyploidy/aneuploidy, normal cell contamination and signal baseline shift. Our method makes explicit detection on chromosome arm level events, which are commonly found in tumour samples. The methods are combined into a package named ADTEx (Aberration Detection in Tumour Exome). We applied our algorithm to a cohort of 17 in-house generated and 18 TCGA paired ovarian cancer/normal exomes and evaluated the performance by comparing against the copy number variations and genotypes predicted using Affymetrix SNP 6.0 data of the same samples. Further, we carried out a comparison study to show that ADTEx outperformed its competitors in terms of precision and F-measure. CONCLUSIONS Our proposed method, ADTEx, uses both depth of coverage ratios and B allele frequencies calculated from whole exome sequencing data, to predict copy number variations along with their genotypes. ADTEx is implemented as a user friendly software package using Python and R statistical language. Source code and sample data are freely available under GNU license (GPLv3) at http://adtex.sourceforge.net/.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Saman K Halgamuge
- Optimisation and Pattern Recognition group, Mechanical Engineering Department, Melbourne School of Engineering, The University of Melbourne, Parkville, Victoria 3010, Australia.
| |
Collapse
|
35
|
Xia R, Vattathil S, Scheet P. Identification of allelic imbalance with a statistical model for subtle genomic mosaicism. PLoS Comput Biol 2014; 10:e1003765. [PMID: 25166618 PMCID: PMC4148184 DOI: 10.1371/journal.pcbi.1003765] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2014] [Accepted: 05/22/2014] [Indexed: 11/18/2022] Open
Abstract
Genetic heterogeneity in a mixed sample of tumor and normal DNA can confound characterization of the tumor genome. Numerous computational methods have been proposed to detect aberrations in DNA samples from tumor and normal tissue mixtures. Most of these require tumor purities to be at least 10-15%. Here, we present a statistical model to capture information, contained in the individual's germline haplotypes, about expected patterns in the B allele frequencies from SNP microarrays while fully modeling their magnitude, the first such model for SNP microarray data. Our model consists of a pair of hidden Markov models--one for the germline and one for the tumor genome--which, conditional on the observed array data and patterns of population haplotype variation, have a dependence structure induced by the relative imbalance of an individual's inherited haplotypes. Together, these hidden Markov models offer a powerful approach for dealing with mixtures of DNA where the main component represents the germline, thus suggesting natural applications for the characterization of primary clones when stromal contamination is extremely high, and for identifying lesions in rare subclones of a tumor when tumor purity is sufficient to characterize the primary lesions. Our joint model for germline haplotypes and acquired DNA aberration is flexible, allowing a large number of chromosomal alterations, including balanced and imbalanced losses and gains, copy-neutral loss-of-heterozygosity (LOH) and tetraploidy. We found our model (which we term J-LOH) to be superior for localizing rare aberrations in a simulated 3% mixture sample. More generally, our model provides a framework for full integration of the germline and tumor genomes to deal more effectively with missing or uncertain features, and thus extract maximal information from difficult scenarios where existing methods fail.
Collapse
Affiliation(s)
- Rui Xia
- Department of Epidemiology, The University of Texas MD Anderson Cancer Center, Houston, Texas, United States of America
- Division of Biostatistics, The University of Texas School of Public Health, Houston, Texas, United States of America
| | - Selina Vattathil
- Department of Epidemiology, The University of Texas MD Anderson Cancer Center, Houston, Texas, United States of America
- Human & Molecular Genetics Program, The University of Texas Graduate School of Biomedical Sciences, Houston, Texas, United States of America
| | - Paul Scheet
- Department of Epidemiology, The University of Texas MD Anderson Cancer Center, Houston, Texas, United States of America
- Division of Biostatistics, The University of Texas School of Public Health, Houston, Texas, United States of America
- Human & Molecular Genetics Program, The University of Texas Graduate School of Biomedical Sciences, Houston, Texas, United States of America
| |
Collapse
|
36
|
Fischer A, Vázquez-García I, Illingworth CJR, Mustonen V. High-definition reconstruction of clonal composition in cancer. Cell Rep 2014; 7:1740-1752. [PMID: 24882004 PMCID: PMC4062932 DOI: 10.1016/j.celrep.2014.04.055] [Citation(s) in RCA: 117] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2013] [Revised: 03/26/2014] [Accepted: 04/24/2014] [Indexed: 01/08/2023] Open
Abstract
The extensive genetic heterogeneity of cancers can greatly affect therapy success due to the existence of subclonal mutations conferring resistance. However, the characterization of subclones in mixed-cell populations is computationally challenging due to the short length of sequence reads that are generated by current sequencing technologies. Here, we report cloneHD, a probabilistic algorithm for the performance of subclone reconstruction from data generated by high-throughput DNA sequencing: read depth, B-allele counts at germline heterozygous loci, and somatic mutation counts. The algorithm can exploit the added information present in correlated longitudinal or multiregion samples and takes into account correlations along genomes caused by events such as copy-number changes. We apply cloneHD to two case studies: a breast cancer sample and time-resolved samples of chronic lymphocytic leukemia, where we demonstrate that monitoring the response of a patient to therapy regimens is feasible. Our work provides new opportunities for tracking cancer development.
Collapse
Affiliation(s)
- Andrej Fischer
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.
| | - Ignacio Vázquez-García
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK; Department of Applied Mathematics and Theoretical Physics, Centre for Mathematical Sciences, University of Cambridge, Wilberforce Road, Cambridge CB3 0WA, UK
| | | | - Ville Mustonen
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.
| |
Collapse
|
37
|
Lin YJ, Chen YT, Hsu SN, Peng CH, Tang CY, Yen TC, Hsieh WP. HaplotypeCN: copy number haplotype inference with Hidden Markov Model and localized haplotype clustering. PLoS One 2014; 9:e96841. [PMID: 24849202 PMCID: PMC4029584 DOI: 10.1371/journal.pone.0096841] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2013] [Accepted: 04/11/2014] [Indexed: 11/18/2022] Open
Abstract
Copy number variation (CNV) has been reported to be associated with disease and various cancers. Hence, identifying the accurate position and the type of CNV is currently a critical issue. There are many tools targeting on detecting CNV regions, constructing haplotype phases on CNV regions, or estimating the numerical copy numbers. However, none of them can do all of the three tasks at the same time. This paper presents a method based on Hidden Markov Model to detect parent specific copy number change on both chromosomes with signals from SNP arrays. A haplotype tree is constructed with dynamic branch merging to model the transition of the copy number status of the two alleles assessed at each SNP locus. The emission models are constructed for the genotypes formed with the two haplotypes. The proposed method can provide the segmentation points of the CNV regions as well as the haplotype phasing for the allelic status on each chromosome. The estimated copy numbers are provided as fractional numbers, which can accommodate the somatic mutation in cancer specimens that usually consist of heterogeneous cell populations. The algorithm is evaluated on simulated data and the previously published regions of CNV of the 270 HapMap individuals. The results were compared with five popular methods: PennCNV, genoCN, COKGEN, QuantiSNP and cnvHap. The application on oral cancer samples demonstrates how the proposed method can facilitate clinical association studies. The proposed algorithm exhibits comparable sensitivity of the CNV regions to the best algorithm in our genome-wide study and demonstrates the highest detection rate in SNP dense regions. In addition, we provide better haplotype phasing accuracy than similar approaches. The clinical association carried out with our fractional estimate of copy numbers in the cancer samples provides better detection power than that with integer copy number states.
Collapse
Affiliation(s)
- Yen-Jen Lin
- Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan
| | - Yu-Tin Chen
- Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan
| | - Shu-Ni Hsu
- Institute of Statistics, National Tsing Hua University, Hsinchu, Taiwan
| | - Chien-Hua Peng
- Department of Resource Center for Clinical Research, Chang Gung Memorial Hospital, Taoyuan, Taiwan
| | - Chuan-Yi Tang
- Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan
- Department of Computer Science and Information Engineering, Providence University, Taichung, Taiwan
| | - Tzu-Chen Yen
- Head and Neck Oncology Group, Chang Gung Memorial Hospital, Taoyuan, Taiwan
- Nuclear Medicine and Molecular Imaging Center, Chang Gung Memorial Hospital, Taoyuan, Taiwan
| | - Wen-Ping Hsieh
- Institute of Statistics, National Tsing Hua University, Hsinchu, Taiwan
- * E-mail:
| |
Collapse
|
38
|
Trp53 haploinsufficiency modifies EGFR-driven peripheral nerve sheath tumorigenesis. THE AMERICAN JOURNAL OF PATHOLOGY 2014; 184:2082-98. [PMID: 24832557 DOI: 10.1016/j.ajpath.2014.04.006] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/20/2013] [Revised: 03/11/2014] [Accepted: 04/01/2014] [Indexed: 12/21/2022]
Abstract
Malignant peripheral nerve sheath tumors (MPNSTs) are genetically diverse, aggressive sarcomas that occur sporadically or in association with neurofibromatosis type 1 syndrome. Reduced TP53 gene expression and amplification/overexpression of the epidermal growth factor receptor (EGFR) gene occur in MPNST formation. We focused on determining the cooperativity between reduced TP53 expression and EGFR overexpression for Schwann cell transformation in vitro (immortalized human Schwann cells) and MPNST formation in vivo (transgenic mice). Human gene copy number alteration data, microarray expression data, and TMA analysis indicate that TP53 haploinsufficiency and increased EGFR expression co-occur in human MPNST samples. Concurrent modulation of EGFR and TP53 expression in HSC1λ cells significantly increased proliferation and anchorage-independent growth in vitro. Transgenic mice heterozygous for a Trp53-null allele and overexpressing EGFR in Schwann cells had a significant increase in neurofibroma and grade 3 PNST (MPNST) formation compared with single transgenic controls. Histological analysis of tumors identified a significant increase in pAkt expression in grade 3 PNSTs compared with neurofibromas. Array comparative genome hybridization analysis of grade 3 PNSTs identified recurrent focal regions of chromosomal gains with significant enrichment in genes involved in extracellular signal-regulated kinase 5 signaling. Collectively, altered p53 expression cooperates with overexpression of EGFR in Schwann cells to enhance in vitro oncogenic properties and tumorigenesis and progression in vivo.
Collapse
|
39
|
Li Y, Xie X. Deconvolving tumor purity and ploidy by integrating copy number alterations and loss of heterozygosity. ACTA ACUST UNITED AC 2014; 30:2121-9. [PMID: 24695406 DOI: 10.1093/bioinformatics/btu174] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
MOTIVATION Next-generation sequencing (NGS) has revolutionized the study of cancer genomes. However, the reads obtained from NGS of tumor samples often consist of a mixture of normal and tumor cells, which themselves can be of multiple clonal types. A prominent problem in the analysis of cancer genome sequencing data is deconvolving the mixture to identify the reads associated with tumor cells or a particular subclone of tumor cells. Solving the problem is, however, challenging because of the so-called 'identifiability problem', where different combinations of tumor purity and ploidy often explain the sequencing data equally well. RESULTS We propose a new model to resolve the identifiability problem by integrating two types of sequencing information-somatic copy number alterations and loss of heterozygosity-within a unified probabilistic framework. We derive algorithms to solve our model, and implement them in a software package called PyLOH. We benchmark the performance of PyLOH using both simulated data and 12 breast cancer sequencing datasets and show that PyLOH outperforms existing methods in disambiguating the identifiability problem and estimating tumor purity. AVAILABILITY AND IMPLEMENTATION The PyLOH package is written in Python and is publicly available at https://github.com/uci-cbcl/PyLOH. CONTACT xhx@ics.uci.edu SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yi Li
- Department of Computer Science, Institute for Genomics and Bioinformatics and Center for Machine Learning and Intelligent Systems, University of California, Irvine, CA 92697, USA
| | - Xiaohui Xie
- Department of Computer Science, Institute for Genomics and Bioinformatics and Center for Machine Learning and Intelligent Systems, University of California, Irvine, CA 92697, USADepartment of Computer Science, Institute for Genomics and Bioinformatics and Center for Machine Learning and Intelligent Systems, University of California, Irvine, CA 92697, USADepartment of Computer Science, Institute for Genomics and Bioinformatics and Center for Machine Learning and Intelligent Systems, University of California, Irvine, CA 92697, USA
| |
Collapse
|
40
|
Genome-wide identification of somatic aberrations from paired normal-tumor samples. PLoS One 2014; 9:e87212. [PMID: 24498045 PMCID: PMC3907544 DOI: 10.1371/journal.pone.0087212] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2013] [Accepted: 12/26/2013] [Indexed: 12/13/2022] Open
Abstract
Genomic copy number alteration and allelic imbalance are distinct features of cancer cells, and recent advances in the genotyping technology have greatly boosted the research in the cancer genome. However, the complicated nature of tumor usually hampers the dissection of the SNP arrays. In this study, we describe a bioinformatic tool, named GIANT, for genome-wide identification of somatic aberrations from paired normal-tumor samples measured with SNP arrays. By efficiently incorporating genotype information of matched normal sample, it accurately detects different types of aberrations in cancer genome, even for aneuploid tumor samples with severe normal cell contamination. Furthermore, it allows for discovery of recurrent aberrations with critical biological properties in tumorigenesis by using statistical significance test. We demonstrate the superior performance of the proposed method on various datasets including tumor replicate pairs, simulated SNP arrays and dilution series of normal-cancer cell lines. Results show that GIANT has the potential to detect the genomic aberration even when the cancer cell proportion is as low as 5∼10%. Application on a large number of paired tumor samples delivers a genome-wide profile of the statistical significance of the various aberrations, including amplification, deletion and LOH. We believe that GIANT represents a powerful bioinformatic tool for interpreting the complex genomic aberration, and thus assisting both academic study and the clinical treatment of cancer.
Collapse
|
41
|
Vandeweyer G, Kooy RF. Detection and interpretation of genomic structural variation in health and disease. Expert Rev Mol Diagn 2014; 13:61-82. [DOI: 10.1586/erm.12.119] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
|
42
|
Li M, Wen Y, Fu W. A Single-Array-Based Method for Detecting Copy Number Variants Using Affymetrix High Density SNP Arrays and its Application to Breast Cancer. Cancer Inform 2014; 13:95-103. [PMID: 26279618 PMCID: PMC4519351 DOI: 10.4137/cin.s15203] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2014] [Revised: 06/03/2015] [Accepted: 06/04/2015] [Indexed: 11/06/2022] Open
Abstract
Cumulative evidence has shown that structural variations, due to insertions, deletions, and inversions of DNA, may contribute considerably to the development of complex human diseases, such as breast cancer. High-throughput genotyping technologies, such as Affymetrix high density single-nucleotide polymorphism (SNP) arrays, have produced large amounts of genetic data for genome-wide SNP genotype calling and copy number estimation. Meanwhile, there is a great need for accurate and efficient statistical methods to detect copy number variants. In this article, we introduce a hidden-Markov-model (HMM)-based method, referred to as the PICR-CNV, for copy number inference. The proposed method first estimates copy number abundance for each single SNP on a single array based on the raw fluorescence values, and then standardizes the estimated copy number abundance to achieve equal footing among multiple arrays. This method requires no between-array normalization, and thus, maintains data integrity and independence of samples among individual subjects. In addition to our efforts to apply new statistical technology to raw fluorescence values, the HMM has been applied to the standardized copy number abundance in order to reduce experimental noise. Through simulations, we show our refined method is able to infer copy number variants accurately. Application of the proposed method to a breast cancer dataset helps to identify genomic regions significantly associated with the disease.
Collapse
Affiliation(s)
- Ming Li
- Division of Biostatistics, Department of Pediatrics, University of Arkansas for Medical Sciences, Little Rock, AR, USA
| | - Yalu Wen
- Department of Epidemiology and Biostatistics, Michigan State University, East Lansing MI, USA
| | - Wenjiang Fu
- Department of Epidemiology and Biostatistics, Michigan State University, East Lansing MI, USA. ; Department of Mathematics, University of Houston, Houston, TX, USA
| |
Collapse
|
43
|
Nakachi I, Rice JL, Coldren CD, Edwards MG, Stearman RS, Glidewell SC, Varella-Garcia M, Franklin WA, Keith RL, Lewis MT, Gao B, Merrick DT, Miller YE, Geraci MW. Application of SNP microarrays to the genome-wide analysis of chromosomal instability in premalignant airway lesions. Cancer Prev Res (Phila) 2013; 7:255-65. [PMID: 24346345 DOI: 10.1158/1940-6207.capr-12-0485] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Chromosomal instability is central to the process of carcinogenesis. The genome-wide detection of somatic chromosomal alterations (SCA) in small premalignant lesions remains challenging because sample heterogeneity dilutes the aberrant cell information. To overcome this hurdle, we focused on the B allele frequency data from single-nucleotide polymorphism microarrays (SNP arrays). The difference of allelic fractions between paired tumor and normal samples from the same patient (delta-θ) provides a simple but sensitive detection of SCA in the affected tissue. We applied the delta-θ approach to small, heterogeneous clinical specimens, including endobronchial biopsies and brushings. Regions identified by delta-θ were validated by FISH and quantitative PCR in heterogeneous samples. Distinctive genomic variations were successfully detected across the whole genome in all invasive cancer cases (6 of 6), carcinoma in situ (3 of 3), and high-grade dysplasia (severe or moderate; 3 of 11). Not only well-described SCAs in lung squamous cell carcinoma, but also several novel chromosomal alterations were frequently found across the preinvasive dysplastic cases. Within these novel regions, losses of putative tumor suppressors (RNF20 and SSBP2) and an amplification of RASGRP3 gene with oncogenic activity were observed. Widespread sampling of the airway during bronchoscopy demonstrated that field cancerization reflected by SCAs at multiple sites was detectable. SNP arrays combined with delta-θ analysis can detect SCAs in heterogeneous clinical sample and expand our ability to assess genomic instability in the airway epithelium as a biomarker of lung cancer risk.
Collapse
Affiliation(s)
- Ichiro Nakachi
- University of Colorado, Anschutz Medical Campus, 12700, East 19th Avenue, RC2 9th Floor, Aurora, CO 80045.
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
44
|
Holt C, Losic B, Pai D, Zhao Z, Trinh Q, Syam S, Arshadi N, Jang GH, Ali J, Beck T, McPherson J, Muthuswamy LB. WaveCNV: allele-specific copy number alterations in primary tumors and xenograft models from next-generation sequencing. ACTA ACUST UNITED AC 2013; 30:768-74. [PMID: 24192544 PMCID: PMC3957071 DOI: 10.1093/bioinformatics/btt611] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Motivation: Copy number variations (CNVs) are a major source of genomic variability and are especially significant in cancer. Until recently microarray technologies have been used to characterize CNVs in genomes. However, advances in next-generation sequencing technology offer significant opportunities to deduce copy number directly from genome sequencing data. Unfortunately cancer genomes differ from normal genomes in several aspects that make them far less amenable to copy number detection. For example, cancer genomes are often aneuploid and an admixture of diploid/non-tumor cell fractions. Also patient-derived xenograft models can be laden with mouse contamination that strongly affects accurate assignment of copy number. Hence, there is a need to develop analytical tools that can take into account cancer-specific parameters for detecting CNVs directly from genome sequencing data. Results: We have developed WaveCNV, a software package to identify copy number alterations by detecting breakpoints of CNVs using translation-invariant discrete wavelet transforms and assign digitized copy numbers to each event using next-generation sequencing data. We also assign alleles specifying the chromosomal ratio following duplication/loss. We verified copy number calls using both microarray (correlation coefficient 0.97) and quantitative polymerase chain reaction (correlation coefficient 0.94) and found them to be highly concordant. We demonstrate its utility in pancreatic primary and xenograft sequencing data. Availability and implementation: Source code and executables are available at https://github.com/WaveCNV. The segmentation algorithm is implemented in MATLAB, and copy number assignment is implemented Perl. Contact:lakshmi.muthuswamy@gmail.com Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Carson Holt
- Ontario Institute for Cancer Research, Toronto, ON, M5G 0A3, Canada and Department of Medical Biophysics, University of Toronto, Toronto, ON, M5G 2M9, Canada
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
45
|
Chen GK, Chang X, Curtis C, Wang K. Precise inference of copy number alterations in tumor samples from SNP arrays. ACTA ACUST UNITED AC 2013; 29:2964-70. [PMID: 24021380 DOI: 10.1093/bioinformatics/btt521] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
MOTIVATION The accurate detection of copy number alterations (CNAs) in human genomes is important for understanding susceptibility to cancer and mechanisms of tumor progression. CNA detection in tumors from single nucleotide polymorphism (SNP) genotyping arrays is a challenging problem due to phenomena such as aneuploidy, stromal contamination, genomic waves and intra-tumor heterogeneity, issues that leading methods do not optimally address. RESULTS Here we introduce methods and software (PennCNV-tumor) for fast and accurate CNA detection using signal intensity data from SNP genotyping arrays. We estimate stromal contamination by applying a maximum likelihood approach over multiple discrete genomic intervals. By conditioning on signal intensity across the genome, our method accounts for both aneuploidy and genomic waves. Finally, our method uses a hidden Markov model to integrate multiple sources of information, including total and allele-specific signal intensity at each SNP, as well as physical maps to make posterior inferences of CNAs. Using real data from cancer cell-lines and patient tumors, we demonstrate substantial improvements in accuracy and computational efficiency compared with existing methods.
Collapse
Affiliation(s)
- Gary K Chen
- Department of Preventive Medicine, Zilkha Neurogenetic Institute and Department of Psychiatry, University of Southern California, Los Angeles, CA 90089, USA
| | | | | | | |
Collapse
|
46
|
|
47
|
Evaluating the repair of DNA derived from formalin-fixed paraffin-embedded tissues prior to genomic profiling by SNP-CGH analysis. J Transl Med 2013; 93:701-10. [PMID: 23568031 DOI: 10.1038/labinvest.2013.54] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Pathology archives contain vast resources of clinical material in the form of formalin-fixed paraffin-embedded (FFPE) tissue samples. Owing to the methods of tissue fixation and storage, the integrity of DNA and RNA available from FFPE tissue is compromized, which means obtaining informative data regarding epigenetic, genomic, and expression alterations can be challenging. Here, we have investigated the utility of repairing damaged DNA derived from FFPE tumors prior to single-nucleotide polymorphism (SNP) arrays for whole-genome DNA copy number analysis. DNA was extracted from FFPE samples spanning five decades, involving tumor material obtained from surgical specimens and postmortems. Various aspects of the protocol were assessed, including the method of DNA extraction, the role of Quality Control quantitative PCR (qPCR) in predicting sample success, and the effect of DNA restoration on assay performance, data quality, and the prediction of copy number aberrations (CNAs). DNA that had undergone the repair process yielded higher SNP call rates, reduced log R ratio variance, and improved calling of CNAs compared with matched FFPE DNA not subjected to repair. Reproducible mapping of genomic break points and detection of focal CNAs representing high-level gains and homozygous deletions (HD) were possible, even on autopsy material obtained in 1974. For example, DNA amplifications at the ERBB2 and EGFR gene loci and a HD mapping to 13q14.2 were validated using immunohistochemistry, in situ hybridization, and qPCR. The power of SNP arrays lies in the detection of allele-specific aberrations; however, this aspect of the analysis remains challenging, particularly in the distinction between loss of heterozygosity (LOH) and copy neutral LOH. In summary, attempting to repair DNA that is damaged during fixation and storage may be a useful pretreatment step for genomic studies of large archival FFPE cohorts with long-term follow-up or for understanding rare cancer types, where fresh frozen material is scarce.
Collapse
|
48
|
Baugher JD, Baugher BD, Shirley MD, Pevsner J. Sensitive and specific detection of mosaic chromosomal abnormalities using the Parent-of-Origin-based Detection (POD) method. BMC Genomics 2013; 14:367. [PMID: 23724825 PMCID: PMC3680018 DOI: 10.1186/1471-2164-14-367] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2012] [Accepted: 05/14/2013] [Indexed: 11/25/2022] Open
Abstract
Background Mosaic somatic alterations are present in all multi-cellular organisms, but the physiological effects of low-level mosaicism are largely unknown. Most mosaic alterations remain undetectable with current analytical approaches, although the presence of such alterations is increasingly implicated as causative for disease. Results Here, we present the Parent-of-Origin-based Detection (POD) method for chromosomal abnormality detection in trio-based SNP microarray data. Our software implementation, triPOD, was benchmarked using a simulated dataset, outperformed comparable software for sensitivity of abnormality detection, and displayed substantial improvement in the detection of low-level mosaicism while maintaining comparable specificity. Examples of low-level mosaic abnormalities from a large autism dataset demonstrate the benefits of the increased sensitivity provided by triPOD. The triPOD analyses showed robustness across multiple types of Illumina microarray chips. Two large, clinically-relevant datasets were characterized and compared. Conclusions Our method and software provide a significant advancement in the ability to detect low-level mosaic abnormalities, thereby opening new avenues for research into the implications of mosaicism in pathogenic and non-pathogenic processes.
Collapse
Affiliation(s)
- Joseph D Baugher
- Program in Biochemistry, Cellular and Molecular Biology, Johns Hopkins School of Medicine, Baltimore, MD 21205, USA
| | | | | | | |
Collapse
|
49
|
Forward genetic screen for malignant peripheral nerve sheath tumor formation identifies new genes and pathways driving tumorigenesis. Nat Genet 2013; 45:756-66. [PMID: 23685747 PMCID: PMC3695033 DOI: 10.1038/ng.2641] [Citation(s) in RCA: 117] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2013] [Accepted: 04/25/2013] [Indexed: 12/27/2022]
Abstract
Malignant peripheral nerve sheath tumors (MPNSTs) are sarcomas of Schwann cell-lineage origin that occur sporadically or in association with the inherited syndrome, Neurofibromatosis Type 1. To identify genetic drivers of MPNST development, we utilized the Sleeping Beauty (SB) transposon-based somatic mutagenesis system in mice with somatic loss of tumor protein p53 (Trp53) function and/or overexpression of epidermal growth factor receptor (EGFR). Common insertion site (CIS) analysis of 269 neurofibromas and 106 MPNSTs identified 695 and 87 sites with a statistically significant number of recurrent transposon insertions, respectively. Comparison to human data sets revealed novel and known driver genes for MPNST formation at these sites. Pairwise co-occurrence analysis of CIS-associated genes identified many cooperating mutations that are enriched for in Wnt/CTNNB1, PI3K/Akt/mTor, and growth factor receptor signaling pathways. Lastly, we identified several novel proto-oncogenes including forkhead box R2 (Foxr2), which we functionally validated as a proto-oncogene involved in MPNST maintenance.
Collapse
|
50
|
Oros KK, Arcand SL, Bayani J, Squire JA, Mes-Masson AM, Tonin PN, Greenwood CM. Analysis of genomic abnormalities in tumors: a review of available methods for Illumina two-color SNP genotyping and evaluation of performance. Cancer Genet 2013; 206:103-15. [DOI: 10.1016/j.cancergen.2013.03.001] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2012] [Revised: 03/12/2013] [Accepted: 03/13/2013] [Indexed: 10/26/2022]
|