151
|
Next-Generation Sequencing Approaches in Cancer: Where Have They Brought Us and Where Will They Take Us? Cancers (Basel) 2015; 7:1925-58. [PMID: 26404381 PMCID: PMC4586802 DOI: 10.3390/cancers7030869] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2015] [Accepted: 09/15/2015] [Indexed: 12/20/2022] Open
Abstract
Next-generation sequencing (NGS) technologies and data have revolutionized cancer research and are increasingly being deployed to guide clinicians in treatment decision-making. NGS technologies have allowed us to take an “omics” approach to cancer in order to reveal genomic, transcriptomic, and epigenomic landscapes of individual malignancies. Integrative multi-platform analyses are increasingly used in large-scale projects that aim to fully characterize individual tumours as well as general cancer types and subtypes. In this review, we examine how NGS technologies in particular have contributed to “omics” approaches in cancer research, allowing for large-scale integrative analyses that consider hundreds of tumour samples. These types of studies have provided us with an unprecedented wealth of information, providing the background knowledge needed to make small-scale (including “N of 1”) studies informative and relevant. We also take a look at emerging opportunities provided by NGS and state-of-the-art third-generation sequencing technologies, particularly in the context of translational research. Cancer research and care are currently poised to experience significant progress catalyzed by accessible sequencing technologies that will benefit both clinical- and research-based efforts.
Collapse
|
152
|
Buisine N, Ruan X, Bilesimo P, Grimaldi A, Alfama G, Ariyaratne P, Mulawadi F, Chen J, Sung WK, Liu ET, Demeneix BA, Ruan Y, Sachs LM. Xenopus tropicalis Genome Re-Scaffolding and Re-Annotation Reach the Resolution Required for In Vivo ChIA-PET Analysis. PLoS One 2015; 10:e0137526. [PMID: 26348928 PMCID: PMC4562602 DOI: 10.1371/journal.pone.0137526] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2015] [Accepted: 08/19/2015] [Indexed: 12/11/2022] Open
Abstract
Genome-wide functional analyses require high-resolution genome assembly and annotation. We applied ChIA-PET to analyze gene regulatory networks, including 3D chromosome interactions, underlying thyroid hormone (TH) signaling in the frog Xenopus tropicalis. As the available versions of Xenopus tropicalis assembly and annotation lacked the resolution required for ChIA-PET we improve the genome assembly version 4.1 and annotations using data derived from the paired end tag (PET) sequencing technologies and approaches (e.g., DNA-PET [gPET], RNA-PET etc.). The large insert (~10Kb, ~17Kb) paired end DNA-PET with high throughput NGS sequencing not only significantly improved genome assembly quality, but also strongly reduced genome “fragmentation”, reducing total scaffold numbers by ~60%. Next, RNA-PET technology, designed and developed for the detection of full-length transcripts and fusion mRNA in whole transcriptome studies (ENCODE consortia), was applied to capture the 5' and 3' ends of transcripts. These amendments in assembly and annotation were essential prerequisites for the ChIA-PET analysis of TH transcription regulation. Their application revealed complex regulatory configurations of target genes and the structures of the regulatory networks underlying physiological responses. Our work allowed us to improve the quality of Xenopus tropicalis genomic resources, reaching the standard required for ChIA-PET analysis of transcriptional networks. We consider that the workflow proposed offers useful conceptual and methodological guidance and can readily be applied to other non-conventional models that have low-resolution genome data.
Collapse
Affiliation(s)
- Nicolas Buisine
- UMR CNRS 7221, Muséum National d'Histoire Naturelle, Paris, France
| | - Xiaoan Ruan
- The Jackson Laboratory of Genomic Medicine, Farmington, Connecticut, United States of America
- Department of Genetics and Developmental Biology, University of Connecticut, Farmington, Connecticut, United States of America
- Genome Institute of Singapore, Singapore, Singapore
| | - Patrice Bilesimo
- UMR CNRS 7221, Muséum National d'Histoire Naturelle, Paris, France
- Watchfrog S.A.S., Evry, France
| | - Alexis Grimaldi
- UMR CNRS 7221, Muséum National d'Histoire Naturelle, Paris, France
| | - Gladys Alfama
- UMR CNRS 7221, Muséum National d'Histoire Naturelle, Paris, France
| | | | | | - Jieqi Chen
- Genome Institute of Singapore, Singapore, Singapore
| | | | - Edison T. Liu
- The Jackson Laboratory of Genomic Medicine, Farmington, Connecticut, United States of America
- Department of Genetics and Developmental Biology, University of Connecticut, Farmington, Connecticut, United States of America
- Genome Institute of Singapore, Singapore, Singapore
| | | | - Yijun Ruan
- The Jackson Laboratory of Genomic Medicine, Farmington, Connecticut, United States of America
- Department of Genetics and Developmental Biology, University of Connecticut, Farmington, Connecticut, United States of America
- Genome Institute of Singapore, Singapore, Singapore
- * E-mail: (YR); (LMS)
| | - Laurent M. Sachs
- UMR CNRS 7221, Muséum National d'Histoire Naturelle, Paris, France
- * E-mail: (YR); (LMS)
| |
Collapse
|
153
|
CNV-CH: A Convex Hull Based Segmentation Approach to Detect Copy Number Variations (CNV) Using Next-Generation Sequencing Data. PLoS One 2015; 10:e0135895. [PMID: 26291322 PMCID: PMC4546278 DOI: 10.1371/journal.pone.0135895] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2014] [Accepted: 07/28/2015] [Indexed: 11/19/2022] Open
Abstract
Copy number variation (CNV) is a form of structural alteration in the mammalian DNA sequence, which are associated with many complex neurological diseases as well as cancer. The development of next generation sequencing (NGS) technology provides us a new dimension towards detection of genomic locations with copy number variations. Here we develop an algorithm for detecting CNVs, which is based on depth of coverage data generated by NGS technology. In this work, we have used a novel way to represent the read count data as a two dimensional geometrical point. A key aspect of detecting the regions with CNVs, is to devise a proper segmentation algorithm that will distinguish the genomic locations having a significant difference in read count data. We have designed a new segmentation approach in this context, using convex hull algorithm on the geometrical representation of read count data. To our knowledge, most algorithms have used a single distribution model of read count data, but here in our approach, we have considered the read count data to follow two different distribution models independently, which adds to the robustness of detection of CNVs. In addition, our algorithm calls CNVs based on the multiple sample analysis approach resulting in a low false discovery rate with high precision.
Collapse
|
154
|
Duan J, Wan M, Deng HW, Wang YP. A Sparse Model Based Detection of Copy Number Variations From Exome Sequencing Data. IEEE Trans Biomed Eng 2015; 63:496-505. [PMID: 26258935 DOI: 10.1109/tbme.2015.2464674] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
GOAL Whole-exome sequencing provides a more cost-effective way than whole-genome sequencing for detecting genetic variants, such as copy number variations (CNVs). Although a number of approaches have been proposed to detect CNVs from whole-genome sequencing, a direct adoption of these approaches to whole-exome sequencing will often fail because exons are separately located along a genome. Therefore, an appropriate method is needed to target the specific features of exome sequencing data. METHODS In this paper, a novel sparse model based method is proposed to discover CNVs from multiple exome sequencing data. First, exome sequencing data are represented with a penalized matrix approximation, and technical variability and random sequencing errors are assumed to follow a generalized Gaussian distribution. Second, an iteratively reweighted least squares algorithm is used to estimate the solution. RESULTS The method is tested and validated on both synthetic and real data, and compared with other approaches including CoNIFER, XHMM, and cn.MOPS. The test demonstrates that the proposed method outperform other approaches. CONCLUSION The proposed sparse model can detect CNVs from exome sequencing data with high power and precision. Significance: Sparse model can target the specific features of exome sequencing data. The software codes are freely available at http://www.tulane.edu/ wyp/software/Exon_CNV.m.
Collapse
|
155
|
Gao G, Smith DI. Mate-Pair Sequencing as a Powerful Clinical Tool for the Characterization of Cancers with a DNA Viral Etiology. Viruses 2015; 7:4507-28. [PMID: 26262638 PMCID: PMC4576192 DOI: 10.3390/v7082831] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2015] [Revised: 07/16/2015] [Accepted: 07/29/2015] [Indexed: 01/18/2023] Open
Abstract
DNA viruses are known to be associated with a variety of different cancers. Human papillomaviruses (HPV) are a family of viruses and several of its sub-types are classified as high-risk HPVs as they are found to be associated with the development of a number of different cancers. Almost all cervical cancers appear to be driven by HPV infection and HPV is also found in most cancers of the anus and at least half the cancers of the vulva, penis and vagina, and increasingly found in one sub-type of head and neck cancers namely oropharyngeal squamous cell carcinoma. Our understanding of HPVs role in cancer development comes from extensive studies done on cervical cancer and it has just been assumed that HPV plays an identical role in the development of all other cancers arising in the presence of HPV sequences, although this has not been proven. Most invasive cervical cancers have the HPV genome integrated into one or more sites within the human genome. One powerful tool to examine all the sites of HPV integration in a cancer but that also provides a comprehensive view of genomic alterations in that cancer is the use of next generation sequencing of mate-pair libraries produced from the DNA isolated. We will describe how this powerful technology can provide important information about the genomic organization within an individual cancer genome, and how this has demonstrated that HPVs role in oropharyngeal squamous cell carcinoma is distinct from that in cervical cancer. We will also describe why the sequencing of mate-pair libraries could be a powerful clinical tool for the management of patients with a DNA viral etiology and how this could quickly transform the care of these patients.
Collapse
Affiliation(s)
- Ge Gao
- Division of Experimental Pathology, Mayo Clinic, Rochester, MN 55905, USA.
| | - David I Smith
- Division of Experimental Pathology, Mayo Clinic, Rochester, MN 55905, USA.
| |
Collapse
|
156
|
Fong KM, Daniels M, Goh F, Yang IA, Bowman RV. The current and future roles of genomics. Lung Cancer 2015. [DOI: 10.1183/2312508x.10009614] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
157
|
Zhang G, Wang J, Yang J, Li W, Deng Y, Li J, Huang J, Hu S, Zhang B. Comparison and evaluation of two exome capture kits and sequencing platforms for variant calling. BMC Genomics 2015; 16:581. [PMID: 26242175 PMCID: PMC4524363 DOI: 10.1186/s12864-015-1796-6] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2014] [Accepted: 07/23/2015] [Indexed: 12/30/2022] Open
Abstract
Background To promote the clinical application of next-generation sequencing, it is important to obtain accurate and consistent variants of target genomic regions at low cost. Ion Proton, the latest updated semiconductor-based sequencing instrument from Life Technologies, is designed to provide investigators with an inexpensive platform for human whole exome sequencing that achieves a rapid turnaround time. However, few studies have comprehensively compared and evaluated the accuracy of variant calling between Ion Proton and Illumina sequencing platforms such as HiSeq 2000, which is the most popular sequencing platform for the human genome. The Ion Proton sequencer combined with the Ion TargetSeq™ Exome Enrichment Kit together make up TargetSeq-Proton, whereas SureSelect-Hiseq is based on the Agilent SureSelect Human All Exon v4 Kit and the HiSeq 2000 sequencer. Results Here, we sequenced exonic DNA from four human blood samples using both TargetSeq-Proton and SureSelect-HiSeq. We then called variants in the exonic regions that overlapped between the two exome capture kits (33.6 Mb). The rates of shared variant loci called by two sequencing platforms were from 68.0 to 75.3 % in four samples, whereas the concordance of co-detected variant loci reached 99 %. Sanger sequencing validation revealed that the validated rate of concordant single nucleotide polymorphisms (SNPs) (91.5 %) was higher than the SNPs specific to TargetSeq-Proton (60.0 %) or specific to SureSelect-HiSeq (88.3 %). With regard to 1-bp small insertions and deletions (InDels), the Sanger sequencing validated rates of concordant variants (100.0 %) and SureSelect-HiSeq-specific (89.6 %) were higher than those of TargetSeq-Proton-specific (15.8 %). Conclusions In the sequencing of exonic regions, a combination of using of two sequencing strategies (SureSelect-HiSeq and TargetSeq-Proton) increased the variant calling specificity for concordant variant loci and the sensitivity for variant loci called by any one platform. However, for the sequencing of platform-specific variants, the accuracy of variant calling by HiSeq 2000 was higher than that of Ion Proton, specifically for the InDel detection. Moreover, the variant calling software also influences the detection of SNPs and, specifically, InDels in Ion Proton exome sequencing. Electronic supplementary material The online version of this article (doi:10.1186/s12864-015-1796-6) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Guoqiang Zhang
- Core Genomic Facility and CAS Key Laboratory of Genome Sciences & Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, 100101, China.
| | - Jianfeng Wang
- Core Genomic Facility and CAS Key Laboratory of Genome Sciences & Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, 100101, China.
| | - Jin Yang
- Core Genomic Facility and CAS Key Laboratory of Genome Sciences & Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, 100101, China.
| | - Wenjie Li
- Core Genomic Facility and CAS Key Laboratory of Genome Sciences & Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, 100101, China.
| | - Yutian Deng
- Core Genomic Facility and CAS Key Laboratory of Genome Sciences & Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, 100101, China.
| | - Jing Li
- Core Genomic Facility and CAS Key Laboratory of Genome Sciences & Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, 100101, China.
| | - Jun Huang
- Core Genomic Facility and CAS Key Laboratory of Genome Sciences & Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, 100101, China.
| | - Songnian Hu
- Core Genomic Facility and CAS Key Laboratory of Genome Sciences & Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, 100101, China.
| | - Bing Zhang
- Core Genomic Facility and CAS Key Laboratory of Genome Sciences & Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, 100101, China.
| |
Collapse
|
158
|
Matsui A, Ihara T, Suda H, Mikami H, Semba K. Gene amplification: mechanisms and involvement in cancer. Biomol Concepts 2015; 4:567-82. [PMID: 25436757 DOI: 10.1515/bmc-2013-0026] [Citation(s) in RCA: 71] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2013] [Accepted: 09/02/2013] [Indexed: 11/15/2022] Open
Abstract
Gene amplification was recognized as a physiological process during the development of Drosophila melanogaster. Intriguingly, mammalian cells use this mechanism to overexpress particular genes for survival under stress, such as during exposure to cytotoxic drugs. One well-known example is the amplification of the dihydrofolate reductase gene observed in methotrexate-resistant cells. Four models have been proposed for the generation of amplifications: extrareplication and recombination, the breakage-fusion-bridge cycle, double rolling-circle replication, and replication fork stalling and template switching. Gene amplification is a typical genetic alteration in cancer, and historically many oncogenes have been identified in the amplified regions. In this regard, novel cancer-associated genes may remain to be identified in the amplified regions. Recent comprehensive approaches have further revealed that co-amplified genes also contribute to tumorigenesis in concert with known oncogenes in the same amplicons. Considering that cancer develops through the alteration of multiple genes, gene amplification is an effective acceleration machinery to promote tumorigenesis. Identification of cancer-associated genes could provide novel and effective therapeutic targets.
Collapse
|
159
|
Mittal VK, McDonald JF. Integrated sequence and expression analysis of ovarian cancer structural variants underscores the importance of gene fusion regulation. BMC Med Genomics 2015; 8:40. [PMID: 26177635 PMCID: PMC4504069 DOI: 10.1186/s12920-015-0118-9] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2014] [Accepted: 07/09/2015] [Indexed: 12/25/2022] Open
Abstract
Background Genomic rearrangements or structural variants (SVs) are one of the most common classes of mutations in cancer. Methods An integrated DNA sequencing and transcriptional profiling (RNA sequence and microarray gene expression data) analysis was performed on six ovarian cancer patient samples. Matched sets of control (whole blood) samples from these same patients were used to distinguish cancer SVs of germline origin from those arising somatically in the cancer cell lineage. Results We detected 10,034 ovarian cancer SVs (5518 germline derived; 4516 somatically derived) at base-pair level resolution. Only 11 % of these variants were shown to have the potential to form gene fusions and, of these, less than 20 % were detected at the transcriptional level. Conclusions Collectively our results are consistent with the view that gene fusions and other SVs can be significant factors in the onset and progression of ovarian cancer. The results further indicate that it may not only be the occurrence of these variants in cancer but their regulation that contributes to their biological and clinical significance. Electronic supplementary material The online version of this article (doi:10.1186/s12920-015-0118-9) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Vinay K Mittal
- Integrated Cancer Research Center, School of Biology, and Parker H. Petit Institute of Bioengineering and Biosciences, Georgia Institute of Technology, 315 Ferst Dr., Atlanta, GA, 30332, USA.
| | - John F McDonald
- Integrated Cancer Research Center, School of Biology, and Parker H. Petit Institute of Bioengineering and Biosciences, Georgia Institute of Technology, 315 Ferst Dr., Atlanta, GA, 30332, USA.
| |
Collapse
|
160
|
Microplitis demolitor Bracovirus Proviral Loci and Clustered Replication Genes Exhibit Distinct DNA Amplification Patterns during Replication. J Virol 2015; 89:9511-23. [PMID: 26157119 DOI: 10.1128/jvi.01388-15] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2015] [Accepted: 07/01/2015] [Indexed: 01/25/2023] Open
Abstract
UNLABELLED Polydnaviruses are large, double-stranded DNA viruses that are beneficial symbionts of parasitoid wasps. Polydnaviruses in the genus Bracovirus (BVs) persist in wasps as proviruses, and their genomes consist of two functional components referred to as proviral segments and nudivirus-like genes. Prior studies established that the DNA domains where proviral segments reside are amplified during replication and that segments within amplified loci are circularized before packaging into nucleocapsids. One DNA domain where nudivirus-like genes are located is also amplified but never packaged into virions. We recently sequenced the genome of the braconid Microplitis demolitor, which carries M. demolitor bracovirus (MdBV). Here, we took advantage of this resource to characterize the DNAs that are amplified during MdBV replication using a combination of Illumina and Pacific Biosciences sequencing approaches. The results showed that specific nucleotide sites identify the boundaries of amplification for proviral loci. Surprisingly, however, amplification of loci 3, 4, 6, and 8 produced head-to-tail concatemeric intermediates; loci 1, 2, and 5 produced head-to-head/tail-to-tail concatemers; and locus 7 yielded no identified concatemers. Sequence differences at amplification junctions correlated with the types of amplification intermediates the loci produced, while concatemer processing gave rise to the circularized DNAs that are packaged into nucleocapsids. The MdBV nudivirus-like gene cluster was also amplified, albeit more weakly than most proviral loci and with nondiscrete boundaries. Overall, the MdBV genome exhibited three patterns of DNA amplification during replication. Our data also suggest that PacBio sequencing could be useful in studying the replication intermediates produced by other DNA viruses. IMPORTANCE Polydnaviruses are of fundamental interest because they provide a novel example of viruses evolving into beneficial symbionts. All polydnaviruses are associated with insects called parasitoid wasps, which are of additional applied interest because many are biological control agents of pest insects. Polydnaviruses in the genus Bracovirus (BVs) evolved ~100 million years ago from an ancestor related to the baculovirus-nudivirus lineage but have also established many novelties due to their symbiotic lifestyle. These include the fact that BVs are transmitted only vertically as proviruses and produce replication-defective virions that package only a portion of the viral genome. Here, we studied Microplitis demolitor bracovirus (MdBV) and report that its genome exhibits three distinct patterns of DNA amplification during replication. We also identify several previously unknown features of BV genomes that correlate with these different amplification patterns.
Collapse
|
161
|
Tattini L, D'Aurizio R, Magi A. Detection of Genomic Structural Variants from Next-Generation Sequencing Data. Front Bioeng Biotechnol 2015; 3:92. [PMID: 26161383 PMCID: PMC4479793 DOI: 10.3389/fbioe.2015.00092] [Citation(s) in RCA: 169] [Impact Index Per Article: 16.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2014] [Accepted: 06/10/2015] [Indexed: 01/16/2023] Open
Abstract
Structural variants are genomic rearrangements larger than 50 bp accounting for around 1% of the variation among human genomes. They impact on phenotypic diversity and play a role in various diseases including neurological/neurocognitive disorders and cancer development and progression. Dissecting structural variants from next-generation sequencing data presents several challenges and a number of approaches have been proposed in the literature. In this mini review, we describe and summarize the latest tools – and their underlying algorithms – designed for the analysis of whole-genome sequencing, whole-exome sequencing, custom captures, and amplicon sequencing data, pointing out the major advantages/drawbacks. We also report a summary of the most recent applications of third-generation sequencing platforms. This assessment provides a guided indication – with particular emphasis on human genetics and copy number variants – for researchers involved in the investigation of these genomic events.
Collapse
Affiliation(s)
- Lorenzo Tattini
- Department of Neurosciences, Psychology, Pharmacology and Child Health, University of Florence , Florence , Italy
| | - Romina D'Aurizio
- Laboratory of Integrative Systems Medicine (LISM), Institute of Informatics and Telematics and Institute of Clinical Physiology, National Research Council , Pisa , Italy
| | - Alberto Magi
- Department of Clinical and Experimental Medicine, University of Florence , Florence , Italy
| |
Collapse
|
162
|
Li W, Mills AA. Architects of the genome: CHD dysfunction in cancer, developmental disorders and neurological syndromes. Epigenomics 2015; 6:381-95. [PMID: 25333848 DOI: 10.2217/epi.14.31] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
Chromatin is vital to normal cells, and its deregulation contributes to a spectrum of human ailments. An emerging concept is that aberrant chromatin regulation culminates in gene expression programs that set the stage for the seemingly diverse pathologies of cancer, developmental disorders and neurological syndromes. However, the mechanisms responsible for such common etiology have been elusive. Recent evidence has implicated lesions affecting chromatin-remodeling proteins in cancer, developmental disorders and neurological syndromes, suggesting a common source for these different pathologies. Here, we focus on the chromodomain helicase DNA binding chromatin-remodeling family and the recent evidence for its deregulation in diverse pathological conditions, providing a new perspective on the underlying mechanisms and their implications for these prevalent human diseases.
Collapse
Affiliation(s)
- Wangzhi Li
- Cold Spring Harbor Laboratory Cold Spring Harbor, NY 11724, USA
| | | |
Collapse
|
163
|
Abstract
Structural chromosome rearrangements may result in the exchange of coding or regulatory DNA sequences between genes. Many such gene fusions are strong driver mutations in neoplasia and have provided fundamental insights into the disease mechanisms that are involved in tumorigenesis. The close association between the type of gene fusion and the tumour phenotype makes gene fusions ideal for diagnostic purposes, enabling the subclassification of otherwise seemingly identical disease entities. In addition, many gene fusions add important information for risk stratification, and increasing numbers of chimeric proteins encoded by the gene fusions serve as specific targets for treatment, resulting in dramatically improved patient outcomes. In this Timeline article, we describe the spectrum of gene fusions in cancer and how the methods to identify them have evolved, and also discuss conceptual implications of current, sequencing-based approaches for detection.
Collapse
Affiliation(s)
- Fredrik Mertens
- Department of Clinical Genetics, Lund University and Skåne University Hospital, SE-221 85 Lund, Sweden
| | - Bertil Johansson
- Department of Clinical Genetics, Lund University and Skåne University Hospital, SE-221 85 Lund, Sweden
| | - Thoas Fioretos
- Department of Clinical Genetics, Lund University and Skåne University Hospital, SE-221 85 Lund, Sweden
| | - Felix Mitelman
- Department of Clinical Genetics, Lund University and Skåne University Hospital, SE-221 85 Lund, Sweden
| |
Collapse
|
164
|
Ju YS, Tubio JMC, Mifsud W, Fu B, Davies HR, Ramakrishna M, Li Y, Yates L, Gundem G, Tarpey PS, Behjati S, Papaemmanuil E, Martin S, Fullam A, Gerstung M, Nangalia J, Green AR, Caldas C, Borg Å, Tutt A, Lee MTM, van't Veer LJ, Tan BKT, Aparicio S, Span PN, Martens JWM, Knappskog S, Vincent-Salomon A, Børresen-Dale AL, Eyfjörd JE, Myklebost O, Flanagan AM, Foster C, Neal DE, Cooper C, Eeles R, Bova SG, Lakhani SR, Desmedt C, Thomas G, Richardson AL, Purdie CA, Thompson AM, McDermott U, Yang F, Nik-Zainal S, Campbell PJ, Stratton MR. Frequent somatic transfer of mitochondrial DNA into the nuclear genome of human cancer cells. Genome Res 2015; 25:814-24. [PMID: 25963125 PMCID: PMC4448678 DOI: 10.1101/gr.190470.115] [Citation(s) in RCA: 69] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2015] [Accepted: 04/14/2015] [Indexed: 12/11/2022]
Abstract
Mitochondrial genomes are separated from the nuclear genome for most of the cell cycle by the nuclear double membrane, intervening cytoplasm, and the mitochondrial double membrane. Despite these physical barriers, we show that somatically acquired mitochondrial-nuclear genome fusion sequences are present in cancer cells. Most occur in conjunction with intranuclear genomic rearrangements, and the features of the fusion fragments indicate that nonhomologous end joining and/or replication-dependent DNA double-strand break repair are the dominant mechanisms involved. Remarkably, mitochondrial-nuclear genome fusions occur at a similar rate per base pair of DNA as interchromosomal nuclear rearrangements, indicating the presence of a high frequency of contact between mitochondrial and nuclear DNA in some somatic cells. Transmission of mitochondrial DNA to the nuclear genome occurs in neoplastically transformed cells, but we do not exclude the possibility that some mitochondrial-nuclear DNA fusions observed in cancer occurred years earlier in normal somatic cells.
Collapse
Affiliation(s)
- Young Seok Ju
- Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, United Kingdom
| | - Jose M C Tubio
- Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, United Kingdom
| | - William Mifsud
- Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, United Kingdom
| | - Beiyuan Fu
- Cytogenetics Facility, Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, United Kingdom
| | - Helen R Davies
- Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, United Kingdom
| | - Manasa Ramakrishna
- Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, United Kingdom
| | - Yilong Li
- Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, United Kingdom
| | - Lucy Yates
- Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, United Kingdom
| | - Gunes Gundem
- Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, United Kingdom
| | - Patrick S Tarpey
- Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, United Kingdom
| | - Sam Behjati
- Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, United Kingdom
| | - Elli Papaemmanuil
- Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, United Kingdom
| | - Sancha Martin
- Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, United Kingdom
| | - Anthony Fullam
- Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, United Kingdom
| | - Moritz Gerstung
- Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, United Kingdom
| | - Jyoti Nangalia
- Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, United Kingdom; Cambridge University Hospitals NHS Foundation Trust, Cambridge CB2 0QQ, United Kingdom; Department of Haematology, University of Cambridge, Cambridge CB2 0XY, United Kingdom
| | - Anthony R Green
- Cambridge University Hospitals NHS Foundation Trust, Cambridge CB2 0QQ, United Kingdom; Department of Haematology, University of Cambridge, Cambridge CB2 0XY, United Kingdom
| | - Carlos Caldas
- Cambridge University Hospitals NHS Foundation Trust, Cambridge CB2 0QQ, United Kingdom; Cancer Research UK (CRUK) Cambridge Institute, University of Cambridge, Cambridge CB2 0RE, United Kingdom
| | - Åke Borg
- BioCare, Strategic Cancer Research Program, SE-223 81 Lund, Sweden; CREATE Health, Strategic Centre for Translational Cancer Research, SE-221 00 Lund, Sweden; Department of Oncology and Pathology, Lund University Cancer Center, SE-221 85 Lund, Sweden
| | - Andrew Tutt
- Breakthrough Breast Cancer Research Unit, Research Oncology, King's College London, Guy's Hospital, London SE1 9RT, United Kingdom
| | - Ming Ta Michael Lee
- Laboratory for International Alliance on Genomic Research, RIKEN Center for Integrative Medical Sciences, 230-0045 Yokohama, Japan; National Center for Genome Medicine, Institute of Biomedical Sciences, Academia Sinica, Taipei 115, Taiwan
| | - Laura J van't Veer
- Department of Laboratory Medicine, Helen Diller Family Comprehensive Cancer Center, University of California, San Francisco, California 94158, USA; Netherlands Cancer Institute, 1066 CX Amsterdam, Netherlands
| | - Benita K T Tan
- Department of General Surgery, Singapore General Hospital, Singapore 169608
| | - Samuel Aparicio
- Department of Molecular Oncology, British Columbia Cancer Agency, Vancouver V5Z 1L3, Canada
| | - Paul N Span
- Department of Radiation Oncology and Department of Laboratory Medicine, Radboud University Medical Center, 6525 HP Nijmegen, Netherlands
| | - John W M Martens
- Department of Medical Oncology, Erasmus MC Cancer Institute, Erasmus University Medical Center, 3015 CE Rotterdam, Netherlands
| | - Stian Knappskog
- Section of Oncology, Department of Clinical Science, University of Bergen, N-5020 Bergen, Norway; Department of Oncology, Haukeland University Hospital, 5021 Bergen, Norway
| | - Anne Vincent-Salomon
- Institut Curie, INSERM U934 and Department of Tumor Biology, 75248 Paris cédex 05, France
| | - Anne-Lise Børresen-Dale
- Department of Genetics, Institute for Cancer Research, Oslo University Hospital, The Norwegian Radium Hospital, Montebello, 0310 Oslo, Norway; The K.G. Jebsen Center for Breast Cancer Research, Institute for Clinical Medicine, Faculty of Medicine, University of Oslo, 0450 Oslo, Norway
| | | | | | - Adrienne M Flanagan
- Royal National Orthopaedic Hospital, Middlesex HA7 4LP, United Kingdom; UCL Cancer Institute, University College London, London WC1E 6DD, United Kingdom
| | - Christopher Foster
- University of Liverpool and HCA Pathology Laboratories, London WC1E 6JA, United Kingdom
| | - David E Neal
- Urological Research Laboratory, Cancer Research UK Cambridge Research Institute, Cambridge CB2 0RE, United Kingdom; Department of Surgical Oncology, University of Cambridge, Addenbrooke's Hospital, Cambridge CB2 0QQ, United Kingdom
| | - Colin Cooper
- Institute of Cancer Research, Sutton, London SM2 5NG, United Kingdom; Department of Biological Sciences and School of Medicine, University of East Anglia, Norwich NR4 7TJ, United Kingdom
| | - Rosalind Eeles
- Division of Genetics and Epidemiology, The Institute of Cancer Research, Sutton SM2 5NG, United Kingdom; Royal Marsden NHS Foundation Trust, London SW3 6JJ and Sutton SM2 5PT, United Kingdom
| | | | - Sunil R Lakhani
- University of Queensland, School of Medicine, Brisbane, QLD 4006, Australia; Pathology Queensland, Royal Brisbane and Women's Hospital, Brisbane, QLD 4029, Australia; University of Queensland, UQ Centre for Clinical Research, Brisbane, QLD 4029, Australia
| | - Christine Desmedt
- Breast Cancer Translational Research Laboratory, Université Libre de Bruxelles, Institut Jules Bordet, 1000 Brussels, Belgium
| | - Gilles Thomas
- Université Lyon 1, Institut National du Cancer (INCa)-Synergie, 69008 Lyon, France
| | - Andrea L Richardson
- Dana-Farber Cancer Institute, Boston, Massachusetts 02215, USA; Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts 02115, USA
| | - Colin A Purdie
- Department of Pathology, Ninewells Hospital and Medical School, Dundee DD1 9SY, United Kingdom
| | - Alastair M Thompson
- Department of Surgical Oncology, University of Texas MD Anderson Cancer Center, Houston, Texas 77030, USA
| | - Ultan McDermott
- Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, United Kingdom
| | - Fengtang Yang
- Cytogenetics Facility, Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, United Kingdom
| | - Serena Nik-Zainal
- Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, United Kingdom
| | - Peter J Campbell
- Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, United Kingdom
| | - Michael R Stratton
- Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, United Kingdom
| |
Collapse
|
165
|
Abstract
With the advent of next-generation sequencing technologies, we have witnessed a rapid pace of discovery of new patterns of somatic structural variation in cancer genomes, and an attempt to figure out their underlying mechanisms. Some of these mechanisms are associated with particular cancer types, and in some cases are the main cause of the structural mutations that drive the oncogenic process. This review provides an overview of the patterns of somatic structural variation and chromosomal structures that characterize cancer genomes, their causal mechanisms and their impact in oncogenesis.
Collapse
|
166
|
Dixon-McIver A. Emerging technologies in paediatric leukaemia. Transl Pediatr 2015; 4:116-24. [PMID: 26835367 PMCID: PMC4729090 DOI: 10.3978/j.issn.2224-4336.2015.03.02] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Genetic changes, in particular chromosomal aberrations, are a hallmark of acute lymphoblastic lymphoma (ALL) and accurate detection of them is important in ensuring assignment to the appropriate drug protocol. Our ability to detect these genetic changes has been somewhat limited in the past due to the necessity to analyse mitotically active cells by conventional G-banded metaphase analysis and by mutational analysis of individual genes. Advances in technology include high resolution, microarray-based techniques that permit examination of the whole genome. Here we will review the current available methodology and discuss how the technology is being integrated into the diagnostic setting.
Collapse
|
167
|
Cooper CS, Eeles R, Wedge DC, Van Loo P, Gundem G, Alexandrov LB, Kremeyer B, Butler A, Lynch AG, Camacho N, Massie CE, Kay J, Luxton HJ, Edwards S, Kote-Jarai ZS, Dennis N, Merson S, Leongamornlert D, Zamora J, Corbishley C, Thomas S, Nik-Zainal S, O'Meara S, Matthews L, Clark J, Hurst R, Mithen R, Bristow RG, Boutros PC, Fraser M, Cooke S, Raine K, Jones D, Menzies A, Stebbings L, Hinton J, Teague J, McLaren S, Mudie L, Hardy C, Anderson E, Joseph O, Goody V, Robinson B, Maddison M, Gamble S, Greenman C, Berney D, Hazell S, Livni N, Fisher C, Ogden C, Kumar P, Thompson A, Woodhouse C, Nicol D, Mayer E, Dudderidge T, Shah NC, Gnanapragasam V, Voet T, Campbell P, Futreal A, Easton D, Warren AY, Foster CS, Stratton MR, Whitaker HC, McDermott U, Brewer DS, Neal DE. Analysis of the genetic phylogeny of multifocal prostate cancer identifies multiple independent clonal expansions in neoplastic and morphologically normal prostate tissue. Nat Genet 2015; 47:367-372. [PMID: 25730763 PMCID: PMC4380509 DOI: 10.1038/ng.3221] [Citation(s) in RCA: 337] [Impact Index Per Article: 33.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2014] [Accepted: 01/21/2015] [Indexed: 01/12/2023]
Abstract
Genome-wide DNA sequencing was used to decrypt the phylogeny of multiple samples from distinct areas of cancer and morphologically normal tissue taken from the prostates of three men. Mutations were present at high levels in morphologically normal tissue distant from the cancer, reflecting clonal expansions, and the underlying mutational processes at work in morphologically normal tissue were also at work in cancer. Our observations demonstrate the existence of ongoing abnormal mutational processes, consistent with field effects, underlying carcinogenesis. This mechanism gives rise to extensive branching evolution and cancer clone mixing, as exemplified by the coexistence of multiple cancer lineages harboring distinct ERG fusions within a single cancer nodule. Subsets of mutations were shared either by morphologically normal and malignant tissues or between different ERG lineages, indicating earlier or separate clonal cell expansions. Our observations inform on the origin of multifocal disease and have implications for prostate cancer therapy in individual cases.
Collapse
Affiliation(s)
- Colin S Cooper
- Division of Genetics and Epidemiology, The Institute Of Cancer Research, London, UK
- Department of Biological Sciences University of East Anglia, Norwich, UK
- Norwich Medical School, University of East Anglia, Norwich, UK
| | - Rosalind Eeles
- Division of Genetics and Epidemiology, The Institute Of Cancer Research, London, UK
- Royal Marsden NHS Foundation Trust, London and Sutton, UK
| | - David C Wedge
- Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton, UK
| | - Peter Van Loo
- Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton, UK
- Human Genome Laboratory, Department of Human Genetics, VIB and KU Leuven, Leuven, Belgium
- Cancer Research UK London Research Institute, London, UK
| | - Gunes Gundem
- Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton, UK
| | | | - Barbara Kremeyer
- Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton, UK
| | - Adam Butler
- Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton, UK
| | - Andrew G Lynch
- Statistics and Computational Biology Laboratory, Cancer Research UK Cambridge Research Institute, Cambridge, UK
| | - Niedzica Camacho
- Division of Genetics and Epidemiology, The Institute Of Cancer Research, London, UK
| | - Charlie E Massie
- Urological Research Laboratory, Cancer Research UK Cambridge Research Institute, Cambridge, UK
| | - Jonathan Kay
- Urological Research Laboratory, Cancer Research UK Cambridge Research Institute, Cambridge, UK
| | - Hayley J Luxton
- Urological Research Laboratory, Cancer Research UK Cambridge Research Institute, Cambridge, UK
| | - Sandra Edwards
- Division of Genetics and Epidemiology, The Institute Of Cancer Research, London, UK
| | - ZSofia Kote-Jarai
- Division of Genetics and Epidemiology, The Institute Of Cancer Research, London, UK
| | - Nening Dennis
- Royal Marsden NHS Foundation Trust, London and Sutton, UK
| | - Sue Merson
- Division of Genetics and Epidemiology, The Institute Of Cancer Research, London, UK
| | | | - Jorge Zamora
- Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton, UK
| | | | - Sarah Thomas
- Royal Marsden NHS Foundation Trust, London and Sutton, UK
| | | | - Sarah O'Meara
- Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton, UK
| | - Lucy Matthews
- Division of Genetics and Epidemiology, The Institute Of Cancer Research, London, UK
| | - Jeremy Clark
- Norwich Medical School, University of East Anglia, Norwich, UK
| | - Rachel Hurst
- Norwich Medical School, University of East Anglia, Norwich, UK
| | - Richard Mithen
- Institute of Food Research, Norwich Research Park, Norwich, UK
| | - Robert G Bristow
- Department of Medical Biophysics, University of Toronto, Toronto, Canada
- Department of Radiation Oncology, University of Toronto, Toronto, Canada
- Princess Margaret Cancer Centre-University Health Network, Toronto, Canada
| | - Paul C Boutros
- Department of Medical Biophysics, University of Toronto, Toronto, Canada
- Informatics and Bio-Computing, Ontario Institute for Cancer Research, Toronto, Canada
- Department Pharmacology & Toxicology, University of Toronto, Toronto, Canada
| | - Michael Fraser
- Department of Radiation Oncology, University of Toronto, Toronto, Canada
- Princess Margaret Cancer Centre-University Health Network, Toronto, Canada
| | - Susanna Cooke
- Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton, UK
| | - Keiran Raine
- Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton, UK
| | - David Jones
- Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton, UK
| | - Andrew Menzies
- Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton, UK
| | - Lucy Stebbings
- Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton, UK
| | - Jon Hinton
- Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton, UK
| | - Jon Teague
- Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton, UK
| | - Stuart McLaren
- Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton, UK
| | - Laura Mudie
- Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton, UK
| | - Claire Hardy
- Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton, UK
| | | | - Olivia Joseph
- Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton, UK
| | - Victoria Goody
- Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton, UK
| | - Ben Robinson
- Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton, UK
| | - Mark Maddison
- Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton, UK
| | - Stephen Gamble
- Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton, UK
| | | | - Dan Berney
- Department of Molecular Oncology, Barts Cancer Centre, Barts and the London School of Medicine and Dentistry, London, UK
| | - Steven Hazell
- Royal Marsden NHS Foundation Trust, London and Sutton, UK
| | - Naomi Livni
- Royal Marsden NHS Foundation Trust, London and Sutton, UK
| | - Cyril Fisher
- Royal Marsden NHS Foundation Trust, London and Sutton, UK
| | | | - Pardeep Kumar
- Royal Marsden NHS Foundation Trust, London and Sutton, UK
| | - Alan Thompson
- Royal Marsden NHS Foundation Trust, London and Sutton, UK
| | | | - David Nicol
- Royal Marsden NHS Foundation Trust, London and Sutton, UK
| | - Erik Mayer
- Royal Marsden NHS Foundation Trust, London and Sutton, UK
| | - Tim Dudderidge
- Royal Marsden NHS Foundation Trust, London and Sutton, UK
| | - Nimish C Shah
- Urological Research Laboratory, Cancer Research UK Cambridge Research Institute, Cambridge, UK
| | - Vincent Gnanapragasam
- Urological Research Laboratory, Cancer Research UK Cambridge Research Institute, Cambridge, UK
| | - Thierry Voet
- Laboratory of Reproductive Genomics, Department of Human Genetics, KU Leuven, Leuven, Belgium
| | - Peter Campbell
- Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton, UK
| | - Andrew Futreal
- Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton, UK
| | - Douglas Easton
- Centre for Cancer Genetic Epidemiology, Department of Oncology, University of Cambridge, Cambridge, UK
| | - Anne Y Warren
- Department of Histopathology, Cambridge University Hospitals NHS Foundation Trust, Cambridge, UK
| | | | | | - Hayley C Whitaker
- Urological Research Laboratory, Cancer Research UK Cambridge Research Institute, Cambridge, UK
| | - Ultan McDermott
- Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton, UK
| | - Daniel S Brewer
- Division of Genetics and Epidemiology, The Institute Of Cancer Research, London, UK
- Norwich Medical School, University of East Anglia, Norwich, UK
- The Genome Analysis Centre, Norwich, UK
| | - David E Neal
- Urological Research Laboratory, Cancer Research UK Cambridge Research Institute, Cambridge, UK
- Department of Surgical Oncology, University of Cambridge, Addenbrooke's Hospital, Cambridge, UK
| |
Collapse
|
168
|
Nakagawa H, Wardell CP, Furuta M, Taniguchi H, Fujimoto A. Cancer whole-genome sequencing: present and future. Oncogene 2015; 34:5943-50. [PMID: 25823020 DOI: 10.1038/onc.2015.90] [Citation(s) in RCA: 69] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2015] [Revised: 02/27/2015] [Accepted: 02/27/2015] [Indexed: 12/26/2022]
Abstract
Recent explosive advances in next-generation sequencing technology and computational approaches to massive data enable us to analyze a number of cancer genome profiles by whole-genome sequencing (WGS). To explore cancer genomic alterations and their diversity comprehensively, global and local cancer genome-sequencing projects, including ICGC and TCGA, have been analyzing many types of cancer genomes mainly by exome sequencing. However, there is limited information on somatic mutations in non-coding regions including untranslated regions, introns, regulatory elements and non-coding RNAs, and rearrangements, sometimes producing fusion genes, and pathogen detection in cancer genomes remain widely unexplored. WGS approaches can detect these unexplored mutations, as well as coding mutations and somatic copy number alterations, and help us to better understand the whole landscape of cancer genomes and elucidate functions of these unexplored genomic regions. Analysis of cancer genomes using the present WGS platforms is still primitive and there are substantial improvements to be made in sequencing technologies, informatics and computer resources. Taking account of the extreme diversity of cancer genomes and phenotype, it is also required to analyze much more WGS data and integrate these with multi-omics data, functional data and clinical-pathological data in a large number of sample sets to interpret them more fully and efficiently.
Collapse
Affiliation(s)
- H Nakagawa
- Laboratory for Genome Sequencing Analysis, RIKEN Center for Integrative Medical Sciences, Tokyo, Japan
| | - C P Wardell
- Laboratory for Genome Sequencing Analysis, RIKEN Center for Integrative Medical Sciences, Tokyo, Japan
| | - M Furuta
- Laboratory for Genome Sequencing Analysis, RIKEN Center for Integrative Medical Sciences, Tokyo, Japan
| | - H Taniguchi
- Laboratory for Genome Sequencing Analysis, RIKEN Center for Integrative Medical Sciences, Tokyo, Japan
| | - A Fujimoto
- Laboratory for Genome Sequencing Analysis, RIKEN Center for Integrative Medical Sciences, Tokyo, Japan
| |
Collapse
|
169
|
Wang M, Beck CR, English AC, Meng Q, Buhay C, Han Y, Doddapaneni HV, Yu F, Boerwinkle E, Lupski JR, Muzny DM, Gibbs RA. PacBio-LITS: a large-insert targeted sequencing method for characterization of human disease-associated chromosomal structural variations. BMC Genomics 2015; 16:214. [PMID: 25887218 PMCID: PMC4376517 DOI: 10.1186/s12864-015-1370-2] [Citation(s) in RCA: 53] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2014] [Accepted: 02/20/2015] [Indexed: 11/24/2022] Open
Abstract
Background Generation of long (>5 Kb) DNA sequencing reads provides an approach for interrogation of complex regions in the human genome. Currently, large-insert whole genome sequencing (WGS) technologies from Pacific Biosciences (PacBio) enable analysis of chromosomal structural variations (SVs), but the cost to achieve the required sequence coverage across the entire human genome is high. Results We developed a method (termed PacBio-LITS) that combines oligonucleotide-based DNA target-capture enrichment technologies with PacBio large-insert library preparation to facilitate SV studies at specific chromosomal regions. PacBio-LITS provides deep sequence coverage at the specified sites at substantially reduced cost compared with PacBio WGS. The efficacy of PacBio-LITS is illustrated by delineating the breakpoint junctions of low copy repeat (LCR)-associated complex structural rearrangements on chr17p11.2 in patients diagnosed with Potocki–Lupski syndrome (PTLS; MIM#610883). We successfully identified previously determined breakpoint junctions in three PTLS cases, and also were able to discover novel junctions in repetitive sequences, including LCR-mediated breakpoints. The new information has enabled us to propose mechanisms for formation of these structural variants. Conclusions The new method leverages the cost efficiency of targeted capture-sequencing as well as the mappability and scaffolding capabilities of long sequencing reads generated by the PacBio platform. It is therefore suitable for studying complex SVs, especially those involving LCRs, inversions, and the generation of chimeric Alu elements at the breakpoints. Other genomic research applications, such as haplotype phasing and small insertion and deletion validation could also benefit from this technology. Electronic supplementary material The online version of this article (doi:10.1186/s12864-015-1370-2) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Min Wang
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA.
| | - Christine R Beck
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA.
| | - Adam C English
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA.
| | - Qingchang Meng
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA.
| | - Christian Buhay
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA.
| | - Yi Han
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA.
| | - Harsha V Doddapaneni
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA.
| | - Fuli Yu
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA. .,Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA.
| | - Eric Boerwinkle
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA. .,Human Genetics Center, University of Texas Health Science Center at Houston, Houston, TX, 77030, USA.
| | - James R Lupski
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA. .,Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA.
| | - Donna M Muzny
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA.
| | - Richard A Gibbs
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA. .,Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA.
| |
Collapse
|
170
|
Priyadarshana WJRM, Sofronov G. Multiple Break-Points Detection in Array CGH Data via the Cross-Entropy Method. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2015; 12:487-498. [PMID: 26357234 DOI: 10.1109/tcbb.2014.2361639] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Array comparative genome hybridization (aCGH) is a widely used methodology to detect copy number variations of a genome in high resolution. Knowing the number of break-points and their corresponding locations in genomic sequences serves different biological needs. Primarily, it helps to identify disease-causing genes that have functional importance in characterizing genome wide diseases. For human autosomes the normal copy number is two, whereas at the sites of oncogenes it increases (gain of DNA) and at the tumour suppressor genes it decreases (loss of DNA). The majority of the current detection methods are deterministic in their set-up and use dynamic programming or different smoothing techniques to obtain the estimates of copy number variations. These approaches limit the search space of the problem due to different assumptions considered in the methods and do not represent the true nature of the uncertainty associated with the unknown break-points in genomic sequences. We propose the Cross-Entropy method, which is a model-based stochastic optimization technique as an exact search method, to estimate both the number and locations of the break-points in aCGH data. We model the continuous scale log-ratio data obtained by the aCGH technique as a multiple break-point problem. The proposed methodology is compared with well established publicly available methods using both artificially generated data and real data. Results show that the proposed procedure is an effective way of estimating number and especially the locations of break-points with high level of precision. Availability: The methods described in this article are implemented in the new R package breakpoint and it is available from the Comprehensive R Archive Network at http://CRAN.R-project.org/package=breakpoint.
Collapse
|
171
|
ICRmax: an optimized approach to detect tumor-specific interchromosomal rearrangements for clinical application. Genomics 2015; 105:265-72. [PMID: 25666663 DOI: 10.1016/j.ygeno.2015.01.009] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2014] [Revised: 01/23/2015] [Accepted: 01/26/2015] [Indexed: 01/26/2023]
Abstract
Somatically acquired chromosomal rearrangements occur at early stages during tumorigenesis and can be used to indirectly detect tumor cells, serving as highly sensitive and tumor-specific biomarkers. Advances in high-throughput sequencing have allowed the genome-wide identification of patient-specific chromosomal rearrangements to be used as personalized biomarkers to efficiently assess response to treatment, detect residual disease and monitor disease recurrence. However, sequencing and data processing costs still represent major obstacles for the widespread application of personalized biomarkers in oncology. We developed a computational pipeline (ICRmax) for the cost-effective identification of a minimal set of tumor-specific interchromosomal rearrangements (ICRs). We examined ICRmax performance on sequencing data from rectal tumors and simulated data achieving an average accuracy of 68% for ICR identification. ICRmax identifies ICRs from low-coverage sequenced tumors, eliminates the need to sequence a matched normal tissue and significantly reduces the costs that limit the utilization of personalized biomarkers in the clinical setting.
Collapse
|
172
|
Brynildsrud O, Snipen LG, Bohlin J. CNOGpro: detection and quantification of CNVs in prokaryotic whole-genome sequencing data. Bioinformatics 2015; 31:1708-15. [DOI: 10.1093/bioinformatics/btv070] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2014] [Accepted: 01/28/2015] [Indexed: 01/22/2023] Open
|
173
|
Abstract
Background Next generation sequencing (NGS) technologies have made it possible to exhaustively detect structural variations (SVs) in genomes. Although various methods for detecting SVs have been developed, the global structure of chromosomes, i.e., how segments in a reference genome are extracted and ordered in an unknown target genome, cannot be inferred by detecting only individual SVs. Results Here, we formulate the problem of inferring the global structure of chromosomes from SVs as an optimization problem on a bidirected graph. This problem takes into account the aberrant adjacencies of genomic regions, the copy numbers, and the number and length of chromosomes. Although the problem is NP-complete, we propose its polynomial-time solvable variation by restricting instances of the problem using a biologically meaningful condition, which we call the weakly connected constraint. We also explain how to obtain experimental data that satisfies the weakly connected constraint. Conclusion Our results establish a theoretical foundation for the development of practical computational tools that could be used to infer the global structure of chromosomes based on SVs. The computational complexity of the inference can be reduced by detecting the segments of the reference genome at the ends of the chromosomes of the target genome and also the segments that are known to exist in the target genome.
Collapse
|
174
|
de Pagter MS, Kloosterman WP. The Diverse Effects of Complex Chromosome Rearrangements and Chromothripsis in Cancer Development. Recent Results Cancer Res 2015; 200:165-193. [PMID: 26376877 DOI: 10.1007/978-3-319-20291-4_8] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
In recent years, enormous progress has been made with respect to the identification of somatic mutations that contribute to cancer development. Mutation types range from small substitutions to large structural genomic rearrangements, including complex reshuffling of the genome. Sets of mutations in individual cancer genomes may show specific signatures, which can be provoked by both exogenous and endogenous forces. One of the most remarkable mutation patterns observed in human cancers involve massive rearrangement of just a few chromosomal regions. This phenomenon has been termed chromothripsis and appears widespread in a multitude of cancer types. Chromothripsis provides a way for cancer to rapidly evolve through a one-off massive change in genome structure as opposed to a gradual process of mutation and selection. This chapter focuses on the origin, prevalence and impact of chromothripsis and related complex genomic rearrangements during cancer development.
Collapse
Affiliation(s)
- Mirjam S de Pagter
- Department of Medical Genetics, Center for Molecular Medicine, University Medical Center Utrecht, 3584 CG, Utrecht, The Netherlands
| | - Wigard P Kloosterman
- Department of Medical Genetics, Center for Molecular Medicine, University Medical Center Utrecht, 3584 CG, Utrecht, The Netherlands.
| |
Collapse
|
175
|
Saber A, van der Wekken A, Hiltermann TJ, Kok K, van den Berg A, Groen HJ. Genomic aberrations guiding treatment of non-small cell lung cancer patients. ACTA ACUST UNITED AC 2015. [DOI: 10.1016/j.ctrc.2015.03.005] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
|
176
|
Oleksiewicz U, Tomczak K, Woropaj J, Markowska M, Stępniak P, Shah PK. Computational characterisation of cancer molecular profiles derived using next generation sequencing. Contemp Oncol (Pozn) 2015; 19:A78-91. [PMID: 25691827 PMCID: PMC4322529 DOI: 10.5114/wo.2014.47137] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023] Open
Abstract
Our current understanding of cancer genetics is grounded on the principle that cancer arises from a clone that has accumulated the requisite somatically acquired genetic aberrations, leading to the malignant transformation. It also results in aberrent of gene and protein expression. Next generation sequencing (NGS) or deep sequencing platforms are being used to create large catalogues of changes in copy numbers, mutations, structural variations, gene fusions, gene expression, and other types of information for cancer patients. However, inferring different types of biological changes from raw reads generated using the sequencing experiments is algorithmically and computationally challenging. In this article, we outline common steps for the quality control and processing of NGS data. We highlight the importance of accurate and application-specific alignment of these reads and the methodological steps and challenges in obtaining different types of information. We comment on the importance of integrating these data and building infrastructure to analyse it. We also provide exhaustive lists of available software to obtain information and point the readers to articles comparing software for deeper insight in specialised areas. We hope that the article will guide readers in choosing the right tools for analysing oncogenomic datasets.
Collapse
Affiliation(s)
- Urszula Oleksiewicz
- Laboratory of Gene Therapy, Department of Cancer Immunology, The Greater Poland Cancer Centre, Poznan, Poland ; Department of Cancer Immunology and Diagnostics, Chair of Medical Biotechnology, Poznan University of Medical Sciences, Poznan, Poland ; These authors contributed equally to this paper
| | - Katarzyna Tomczak
- Laboratory of Gene Therapy, Department of Cancer Immunology, The Greater Poland Cancer Centre, Poznan, Poland ; Department of Cancer Immunology and Diagnostics, Chair of Medical Biotechnology, Poznan University of Medical Sciences, Poznan, Poland ; Postgraduate School of Molecular Medicine, Medical University of Warsaw, Warsaw ; These authors contributed equally to this paper
| | - Jakub Woropaj
- Poznan University of Economics, Poznań, Poland ; These authors contributed equally to this paper
| | | | | | - Parantu K Shah
- Institute for Applied Cancer Science, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| |
Collapse
|
177
|
A comprehensive transcriptional portrait of human cancer cell lines. Nat Biotechnol 2014; 33:306-12. [PMID: 25485619 DOI: 10.1038/nbt.3080] [Citation(s) in RCA: 476] [Impact Index Per Article: 43.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2014] [Accepted: 10/15/2014] [Indexed: 12/17/2022]
Abstract
Tumor-derived cell lines have served as vital models to advance our understanding of oncogene function and therapeutic responses. Although substantial effort has been made to define the genomic constitution of cancer cell line panels, the transcriptome remains understudied. Here we describe RNA sequencing and single-nucleotide polymorphism (SNP) array analysis of 675 human cancer cell lines. We report comprehensive analyses of transcriptome features including gene expression, mutations, gene fusions and expression of non-human sequences. Of the 2,200 gene fusions catalogued, 1,435 consist of genes not previously found in fusions, providing many leads for further investigation. We combine multiple genome and transcriptome features in a pathway-based approach to enhance prediction of response to targeted therapeutics. Our results provide a valuable resource for studies that use cancer cell lines.
Collapse
|
178
|
Hofmeister W, Nilsson D, Topa A, Anderlid BM, Darki F, Matsson H, Tapia Páez I, Klingberg T, Samuelsson L, Wirta V, Vezzi F, Kere J, Nordenskjöld M, Syk Lundberg E, Lindstrand A. CTNND2-a candidate gene for reading problems and mild intellectual disability. J Med Genet 2014; 52:111-22. [PMID: 25473103 DOI: 10.1136/jmedgenet-2014-102757] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
Abstract
BACKGROUND Cytogenetically visible chromosomal translocations are highly informative as they can pinpoint strong effect genes even in complex genetic disorders. METHODS AND RESULTS Here, we report a mother and daughter, both with borderline intelligence and learning problems within the dyslexia spectrum, and two apparently balanced reciprocal translocations: t(1;8)(p22;q24) and t(5;18)(p15;q11). By low coverage mate-pair whole-genome sequencing, we were able to pinpoint the genomic breakpoints to 2 kb intervals. By direct sequencing, we then located the chromosome 5p breakpoint to intron 9 of CTNND2. An additional case with a 163 kb microdeletion exclusively involving CTNND2 was identified with genome-wide array comparative genomic hybridisation. This microdeletion at 5p15.2 is also present in mosaic state in the patient's mother but absent from the healthy siblings. We then investigated the effect of CTNND2 polymorphisms on normal variability and identified a polymorphism (rs2561622) with significant effect on phonological ability and white matter volume in the left frontal lobe, close to cortical regions previously associated with phonological processing. Finally, given the potential role of CTNND2 in neuron motility, we used morpholino knockdown in zebrafish embryos to assess its effects on neuronal migration in vivo. Analysis of the zebrafish forebrain revealed a subpopulation of neurons misplaced between the diencephalon and telencephalon. CONCLUSIONS Taken together, our human genetic and in vivo data suggest that defective migration of subpopulations of neuronal cells due to haploinsufficiency of CTNND2 contribute to the cognitive dysfunction in our patients.
Collapse
Affiliation(s)
- Wolfgang Hofmeister
- Department of Molecular Medicine and Surgery, Karolinska Institutet, Stockholm, Sweden Center for Molecular Medicine, Karolinska Institutet, Stockholm, Sweden
| | - Daniel Nilsson
- Department of Molecular Medicine and Surgery, Karolinska Institutet, Stockholm, Sweden Center for Molecular Medicine, Karolinska Institutet, Stockholm, Sweden Department of Clinical Genetics, Karolinska University Hospital, Stockholm, Sweden Science for Life Laboratory, Karolinska Institutet Science Park, Solna, Sweden
| | - Alexandra Topa
- Department of Clinical Genetics, Sahlgrenska University Hospital, Gothenburg, Sweden
| | - Britt-Marie Anderlid
- Department of Molecular Medicine and Surgery, Karolinska Institutet, Stockholm, Sweden Center for Molecular Medicine, Karolinska Institutet, Stockholm, Sweden Department of Clinical Genetics, Karolinska University Hospital, Stockholm, Sweden
| | - Fahimeh Darki
- Department of Neuroscience, Karolinska Institutet, Stockholm, Sweden
| | - Hans Matsson
- Department of Biosciences and Nutrition, Center for Innovative Medicine, Karolinska Institutet, Huddinge, Sweden
| | - Isabel Tapia Páez
- Department of Biosciences and Nutrition, Center for Innovative Medicine, Karolinska Institutet, Huddinge, Sweden
| | - Torkel Klingberg
- Department of Neuroscience, Karolinska Institutet, Stockholm, Sweden
| | - Lena Samuelsson
- Department of Clinical Genetics, Sahlgrenska University Hospital, Gothenburg, Sweden
| | - Valtteri Wirta
- SciLifeLab, School of Biotechnology, KTH Royal Institute of Technology, Stockholm, Sweden
| | - Francesco Vezzi
- SciLifeLab, Department of Biochemistry and Biophysics, Stockholm University, Stockholm, Sweden
| | - Juha Kere
- Department of Biosciences and Nutrition, Center for Innovative Medicine, Karolinska Institutet, Huddinge, Sweden Molecular Neurology Research Program, University of Helsinki, and Folkhälsan Institute of Genetics, Helsinki, Finland
| | - Magnus Nordenskjöld
- Department of Molecular Medicine and Surgery, Karolinska Institutet, Stockholm, Sweden Center for Molecular Medicine, Karolinska Institutet, Stockholm, Sweden Department of Clinical Genetics, Karolinska University Hospital, Stockholm, Sweden
| | - Elisabeth Syk Lundberg
- Department of Molecular Medicine and Surgery, Karolinska Institutet, Stockholm, Sweden Center for Molecular Medicine, Karolinska Institutet, Stockholm, Sweden Department of Clinical Genetics, Karolinska University Hospital, Stockholm, Sweden
| | - Anna Lindstrand
- Department of Molecular Medicine and Surgery, Karolinska Institutet, Stockholm, Sweden Center for Molecular Medicine, Karolinska Institutet, Stockholm, Sweden Department of Clinical Genetics, Karolinska University Hospital, Stockholm, Sweden
| |
Collapse
|
179
|
Vogt N, Gibaud A, Lemoine F, de la Grange P, Debatisse M, Malfoy B. Amplicon rearrangements during the extrachromosomal and intrachromosomal amplification process in a glioma. Nucleic Acids Res 2014; 42:13194-205. [PMID: 25378339 PMCID: PMC4245956 DOI: 10.1093/nar/gku1101] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2014] [Revised: 10/22/2014] [Accepted: 10/23/2014] [Indexed: 01/18/2023] Open
Abstract
The mechanisms of gene amplification in tumour cells are poorly understood and the relationship between extrachromosomal DNA molecules, named double minutes (dmins), and intrachromosomal homogeneously staining regions (hsr) is not documented at nucleotide resolution. Using fluorescent in situ hybridization and whole genome sequencing, we studied a xenografted human oligodendroglioma where the co-amplification of the EGFR and MYC loci was present in the form of dmins at early passages and of an hsr at later passages. The amplified regions underwent multiple rearrangements and deletions during the formation of the dmins and their transformation into hsr. In both forms of amplification, non-homologous end-joining and microhomology-mediated end-joining rather than replication repair mechanisms prevailed in fusions. Small fragments, some of a few tens of base pairs, were associated in contigs. They came from clusters of breakpoints localized hundreds of kilobases apart in the amplified regions. The characteristics of some pairs of junctions suggest that at least some fragments were not fused randomly but could result from the concomitant repair of neighbouring breakpoints during the interaction of remote DNA sequences. This characterization at nucleotide resolution of the transition between extra- and intrachromosome amplifications highlights a hitherto uncharacterized organization of the amplified regions suggesting the involvement of new mechanisms in their formation.
Collapse
Affiliation(s)
- Nicolas Vogt
- Institut Curie, Centre de Recherche, F-75248 Paris, France CNRS, UMR3244, F-75248 Paris, France UPMC, F-75248 Paris, France
| | - Anne Gibaud
- Institut Curie, Centre de Recherche, F-75248 Paris, France CNRS, UMR3244, F-75248 Paris, France UPMC, F-75248 Paris, France
| | | | | | - Michelle Debatisse
- Institut Curie, Centre de Recherche, F-75248 Paris, France CNRS, UMR3244, F-75248 Paris, France UPMC, F-75248 Paris, France
| | - Bernard Malfoy
- Institut Curie, Centre de Recherche, F-75248 Paris, France CNRS, UMR3244, F-75248 Paris, France UPMC, F-75248 Paris, France
| |
Collapse
|
180
|
Toward product attribute control: developments from genome sequencing. Curr Opin Biotechnol 2014; 30:40-4. [DOI: 10.1016/j.copbio.2014.05.001] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2014] [Accepted: 05/02/2014] [Indexed: 01/16/2023]
|
181
|
Gonçalves da Silva A, Barendse W, Kijas JW, Barris WC, McWilliam S, Bunch RJ, McCullough R, Harrison B, Hoelzel AR, England PR. SNP discovery in nonmodel organisms: strand bias and base-substitution errors reduce conversion rates. Mol Ecol Resour 2014; 15:723-36. [PMID: 25388640 DOI: 10.1111/1755-0998.12343] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2014] [Revised: 10/31/2014] [Accepted: 11/03/2014] [Indexed: 11/28/2022]
Abstract
Single nucleotide polymorphisms (SNPs) have become the marker of choice for genetic studies in organisms of conservation, commercial or biological interest. Most SNP discovery projects in nonmodel organisms apply a strategy for identifying putative SNPs based on filtering rules that account for random sequencing errors. Here, we analyse data used to develop 4723 novel SNPs for the commercially important deep-sea fish, orange roughy (Hoplostethus atlanticus), to assess the impact of not accounting for systematic sequencing errors when filtering identified polymorphisms when discovering SNPs. We used SAMtools to identify polymorphisms in a velvet assembly of genomic DNA sequence data from seven individuals. The resulting set of polymorphisms were filtered to minimize 'bycatch'-polymorphisms caused by sequencing or assembly error. An Illumina Infinium SNP chip was used to genotype a final set of 7714 polymorphisms across 1734 individuals. Five predictors were examined for their effect on the probability of obtaining an assayable SNP: depth of coverage, number of reads that support a variant, polymorphism type (e.g. A/C), strand-bias and Illumina SNP probe design score. Our results indicate that filtering out systematic sequencing errors could substantially improve the efficiency of SNP discovery. We show that BLASTX can be used as an efficient tool to identify single-copy genomic regions in the absence of a reference genome. The results have implications for research aiming to identify assayable SNPs and build SNP genotyping assays for nonmodel organisms.
Collapse
Affiliation(s)
- Anders Gonçalves da Silva
- CSIRO Oceans and Atmosphere, GPO Box 1538, Hobart, Tas., 7001, Australia.,School of Biological Sciences, Monash University, 18 Innovation Walk, Clayton, Vic, 3800, Australia
| | - William Barendse
- CSIRO Animal, Food and Health Sciences, 306 Carmody Road, St Lucia, Qld, 4067, Australia
| | - James W Kijas
- CSIRO Animal, Food and Health Sciences, 306 Carmody Road, St Lucia, Qld, 4067, Australia
| | - Wes C Barris
- CSIRO Animal, Food and Health Sciences, 306 Carmody Road, St Lucia, Qld, 4067, Australia
| | - Sean McWilliam
- CSIRO Animal, Food and Health Sciences, 306 Carmody Road, St Lucia, Qld, 4067, Australia
| | - Rowan J Bunch
- CSIRO Animal, Food and Health Sciences, 306 Carmody Road, St Lucia, Qld, 4067, Australia
| | - Russell McCullough
- CSIRO Animal, Food and Health Sciences, 306 Carmody Road, St Lucia, Qld, 4067, Australia
| | - Blair Harrison
- CSIRO Animal, Food and Health Sciences, 306 Carmody Road, St Lucia, Qld, 4067, Australia
| | - A Rus Hoelzel
- School of Biological and Biomedical Sciences, Durham University, Durham, DH1 3LE, UK
| | | |
Collapse
|
182
|
Frenkel-Morgenstern M, Gorohovski A, Vucenovic D, Maestre L, Valencia A. ChiTaRS 2.1--an improved database of the chimeric transcripts and RNA-seq data with novel sense-antisense chimeric RNA transcripts. Nucleic Acids Res 2014; 43:D68-75. [PMID: 25414346 PMCID: PMC4383979 DOI: 10.1093/nar/gku1199] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Chimeric RNAs that comprise two or more different transcripts have been identified in many cancers and among the Expressed Sequence Tags (ESTs) isolated from different organisms; they might represent functional proteins and produce different disease phenotypes. The ChiTaRS 2.1 database of chimeric transcripts and RNA-Seq data (http://chitars.bioinfo.cnio.es/) is the second version of the ChiTaRS database and includes improvements in content and functionality. Chimeras from eight organisms have been collated including novel sense–antisense (SAS) chimeras resulting from the slippage of the sense and anti-sense intragenic regions. The new database version collects more than 29 000 chimeric transcripts and indicates the expression and tissue specificity for 333 entries confirmed by RNA-seq reads mapping the chimeric junction sites. User interface allows for rapid and easy analysis of evolutionary conservation of fusions, literature references and experimental data supporting fusions in different organisms. More than 1428 cancer breakpoints have been automatically collected from public databases and manually verified to identify their correct cross-references, genomic sequences and junction sites. As a result, the ChiTaRS 2.1 collection of chimeras from eight organisms and human cancer breakpoints extends our understanding of the evolution of chimeric transcripts in eukaryotes as well as their functional role in carcinogenic processes.
Collapse
Affiliation(s)
- Milana Frenkel-Morgenstern
- Structural Biology and BioComputing Program, Spanish National Cancer Research Centre (CNIO), Madrid 28029, Spain
| | - Alessandro Gorohovski
- Structural Biology and BioComputing Program, Spanish National Cancer Research Centre (CNIO), Madrid 28029, Spain
| | - Dunja Vucenovic
- Structural Biology and BioComputing Program, Spanish National Cancer Research Centre (CNIO), Madrid 28029, Spain
| | - Lorena Maestre
- Monoclonal Antibodies Unit, Spanish National Cancer Research Centre (CNIO), Madrid 28029, Spain
| | - Alfonso Valencia
- Structural Biology and BioComputing Program, Spanish National Cancer Research Centre (CNIO), Madrid 28029, Spain.
| |
Collapse
|
183
|
Abstract
Genomic instability is a hallmark of cancer that leads to an increase in genetic alterations, thus enabling the acquisition of additional capabilities required for tumorigenesis and progression. Substantial heterogeneity in the amount and type of instability (nucleotide, microsatellite, or chromosomal) exists both within and between cancer types, with epithelial tumors typically displaying a greater degree of instability than hematological cancers. While high-throughput sequencing studies offer a comprehensive record of the genetic alterations within a tumor, detecting the rate of instability or cell-to-cell viability using this and most other available methods remains a challenge. Here, we discuss the different levels of genomic instability occurring in human cancers and touch on the current methods and limitations of detecting instability. We have applied one such approach to the surveying of public tumor data to provide a cursory view of genome instability across numerous tumor types.
Collapse
Affiliation(s)
- Larissa Pikor
- Department of Integrative Oncology, BC Cancer Research Centre, 675 West 10th Ave, Vancouver, BC, Canada, V5Z 1L3,
| | | | | | | |
Collapse
|
184
|
Gillet-Markowska A, Richard H, Fischer G, Lafontaine I. Ulysses: accurate detection of low-frequency structural variations in large insert-size sequencing libraries. ACTA ACUST UNITED AC 2014; 31:801-8. [PMID: 25380961 DOI: 10.1093/bioinformatics/btu730] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
MOTIVATION The detection of structural variations (SVs) in short-range Paired-End (PE) libraries remains challenging because SV breakpoints can involve large dispersed repeated sequences, or carry inherent complexity, hardly resolvable with classical PE sequencing data. In contrast, large insert-size sequencing libraries (Mate-Pair libraries) provide higher physical coverage of the genome and give access to repeat-containing regions. They can thus theoretically overcome previous limitations as they are becoming routinely accessible. Nevertheless, broad insert size distributions and high rates of chimerical sequences are usually associated to this type of libraries, which makes the accurate annotation of SV challenging. RESULTS Here, we present Ulysses, a tool that achieves drastically higher detection accuracy than existing tools, both on simulated and real mate-pair sequencing datasets from the 1000 Human Genome project. Ulysses achieves high specificity over the complete spectrum of variants by assessing, in a principled manner, the statistical significance of each possible variant (duplications, deletions, translocations, insertions and inversions) against an explicit model for the generation of experimental noise. This statistical model proves particularly useful for the detection of low frequency variants. SV detection performed on a large insert Mate-Pair library from a breast cancer sample revealed a high level of somatic duplications in the tumor and, to a lesser extent, in the blood sample as well. Altogether, these results show that Ulysses is a valuable tool for the characterization of somatic mosaicism in human tissues and in cancer genomes.
Collapse
Affiliation(s)
- Alexandre Gillet-Markowska
- Sorbonne Universités, UPMC University Paris 06, UMR 7238, Biologie Computationnelle et Quantitative and CNRS, UMR7238, Laboratory of Computational and Quantitative Biology, F-75005 Paris, France Sorbonne Universités, UPMC University Paris 06, UMR 7238, Biologie Computationnelle et Quantitative and CNRS, UMR7238, Laboratory of Computational and Quantitative Biology, F-75005 Paris, France
| | - Hugues Richard
- Sorbonne Universités, UPMC University Paris 06, UMR 7238, Biologie Computationnelle et Quantitative and CNRS, UMR7238, Laboratory of Computational and Quantitative Biology, F-75005 Paris, France Sorbonne Universités, UPMC University Paris 06, UMR 7238, Biologie Computationnelle et Quantitative and CNRS, UMR7238, Laboratory of Computational and Quantitative Biology, F-75005 Paris, France
| | - Gilles Fischer
- Sorbonne Universités, UPMC University Paris 06, UMR 7238, Biologie Computationnelle et Quantitative and CNRS, UMR7238, Laboratory of Computational and Quantitative Biology, F-75005 Paris, France Sorbonne Universités, UPMC University Paris 06, UMR 7238, Biologie Computationnelle et Quantitative and CNRS, UMR7238, Laboratory of Computational and Quantitative Biology, F-75005 Paris, France
| | - Ingrid Lafontaine
- Sorbonne Universités, UPMC University Paris 06, UMR 7238, Biologie Computationnelle et Quantitative and CNRS, UMR7238, Laboratory of Computational and Quantitative Biology, F-75005 Paris, France Sorbonne Universités, UPMC University Paris 06, UMR 7238, Biologie Computationnelle et Quantitative and CNRS, UMR7238, Laboratory of Computational and Quantitative Biology, F-75005 Paris, France
| |
Collapse
|
185
|
Detection of MYD88 L265P Mutation by Real-Time Allele-Specific Oligonucleotide Polymerase Chain Reaction. Appl Immunohistochem Mol Morphol 2014; 22:768-73. [DOI: 10.1097/pai.0000000000000020] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
|
186
|
Somatic mutations, genome mosaicism, cancer and aging. Curr Opin Genet Dev 2014; 26:141-9. [PMID: 25282114 DOI: 10.1016/j.gde.2014.04.002] [Citation(s) in RCA: 99] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2014] [Revised: 04/04/2014] [Accepted: 04/11/2014] [Indexed: 01/11/2023]
Abstract
Genomes are inherently unstable due to the need for DNA sequence variation in the germ line to fuel evolution through natural selection. In somatic tissues mutations accumulate during development and aging, generating genome mosaics. There is little information about the possible causal role of increased somatic mutation loads in late-life disease and aging, with the exception of cancer. Characterizing somatic mutations and their functional consequences in normal tissues remains a formidable challenge due to their low, individual abundance. Here, I will briefly review our current knowledge of somatic mutations in animals and humans in relation to aging, how they arise and lead to genome mosaicism, the technology to study somatic mutations and how they possibly could cause non-clonal disease.
Collapse
|
187
|
Wright CM, Yang IA, Bowman RV, Fong KM. The potential of genome-wide analyses to improve non-small-cell lung cancer care. Lung Cancer Manag 2014. [DOI: 10.2217/lmt.14.31] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
SUMMARY Genomic technologies have revolutionized the way we study and understand cancer. The advent of next-generation sequencing technology in particular is now starting to change the clinical management of non-small-cell lung cancer. These technologies have helped us to refine prognostication and identify new driver mutations that can allow subselection of patients for therapeutic intervention. However, several limitations and challenges must be overcome before these technologies are widely accepted in diagnostic laboratories. It will be important for clinicians and diagnostic laboratories to consider sample type, analytical platform, cost, data security and ethics, and the bioinformatics challenges associated with 'big data', before widespread integration to the clinic. If these challenges can be overcome, then genomics has the potential to change clinical management of lung cancer.
Collapse
Affiliation(s)
- Casey M Wright
- Asbestos Diseases Research Institute, Sydney, NSW, Australia
| | - Ian A Yang
- Department of Thoracic Medicine, The Prince Charles Hospital, 627 Rode Road, Chermside, QLD 4032, Australia
- University of Queensland Thoracic Research Centre, School of Medicine, The University of Queensland, Brisbane, QLD, Australia
| | - Rayleen V Bowman
- Department of Thoracic Medicine, The Prince Charles Hospital, 627 Rode Road, Chermside, QLD 4032, Australia
- University of Queensland Thoracic Research Centre, School of Medicine, The University of Queensland, Brisbane, QLD, Australia
| | - Kwun M Fong
- Department of Thoracic Medicine, The Prince Charles Hospital, 627 Rode Road, Chermside, QLD 4032, Australia
- University of Queensland Thoracic Research Centre, School of Medicine, The University of Queensland, Brisbane, QLD, Australia
| |
Collapse
|
188
|
Chinen Y, Sakamoto N, Nagoshi H, Taki T, Maegawa S, Tatekawa S, Tsukamoto T, Mizutani S, Shimura Y, Yamamoto-Sugitani M, Kobayashi T, Matsumoto Y, Horiike S, Kuroda J, Taniwaki M. 8q24 amplified segments involve novel fusion genes between NSMCE2 and long noncoding RNAs in acute myelogenous leukemia. J Hematol Oncol 2014; 7:68. [PMID: 25245984 PMCID: PMC4176872 DOI: 10.1186/s13045-014-0068-2] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2014] [Accepted: 09/11/2014] [Indexed: 11/22/2022] Open
Abstract
The pathogenetic roles of 8q24 amplified segments in leukemic cells with double minute chromosomes remain to be verified. Through comprehensive molecular analyses of 8q24 amplicons in leukemic cells from an acute myelogenous leukemia (AML) patient and AML-derived cell line HL60 cells, we identified two novel fusion genes between NSMCE2 and long noncoding RNAs (lncRNAs), namely, PVT1-NSMCE2 and BF104016-NSMCE2. Our study suggests that 8q24 amplicons are associated with the emergence of aberrant chimeric genes between NSMCE2 and oncogenic lncRNAs, and also implicate that the chimeric genes involving lncRNAs potentially possess as-yet-unknown oncogenic functional roles.
Collapse
|
189
|
Wu J, Lee WP, Ward A, Walker JA, Konkel MK, Batzer MA, Marth GT. Tangram: a comprehensive toolbox for mobile element insertion detection. BMC Genomics 2014; 15:795. [PMID: 25228379 PMCID: PMC4180832 DOI: 10.1186/1471-2164-15-795] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2014] [Accepted: 09/03/2014] [Indexed: 11/10/2022] Open
Abstract
Background Mobile elements (MEs) constitute greater than 50% of the human genome as a result of repeated insertion events during human genome evolution. Although most of these elements are now fixed in the population, some MEs, including ALU, L1, SVA and HERV-K elements, are still actively duplicating. Mobile element insertions (MEIs) have been associated with human genetic disorders, including Crohn’s disease, hemophilia, and various types of cancer, motivating the need for accurate MEI detection methods. To comprehensively identify and accurately characterize these variants in whole genome next-generation sequencing (NGS) data, a computationally efficient detection and genotyping method is required. Current computational tools are unable to call MEI polymorphisms with sufficiently high sensitivity and specificity, or call individual genotypes with sufficiently high accuracy. Results Here we report Tangram, a computationally efficient MEI detection program that integrates read-pair (RP) and split-read (SR) mapping signals to detect MEI events. By utilizing SR mapping in its primary detection module, a feature unique to this software, Tangram is able to pinpoint MEI breakpoints with single-nucleotide precision. To understand the role of MEI events in disease, it is essential to produce accurate individual genotypes in clinical samples. Tangram is able to determine sample genotypes with very high accuracy. Using simulations and experimental datasets, we demonstrate that Tangram has superior sensitivity, specificity, breakpoint resolution and genotyping accuracy, when compared to other, recently developed MEI detection methods. Conclusions Tangram serves as the primary MEI detection tool in the 1000 Genomes Project, and is implemented as a highly portable, memory-efficient, easy-to-use C++ computer program, built under an open-source development model.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Gabor T Marth
- Department of Human Genetics and USTAR Center for Genetic Discovery, University of Utah, Salt Lake City, Utah, USA.
| |
Collapse
|
190
|
Drabovich AP, Martínez-Morillo E, Diamandis EP. Toward an integrated pipeline for protein biomarker development. BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS 2014; 1854:677-86. [PMID: 25218201 DOI: 10.1016/j.bbapap.2014.09.006] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/29/2014] [Revised: 08/08/2014] [Accepted: 09/04/2014] [Indexed: 01/06/2023]
Abstract
Protein biomarker development is a multidisciplinary task involving basic, translational and clinical research. Integration of multidisciplinary efforts in a single pipeline is challenging, but crucial to facilitate rational discovery of protein biomarkers and alleviate existing disappointments in the field. In this review, we discuss in detail individual phases of biomarker development pipeline, such as biomarker candidate identification, verification and validation. We focus on mass spectrometry as a principal technique for protein identification and quantification, and discuss complementary -omics approaches for selection of biomarker candidates. Proteomic samples, protein-based clinical laboratory tests and limitations of biomarker development are reviewed in detail, and critical assessment of all phases of biomarker development pipeline is provided. This article is part of a Special Issue entitled: Medical Proteomics.
Collapse
Affiliation(s)
- Andrei P Drabovich
- Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, ON, Canada; Department of Laboratory Medicine and Pathobiology, University of Toronto, Toronto, ON, Canada.
| | | | - Eleftherios P Diamandis
- Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, ON, Canada; Department of Laboratory Medicine and Pathobiology, University of Toronto, Toronto, ON, Canada; Department of Clinical Biochemistry, University Health Network, Toronto, ON, Canada; Department of Pathology and Laboratory Medicine, Mount Sinai Hospital, Toronto, ON, Canada
| |
Collapse
|
191
|
Liu B, Morrison CD, Johnson CS, Trump DL, Qin M, Conroy JC, Wang J, Liu S. Computational methods for detecting copy number variations in cancer genome using next generation sequencing: principles and challenges. Oncotarget 2014; 4:1868-81. [PMID: 24240121 PMCID: PMC3875755 DOI: 10.18632/oncotarget.1537] [Citation(s) in RCA: 62] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
Accurate detection of somatic copy number variations (CNVs) is an essential part of cancer genome analysis, and plays an important role in oncotarget identifications. Next generation sequencing (NGS) holds the promise to revolutionize somatic CNV detection. In this review, we provide an overview of current analytic tools used for CNV detection in NGS-based cancer studies. We summarize the NGS data types used for CNV detection, decipher the principles for data preprocessing, segmentation, and interpretation, and discuss the challenges in somatic CNV detection. This review aims to provide a guide to the analytic tools used in NGS-based cancer CNV studies, and to discuss the important factors that researchers need to consider when analyzing NGS data for somatic CNV detections.
Collapse
Affiliation(s)
- Biao Liu
- Center for Personalized Medicine, Roswell Park Cancer Institute, Buffalo, NY
| | | | | | | | | | | | | | | |
Collapse
|
192
|
Kadalayil L, Rafiq S, Rose-Zerilli MJJ, Pengelly RJ, Parker H, Oscier D, Strefford JC, Tapper WJ, Gibson J, Ennis S, Collins A. Exome sequence read depth methods for identifying copy number changes. Brief Bioinform 2014; 16:380-92. [PMID: 25169955 DOI: 10.1093/bib/bbu027] [Citation(s) in RCA: 69] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2014] [Accepted: 07/10/2014] [Indexed: 01/04/2023] Open
Abstract
Copy number variants (CNVs) play important roles in a number of human diseases and in pharmacogenetics. Powerful methods exist for CNV detection in whole genome sequencing (WGS) data, but such data are costly to obtain. Many disease causal CNVs span or are found in genome coding regions (exons), which makes CNV detection using whole exome sequencing (WES) data attractive. If reliably validated against WGS-based CNVs, exome-derived CNVs have potential applications in a clinical setting. Several algorithms have been developed to exploit exome data for CNV detection and comparisons made to find the most suitable methods for particular data samples. The results are not consistent across studies. Here, we review some of the exome CNV detection methods based on depth of coverage profiles and examine their performance to identify problems contributing to discrepancies in published results. We also present a streamlined strategy that uses a single metric, the likelihood ratio, to compare exome methods, and we demonstrated its utility using the VarScan 2 and eXome Hidden Markov Model (XHMM) programs using paired normal and tumour exome data from chronic lymphocytic leukaemia patients. We use array-based somatic CNV (SCNV) calls as a reference standard to compute prevalence-independent statistics, such as sensitivity, specificity and likelihood ratio, for validation of the exome-derived SCNVs. We also account for factors known to influence the performance of exome read depth methods, such as CNV size and frequency, while comparing our findings with published results.
Collapse
|
193
|
Duan J, Zhang JG, Wan M, Deng HW, Wang YP. Population clustering based on copy number variations detected from next generation sequencing data. J Bioinform Comput Biol 2014; 12:1450021. [PMID: 25152046 DOI: 10.1142/s0219720014500218] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Copy number variations (CNVs) can be used as significant bio-markers and next generation sequencing (NGS) provides a high resolution detection of these CNVs. But how to extract features from CNVs and further apply them to genomic studies such as population clustering have become a big challenge. In this paper, we propose a novel method for population clustering based on CNVs from NGS. First, CNVs are extracted from each sample to form a feature matrix. Then, this feature matrix is decomposed into the source matrix and weight matrix with non-negative matrix factorization (NMF). The source matrix consists of common CNVs that are shared by all the samples from the same group, and the weight matrix indicates the corresponding level of CNVs from each sample. Therefore, using NMF of CNVs one can differentiate samples from different ethnic groups, i.e. population clustering. To validate the approach, we applied it to the analysis of both simulation data and two real data set from the 1000 Genomes Project. The results on simulation data demonstrate that the proposed method can recover the true common CNVs with high quality. The results on the first real data analysis show that the proposed method can cluster two family trio with different ancestries into two ethnic groups and the results on the second real data analysis show that the proposed method can be applied to the whole-genome with large sample size consisting of multiple groups. Both results demonstrate the potential of the proposed method for population clustering.
Collapse
Affiliation(s)
- Junbo Duan
- Department of Biomedical Engineering, Xi'an Jiaotong University, Xi'an, P. R. China
| | | | | | | | | |
Collapse
|
194
|
Milavec M, Dobnik D, Yang L, Zhang D, Gruden K, Zel J. GMO quantification: valuable experience and insights for the future. Anal Bioanal Chem 2014; 406:6485-97. [PMID: 25182968 DOI: 10.1007/s00216-014-8077-0] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2014] [Revised: 07/23/2014] [Accepted: 07/28/2014] [Indexed: 11/30/2022]
Abstract
Cultivation and marketing of genetically modified organisms (GMOs) have been unevenly adopted worldwide. To facilitate international trade and to provide information to consumers, labelling requirements have been set up in many countries. Quantitative real-time polymerase chain reaction (qPCR) is currently the method of choice for detection, identification and quantification of GMOs. This has been critically assessed and the requirements for the method performance have been set. Nevertheless, there are challenges that should still be highlighted, such as measuring the quantity and quality of DNA, and determining the qPCR efficiency, possible sequence mismatches, characteristics of taxon-specific genes and appropriate units of measurement, as these remain potential sources of measurement uncertainty. To overcome these problems and to cope with the continuous increase in the number and variety of GMOs, new approaches are needed. Statistical strategies of quantification have already been proposed and expanded with the development of digital PCR. The first attempts have been made to use new generation sequencing also for quantitative purposes, although accurate quantification of the contents of GMOs using this technology is still a challenge for the future, and especially for mixed samples. New approaches are needed also for the quantification of stacks, and for potential quantification of organisms produced by new plant breeding techniques.
Collapse
Affiliation(s)
- Mojca Milavec
- Department of Biotechnology and Systems Biology, National Institute of Biology (NIB), Večna pot 111, 1000, Ljubljana, Slovenia,
| | | | | | | | | | | |
Collapse
|
195
|
Lighten J, van Oosterhout C, Bentzen P. Critical review of NGS analyses for de novo genotyping multigene families. Mol Ecol 2014; 23:3957-72. [DOI: 10.1111/mec.12843] [Citation(s) in RCA: 53] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2014] [Revised: 06/08/2014] [Accepted: 06/17/2014] [Indexed: 01/16/2023]
Affiliation(s)
- Jackie Lighten
- Department of Biology; Marine Gene Probe Laboratory; Dalhousie University; Halifax Nova Scotia Canada
| | - Cock van Oosterhout
- School of Environmental Sciences; University of East Anglia; Norwich Research Park; Norwich UK
| | - Paul Bentzen
- Department of Biology; Marine Gene Probe Laboratory; Dalhousie University; Halifax Nova Scotia Canada
| |
Collapse
|
196
|
L'Abbate A, Macchia G, D'Addabbo P, Lonoce A, Tolomeo D, Trombetta D, Kok K, Bartenhagen C, Whelan CW, Palumbo O, Severgnini M, Cifola I, Dugas M, Carella M, De Bellis G, Rocchi M, Carbone L, Storlazzi CT. Genomic organization and evolution of double minutes/homogeneously staining regions with MYC amplification in human cancer. Nucleic Acids Res 2014; 42:9131-45. [PMID: 25034695 PMCID: PMC4132716 DOI: 10.1093/nar/gku590] [Citation(s) in RCA: 86] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
Abstract
The mechanism for generating double minutes chromosomes (dmin) and homogeneously staining regions (hsr) in cancer is still poorly understood. Through an integrated approach combining next-generation sequencing, single nucleotide polymorphism array, fluorescent in situ hybridization and polymerase chain reaction-based techniques, we inferred the fine structure of MYC-containing dmin/hsr amplicons harboring sequences from several different chromosomes in seven tumor cell lines, and characterized an unprecedented number of hsr insertion sites. Local chromosome shattering involving a single-step catastrophic event (chromothripsis) was recently proposed to explain clustered chromosomal rearrangements and genomic amplifications in cancer. Our bioinformatics analyses based on the listed criteria to define chromothripsis led us to exclude it as the driving force underlying amplicon genesis in our samples. Instead, the finding of coexisting heterogeneous amplicons, differing in their complexity and chromosome content, in cell lines derived from the same tumor indicated the occurrence of a multi-step evolutionary process in the genesis of dmin/hsr. Our integrated approach allowed us to gather a complete view of the complex chromosome rearrangements occurring within MYC amplicons, suggesting that more than one model may be invoked to explain the origin of dmin/hsr in cancer. Finally, we identified PVT1 as a target of fusion events, confirming its role as breakpoint hotspot in MYC amplification.
Collapse
Affiliation(s)
| | - Gemma Macchia
- Department of Biology, University of Bari, Bari, Italy
| | | | - Angelo Lonoce
- Department of Biology, University of Bari, Bari, Italy
| | - Doron Tolomeo
- Department of Biology, University of Bari, Bari, Italy
| | - Domenico Trombetta
- Laboratory of Oncology, IRCCS Casa Sollievo della Sofferenza Hospital, San Giovanni Rotondo, Italy
| | - Klaas Kok
- Department of Genetics, University of Groningen, Groningen, The Netherlands
| | | | | | - Orazio Palumbo
- Medical Genetics Unit, IRCCS Casa Sollievo della Sofferenza Hospital, San Giovanni Rotondo, Italy
| | - Marco Severgnini
- Institute for Biomedical Technologies, National Research Council, Milan, Italy
| | - Ingrid Cifola
- Institute for Biomedical Technologies, National Research Council, Milan, Italy
| | - Martin Dugas
- Institute of Medical Informatics, University of Münster, Münster, Germany
| | - Massimo Carella
- Medical Genetics Unit, IRCCS Casa Sollievo della Sofferenza Hospital, San Giovanni Rotondo, Italy
| | - Gianluca De Bellis
- Institute for Biomedical Technologies, National Research Council, Milan, Italy
| | | | - Lucia Carbone
- National Primate Research Center, Beaverton, Oregon, USA
| | | |
Collapse
|
197
|
Abstract
High-throughput DNA sequencing has revolutionized the study of cancer genomics with numerous discoveries that are relevant to cancer diagnosis and treatment. The latest sequencing and analysis methods have successfully identified somatic alterations, including single-nucleotide variants, insertions and deletions, copy-number aberrations, structural variants and gene fusions. Additional computational techniques have proved useful for defining the mutations, genes and molecular networks that drive diverse cancer phenotypes and that determine clonal architectures in tumour samples. Collectively, these tools have advanced the study of genomic, transcriptomic and epigenomic alterations in cancer, and their association to clinical properties. Here, we review cancer genomics software and the insights that have been gained from their application.
Collapse
|
198
|
Quek K, Nones K, Patch AM, Fink JL, Newell F, Cloonan N, Miller D, Fadlullah MZH, Kassahn K, Christ AN, Bruxner TJC, Manning S, Harliwong I, Idrisoglu S, Nourse C, Nourbakhsh E, Wani S, Steptoe A, Anderson M, Holmes O, Leonard C, Taylor D, Wood S, Xu Q, Wilson P, Biankin AV, Pearson JV, Waddell N, Grimmond SM. A workflow to increase verification rate of chromosomal structural rearrangements using high-throughput next-generation sequencing. Biotechniques 2014; 57:31-8. [PMID: 25005691 DOI: 10.2144/000114189] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2014] [Accepted: 06/12/2014] [Indexed: 11/23/2022] Open
Abstract
Somatic rearrangements, which are commonly found in human cancer genomes, contribute to the progression and maintenance of cancers. Conventionally, the verification of somatic rearrangements comprises many manual steps and Sanger sequencing. This is labor intensive when verifying a large number of rearrangements in a large cohort. To increase the verification throughput, we devised a high-throughput workflow that utilizes benchtop next-generation sequencing and in-house bioinformatics tools to link the laboratory processes. In the proposed workflow, primers are automatically designed. PCR and an optional gel electrophoresis step to confirm the somatic nature of the rearrangements are performed. PCR products of somatic events are pooled for Ion Torrent PGM and/or Illumina MiSeq sequencing, the resulting sequence reads are assembled into consensus contigs by a consensus assembler, and an automated BLAT is used to resolve the breakpoints to base level. We compared sequences and breakpoints of verified somatic rearrangements between the conventional and high-throughput workflow. The results showed that next-generation sequencing methods are comparable to conventional Sanger sequencing. The identified breakpoints obtained from next-generation sequencing methods were highly accurate and reproducible. Furthermore, the proposed workflow allows hundreds of events to be processed in a shorter time frame compared with the conventional workflow.
Collapse
Affiliation(s)
- Kelly Quek
- Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, University of Queensland, St Lucia, Brisbane, QLD, Australia
| | - Katia Nones
- Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, University of Queensland, St Lucia, Brisbane, QLD, Australia
| | - Ann-Marie Patch
- Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, University of Queensland, St Lucia, Brisbane, QLD, Australia
| | - J Lynn Fink
- Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, University of Queensland, St Lucia, Brisbane, QLD, Australia
| | - Felicity Newell
- Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, University of Queensland, St Lucia, Brisbane, QLD, Australia
| | - Nicole Cloonan
- Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, University of Queensland, St Lucia, Brisbane, QLD, Australia
| | - David Miller
- Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, University of Queensland, St Lucia, Brisbane, QLD, Australia
| | - Muhammad Z H Fadlullah
- Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, University of Queensland, St Lucia, Brisbane, QLD, Australia
| | - Karin Kassahn
- Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, University of Queensland, St Lucia, Brisbane, QLD, Australia
| | - Angelika N Christ
- Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, University of Queensland, St Lucia, Brisbane, QLD, Australia
| | - Timothy J C Bruxner
- Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, University of Queensland, St Lucia, Brisbane, QLD, Australia
| | - Suzanne Manning
- Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, University of Queensland, St Lucia, Brisbane, QLD, Australia
| | - Ivon Harliwong
- Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, University of Queensland, St Lucia, Brisbane, QLD, Australia
| | - Senel Idrisoglu
- Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, University of Queensland, St Lucia, Brisbane, QLD, Australia
| | - Craig Nourse
- Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, University of Queensland, St Lucia, Brisbane, QLD, Australia
| | - Ehsan Nourbakhsh
- Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, University of Queensland, St Lucia, Brisbane, QLD, Australia
| | - Shivangi Wani
- Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, University of Queensland, St Lucia, Brisbane, QLD, Australia
| | - Anita Steptoe
- Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, University of Queensland, St Lucia, Brisbane, QLD, Australia
| | - Matthew Anderson
- Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, University of Queensland, St Lucia, Brisbane, QLD, Australia
| | - Oliver Holmes
- Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, University of Queensland, St Lucia, Brisbane, QLD, Australia
| | - Conrad Leonard
- Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, University of Queensland, St Lucia, Brisbane, QLD, Australia
| | - Darrin Taylor
- Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, University of Queensland, St Lucia, Brisbane, QLD, Australia
| | - Scott Wood
- Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, University of Queensland, St Lucia, Brisbane, QLD, Australia
| | - Qinying Xu
- Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, University of Queensland, St Lucia, Brisbane, QLD, Australia
| | - Peter Wilson
- Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, University of Queensland, St Lucia, Brisbane, QLD, Australia
| | - Andrew V Biankin
- The Kinghorn Cancer Centre, Cancer Research Program, Garvan Institute of Medical Research, Sydney, NSW, Australia; Department of Surgery, Bankstown Hospital, Sydney, NSW, Australia; South Western Sydney Clinical School, Faculty of Medicine, University of NSW, Liverpool, NSW, Australia; Wolfson Wohl Cancer Research Centre, Institute for Cancer Sciences, University of Glasgow, Glasgow, Scotland, United Kingdom
| | - John V Pearson
- Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, University of Queensland, St Lucia, Brisbane, QLD, Australia
| | - Nic Waddell
- Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, University of Queensland, St Lucia, Brisbane, QLD, Australia
| | - Sean M Grimmond
- Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, University of Queensland, St Lucia, Brisbane, QLD, Australia; Wolfson Wohl Cancer Research Centre, Institute for Cancer Sciences, University of Glasgow, Glasgow, Scotland, United Kingdom
| |
Collapse
|
199
|
Daniels MG, Bowman RV, Yang IA, Govindan R, Fong KM. An emerging place for lung cancer genomics in 2013. J Thorac Dis 2014; 5 Suppl 5:S491-7. [PMID: 24163742 DOI: 10.3978/j.issn.2072-1439.2013.10.06] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2013] [Accepted: 10/12/2013] [Indexed: 12/11/2022]
Abstract
Lung cancer is a disease with a dismal prognosis and is the biggest cause of cancer deaths in many countries. Nonetheless, rapid technological developments in genome science promise more effective prevention and treatment strategies. Since the Human Genome Project, scientific advances have revolutionized the diagnosis and treatment of human cancers, including thoracic cancers. The latest, massively parallel, next generation sequencing (NGS) technologies offer much greater sequencing capacity than traditional, capillary-based Sanger sequencing. These modern but costly technologies have been applied to whole genome-, and whole exome sequencing (WGS and WES) for the discovery of mutations and polymorphisms, transcriptome sequencing for quantification of gene expression, small ribonucleic acid (RNA) sequencing for microRNA profiling, large scale analysis of deoxyribonucleic acid (DNA) methylation and chromatin immunoprecipitation mapping of DNA-protein interaction. With the rise of personalized cancer care, based on the premise of precision medicine, sequencing technologies are constantly changing. To date, the genomic landscape of lung cancer has been captured in several WGS projects. Such work has not only contributed to our understanding of cancer biology, but has also provided impetus for technical advances that may improve our ability to accurately capture the cancer genome. Issues such as short read lengths contribute to sequenced libraries that contain challenging gaps in the aligned genome. Emerging platforms promise longer reads as well as the ability to capture a range of epigenomic signals. In addition, ongoing optimization of bioinformatics strategies for data analysis and interpretation are critical, especially for the differentiation between driver and passenger mutations. Moreover, broader deployment of these and future generations of platforms, coupled with an increasing bioinformatics workforce with access to highly sophisticated technologies, could see many of these discoveries translated to the clinic at a rapid pace. We look forward to these advances making a difference for the many patients we treat in the Asia-Pacific region and around the world.
Collapse
Affiliation(s)
- Marissa G Daniels
- The University of Queensland and the Prince Charles Hospital Thoracic Research Centre, the Prince Charles Hospital, Chermside 4032, Australia
| | | | | | | | | |
Collapse
|
200
|
Abstract
Differences between plant genomes range from single nucleotide polymorphisms to large-scale duplications, deletions and rearrangements. The large polymorphisms are termed structural variants (SVs). SVs have received significant attention in human genetics and were found to be responsible for various chronic diseases. However, little effort has been directed towards understanding the role of SVs in plants. Many recent advances in plant genetics have resulted from improvements in high-resolution technologies for measuring SVs, including microarray-based techniques, and more recently, high-throughput DNA sequencing. In this review we describe recent reports of SV in plants and describe the genomic technologies currently used to measure these SVs.
Collapse
|