Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Zhang Q, Ding L, Larson DE, Koboldt DC, McLellan MD, Chen K, Shi X, Kraja A, Mardis ER, Wilson RK, Borecki IB, Province MA. CMDS: a population-based method for identifying recurrent DNA copy number aberrations in cancer from high-resolution data. ACTA ACUST UNITED AC 2009;26:464-9. [PMID: 20031968 DOI: 10.1093/bioinformatics/btp708] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]

For:	Zhang Q, Ding L, Larson DE, Koboldt DC, McLellan MD, Chen K, Shi X, Kraja A, Mardis ER, Wilson RK, Borecki IB, Province MA. CMDS: a population-based method for identifying recurrent DNA copy number aberrations in cancer from high-resolution data. ACTA ACUST UNITED AC 2009;26:464-9. [PMID: 20031968 DOI: 10.1093/bioinformatics/btp708] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]

Number

Cited by Other Article(s)

Sinha R, Pal RK, De RK. A novel method addressing NGS-based mappability bias for sensitive detection of DNA alterations. J Bioinform Comput Biol 2024;22:2450009. [PMID: 39030667 DOI: 10.1142/s0219720024500094] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/21/2024]

Sinha R, Pal RK, De RK. ENLIGHTENMENT: A Scalable Annotated Database of Genomics and NGS-Based Nucleotide Level Profiles. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024;21:155-168. [PMID: 38055361 DOI: 10.1109/tcbb.2023.3340067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/08/2023]

Sinha R, Pal RK, De RK. GenSeg and MR-GenSeg: A Novel Segmentation Algorithm and its Parallel MapReduce Based Approach for Identifying Genomic Regions With Copy Number Variations. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022;19:443-454. [PMID: 32750860 DOI: 10.1109/tcbb.2020.3000661] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]

Yuan X, Xu X, Zhao H, Duan J. ERINS: Novel Sequence Insertion Detection by Constructing an Extended Reference. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021;18:1893-1901. [PMID: 31751246 DOI: 10.1109/tcbb.2019.2954315] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]

Qin F, Luo X, Cai G, Xiao F. Shall genomic correlation structure be considered in copy number variants detection? Brief Bioinform 2021;22:6295811. [PMID: 34114005 DOI: 10.1093/bib/bbab215] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2021] [Revised: 04/16/2021] [Accepted: 05/17/2021] [Indexed: 11/14/2022] Open

Yan C, He J, Luo J, Wang J, Zhang G, Luo H. SIns: A Novel Insertion Detection Approach Based on Soft-Clipped Reads. Front Genet 2021;12:665812. [PMID: 33995493 PMCID: PMC8120196 DOI: 10.3389/fgene.2021.665812] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2021] [Accepted: 04/06/2021] [Indexed: 11/13/2022] Open

Statistical Considerations on NGS Data for Inferring Copy Number Variations. Methods Mol Biol 2021;2243:27-58. [PMID: 33606251 DOI: 10.1007/978-1-0716-1103-6_2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]

Alshawaqfeh M, Al Kawam A, Serpedin E, Datta A. Robust Recurrent CNV Detection in the Presence of Inter-Subject Variability. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020;17:1056-1067. [PMID: 30387737 DOI: 10.1109/tcbb.2018.2878560] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]

Xi J, Li A, Wang M. HetRCNA: A Novel Method to Identify Recurrent Copy Number Alternations from Heterogeneous Tumor Samples Based on Matrix Decomposition Framework. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020;17:422-434. [PMID: 29994262 DOI: 10.1109/tcbb.2018.2846599] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]

Yang X, Han G, Cai H, Song Y. Recovering Hidden Diagonal Structures via Non-Negative Matrix Factorization with Multiple Constraints. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2019;16:1760-1772. [PMID: 28371782 DOI: 10.1109/tcbb.2017.2690282] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]

Zhang Z, Cheng H, Hong X, Di Narzo AF, Franzen O, Peng S, Ruusalepp A, Kovacic JC, Bjorkegren JLM, Wang X, Hao K. EnsembleCNV: an ensemble machine learning algorithm to identify and genotype copy number variation using SNP array data. Nucleic Acids Res 2019;47:e39. [PMID: 30722045 PMCID: PMC6468244 DOI: 10.1093/nar/gkz068] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2018] [Revised: 12/17/2018] [Accepted: 01/25/2019] [Indexed: 12/30/2022] Open

Affiliation(s)

Zhongyang Zhang Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
Haoxiang Cheng Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
Xiumei Hong Center on the Early Life Origins of Disease, Department of Population, Family and Reproductive Health, Johns Hopkins University Bloomberg School of Public Health, Baltimore, MD 21205, USA
Antonio F Di Narzo Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
Oscar Franzen Integrated Cardio Metabolic Centre, Department of Medicine, Karolinska Institutet, Karolinska Universitetssjukhuset, Huddinge, Sweden
Shouneng Peng Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
Arno Ruusalepp Department of Cardiac Surgery, Tartu University Hospital, Tartu, Estonia
Jason C Kovacic Cardiovascular Research Center, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
Johan L M Bjorkegren Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA Integrated Cardio Metabolic Centre, Department of Medicine, Karolinska Institutet, Karolinska Universitetssjukhuset, Huddinge, Sweden
Xiaobin Wang Center on the Early Life Origins of Disease, Department of Population, Family and Reproductive Health, Johns Hopkins University Bloomberg School of Public Health, Baltimore, MD 21205, USA Division of General Pediatrics & Adolescent Medicine, Department of Pediatrics, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA
Ke Hao Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA The Tenth People's Hospital, Tongji University, Shanghai 200072, China College of Environmental Science and Engineering, Tongji University, Shanghai 200092, China

Collapse

Tran HV, Kiemer AK, Helms V. Copy Number Alterations in Tumor Genomes Deleting Antineoplastic Drug Targets Partially Compensated by Complementary Amplifications. Cancer Genomics Proteomics 2018;15:365-378. [PMID: 30194077 PMCID: PMC6199575 DOI: 10.21873/cgp.20095] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2018] [Revised: 07/14/2018] [Accepted: 07/17/2018] [Indexed: 01/06/2023] Open

Yuan X, Zhang J, Yang L, Bai J, Fan P. Detection of Significant Copy Number Variations From Multiple Samples in Next-Generation Sequencing Data. IEEE Trans Nanobioscience 2018;17:12-20. [PMID: 29570071 DOI: 10.1109/tnb.2017.2783910] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]

Dembélé D. Analysis of high-throughput biological data using their rank values. Stat Methods Med Res 2018;28:2276-2291. [PMID: 29560792 DOI: 10.1177/0962280218764187] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]

Malekpour SA, Pezeshk H, Sadeghi M. MSeq-CNV: accurate detection of Copy Number Variation from Sequencing of Multiple samples. Sci Rep 2018;8:4009. [PMID: 29507384 PMCID: PMC5838159 DOI: 10.1038/s41598-018-22323-8] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2017] [Accepted: 02/16/2018] [Indexed: 01/23/2023] Open

A Total-variation Constrained Permutation Model for Revealing Common Copy Number Patterns. Sci Rep 2017;7:9666. [PMID: 28851906 PMCID: PMC5575355 DOI: 10.1038/s41598-017-09139-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2016] [Accepted: 07/24/2017] [Indexed: 01/20/2023] Open

Delatola EI, Lebarbier E, Mary-Huard T, Radvanyi F, Robin S, Wong J. SegCorr a statistical procedure for the detection of genomic regions of correlated expression. BMC Bioinformatics 2017;18:333. [PMID: 28697800 PMCID: PMC5504623 DOI: 10.1186/s12859-017-1742-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2016] [Accepted: 06/26/2017] [Indexed: 01/27/2023] Open

Ji T, Chen J. Statistical models for DNA copy number variation detection using read-depth data from next generation sequencing experiments. AUST NZ J STAT 2016. [DOI: 10.1111/anzs.12175] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]

Malekpour SA, Pezeshk H, Sadeghi M. PSE-HMM: genome-wide CNV detection from NGS data using an HMM with Position-Specific Emission probabilities. BMC Bioinformatics 2016;18:30. [PMID: 27809781 PMCID: PMC5445519 DOI: 10.1186/s12859-016-1296-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2016] [Accepted: 10/20/2016] [Indexed: 11/23/2022] Open

Malekpour SA, Pezeshk H, Sadeghi M. MGP-HMM: Detecting genome-wide CNVs using an HMM for modeling mate pair insertion sizes and read counts. Math Biosci 2016;279:53-62. [PMID: 27424951 DOI: 10.1016/j.mbs.2016.07.006] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2016] [Revised: 06/12/2016] [Accepted: 07/10/2016] [Indexed: 01/02/2023]

Xi J, Li A. Discovering Recurrent Copy Number Aberrations in Complex Patterns via Non-Negative Sparse Singular Value Decomposition. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2016;13:656-668. [PMID: 26372614 DOI: 10.1109/tcbb.2015.2474404] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]

Thangam M, Gopal RK. CRCDA--Comprehensive resources for cancer NGS data analysis. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2015;2015:bav092. [PMID: 26450948 PMCID: PMC4597977 DOI: 10.1093/database/bav092] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/27/2015] [Accepted: 08/31/2015] [Indexed: 12/24/2022]

Abstract

Next generation sequencing (NGS) innovations put a compelling landmark in life science and changed the direction of research in clinical oncology with its productivity to diagnose and treat cancer. The aim of our portal comprehensive resources for cancer NGS data analysis (CRCDA) is to provide a collection of different NGS tools and pipelines under diverse classes with cancer pathways and databases and furthermore, literature information from PubMed. The literature data was constrained to 18 most common cancer types such as breast cancer, colon cancer and other cancers that exhibit in worldwide population. NGS-cancer tools for the convenience have been categorized into cancer genomics, cancer transcriptomics, cancer epigenomics, quality control and visualization. Pipelines for variant detection, quality control and data analysis were listed to provide out-of-the box solution for NGS data analysis, which may help researchers to overcome challenges in selecting and configuring individual tools for analysing exome, whole genome and transcriptome data. An extensive search page was developed that can be queried by using (i) type of data [literature, gene data and sequence read archive (SRA) data] and (ii) type of cancer (selected based on global incidence and accessibility of data). For each category of analysis, variety of tools are available and the biggest challenge is in searching and using the right tool for the right application. The objective of the work is collecting tools in each category available at various places and arranging the tools and other data in a simple and user-friendly manner for biologists and oncologists to find information easier. To the best of our knowledge, we have collected and presented a comprehensive package of most of the resources available in cancer for NGS data analysis. Given these factors, we believe that this website will be an useful resource to the NGS research community working on cancer.

Database URL: http://bioinfo.au-kbc.org.in/ngs/ngshome.html.

Collapse

Wang X, Li X, Cheng Y, Sun X, Sun X, Self S, Kooperberg C, Dai JY. Copy number alterations detected by whole-exome and whole-genome sequencing of esophageal adenocarcinoma. Hum Genomics 2015;9:22. [PMID: 26374103 PMCID: PMC4570720 DOI: 10.1186/s40246-015-0044-0] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2015] [Accepted: 08/25/2015] [Indexed: 02/08/2023] Open

Abstract

Background

Esophageal adenocarcinoma (EA) is among the leading causes of cancer mortality, especially in developed countries. A high level of somatic copy number alterations (CNAs) accumulates over the decades in the progression from Barrett’s esophagus, the precursor lesion, to EA. Accurate identification of somatic CNAs is essential to understand cancer development. Many studies have been conducted for the detection of CNA in EA using microarrays. Next-generation sequencing (NGS) technologies are believed to have advantages in sensitivity and accuracy to detect CNA, yet no NGS-based CNA detection in EA has been reported.

Results

In this study, we analyzed whole-exome (WES) and whole-genome sequencing (WGS) data for detecting CNA from a published large-scale genomic study of EA. Two specific comparisons were conducted. First, the recurrent CNAs based on WGS and WES data from 145 EA samples were compared to those found in five previous microarray-based studies. We found that the majority of the previously identified regions were also detected in this study. Interestingly, some novel amplifications and deletions were discovered using the NGS data. In particular, SKI and PRKCZ detected in a deletion region are involved in transforming growth factor-β pathway, suggesting the potential utility of novel biomarkers for EA. Second, we compared CNAs detected in WGS and WES data from the same 15 EA samples. No large-scale CNA was identified statistically more frequently by WES or WGS, while more focal-scale CNAs were detected by WGS than by WES.

Conclusions

Our results suggest that NGS can replace microarrays to detect CNA in EA. WGS is superior to WES in that it can offer finer resolution for the detection, though if the interest is on recurrent CNAs, WES can be preferable to WGS for its cost-effectiveness.

Electronic supplementary material

The online version of this article (doi:10.1186/s40246-015-0044-0) contains supplementary material, which is available to authorized users.

Collapse

Masecchia S, Coco S, Barla A, Verri A, Tonini GP. Genome instability model of metastatic neuroblastoma tumorigenesis by a dictionary learning algorithm. BMC Med Genomics 2015;8:57. [PMID: 26358114 PMCID: PMC4566396 DOI: 10.1186/s12920-015-0132-y] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2015] [Accepted: 08/28/2015] [Indexed: 12/21/2022] Open

Abstract

Background

Metastatic neuroblastoma (NB) occurs in pediatric patients as stage 4S or stage 4 and it is characterized by heterogeneous clinical behavior associated with diverse genotypes. Tumors of stage 4 contain several structural copy number aberrations (CNAs) rarely found in stage 4S. To date, the NB tumorigenesis is not still elucidated, although it is evident that genomic instability plays a critical role in the genesis of the tumor. Here we propose a mathematical approach to decipher genomic data and we provide a new model of NB metastatic tumorigenesis.

Method

We elucidate NB tumorigenesis using Enhanced Fused Lasso Latent Feature Model (E-FLLat) modeling the array comparative chromosome hybridization (aCGH) data of 190 metastatic NBs (63 stage 4S and 127 stage 4). This model for aCGH segmentation, based on the minimization of functional dictionary learning (DL), combines several penalties tailored to the specificities of aCGH data. In DL, the original signal is approximated by a linear weighted combination of atoms: the elements of the learned dictionary.

Results

The hierarchical structures for stage 4S shows at the first level of the oncogenetic tree several whole chromosome gains except to the unbalanced gains of 17q, 2p and 2q. Conversely, the high CNA complexity found in stage 4 tumors, requires two different trees. Both stage 4 oncogenetic trees are marked diverged, up to five sublevels and the 17q gain is the most common event at the first level (2/3 nodes). Moreover the 11q deletion, one of the major unfavorable marker of disease progression, occurs before 3p loss indicating that critical chromosome aberrations appear at early stages of tumorigenesis. Finally, we also observed a significant (p = 0.025) association between patient age and chromosome loss in stage 4 cases.

Conclusion

These results led us to propose a genome instability progressive model in which NB cells initiate with a DNA synthesis uncoupled from cell division, that leads to stage 4S tumors, primarily characterized by numerical aberrations, or stage 4 tumors with high levels of genome instability resulting in complex chromosome rearrangements associated with high tumor aggressiveness and rapid disease progression.

Electronic supplementary material

The online version of this article (doi:10.1186/s12920-015-0132-y) contains supplementary material, which is available to authorized users.

Collapse

CNV-CH: A Convex Hull Based Segmentation Approach to Detect Copy Number Variations (CNV) Using Next-Generation Sequencing Data. PLoS One 2015;10:e0135895. [PMID: 26291322 PMCID: PMC4546278 DOI: 10.1371/journal.pone.0135895] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2014] [Accepted: 07/28/2015] [Indexed: 11/19/2022] Open

Glusman G, Severson A, Dhankani V, Robinson M, Farrah T, Mauldin DE, Stittrich AB, Ament SA, Roach JC, Brunkow ME, Bodian DL, Vockley JG, Shmulevich I, Niederhuber JE, Hood L. Identification of copy number variants in whole-genome data using Reference Coverage Profiles. Front Genet 2015;6:45. [PMID: 25741365 PMCID: PMC4330915 DOI: 10.3389/fgene.2015.00045] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2014] [Accepted: 01/30/2015] [Indexed: 12/20/2022] Open

Oleksiewicz U, Tomczak K, Woropaj J, Markowska M, Stępniak P, Shah PK. Computational characterisation of cancer molecular profiles derived using next generation sequencing. Contemp Oncol (Pozn) 2015;19:A78-91. [PMID: 25691827 PMCID: PMC4322529 DOI: 10.5114/wo.2014.47137] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023] Open

Expanding the computational toolbox for mining cancer genomes. Nat Rev Genet 2014;15:556-70. [PMID: 25001846 DOI: 10.1038/nrg3767] [Citation(s) in RCA: 146] [Impact Index Per Article: 14.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]

Zhou X, Liu J, Wan X, Yu W. Piecewise-constant and low-rank approximation for identification of recurrent copy number variations. Bioinformatics 2014;30:1943-9. [PMID: 24642062 DOI: 10.1093/bioinformatics/btu131] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open

Raphael BJ, Dobson JR, Oesper L, Vandin F. Identifying driver mutations in sequenced cancer genomes: computational approaches to enable precision medicine. Genome Med 2014;6:5. [PMID: 24479672 PMCID: PMC3978567 DOI: 10.1186/gm524] [Citation(s) in RCA: 131] [Impact Index Per Article: 13.1] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open

Evans P, Kong Y, Krauthammer M. Computational analysis in cancer exome sequencing. Methods Mol Biol 2014;1176:219-227. [PMID: 25030931 DOI: 10.1007/978-1-4939-0992-6_18] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]

Zhao M, Wang Q, Wang Q, Jia P, Zhao Z. Computational tools for copy number variation (CNV) detection using next-generation sequencing data: features and perspectives. BMC Bioinformatics 2013;14 Suppl 11:S1. [PMID: 24564169 PMCID: PMC3846878 DOI: 10.1186/1471-2105-14-s11-s1] [Citation(s) in RCA: 333] [Impact Index Per Article: 30.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open

Zhou X, Yang C, Wan X, Zhao H, Yu W. Multisample aCGH data analysis via total variation and spectral regularization. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2013;10:230-235. [PMID: 23702561 PMCID: PMC3715577 DOI: 10.1109/tcbb.2012.166] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]

Mardis ER. Next-generation sequencing platforms. ANNUAL REVIEW OF ANALYTICAL CHEMISTRY (PALO ALTO, CALIF.) 2013;6:287-303. [PMID: 23560931 DOI: 10.1146/annurev-anchem-062012-092628] [Citation(s) in RCA: 351] [Impact Index Per Article: 31.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2023]

Rueda OM, Diaz-Uriarte R, Caldas C. Finding common regions of alteration in copy number data. Methods Mol Biol 2013;973:339-53. [PMID: 23412800 DOI: 10.1007/978-1-62703-281-0_21] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/20/2023]

Comparative analysis of methods for identifying recurrent copy number alterations in cancer. PLoS One 2012;7:e52516. [PMID: 23285074 PMCID: PMC3527554 DOI: 10.1371/journal.pone.0052516] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2012] [Accepted: 11/14/2012] [Indexed: 11/19/2022] Open

Xuan J, Yu Y, Qing T, Guo L, Shi L. Next-generation sequencing in the clinic: promises and challenges. Cancer Lett 2012;340:284-95. [PMID: 23174106 DOI: 10.1016/j.canlet.2012.11.025] [Citation(s) in RCA: 198] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2012] [Revised: 11/13/2012] [Accepted: 11/13/2012] [Indexed: 02/06/2023]

Exome sequencing identifies recurrent somatic RAC1 mutations in melanoma. Nat Genet 2012;44:1006-14. [PMID: 22842228 PMCID: PMC3432702 DOI: 10.1038/ng.2359] [Citation(s) in RCA: 893] [Impact Index Per Article: 74.4] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2012] [Accepted: 06/28/2012] [Indexed: 02/06/2023]

Yuan X, Yu G, Hou X, Shih IM, Clarke R, Zhang J, Hoffman EP, Wang RR, Zhang Z, Wang Y. Genome-wide identification of significant aberrations in cancer genome. BMC Genomics 2012;13:342. [PMID: 22839576 PMCID: PMC3428679 DOI: 10.1186/1471-2164-13-342] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2012] [Accepted: 07/27/2012] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Somatic Copy Number Alterations (CNAs) in human genomes are present in almost all human cancers. Systematic efforts to characterize such structural variants must effectively distinguish significant consensus events from random background aberrations. Here we introduce Significant Aberration in Cancer (SAIC), a new method for characterizing and assessing the statistical significance of recurrent CNA units. Three main features of SAIC include: (1) exploiting the intrinsic correlation among consecutive probes to assign a score to each CNA unit instead of single probes; (2) performing permutations on CNA units that preserve correlations inherent in the copy number data; and (3) iteratively detecting Significant Copy Number Aberrations (SCAs) and estimating an unbiased null distribution by applying an SCA-exclusive permutation scheme.

RESULTS

We test and compare the performance of SAIC against four peer methods (GISTIC, STAC, KC-SMART, CMDS) on a large number of simulation datasets. Experimental results show that SAIC outperforms peer methods in terms of larger area under the Receiver Operating Characteristics curve and increased detection power. We then apply SAIC to analyze structural genomic aberrations acquired in four real cancer genome-wide copy number data sets (ovarian cancer, metastatic prostate cancer, lung adenocarcinoma, glioblastoma). When compared with previously reported results, SAIC successfully identifies most SCAs known to be of biological significance and associated with oncogenes (e.g., KRAS, CCNE1, and MYC) or tumor suppressor genes (e.g., CDKN2A/B). Furthermore, SAIC identifies a number of novel SCAs in these copy number data that encompass tumor related genes and may warrant further studies.

CONCLUSIONS

Supported by a well-grounded theoretical framework, SAIC has been developed and used to identify SCAs in various cancer copy number data sets, providing useful information to study the landscape of cancer genomes. Open-source and platform-independent SAIC software is implemented using C++, together with R scripts for data formatting and Perl scripts for user interfacing, and it is easy to install and efficient to use. The source code and documentation are freely available at http://www.cbil.ece.vt.edu/software.htm.

Collapse

Yuan X, Zhang J, Yang L, Zhang S, Chen B, Geng Y, Wang Y. TAGCNA: a method to identify significant consensus events of copy number alterations in cancer. PLoS One 2012;7:e41082. [PMID: 22815924 PMCID: PMC3399811 DOI: 10.1371/journal.pone.0041082] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2012] [Accepted: 06/17/2012] [Indexed: 01/20/2023] Open

Yeo CWS, Ng FSL, Chai C, Tan JMM, Koh GRH, Chong YK, Koh LWH, Foong CSF, Sandanaraj E, Holbrook JD, Ang BT, Takahashi R, Tang C, Lim KL. Parkin pathway activation mitigates glioma cell proliferation and predicts patient survival. Cancer Res 2012;72:2543-53. [PMID: 22431710 DOI: 10.1158/0008-5472.can-11-3060] [Citation(s) in RCA: 65] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]

VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing. Genome Res 2012;22:568-76. [PMID: 22300766 DOI: 10.1101/gr.129684.111] [Citation(s) in RCA: 3350] [Impact Index Per Article: 279.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]

Wong G, Leckie C, Kowalczyk A. FSR: feature set reduction for scalable and accurate multi-class cancer subtype classification based on copy number. ACTA ACUST UNITED AC 2011;28:151-9. [PMID: 22110244 DOI: 10.1093/bioinformatics/btr644] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]

Abstract

MOTIVATION

Feature selection is a key concept in machine learning for microarray datasets, where features represented by probesets are typically several orders of magnitude larger than the available sample size. Computational tractability is a key challenge for feature selection algorithms in handling very high-dimensional datasets beyond a hundred thousand features, such as in datasets produced on single nucleotide polymorphism microarrays. In this article, we present a novel feature set reduction approach that enables scalable feature selection on datasets with hundreds of thousands of features and beyond. Our approach enables more efficient handling of higher resolution datasets to achieve better disease subtype classification of samples for potentially more accurate diagnosis and prognosis, which allows clinicians to make more informed decisions in regards to patient treatment options.

RESULTS

We applied our feature set reduction approach to several publicly available cancer single nucleotide polymorphism (SNP) array datasets and evaluated its performance in terms of its multiclass predictive classification accuracy over different cancer subtypes, its speedup in execution as well as its scalability with respect to sample size and array resolution. Feature Set Reduction (FSR) was able to reduce the dimensions of an SNP array dataset by more than two orders of magnitude while achieving at least equal, and in most cases superior predictive classification performance over that achieved on features selected by existing feature selection methods alone. An examination of the biological relevance of frequently selected features from FSR-reduced feature sets revealed strong enrichment in association with cancer.

AVAILABILITY

FSR was implemented in MATLAB R2010b and is available at http://ww2.cs.mu.oz.au/~gwong/FSR.

Collapse

Park C, Ahn J, Yoon Y, Park S. A multi-sample based method for identifying common CNVs in normal human genomic structure using high-resolution aCGH data. PLoS One 2011;6:e26975. [PMID: 22073121 PMCID: PMC3205051 DOI: 10.1371/journal.pone.0026975] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2011] [Accepted: 10/07/2011] [Indexed: 01/08/2023] Open

Abstract

BACKGROUND

It is difficult to identify copy number variations (CNV) in normal human genomic data due to noise and non-linear relationships between different genomic regions and signal intensity. A high-resolution array comparative genomic hybridization (aCGH) containing 42 million probes, which is very large compared to previous arrays, was recently published. Most existing CNV detection algorithms do not work well because of noise associated with the large amount of input data and because most of the current methods were not designed to analyze normal human samples. Normal human genome analysis often requires a joint approach across multiple samples. However, the majority of existing methods can only identify CNVs from a single sample.

METHODOLOGY AND PRINCIPAL FINDINGS

We developed a multi-sample-based genomic variations detector (MGVD) that uses segmentation to identify common breakpoints across multiple samples and a k-means-based clustering strategy. Unlike previous methods, MGVD simultaneously considers multiple samples with different genomic intensities and identifies CNVs and CNV zones (CNVZs); CNVZ is a more precise measure of the location of a genomic variant than the CNV region (CNVR).

CONCLUSIONS AND SIGNIFICANCE

We designed a specialized algorithm to detect common CNVs from extremely high-resolution multi-sample aCGH data. MGVD showed high sensitivity and a low false discovery rate for a simulated data set, and outperformed most current methods when real, high-resolution HapMap datasets were analyzed. MGVD also had the fastest runtime compared to the other algorithms evaluated when actual, high-resolution aCGH data were analyzed. The CNVZs identified by MGVD can be used in association studies for revealing relationships between phenotypes and genomic aberrations. Our algorithm was developed with standard C++ and is available in Linux and MS Windows format in the STL library. It is freely available at: http://embio.yonsei.ac.kr/~Park/mgvd.php.

Collapse

Scharpf RB, Irizarry RA, Ritchie ME, Carvalho B, Ruczinski I. Using the R Package crlmm for Genotyping and Copy Number Estimation. J Stat Softw 2011;40:1-32. [PMID: 22523482 PMCID: PMC3329223] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/31/2023] Open

Ritz A, Paris PL, Ittmann MM, Collins C, Raphael BJ. Detection of recurrent rearrangement breakpoints from copy number data. BMC Bioinformatics 2011;12:114. [PMID: 21510904 PMCID: PMC3112242 DOI: 10.1186/1471-2105-12-114] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2010] [Accepted: 04/21/2011] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Copy number variants (CNVs), including deletions, amplifications, and other rearrangements, are common in human and cancer genomes. Copy number data from array comparative genome hybridization (aCGH) and next-generation DNA sequencing is widely used to measure copy number variants. Comparison of copy number data from multiple individuals reveals recurrent variants. Typically, the interior of a recurrent CNV is examined for genes or other loci associated with a phenotype. However, in some cases, such as gene truncations and fusion genes, the target of variant lies at the boundary of the variant.

RESULTS

We introduce Neighborhood Breakpoint Conservation (NBC), an algorithm for identifying rearrangement breakpoints that are highly conserved at the same locus in multiple individuals. NBC detects recurrent breakpoints at varying levels of resolution, including breakpoints whose location is exactly conserved and breakpoints whose location varies within a gene. NBC also identifies pairs of recurrent breakpoints such as those that result from fusion genes. We apply NBC to aCGH data from 36 primary prostate tumors and identify 12 novel rearrangements, one of which is the well-known TMPRSS2-ERG fusion gene. We also apply NBC to 227 glioblastoma tumors and predict 93 novel rearrangements which we further classify as gene truncations, germline structural variants, and fusion genes. A number of these variants involve the protein phosphatase PTPN12 suggesting that deregulation of PTPN12, via a variety of rearrangements, is common in glioblastoma.

CONCLUSIONS

We demonstrate that NBC is useful for detection of recurrent breakpoints resulting from copy number variants or other structural variants, and in particular identifies recurrent breakpoints that result in gene truncations or fusion genes. Software is available at http://http.//cs.brown.edu/people/braphael/software.html.

Collapse

Wartman LD, Larson DE, Xiang Z, Ding L, Chen K, Lin L, Cahan P, Klco JM, Welch JS, Li C, Payton JE, Uy GL, Varghese N, Ries RE, Hoock M, Koboldt DC, McLellan MD, Schmidt H, Fulton RS, Abbott RM, Cook L, McGrath SD, Fan X, Dukes AF, Vickery T, Kalicki J, Lamprecht TL, Graubert TA, Tomasson MH, Mardis ER, Wilson RK, Ley TJ. Sequencing a mouse acute promyelocytic leukemia genome reveals genetic events relevant for disease progression. J Clin Invest 2011;121:1445-55. [PMID: 21436584 DOI: 10.1172/jci45284] [Citation(s) in RCA: 81] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2010] [Accepted: 01/19/2011] [Indexed: 01/12/2023] Open

Ding L, Wendl MC, Koboldt DC, Mardis ER. Analysis of next-generation genomic data in cancer: accomplishments and challenges. Hum Mol Genet 2010;19:R188-96. [PMID: 20843826 DOI: 10.1093/hmg/ddq391] [Citation(s) in RCA: 111] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open

Yuan X, Zhang J, Wang Y. Probability theory-based SNP association study method for identifying susceptibility loci and genetic disease models in human case-control data. IEEE Trans Nanobioscience 2010;9:232-41. [PMID: 20840904 DOI: 10.1109/tnb.2010.2070805] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]

Koboldt DC, Ding L, Mardis ER, Wilson RK. Challenges of sequencing human genomes. Brief Bioinform 2010;11:484-98. [PMID: 20519329 DOI: 10.1093/bib/bbq016] [Citation(s) in RCA: 98] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open