1
|
Zhou Q, Ghezelji M, Hari A, Ford MKB, Holley C, Mirabello L, Chanock S, Sahinalp SC, Numanagić I. Geny: A Genotyping Tool for Allelic Decomposition of Killer Cell Immunoglobulin-Like Receptor Genes. bioRxiv 2024:2024.02.27.582413. [PMID: 38529502 PMCID: PMC10962708 DOI: 10.1101/2024.02.27.582413] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/27/2024]
Abstract
Accurate genotyping of Killer cell Immunoglobulin-like Receptor (KIR) genes plays a pivotal role in enhancing our understanding of innate immune responses, disease correlations, and the advancement of personalized medicine. However, due to the high variability of the KIR region and high level of sequence similarity among different KIR genes, the currently available genotyping methods are unable to accurately infer copy numbers, genotypes and haplotypes of individual KIR genes from next-generation sequencing data. Here we introduce Geny, a new computational tool for precise genotyping of KIR genes. Geny utilizes available KIR haplotype databases and proposes a novel combination of expectation-maximization filtering schemes and integer linear programming-based combinatorial optimization models to resolve ambiguous reads, provide accurate copy number estimation and estimate the haplotype of each copy for the genes within the KIR region. We evaluated Geny on a large set of simulated short-read datasets covering the known validated KIR region assemblies and a set of Illumina short-read samples sequenced from 25 validated samples from the Human Pangenome Reference Consortium collection and showed that it outperforms the existing genotyping tools in terms of accuracy, precision and recall. We envision Geny becoming a valuable resource for understanding immune system response and consequently advancing the field of patient-centric medicine.
Collapse
|
2
|
Shugg T, Ly RC, Osei W, Rowe EJ, Granfield CA, Lynnes TC, Medeiros EB, Hodge JC, Breman AM, Schneider BP, Sahinalp SC, Numanagić I, Salisbury BA, Bray SM, Ratcliff R, Skaar TC. Computational pharmacogenotype extraction from clinical next-generation sequencing. Front Oncol 2023; 13:1199741. [PMID: 37469403 PMCID: PMC10352904 DOI: 10.3389/fonc.2023.1199741] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2023] [Accepted: 05/22/2023] [Indexed: 07/21/2023] Open
Abstract
Background Next-generation sequencing (NGS), including whole genome sequencing (WGS) and whole exome sequencing (WES), is increasingly being used for clinic care. While NGS data have the potential to be repurposed to support clinical pharmacogenomics (PGx), current computational approaches have not been widely validated using clinical data. In this study, we assessed the accuracy of the Aldy computational method to extract PGx genotypes from WGS and WES data for 14 and 13 major pharmacogenes, respectively. Methods Germline DNA was isolated from whole blood samples collected for 264 patients seen at our institutional molecular solid tumor board. DNA was used for panel-based genotyping within our institutional Clinical Laboratory Improvement Amendments- (CLIA-) certified PGx laboratory. DNA was also sent to other CLIA-certified commercial laboratories for clinical WGS or WES. Aldy v3.3 and v4.4 were used to extract PGx genotypes from these NGS data, and results were compared to the panel-based genotyping reference standard that contained 45 star allele-defining variants within CYP2B6, CYP2C8, CYP2C9, CYP2C19, CYP2D6, CYP3A4, CYP3A5, CYP4F2, DPYD, G6PD, NUDT15, SLCO1B1, TPMT, and VKORC1. Results Mean WGS read depth was >30x for all variant regions except for G6PD (average read depth was 29 reads), and mean WES read depth was >30x for all variant regions. For 94 patients with WGS, Aldy v3.3 diplotype calls were concordant with those from the genotyping reference standard in 99.5% of cases when excluding diplotypes with additional major star alleles not tested by targeted genotyping, ambiguous phasing, and CYP2D6 hybrid alleles. Aldy v3.3 identified 15 additional clinically actionable star alleles not covered by genotyping within CYP2B6, CYP2C19, DPYD, SLCO1B1, and NUDT15. Within the WGS cohort, Aldy v4.4 diplotype calls were concordant with those from genotyping in 99.7% of cases. When excluding patients with CYP2D6 copy number variation, all Aldy v4.4 diplotype calls except for one CYP3A4 diplotype call were concordant with genotyping for 161 patients in the WES cohort. Conclusion Aldy v3.3 and v4.4 called diplotypes for major pharmacogenes from clinical WES and WGS data with >99% accuracy. These findings support the use of Aldy to repurpose clinical NGS data to inform clinical PGx.
Collapse
Affiliation(s)
- Tyler Shugg
- Division of Clinical Pharmacology, Department of Medicine, Indiana University School of Medicine, Indianapolis, IN, United States
| | - Reynold C. Ly
- Division of Diagnostic Genetics and Genomics, Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN, United States
| | - Wilberforce Osei
- Division of Clinical Pharmacology, Department of Medicine, Indiana University School of Medicine, Indianapolis, IN, United States
| | - Elizabeth J. Rowe
- Division of Clinical Pharmacology, Department of Medicine, Indiana University School of Medicine, Indianapolis, IN, United States
| | - Caitlin A. Granfield
- Division of Diagnostic Genetics and Genomics, Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN, United States
| | - Ty C. Lynnes
- Division of Diagnostic Genetics and Genomics, Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN, United States
| | - Elizabeth B. Medeiros
- Division of Diagnostic Genetics and Genomics, Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN, United States
| | - Jennelle C. Hodge
- Division of Diagnostic Genetics and Genomics, Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN, United States
| | - Amy M. Breman
- Division of Diagnostic Genetics and Genomics, Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN, United States
| | - Bryan P. Schneider
- Division of Hematology/Oncology, Department of Medicine, Indiana University School of Medicine, Indianapolis, IN, United States
| | - S. Cenk Sahinalp
- Center for Cancer Research, National Cancer Institute, National Institute of Health, Bethesda, MD, United States
| | - Ibrahim Numanagić
- Department of Computer Science, University of Victoria, Victoria, BC, Canada
| | | | | | | | - Todd C. Skaar
- Division of Clinical Pharmacology, Department of Medicine, Indiana University School of Medicine, Indianapolis, IN, United States
| |
Collapse
|
3
|
Liu Y, Li XC, Rashidi Mehrabadi F, Schäffer AA, Pratt D, Crawford DR, Malikić S, Molloy EK, Gopalan V, Mount SM, Ruppin E, Aldape KD, Sahinalp SC. Single-cell methylation sequencing data reveal succinct metastatic migration histories and tumor progression models. Genome Res 2023; 33:1089-1100. [PMID: 37316351 PMCID: PMC10538489 DOI: 10.1101/gr.277608.122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2023] [Accepted: 06/06/2023] [Indexed: 06/16/2023]
Abstract
Recent studies exploring the impact of methylation in tumor evolution suggest that although the methylation status of many of the CpG sites are preserved across distinct lineages, others are altered as the cancer progresses. Because changes in methylation status of a CpG site may be retained in mitosis, they could be used to infer the progression history of a tumor via single-cell lineage tree reconstruction. In this work, we introduce the first principled distance-based computational method, Sgootr, for inferring a tumor's single-cell methylation lineage tree and for jointly identifying lineage-informative CpG sites that harbor changes in methylation status that are retained along the lineage. We apply Sgootr on single-cell bisulfite-treated whole-genome sequencing data of multiregionally sampled tumor cells from nine metastatic colorectal cancer patients, as well as multiregionally sampled single-cell reduced-representation bisulfite sequencing data from a glioblastoma patient. We show that the tumor lineages constructed reveal a simple model underlying tumor progression and metastatic seeding. A comparison of Sgootr against alternative approaches shows that Sgootr can construct lineage trees with fewer migration events and with more in concordance with the sequential-progression model of tumor evolution, with a running time a fraction of that used in prior studies. Lineage-informative CpG sites identified by Sgootr are in inter-CpG island (CGI) regions, as opposed to intra-CGIs, which have been the main regions of interest in genomic methylation-related analyses.
Collapse
Affiliation(s)
- Yuelin Liu
- Cancer Data Science Laboratory, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892, USA
- Department of Computer Science, University of Maryland, College Park, Maryland 20742, USA
- Center for Bioinformatics and Computational Biology, University of Maryland, College Park, Maryland 20742, USA
| | - Xuan Cindy Li
- Cancer Data Science Laboratory, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892, USA
- Program in Computational Biology, Bioinformatics, and Genomics, University of Maryland, College Park, Maryland 20742, USA
| | - Farid Rashidi Mehrabadi
- Cancer Data Science Laboratory, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892, USA
- Department of Computer Science, Indiana University, Bloomington, Indiana 47408, USA
- Laboratory of Human Carcinogenesis, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892, USA
| | - Alejandro A Schäffer
- Cancer Data Science Laboratory, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892, USA
| | - Drew Pratt
- Laboratory of Pathology, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892, USA
| | - David R Crawford
- Cancer Data Science Laboratory, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892, USA
- Program in Computational Biology, Bioinformatics, and Genomics, University of Maryland, College Park, Maryland 20742, USA
- Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, Maryland 20742, USA
| | - Salem Malikić
- Cancer Data Science Laboratory, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892, USA
| | - Erin K Molloy
- Department of Computer Science, University of Maryland, College Park, Maryland 20742, USA
- Center for Bioinformatics and Computational Biology, University of Maryland, College Park, Maryland 20742, USA
| | - Vishaka Gopalan
- Cancer Data Science Laboratory, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892, USA
| | - Stephen M Mount
- Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, Maryland 20742, USA
| | - Eytan Ruppin
- Cancer Data Science Laboratory, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892, USA
| | - Kenneth D Aldape
- Laboratory of Pathology, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892, USA
| | - S Cenk Sahinalp
- Cancer Data Science Laboratory, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892, USA;
| |
Collapse
|
4
|
Li XC, Liu Y, Mehrabadi FR, Schäffer AA, Pratt D, Crawford DR, Malikić S, Molloy EK, Gopalan V, Mount SM, Ruppin E, Aldape K, Sahinalp SC. Abstract 127: Single-cell methylation sequencing data reveals succinct metastatic migration histories and tumor progression models. Cancer Res 2023. [DOI: 10.1158/1538-7445.am2023-127] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/07/2023]
Abstract
Abstract
Recent studies exploring the impact of methylation in tumor evolution suggest that while the methylation status of many of the CpG sites are preserved across distinct lineages, others are altered as the cancer progresses. Since changes in methylation status of a CpG site may be retained in mitosis, they could be used to infer the progression history of a tumor via single-cell lineage tree reconstruction. In this work, we introduce the first principled distance-based computational method, Sgootr, for inferring a tumor's single-cell methylation lineage tree and jointly identifying lineage-informative CpG sites which harbor changes in methylation status that are retained along the lineage. We apply Sgootr on the single-cell bisulfite-treated whole genome sequencing data of multiregionally-sampled tumor cells from 9 metastatic colorectal cancer patients made available by Bian et al., as well as multiregionally-sampled single-cell reduced-representation bisulfite sequencing data from a glioblastoma patient made available by Chaligne et al. We demonstrate that the tumor lineages constructed reveal a simple model underlying colorectal tumor progression and metastatic seeding. A comparison of Sgootr against alternative approaches shows that Sgootr can construct lineage trees with fewer migration events and more in concordance with the sequential-progression model of tumor evolution, in time a fraction of that used in prior studies. Interestingly, lineage-informative CpG sites identified by Sgootr are in inter-CpG island (CGI) regions, as opposed to CGI's, which have been the main regions of interest in genomic methylation-related analyses. Sgootr is implemented as a Snakemake workflow, available at https://github.com/liuy0421/Sgootr.
Citation Format: Xuan C. Li, Yuelin Liu, Farid Rashidi Mehrabadi, Alejandro A. Schäffer, Drew Pratt, David R. Crawford, Salem Malikić, Erin K. Molloy, Vishaka Gopalan, Stephen M. Mount, Eytan Ruppin, Kenneth Aldape, S. Cenk Sahinalp. Single-cell methylation sequencing data reveals succinct metastatic migration histories and tumor progression models [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2023; Part 1 (Regular and Invited Abstracts); 2023 Apr 14-19; Orlando, FL. Philadelphia (PA): AACR; Cancer Res 2023;83(7_Suppl):Abstract nr 127.
Collapse
Affiliation(s)
- Xuan C. Li
- 1National Cancer Institute, Bethesda, MD
| | - Yuelin Liu
- 1National Cancer Institute, Bethesda, MD
| | | | | | - Drew Pratt
- 1National Cancer Institute, Bethesda, MD
| | | | | | | | | | | | | | | | | |
Collapse
|
5
|
Gerstung M, Jolly C, Leshchiner I, Dentro SC, Gonzalez S, Rosebrock D, Mitchell TJ, Rubanova Y, Anur P, Yu K, Tarabichi M, Deshwar A, Wintersinger J, Kleinheinz K, Vázquez-García I, Haase K, Jerman L, Sengupta S, Macintyre G, Malikic S, Donmez N, Livitz DG, Cmero M, Demeulemeester J, Schumacher S, Fan Y, Yao X, Lee J, Schlesner M, Boutros PC, Bowtell DD, Zhu H, Getz G, Imielinski M, Beroukhim R, Sahinalp SC, Ji Y, Peifer M, Markowetz F, Mustonen V, Yuan K, Wang W, Morris QD, Spellman PT, Wedge DC, Van Loo P, Tarabichi M, Wintersinger J, Deshwar AG, Yu K, Gonzalez S, Rubanova Y, Macintyre G, Adams DJ, Anur P, Beroukhim R, Boutros PC, Bowtell DD, Campbell PJ, Cao S, Christie EL, Cmero M, Cun Y, Dawson KJ, Demeulemeester J, Donmez N, Drews RM, Eils R, Fan Y, Fittall M, Garsed DW, Getz G, Ha G, Imielinski M, Jerman L, Ji Y, Kleinheinz K, Lee J, Lee-Six H, Livitz DG, Malikic S, Markowetz F, Martincorena I, Mitchell TJ, Mustonen V, Oesper L, Peifer M, Peto M, Raphael BJ, Rosebrock D, Sahinalp SC, Salcedo A, Schlesner M, Schumacher S, Sengupta S, Shi R, Shin SJ, Spiro O, Pitkänen E, Pivot X, Piñeiro-Yáñez E, Planko L, Plass C, Polak P, Pons T, Popescu I, Potapova O, Prasad A, Stein LD, Preston SR, Prinz M, Pritchard AL, Prokopec SD, Provenzano E, Puente XS, Puig S, Puiggròs M, Pulido-Tamayo S, Pupo GM, Vázquez-García I, Purdie CA, Quinn MC, Rabionet R, Rader JS, Radlwimmer B, Radovic P, Raeder B, Raine KM, Ramakrishna M, Ramakrishnan K, Vembu S, Ramalingam S, Raphael BJ, Rathmell WK, Rausch T, Reifenberger G, Reimand J, Reis-Filho J, Reuter V, Reyes-Salazar I, Reyna MA, Wheeler DA, Reynolds SM, Rheinbay E, Riazalhosseini Y, Richardson AL, Richter J, Ringel M, Ringnér M, Rino Y, Rippe K, Roach J, Yang TP, Roberts LR, Roberts ND, Roberts SA, Robertson AG, Robertson AJ, Rodriguez JB, Rodriguez-Martin B, Rodríguez-González FG, Roehrl MHA, Rohde M, Yao X, Rokutan H, Romieu G, Rooman I, Roques T, Rosebrock D, Rosenberg M, Rosenstiel PC, Rosenwald A, Rowe EW, Royo R, Yuan K, Rozen SG, Rubanova Y, Rubin MA, Rubio-Perez C, Rudneva VA, Rusev BC, Ruzzenente A, Rätsch G, Sabarinathan R, Sabelnykova VY, Zhu H, Sadeghi S, Sahinalp SC, Saini N, Saito-Adachi M, Saksena G, Salcedo A, Salgado R, Salichos L, Sallari R, Saller C, Wang W, Salvia R, Sam M, Samra JS, Sanchez-Vega F, Sander C, Sanders G, Sarin R, Sarrafi I, Sasaki-Oku A, Sauer T, Morris QD, Sauter G, Saw RPM, Scardoni M, Scarlett CJ, Scarpa A, Scelo G, Schadendorf D, Schein JE, Schilhabel MB, Schlesner M, Spellman PT, Schlomm T, Schmidt HK, Schramm SJ, Schreiber S, Schultz N, Schumacher SE, Schwarz RF, Scolyer RA, Scott D, Scully R, Wedge DC, Seethala R, Segre AV, Selander I, Semple CA, Senbabaoglu Y, Sengupta S, Sereni E, Serra S, Sgroi DC, Shackleton M, Van Loo P, Shah NC, Shahabi S, Shang CA, Shang P, Shapira O, Shelton T, Shen C, Shen H, Shepherd R, Shi R, Spellman PT, Shi Y, Shiah YJ, Shibata T, Shih J, Shimizu E, Shimizu K, Shin SJ, Shiraishi Y, Shmaya T, Shmulevich I, Wedge DC, Shorser SI, Short C, Shrestha R, Shringarpure SS, Shriver C, Shuai S, Sidiropoulos N, Siebert R, Sieuwerts AM, Sieverling L, Van Loo P, Signoretti S, Sikora KO, Simbolo M, Simon R, Simons JV, Simpson JT, Simpson PT, Singer S, Sinnott-Armstrong N, Sipahimalani P, Aaltonen LA, Skelly TJ, Smid M, Smith J, Smith-McCune K, Socci ND, Sofia HJ, Soloway MG, Song L, Sood AK, Sothi S, Abascal F, Sotiriou C, Soulette CM, Span PN, Spellman PT, Sperandio N, Spillane AJ, Spiro O, Spring J, Staaf J, Stadler PF, Abeshouse A, Staib P, Stark SG, Stebbings L, Stefánsson ÓA, Stegle O, Stein LD, Stenhouse A, Stewart C, Stilgenbauer S, Stobbe MD, Aburatani H, Stratton MR, Stretch JR, Struck AJ, Stuart JM, Stunnenberg HG, Su H, Su X, Sun RX, Sungalee S, Susak H, Adams DJ, Suzuki A, Sweep F, Szczepanowski M, Sültmann H, Yugawa T, Tam A, Tamborero D, Tan BKT, Tan D, Tan P, Agrawal N, Tanaka H, Taniguchi H, Tanskanen TJ, Tarabichi M, Tarnuzzer R, Tarpey P, Taschuk ML, Tatsuno K, Tavaré S, Taylor DF, Ahn KS, Taylor-Weiner A, Teague JW, Teh BT, Tembe V, Temes J, Thai K, Thayer SP, Thiessen N, Thomas G, Thomas S, Ahn SM, Thompson A, Thompson AM, Thompson JFF, Thompson RH, Thorne H, Thorne LB, Thorogood A, Tiao G, Tijanic N, Timms LE, Aikata H, Tirabosco R, Tojo M, Tommasi S, Toon CW, Toprak UH, Torrents D, Tortora G, Tost J, Totoki Y, Townend D, Akbani R, Traficante N, Treilleux I, Trotta JR, Trümper LHP, Tsao M, Tsunoda T, Tubio JMC, Tucker O, Turkington R, Turner DJ, Akdemir KC, Tutt A, Ueno M, Ueno NT, Umbricht C, Umer HM, Underwood TJ, Urban L, Urushidate T, Ushiku T, Uusküla-Reimand L, Al-Ahmadie H, Valencia A, Van Den Berg DJ, Van Laere S, Van Loo P, Van Meir EG, Van den Eynden GG, Van der Kwast T, Vasudev N, Vazquez M, Vedururu R, Al-Sedairy ST, Veluvolu U, Vembu S, Verbeke LPC, Vermeulen P, Verrill C, Viari A, Vicente D, Vicentini C, VijayRaghavan K, Viksna J, Al-Shahrour F, Vilain RE, Villasante I, Vincent-Salomon A, Visakorpi T, Voet D, Vyas P, Vázquez-García I, Waddell NM, Waddell N, Wadelius C, Alawi M, Wadi L, Wagener R, Wala JA, Wang J, Wang J, Wang L, Wang Q, Wang W, Wang Y, Wang Z, Albert M, Waring PM, Warnatz HJ, Warrell J, Warren AY, Waszak SM, Wedge DC, Weichenhan D, Weinberger P, Weinstein JN, Weischenfeldt J, Aldape K, Weisenberger DJ, Welch I, Wendl MC, Werner J, Whalley JP, Wheeler DA, Whitaker HC, Wigle D, Wilkerson MD, Williams A, Alexandrov LB, Wilmott JS, Wilson GW, Wilson JM, Wilson RK, Winterhoff B, Wintersinger JA, Wiznerowicz M, Wolf S, Wong BH, Wong T, Ally A, Wong W, Woo Y, Wood S, Wouters BG, Wright AJ, Wright DW, Wright MH, Wu CL, Wu DY, Wu G, Alsop K, Wu J, Wu K, Wu Y, Wu Z, Xi L, Xia T, Xiang Q, Xiao X, Xing R, Xiong H, Alvarez EG, Xu Q, Xu Y, Xue H, Yachida S, Yakneen S, Yamaguchi R, Yamaguchi TN, Yamamoto M, Yamamoto S, Yamaue H, Amary F, Yang F, Yang H, Yang JY, Yang L, Yang L, Yang S, Yang TP, Yang Y, Yao X, Yaspo ML, Amin SB, Yates L, Yau C, Ye C, Ye K, Yellapantula VD, Yoon CJ, Yoon SS, Yousif F, Yu J, Yu K, Aminou B, Yu W, Yu Y, Yuan K, Yuan Y, Yuen D, Yung CK, Zaikova O, Zamora J, Zapatka M, Zenklusen JC, Ammerpohl O, Zenz T, Zeps N, Zhang CZ, Zhang F, Zhang H, Zhang H, Zhang H, Zhang J, Zhang J, Zhang J, Anderson MJ, Zhang X, Zhang X, Zhang Y, Zhang Z, Zhao Z, Zheng L, Zheng X, Zhou W, Zhou Y, Zhu B, Ang Y, Zhu H, Zhu J, Zhu S, Zou L, Zou X, deFazio A, van As N, van Deurzen CHM, van de Vijver MJ, van’t Veer L, Antonello D, von Mering C, Anur P, Aparicio S, Appelbaum EL, Arai Y, Aretz A, Arihiro K, Ariizumi SI, Armenia J, Arnould L, Asa S, Assenov Y, Atwal G, Aukema S, Auman JT, Aure MRR, Awadalla P, Aymerich M, Bader GD, Baez-Ortega A, Bailey MH, Bailey PJ, Balasundaram M, Balu S, Bandopadhayay P, Banks RE, Barbi S, Barbour AP, Barenboim J, Barnholtz-Sloan J, Barr H, Barrera E, Bartlett J, Bartolome J, Bassi C, Bathe OF, Baumhoer D, Bavi P, Baylin SB, Bazant W, Beardsmore D, Beck TA, Behjati S, Behren A, Niu B, Bell C, Beltran S, Benz C, Berchuck A, Bergmann AK, Bergstrom EN, Berman BP, Berney DM, Bernhart SH, Beroukhim R, Berrios M, Bersani S, Bertl J, Betancourt M, Bhandari V, Bhosle SG, Biankin AV, Bieg M, Bigner D, Binder H, Birney E, Birrer M, Biswas NK, Bjerkehagen B, Bodenheimer T, Boice L, Bonizzato G, De Bono JS, Boot A, Bootwalla MS, Borg A, Borkhardt A, Boroevich KA, Borozan I, Borst C, Bosenberg M, Bosio M, Boultwood J, Bourque G, Boutros PC, Bova GS, Bowen DT, Bowlby R, Bowtell DDL, Boyault S, Boyce R, Boyd J, Brazma A, Brennan P, Brewer DS, Brinkman AB, Bristow RG, Broaddus RR, Brock JE, Brock M, Broeks A, Brooks AN, Brooks D, Brors B, Brunak S, Bruxner TJC, Bruzos AL, Buchanan A, Buchhalter I, Buchholz C, Bullman S, Burke H, Burkhardt B, Burns KH, Busanovich J, Bustamante CD, Butler AP, Butte AJ, Byrne NJ, Børresen-Dale AL, Caesar-Johnson SJ, Cafferkey A, Cahill D, Calabrese C, Caldas C, Calvo F, Camacho N, Campbell PJ, Campo E, Cantù C, Cao S, Carey TE, Carlevaro-Fita J, Carlsen R, Cataldo I, Cazzola M, Cebon J, Cerfolio R, Chadwick DE, Chakravarty D, Chalmers D, Chan CWY, Chan K, Chan-Seng-Yue M, Chandan VS, Chang DK, Chanock SJ, Chantrill LA, Chateigner A, Chatterjee N, Chayama K, Chen HW, Chen J, Chen K, Chen Y, Chen Z, Cherniack AD, Chien J, Chiew YE, Chin SF, Cho J, Cho S, Choi JK, Choi W, Chomienne C, Chong Z, Choo SP, Chou A, Christ AN, Christie EL, Chuah E, Cibulskis C, Cibulskis K, Cingarlini S, Clapham P, Claviez A, Cleary S, Cloonan N, Cmero M, Collins CC, Connor AA, Cooke SL, Cooper CS, Cope L, Corbo V, Cordes MG, Cordner SM, Cortés-Ciriano I, Covington K, Cowin PA, Craft B, Craft D, Creighton CJ, Cun Y, Curley E, Cutcutache I, Czajka K, Czerniak B, Dagg RA, Danilova L, Davi MV, Davidson NR, Davies H, Davis IJ, Davis-Dusenbery BN, Dawson KJ, De La Vega FM, De Paoli-Iseppi R, Defreitas T, Tos APD, Delaneau O, Demchok JA, Demeulemeester J, Demidov GM, Demircioğlu D, Dennis NM, Denroche RE, Dentro SC, Desai N, Deshpande V, Deshwar AG, Desmedt C, Deu-Pons J, Dhalla N, Dhani NC, Dhingra P, Dhir R, DiBiase A, Diamanti K, Ding L, Ding S, Dinh HQ, Dirix L, Doddapaneni H, Donmez N, Dow MT, Drapkin R, Drechsel O, Drews RM, Serge S, Dudderidge T, Dueso-Barroso A, Dunford AJ, Dunn M, Dursi LJ, Duthie FR, Dutton-Regester K, Eagles J, Easton DF, Edmonds S, Edwards PA, Edwards SE, Eeles RA, Ehinger A, Eils J, Eils R, El-Naggar A, Eldridge M, Ellrott K, Erkek S, Escaramis G, Espiritu SMG, Estivill X, Etemadmoghadam D, Eyfjord JE, Faltas BM, Fan D, Fan Y, Faquin WC, Farcas C, Fassan M, Fatima A, Favero F, Fayzullaev N, Felau I, Fereday S, Ferguson ML, Ferretti V, Feuerbach L, Field MA, Fink JL, Finocchiaro G, Fisher C, Fittall MW, Fitzgerald A, Fitzgerald RC, Flanagan AM, Fleshner NE, Flicek P, Foekens JA, Fong KM, Fonseca NA, Foster CS, Fox NS, Fraser M, Frazer S, Frenkel-Morgenstern M, Friedman W, Frigola J, Fronick CC, Fujimoto A, Fujita M, Fukayama M, Fulton LA, Fulton RS, Furuta M, Futreal PA, Füllgrabe A, Gabriel SB, Gallinger S, Gambacorti-Passerini C, Gao J, Gao S, Garraway L, Garred Ø, Garrison E, Garsed DW, Gehlenborg N, Gelpi JLL, George J, Gerhard DS, Gerhauser C, Gershenwald JE, Gerstein M, Gerstung M, Getz G, Ghori M, Ghossein R, Giama NH, Gibbs RA, Gibson B, Gill AJ, Gill P, Giri DD, Glodzik D, Gnanapragasam VJ, Goebler ME, Goldman MJ, Gomez C, Gonzalez S, Gonzalez-Perez A, Gordenin DA, Gossage J, Gotoh K, Govindan R, Grabau D, Graham JS, Grant RC, Green AR, Green E, Greger L, Grehan N, Grimaldi S, Grimmond SM, Grossman RL, Grundhoff A, Gundem G, Guo Q, Gupta M, Gupta S, Gut IG, Gut M, Göke J, Ha G, Haake A, Haan D, Haas S, Haase K, Haber JE, Habermann N, Hach F, Haider S, Hama N, Hamdy FC, Hamilton A, Hamilton MP, Han L, Hanna GB, Hansmann M, Haradhvala NJ, Harismendy O, Harliwong I, Harmanci AO, Harrington E, Hasegawa T, Haussler D, Hawkins S, Hayami S, Hayashi S, Hayes DN, Hayes SJ, Hayward NK, Hazell S, He Y, Heath AP, Heath SC, Hedley D, Hegde AM, Heiman DI, Heinold MC, Heins Z, Heisler LE, Hellstrom-Lindberg E, Helmy M, Heo SG, Hepperla AJ, Heredia-Genestar JM, Herrmann C, Hersey P, Hess JM, Hilmarsdottir H, Hinton J, Hirano S, Hiraoka N, Hoadley KA, Hobolth A, Hodzic E, Hoell JI, Hoffmann S, Hofmann O, Holbrook A, Holik AZ, Hollingsworth MA, Holmes O, Holt RA, Hong C, Hong EP, Hong JH, Hooijer GK, Hornshøj H, Hosoda F, Hou Y, Hovestadt V, Howat W, Hoyle AP, Hruban RH, Hu J, Hu T, Hua X, Huang KL, Huang M, Huang MN, Huang V, Huang Y, Huber W, Hudson TJ, Hummel M, Hung JA, Huntsman D, Hupp TR, Huse J, Huska MR, Hutter B, Hutter CM, Hübschmann D, Iacobuzio-Donahue CA, Imbusch CD, Imielinski M, Imoto S, Isaacs WB, Isaev K, Ishikawa S, Iskar M, Islam SMA, Ittmann M, Ivkovic S, Izarzugaza JMG, Jacquemier J, Jakrot V, Jamieson NB, Jang GH, Jang SJ, Jayaseelan JC, Jayasinghe R, Jefferys SR, Jegalian K, Jennings JL, Jeon SH, Jerman L, Ji Y, Jiao W, Johansson PA, Johns AL, Johns J, Johnson R, Johnson TA, Jolly C, Joly Y, Jonasson JG, Jones CD, Jones DR, Jones DTW, Jones N, Jones SJM, Jonkers J, Ju YS, Juhl H, Jung J, Juul M, Juul RI, Juul S, Jäger N, Kabbe R, Kahles A, Kahraman A, Kaiser VB, Kakavand H, Kalimuthu S, von Kalle C, Kang KJ, Karaszi K, Karlan B, Karlić R, Karsch D, Kasaian K, Kassahn KS, Katai H, Kato M, Katoh H, Kawakami Y, Kay JD, Kazakoff SH, Kazanov MD, Keays M, Kebebew E, Kefford RF, Kellis M, Kench JG, Kennedy CJ, Kerssemakers JNA, Khoo D, Khoo V, Khuntikeo N, Khurana E, Kilpinen H, Kim HK, Kim HL, Kim HY, Kim H, Kim J, Kim J, Kim JK, Kim Y, King TA, Klapper W, Kleinheinz K, Klimczak LJ, Knappskog S, Kneba M, Knoppers BM, Koh Y, Komorowski J, Komura D, Komura M, Kong G, Kool M, Korbel JO, Korchina V, Korshunov A, Koscher M, Koster R, Kote-Jarai Z, Koures A, Kovacevic M, Kremeyer B, Kretzmer H, Kreuz M, Krishnamurthy S, Kube D, Kumar K, Kumar P, Kumar S, Kumar Y, Kundra R, Kübler K, Küppers R, Lagergren J, Lai PH, Laird PW, Lakhani SR, Lalansingh CM, Lalonde E, Lamaze FC, Lambert A, Lander E, Landgraf P, Landoni L, Langerød A, Lanzós A, Larsimont D, Larsson E, Lathrop M, Lau LMS, Lawerenz C, Lawlor RT, Lawrence MS, Lazar AJ, Lazic AM, Le X, Lee D, Lee D, Lee EA, Lee HJ, Lee JJK, Lee JY, Lee J, Lee MTM, Lee-Six H, Lehmann KV, Lehrach H, Lenze D, Leonard CR, Leongamornlert DA, Leshchiner I, Letourneau L, Letunic I, Levine DA, Lewis L, Ley T, Li C, Li CH, Li HI, Li J, Li L, Li S, Li S, Li X, Li X, Li X, Li Y, Liang H, Liang SB, Lichter P, Lin P, Lin Z, Linehan WM, Lingjærde OC, Liu D, Liu EM, Liu FFF, Liu F, Liu J, Liu X, Livingstone J, Livitz D, Livni N, Lochovsky L, Loeffler M, Long GV, Lopez-Guillermo A, Lou S, Louis DN, Lovat LB, Lu Y, Lu YJ, Lu Y, Luchini C, Lungu I, Luo X, Luxton HJ, Lynch AG, Lype L, López C, López-Otín C, Ma EZ, Ma Y, MacGrogan G, MacRae S, Macintyre G, Madsen T, Maejima K, Mafficini A, Maglinte DT, Maitra A, Majumder PP, Malcovati L, Malikic S, Malleo G, Mann GJ, Mantovani-Löffler L, Marchal K, Marchegiani G, Mardis ER, Margolin AA, Marin MG, Markowetz F, Markowski J, Marks J, Marques-Bonet T, Marra MA, Marsden L, Martens JWM, Martin S, Martin-Subero JI, Martincorena I, Martinez-Fundichely A, Maruvka YE, Mashl RJ, Massie CE, Matthew TJ, Matthews L, Mayer E, Mayes S, Mayo M, Mbabaali F, McCune K, McDermott U, McGillivray PD, McLellan MD, McPherson JD, McPherson JR, McPherson TA, Meier SR, Meng A, Meng S, Menzies A, Merrett ND, Merson S, Meyerson M, Meyerson W, Mieczkowski PA, Mihaiescu GL, Mijalkovic S, Mikkelsen T, Milella M, Mileshkin L, Miller CA, Miller DK, Miller JK, Mills GB, Milovanovic A, Minner S, Miotto M, Arnau GM, Mirabello L, Mitchell C, Mitchell TJ, Miyano S, Miyoshi N, Mizuno S, Molnár-Gábor F, Moore MJ, Moore RA, Morganella S, Morris QD, Morrison C, Mose LE, Moser CD, Muiños F, Mularoni L, Mungall AJ, Mungall K, Musgrove EA, Mustonen V, Mutch D, Muyas F, Muzny DM, Muñoz A, Myers J, Myklebost O, Möller P, Nagae G, Nagrial AM, Nahal-Bose HK, Nakagama H, Nakagawa H, Nakamura H, Nakamura T, Nakano K, Nandi T, Nangalia J, Nastic M, Navarro A, Navarro FCP, Neal DE, Nettekoven G, Newell F, Newhouse SJ, Newton Y, Ng AWT, Ng A, Nicholson J, Nicol D, Nie Y, Nielsen GP, Nielsen MM, Nik-Zainal S, Noble MS, Nones K, Northcott PA, Notta F, O’Connor BD, O’Donnell P, O’Donovan M, O’Meara S, O’Neill BP, O’Neill JR, Ocana D, Ochoa A, Oesper L, Ogden C, Ohdan H, Ohi K, Ohno-Machado L, Oien KA, Ojesina AI, Ojima H, Okusaka T, Omberg L, Ong CK, Ossowski S, Ott G, Ouellette BFF, P’ng C, Paczkowska M, Paiella S, Pairojkul C, Pajic M, Pan-Hammarström Q, Papaemmanuil E, Papatheodorou I, Paramasivam N, Park JW, Park JW, Park K, Park K, Park PJ, Parker JS, Parsons SL, Pass H, Pasternack D, Pastore A, Patch AM, Pauporté I, Pea A, Pearson JV, Pedamallu CS, Pedersen JS, Pederzoli P, Peifer M, Pennell NA, Perou CM, Perry MD, Petersen GM, Peto M, Petrelli N, Petryszak R, Pfister SM, Phillips M, Pich O, Pickett HA, Pihl TD, Pillay N, Pinder S, Pinese M, Pinho AV. Author Correction: The evolutionary history of 2,658 cancers. Nature 2023; 614:E42. [PMID: 36697833 PMCID: PMC9931577 DOI: 10.1038/s41586-022-05601-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Affiliation(s)
- Moritz Gerstung
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, UK. .,European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany. .,Wellcome Sanger Institute, Cambridge, UK.
| | - Clemency Jolly
- grid.451388.30000 0004 1795 1830The Francis Crick Institute, London, UK
| | - Ignaty Leshchiner
- grid.66859.340000 0004 0546 1623Broad Institute of MIT and Harvard, Cambridge, MA USA
| | - Stefan C. Dentro
- grid.10306.340000 0004 0606 5382Wellcome Sanger Institute, Cambridge, UK ,grid.451388.30000 0004 1795 1830The Francis Crick Institute, London, UK ,grid.4991.50000 0004 1936 8948Big Data Institute, University of Oxford, Oxford, UK
| | - Santiago Gonzalez
- grid.225360.00000 0000 9709 7726European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, UK
| | - Daniel Rosebrock
- grid.66859.340000 0004 0546 1623Broad Institute of MIT and Harvard, Cambridge, MA USA
| | - Thomas J. Mitchell
- grid.10306.340000 0004 0606 5382Wellcome Sanger Institute, Cambridge, UK ,grid.5335.00000000121885934University of Cambridge, Cambridge, UK
| | - Yulia Rubanova
- grid.17063.330000 0001 2157 2938University of Toronto, Toronto, Ontario Canada ,grid.494618.6Vector Institute, Toronto, Ontario Canada
| | - Pavana Anur
- grid.5288.70000 0000 9758 5690Molecular and Medical Genetics, Oregon Health & Science University, Portland, OR USA
| | - Kaixian Yu
- grid.240145.60000 0001 2291 4776The University of Texas MD Anderson Cancer Center, Houston, TX USA
| | - Maxime Tarabichi
- grid.10306.340000 0004 0606 5382Wellcome Sanger Institute, Cambridge, UK ,grid.451388.30000 0004 1795 1830The Francis Crick Institute, London, UK
| | - Amit Deshwar
- grid.17063.330000 0001 2157 2938University of Toronto, Toronto, Ontario Canada ,grid.494618.6Vector Institute, Toronto, Ontario Canada
| | - Jeff Wintersinger
- grid.17063.330000 0001 2157 2938University of Toronto, Toronto, Ontario Canada ,grid.494618.6Vector Institute, Toronto, Ontario Canada
| | - Kortine Kleinheinz
- grid.7497.d0000 0004 0492 0584German Cancer Research Center (DKFZ), Heidelberg, Germany ,grid.7700.00000 0001 2190 4373Heidelberg University, Heidelberg, Germany
| | - Ignacio Vázquez-García
- grid.10306.340000 0004 0606 5382Wellcome Sanger Institute, Cambridge, UK ,grid.5335.00000000121885934University of Cambridge, Cambridge, UK
| | - Kerstin Haase
- grid.451388.30000 0004 1795 1830The Francis Crick Institute, London, UK
| | - Lara Jerman
- grid.225360.00000 0000 9709 7726European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, UK ,grid.8954.00000 0001 0721 6013University of Ljubljana, Ljubljana, Slovenia
| | - Subhajit Sengupta
- grid.240372.00000 0004 0400 4439NorthShore University HealthSystem, Evanston, IL USA
| | - Geoff Macintyre
- grid.5335.00000000121885934Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge, UK
| | - Salem Malikic
- grid.61971.380000 0004 1936 7494Simon Fraser University, Burnaby, British Columbia Canada ,grid.412541.70000 0001 0684 7796Vancouver Prostate Centre, Vancouver, British Columbia Canada
| | - Nilgun Donmez
- grid.61971.380000 0004 1936 7494Simon Fraser University, Burnaby, British Columbia Canada ,grid.412541.70000 0001 0684 7796Vancouver Prostate Centre, Vancouver, British Columbia Canada
| | - Dimitri G. Livitz
- grid.66859.340000 0004 0546 1623Broad Institute of MIT and Harvard, Cambridge, MA USA
| | - Marek Cmero
- grid.1008.90000 0001 2179 088XUniversity of Melbourne, Melbourne, Victoria Australia ,grid.1042.70000 0004 0432 4889Walter and Eliza Hall Institute, Melbourne, Victoria Australia
| | - Jonas Demeulemeester
- grid.451388.30000 0004 1795 1830The Francis Crick Institute, London, UK ,grid.5596.f0000 0001 0668 7884University of Leuven, Leuven, Belgium
| | - Steven Schumacher
- grid.66859.340000 0004 0546 1623Broad Institute of MIT and Harvard, Cambridge, MA USA
| | - Yu Fan
- grid.240145.60000 0001 2291 4776The University of Texas MD Anderson Cancer Center, Houston, TX USA
| | - Xiaotong Yao
- grid.5386.8000000041936877XWeill Cornell Medicine, New York, NY USA ,grid.429884.b0000 0004 1791 0895New York Genome Center, New York, NY USA
| | - Juhee Lee
- grid.205975.c0000 0001 0740 6917University of California Santa Cruz, Santa Cruz, CA USA
| | - Matthias Schlesner
- grid.7497.d0000 0004 0492 0584German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Paul C. Boutros
- grid.17063.330000 0001 2157 2938University of Toronto, Toronto, Ontario Canada ,grid.419890.d0000 0004 0626 690XOntario Institute for Cancer Research, Toronto, Ontario Canada ,grid.19006.3e0000 0000 9632 6718University of California, Los Angeles, CA USA
| | - David D. Bowtell
- grid.1055.10000000403978434Peter MacCallum Cancer Centre, Melbourne, Victoria Australia
| | - Hongtu Zhu
- grid.240145.60000 0001 2291 4776The University of Texas MD Anderson Cancer Center, Houston, TX USA
| | - Gad Getz
- grid.66859.340000 0004 0546 1623Broad Institute of MIT and Harvard, Cambridge, MA USA ,grid.32224.350000 0004 0386 9924Center for Cancer Research, Massachusetts General Hospital, Charlestown, MA USA ,grid.32224.350000 0004 0386 9924Department of Pathology, Massachusetts General Hospital, Boston, MA USA ,grid.38142.3c000000041936754XHarvard Medical School, Boston, MA USA
| | - Marcin Imielinski
- grid.5386.8000000041936877XWeill Cornell Medicine, New York, NY USA ,grid.429884.b0000 0004 1791 0895New York Genome Center, New York, NY USA
| | - Rameen Beroukhim
- grid.66859.340000 0004 0546 1623Broad Institute of MIT and Harvard, Cambridge, MA USA ,grid.65499.370000 0001 2106 9910Dana-Farber Cancer Institute, Boston, MA USA
| | - S. Cenk Sahinalp
- grid.412541.70000 0001 0684 7796Vancouver Prostate Centre, Vancouver, British Columbia Canada ,grid.411377.70000 0001 0790 959XIndiana University, Bloomington, IN USA
| | - Yuan Ji
- grid.240372.00000 0004 0400 4439NorthShore University HealthSystem, Evanston, IL USA ,grid.170205.10000 0004 1936 7822The University of Chicago, Chicago, IL USA
| | - Martin Peifer
- grid.6190.e0000 0000 8580 3777University of Cologne, Cologne, Germany
| | - Florian Markowetz
- grid.5335.00000000121885934Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge, UK
| | - Ville Mustonen
- grid.7737.40000 0004 0410 2071University of Helsinki, Helsinki, Finland
| | - Ke Yuan
- grid.5335.00000000121885934Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge, UK ,grid.8756.c0000 0001 2193 314XUniversity of Glasgow, Glasgow, UK
| | - Wenyi Wang
- grid.240145.60000 0001 2291 4776The University of Texas MD Anderson Cancer Center, Houston, TX USA
| | - Quaid D. Morris
- grid.17063.330000 0001 2157 2938University of Toronto, Toronto, Ontario Canada ,grid.494618.6Vector Institute, Toronto, Ontario Canada
| | | | - Paul T. Spellman
- grid.5288.70000 0000 9758 5690Molecular and Medical Genetics, Oregon Health & Science University, Portland, OR USA
| | - David C. Wedge
- grid.4991.50000 0004 1936 8948Big Data Institute, University of Oxford, Oxford, UK ,grid.454382.c0000 0004 7871 7212Oxford NIHR Biomedical Research Centre, Oxford, UK
| | - Peter Van Loo
- The Francis Crick Institute, London, UK. .,University of Leuven, Leuven, Belgium.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
6
|
Calabrese C, Davidson NR, Demircioğlu D, Fonseca NA, He Y, Kahles A, Lehmann KV, Liu F, Shiraishi Y, Soulette CM, Urban L, Greger L, Li S, Liu D, Perry MD, Xiang Q, Zhang F, Zhang J, Bailey P, Erkek S, Hoadley KA, Hou Y, Huska MR, Kilpinen H, Korbel JO, Marin MG, Markowski J, Nandi T, Pan-Hammarström Q, Pedamallu CS, Siebert R, Stark SG, Su H, Tan P, Waszak SM, Yung C, Zhu S, Awadalla P, Creighton CJ, Meyerson M, Ouellette BFF, Wu K, Yang H, Brazma A, Brooks AN, Göke J, Rätsch G, Schwarz RF, Stegle O, Zhang Z, Wu K, Yang H, Fonseca NA, Kahles A, Lehmann KV, Urban L, Soulette CM, Shiraishi Y, Liu F, He Y, Demircioğlu D, Davidson NR, Calabrese C, Zhang J, Perry MD, Xiang Q, Greger L, Li S, Liu D, Stark SG, Zhang F, Amin SB, Bailey P, Chateigner A, Cortés-Ciriano I, Craft B, Erkek S, Frenkel-Morgenstern M, Goldman M, Hoadley KA, Hou Y, Huska MR, Khurana E, Kilpinen H, Korbel JO, Lamaze FC, Li C, Li X, Li X, Liu X, Marin MG, Markowski J, Nandi T, Nielsen MM, Ojesina AI, Pan-Hammarström Q, Park PJ, Pedamallu CS, Pedersen JS, Pederzoli P, Peifer M, Pennell NA, Perou CM, Perry MD, Petersen GM, Peto M, Petrelli N, Pedamallu CS, Petryszak R, Pfister SM, Phillips M, Pich O, Pickett HA, Pihl TD, Pillay N, Pinder S, Pinese M, Pinho AV, Pedersen JS, Pitkänen E, Pivot X, Piñeiro-Yáñez E, Planko L, Plass C, Polak P, Pons T, Popescu I, Potapova O, Prasad A, Siebert R, Preston SR, Prinz M, Pritchard AL, Prokopec SD, Provenzano E, Puente XS, Puig S, Puiggròs M, Pulido-Tamayo S, Pupo GM, Su H, Purdie CA, Quinn MC, Rabionet R, Rader JS, Radlwimmer B, Radovic P, Raeder B, Raine KM, Ramakrishna M, Ramakrishnan K, Tan P, Ramalingam S, Raphael BJ, Rathmell WK, Rausch T, Reifenberger G, Reimand J, Reis-Filho J, Reuter V, Reyes-Salazar I, Reyna MA, Teh BT, Reynolds SM, Rheinbay E, Riazalhosseini Y, Richardson AL, Richter J, Ringel M, Ringnér M, Rino Y, Rippe K, Roach J, Wang J, Roberts LR, Roberts ND, Roberts SA, Robertson AG, Robertson AJ, Rodriguez JB, Rodriguez-Martin B, Rodríguez-González FG, Roehrl MHA, Rohde M, Waszak SM, Rokutan H, Romieu G, Rooman I, Roques T, Rosebrock D, Rosenberg M, Rosenstiel PC, Rosenwald A, Rowe EW, Royo R, Xiong H, Rozen SG, Rubanova Y, Rubin MA, Rubio-Perez C, Rudneva VA, Rusev BC, Ruzzenente A, Rätsch G, Sabarinathan R, Sabelnykova VY, Yakneen S, Sadeghi S, Sahinalp SC, Saini N, Saito-Adachi M, Saksena G, Salcedo A, Salgado R, Salichos L, Sallari R, Saller C, Ye C, Salvia R, Sam M, Samra JS, Sanchez-Vega F, Sander C, Sanders G, Sarin R, Sarrafi I, Sasaki-Oku A, Sauer T, Yung C, Sauter G, Saw RPM, Scardoni M, Scarlett CJ, Scarpa A, Scelo G, Schadendorf D, Schein JE, Schilhabel MB, Schlesner M, Zhang X, Schlomm T, Schmidt HK, Schramm SJ, Schreiber S, Schultz N, Schumacher SE, Schwarz RF, Scolyer RA, Scott D, Scully R, Zheng L, Seethala R, Segre AV, Selander I, Semple CA, Senbabaoglu Y, Sengupta S, Sereni E, Serra S, Sgroi DC, Shackleton M, Zhu J, Shah NC, Shahabi S, Shang CA, Shang P, Shapira O, Shelton T, Shen C, Shen H, Shepherd R, Shi R, Zhu S, Shi Y, Shiah YJ, Shibata T, Shih J, Shimizu E, Shimizu K, Shin SJ, Shiraishi Y, Shmaya T, Shmulevich I, Awadalla P, Shorser SI, Short C, Shrestha R, Shringarpure SS, Shriver C, Shuai S, Sidiropoulos N, Siebert R, Sieuwerts AM, Sieverling L, Creighton CJ, Signoretti S, Sikora KO, Simbolo M, Simon R, Simons JV, Simpson JT, Simpson PT, Singer S, Sinnott-Armstrong N, Sipahimalani P, Meyerson M, Skelly TJ, Smid M, Smith J, Smith-McCune K, Socci ND, Sofia HJ, Soloway MG, Song L, Sood AK, Sothi S, Ouellette BFF, Sotiriou C, Soulette CM, Span PN, Spellman PT, Sperandio N, Spillane AJ, Spiro O, Spring J, Staaf J, Stadler PF, Wu K, Staib P, Stark SG, Stebbings L, Stefánsson ÓA, Stegle O, Stein LD, Stenhouse A, Stewart C, Stilgenbauer S, Stobbe MD, Yang H, Stratton MR, Stretch JR, Struck AJ, Stuart JM, Stunnenberg HG, Su H, Su X, Sun RX, Sungalee S, Susak H, Göke J, Suzuki A, Sweep F, Szczepanowski M, Sültmann H, Yugawa T, Tam A, Tamborero D, Tan BKT, Tan D, Tan P, Schwarz RF, Tanaka H, Taniguchi H, Tanskanen TJ, Tarabichi M, Tarnuzzer R, Tarpey P, Taschuk ML, Tatsuno K, Tavaré S, Taylor DF, Stegle O, Taylor-Weiner A, Teague JW, Teh BT, Tembe V, Temes J, Thai K, Thayer SP, Thiessen N, Thomas G, Thomas S, Zhang Z, Thompson A, Thompson AM, Thompson JFF, Thompson RH, Thorne H, Thorne LB, Thorogood A, Tiao G, Tijanic N, Timms LE, Brazma A, Tirabosco R, Tojo M, Tommasi S, Toon CW, Toprak UH, Torrents D, Tortora G, Tost J, Totoki Y, Townend D, Rätsch G, Traficante N, Treilleux I, Trotta JR, Trümper LHP, Tsao M, Tsunoda T, Tubio JMC, Tucker O, Turkington R, Turner DJ, Brooks AN, Tutt A, Ueno M, Ueno NT, Umbricht C, Umer HM, Underwood TJ, Urban L, Urushidate T, Ushiku T, Uusküla-Reimand L, Brazma A, Valencia A, Van Den Berg DJ, Van Laere S, Van Loo P, Van Meir EG, Van den Eynden GG, Van der Kwast T, Vasudev N, Vazquez M, Vedururu R, Brooks AN, Veluvolu U, Vembu S, Verbeke LPC, Vermeulen P, Verrill C, Viari A, Vicente D, Vicentini C, VijayRaghavan K, Viksna J, Göke J, Vilain RE, Villasante I, Vincent-Salomon A, Visakorpi T, Voet D, Vyas P, Vázquez-García I, Waddell NM, Waddell N, Wadelius C, Rätsch G, Wadi L, Wagener R, Wala JA, Wang J, Wang J, Wang L, Wang Q, Wang W, Wang Y, Wang Z, Schwarz RF, Waring PM, Warnatz HJ, Warrell J, Warren AY, Waszak SM, Wedge DC, Weichenhan D, Weinberger P, Weinstein JN, Weischenfeldt J, Stegle O, Weisenberger DJ, Welch I, Wendl MC, Werner J, Whalley JP, Wheeler DA, Whitaker HC, Wigle D, Wilkerson MD, Williams A, Zhang Z, Wilmott JS, Wilson GW, Wilson JM, Wilson RK, Winterhoff B, Wintersinger JA, Wiznerowicz M, Wolf S, Wong BH, Wong T, Aaltonen LA, Wong W, Woo Y, Wood S, Wouters BG, Wright AJ, Wright DW, Wright MH, Wu CL, Wu DY, Wu G, Abascal F, Wu J, Wu K, Wu Y, Wu Z, Xi L, Xia T, Xiang Q, Xiao X, Xing R, Xiong H, Abeshouse A, Xu Q, Xu Y, Xue H, Yachida S, Yakneen S, Yamaguchi R, Yamaguchi TN, Yamamoto M, Yamamoto S, Yamaue H, Aburatani H, Yang F, Yang H, Yang JY, Yang L, Yang L, Yang S, Yang TP, Yang Y, Yao X, Yaspo ML, Adams DJ, Yates L, Yau C, Ye C, Ye K, Yellapantula VD, Yoon CJ, Yoon SS, Yousif F, Yu J, Yu K, Agrawal N, Yu W, Yu Y, Yuan K, Yuan Y, Yuen D, Yung CK, Zaikova O, Zamora J, Zapatka M, Zenklusen JC, Ahn KS, Zenz T, Zeps N, Zhang CZ, Zhang F, Zhang H, Zhang H, Zhang H, Zhang J, Zhang J, Zhang J, Ahn SM, Zhang X, Zhang X, Zhang Y, Zhang Z, Zhao Z, Zheng L, Zheng X, Zhou W, Zhou Y, Zhu B, Aikata H, Zhu H, Zhu J, Zhu S, Zou L, Zou X, deFazio A, van As N, van Deurzen CHM, van de Vijver MJ, van’t Veer L, Akbani R, von Mering C, Akdemir KC, Al-Ahmadie H, Al-Sedairy ST, Al-Shahrour F, Alawi M, Albert M, Aldape K, Alexandrov LB, Ally A, Alsop K, Alvarez EG, Amary F, Amin SB, Aminou B, Ammerpohl O, Anderson MJ, Ang Y, Antonello D, Anur P, Aparicio S, Appelbaum EL, Arai Y, Aretz A, Arihiro K, Ariizumi SI, Armenia J, Arnould L, Asa S, Assenov Y, Atwal G, Aukema S, Auman JT, Aure MRR, Awadalla P, Aymerich M, Bader GD, Baez-Ortega A, Bailey MH, Bailey PJ, Balasundaram M, Balu S, Bandopadhayay P, Banks RE, Barbi S, Barbour AP, Barenboim J, Barnholtz-Sloan J, Barr H, Barrera E, Bartlett J, Bartolome J, Bassi C, Bathe OF, Baumhoer D, Bavi P, Baylin SB, Bazant W, Beardsmore D, Beck TA, Behjati S, Behren A, Niu B, Bell C, Beltran S, Benz C, Berchuck A, Bergmann AK, Bergstrom EN, Berman BP, Berney DM, Bernhart SH, Beroukhim R, Berrios M, Bersani S, Bertl J, Betancourt M, Bhandari V, Bhosle SG, Biankin AV, Bieg M, Bigner D, Binder H, Birney E, Birrer M, Biswas NK, Bjerkehagen B, Bodenheimer T, Boice L, Bonizzato G, De Bono JS, Boot A, Bootwalla MS, Borg A, Borkhardt A, Boroevich KA, Borozan I, Borst C, Bosenberg M, Bosio M, Boultwood J, Bourque G, Boutros PC, Bova GS, Bowen DT, Bowlby R, Bowtell DDL, Boyault S, Boyce R, Boyd J, Brazma A, Brennan P, Brewer DS, Brinkman AB, Bristow RG, Broaddus RR, Brock JE, Brock M, Broeks A, Brooks AN, Brooks D, Brors B, Brunak S, Bruxner TJC, Bruzos AL, Buchanan A, Buchhalter I, Buchholz C, Bullman S, Burke H, Burkhardt B, Burns KH, Busanovich J, Bustamante CD, Butler AP, Butte AJ, Byrne NJ, Børresen-Dale AL, Caesar-Johnson SJ, Cafferkey A, Cahill D, Calabrese C, Caldas C, Calvo F, Camacho N, Campbell PJ, Campo E, Cantù C, Cao S, Carey TE, Carlevaro-Fita J, Carlsen R, Cataldo I, Cazzola M, Cebon J, Cerfolio R, Chadwick DE, Chakravarty D, Chalmers D, Chan CWY, Chan K, Chan-Seng-Yue M, Chandan VS, Chang DK, Chanock SJ, Chantrill LA, Chateigner A, Chatterjee N, Chayama K, Chen HW, Chen J, Chen K, Chen Y, Chen Z, Cherniack AD, Chien J, Chiew YE, Chin SF, Cho J, Cho S, Choi JK, Choi W, Chomienne C, Chong Z, Choo SP, Chou A, Christ AN, Christie EL, Chuah E, Cibulskis C, Cibulskis K, Cingarlini S, Clapham P, Claviez A, Cleary S, Cloonan N, Cmero M, Collins CC, Connor AA, Cooke SL, Cooper CS, Cope L, Corbo V, Cordes MG, Cordner SM, Cortés-Ciriano I, Covington K, Cowin PA, Craft B, Craft D, Creighton CJ, Cun Y, Curley E, Cutcutache I, Czajka K, Czerniak B, Dagg RA, Danilova L, Davi MV, Davidson NR, Davies H, Davis IJ, Davis-Dusenbery BN, Dawson KJ, De La Vega FM, De Paoli-Iseppi R, Defreitas T, Tos APD, Delaneau O, Demchok JA, Demeulemeester J, Demidov GM, Demircioğlu D, Dennis NM, Denroche RE, Dentro SC, Desai N, Deshpande V, Deshwar AG, Desmedt C, Deu-Pons J, Dhalla N, Dhani NC, Dhingra P, Dhir R, DiBiase A, Diamanti K, Ding L, Ding S, Dinh HQ, Dirix L, Doddapaneni H, Donmez N, Dow MT, Drapkin R, Drechsel O, Drews RM, Serge S, Dudderidge T, Dueso-Barroso A, Dunford AJ, Dunn M, Dursi LJ, Duthie FR, Dutton-Regester K, Eagles J, Easton DF, Edmonds S, Edwards PA, Edwards SE, Eeles RA, Ehinger A, Eils J, Eils R, El-Naggar A, Eldridge M, Ellrott K, Erkek S, Escaramis G, Espiritu SMG, Estivill X, Etemadmoghadam D, Eyfjord JE, Faltas BM, Fan D, Fan Y, Faquin WC, Farcas C, Fassan M, Fatima A, Favero F, Fayzullaev N, Felau I, Fereday S, Ferguson ML, Ferretti V, Feuerbach L, Field MA, Fink JL, Finocchiaro G, Fisher C, Fittall MW, Fitzgerald A, Fitzgerald RC, Flanagan AM, Fleshner NE, Flicek P, Foekens JA, Fong KM, Fonseca NA, Foster CS, Fox NS, Fraser M, Frazer S, Frenkel-Morgenstern M, Friedman W, Frigola J, Fronick CC, Fujimoto A, Fujita M, Fukayama M, Fulton LA, Fulton RS, Furuta M, Futreal PA, Füllgrabe A, Gabriel SB, Gallinger S, Gambacorti-Passerini C, Gao J, Gao S, Garraway L, Garred Ø, Garrison E, Garsed DW, Gehlenborg N, Gelpi JLL, George J, Gerhard DS, Gerhauser C, Gershenwald JE, Gerstein M, Gerstung M, Getz G, Ghori M, Ghossein R, Giama NH, Gibbs RA, Gibson B, Gill AJ, Gill P, Giri DD, Glodzik D, Gnanapragasam VJ, Goebler ME, Goldman MJ, Gomez C, Gonzalez S, Gonzalez-Perez A, Gordenin DA, Gossage J, Gotoh K, Govindan R, Grabau D, Graham JS, Grant RC, Green AR, Green E, Greger L, Grehan N, Grimaldi S, Grimmond SM, Grossman RL, Grundhoff A, Gundem G, Guo Q, Gupta M, Gupta S, Gut IG, Gut M, Göke J, Ha G, Haake A, Haan D, Haas S, Haase K, Haber JE, Habermann N, Hach F, Haider S, Hama N, Hamdy FC, Hamilton A, Hamilton MP, Han L, Hanna GB, Hansmann M, Haradhvala NJ, Harismendy O, Harliwong I, Harmanci AO, Harrington E, Hasegawa T, Haussler D, Hawkins S, Hayami S, Hayashi S, Hayes DN, Hayes SJ, Hayward NK, Hazell S, He Y, Heath AP, Heath SC, Hedley D, Hegde AM, Heiman DI, Heinold MC, Heins Z, Heisler LE, Hellstrom-Lindberg E, Helmy M, Heo SG, Hepperla AJ, Heredia-Genestar JM, Herrmann C, Hersey P, Hess JM, Hilmarsdottir H, Hinton J, Hirano S, Hiraoka N, Hoadley KA, Hobolth A, Hodzic E, Hoell JI, Hoffmann S, Hofmann O, Holbrook A, Holik AZ, Hollingsworth MA, Holmes O, Holt RA, Hong C, Hong EP, Hong JH, Hooijer GK, Hornshøj H, Hosoda F, Hou Y, Hovestadt V, Howat W, Hoyle AP, Hruban RH, Hu J, Hu T, Hua X, Huang KL, Huang M, Huang MN, Huang V, Huang Y, Huber W, Hudson TJ, Hummel M, Hung JA, Huntsman D, Hupp TR, Huse J, Huska MR, Hutter B, Hutter CM, Hübschmann D, Iacobuzio-Donahue CA, Imbusch CD, Imielinski M, Imoto S, Isaacs WB, Isaev K, Ishikawa S, Iskar M, Islam SMA, Ittmann M, Ivkovic S, Izarzugaza JMG, Jacquemier J, Jakrot V, Jamieson NB, Jang GH, Jang SJ, Jayaseelan JC, Jayasinghe R, Jefferys SR, Jegalian K, Jennings JL, Jeon SH, Jerman L, Ji Y, Jiao W, Johansson PA, Johns AL, Johns J, Johnson R, Johnson TA, Jolly C, Joly Y, Jonasson JG, Jones CD, Jones DR, Jones DTW, Jones N, Jones SJM, Jonkers J, Ju YS, Juhl H, Jung J, Juul M, Juul RI, Juul S, Jäger N, Kabbe R, Kahles A, Kahraman A, Kaiser VB, Kakavand H, Kalimuthu S, von Kalle C, Kang KJ, Karaszi K, Karlan B, Karlić R, Karsch D, Kasaian K, Kassahn KS, Katai H, Kato M, Katoh H, Kawakami Y, Kay JD, Kazakoff SH, Kazanov MD, Keays M, Kebebew E, Kefford RF, Kellis M, Kench JG, Kennedy CJ, Kerssemakers JNA, Khoo D, Khoo V, Khuntikeo N, Khurana E, Kilpinen H, Kim HK, Kim HL, Kim HY, Kim H, Kim J, Kim J, Kim JK, Kim Y, King TA, Klapper W, Kleinheinz K, Klimczak LJ, Knappskog S, Kneba M, Knoppers BM, Koh Y, Komorowski J, Komura D, Komura M, Kong G, Kool M, Korbel JO, Korchina V, Korshunov A, Koscher M, Koster R, Kote-Jarai Z, Koures A, Kovacevic M, Kremeyer B, Kretzmer H, Kreuz M, Krishnamurthy S, Kube D, Kumar K, Kumar P, Kumar S, Kumar Y, Kundra R, Kübler K, Küppers R, Lagergren J, Lai PH, Laird PW, Lakhani SR, Lalansingh CM, Lalonde E, Lamaze FC, Lambert A, Lander E, Landgraf P, Landoni L, Langerød A, Lanzós A, Larsimont D, Larsson E, Lathrop M, Lau LMS, Lawerenz C, Lawlor RT, Lawrence MS, Lazar AJ, Lazic AM, Le X, Lee D, Lee D, Lee EA, Lee HJ, Lee JJK, Lee JY, Lee J, Lee MTM, Lee-Six H, Lehmann KV, Lehrach H, Lenze D, Leonard CR, Leongamornlert DA, Leshchiner I, Letourneau L, Letunic I, Levine DA, Lewis L, Ley T, Li C, Li CH, Li HI, Li J, Li L, Li S, Li S, Li X, Li X, Li X, Li Y, Liang H, Liang SB, Lichter P, Lin P, Lin Z, Linehan WM, Lingjærde OC, Liu D, Liu EM, Liu FFF, Liu F, Liu J, Liu X, Livingstone J, Livitz D, Livni N, Lochovsky L, Loeffler M, Long GV, Lopez-Guillermo A, Lou S, Louis DN, Lovat LB, Lu Y, Lu YJ, Lu Y, Luchini C, Lungu I, Luo X, Luxton HJ, Lynch AG, Lype L, López C, López-Otín C, Ma EZ, Ma Y, MacGrogan G, MacRae S, Macintyre G, Madsen T, Maejima K, Mafficini A, Maglinte DT, Maitra A, Majumder PP, Malcovati L, Malikic S, Malleo G, Mann GJ, Mantovani-Löffler L, Marchal K, Marchegiani G, Mardis ER, Margolin AA, Marin MG, Markowetz F, Markowski J, Marks J, Marques-Bonet T, Marra MA, Marsden L, Martens JWM, Martin S, Martin-Subero JI, Martincorena I, Martinez-Fundichely A, Maruvka YE, Mashl RJ, Massie CE, Matthew TJ, Matthews L, Mayer E, Mayes S, Mayo M, Mbabaali F, McCune K, McDermott U, McGillivray PD, McLellan MD, McPherson JD, McPherson JR, McPherson TA, Meier SR, Meng A, Meng S, Menzies A, Merrett ND, Merson S, Meyerson M, Meyerson W, Mieczkowski PA, Mihaiescu GL, Mijalkovic S, Mikkelsen T, Milella M, Mileshkin L, Miller CA, Miller DK, Miller JK, Mills GB, Milovanovic A, Minner S, Miotto M, Arnau GM, Mirabello L, Mitchell C, Mitchell TJ, Miyano S, Miyoshi N, Mizuno S, Molnár-Gábor F, Moore MJ, Moore RA, Morganella S, Morris QD, Morrison C, Mose LE, Moser CD, Muiños F, Mularoni L, Mungall AJ, Mungall K, Musgrove EA, Mustonen V, Mutch D, Muyas F, Muzny DM, Muñoz A, Myers J, Myklebost O, Möller P, Nagae G, Nagrial AM, Nahal-Bose HK, Nakagama H, Nakagawa H, Nakamura H, Nakamura T, Nakano K, Nandi T, Nangalia J, Nastic M, Navarro A, Navarro FCP, Neal DE, Nettekoven G, Newell F, Newhouse SJ, Newton Y, Ng AWT, Ng A, Nicholson J, Nicol D, Nie Y, Nielsen GP, Nielsen MM, Nik-Zainal S, Noble MS, Nones K, Northcott PA, Notta F, O’Connor BD, O’Donnell P, O’Donovan M, O’Meara S, O’Neill BP, O’Neill JR, Ocana D, Ochoa A, Oesper L, Ogden C, Ohdan H, Ohi K, Ohno-Machado L, Oien KA, Ojesina AI, Ojima H, Okusaka T, Omberg L, Ong CK, Ossowski S, Ott G, Ouellette BFF, P’ng C, Paczkowska M, Paiella S, Pairojkul C, Pajic M, Pan-Hammarström Q, Papaemmanuil E, Papatheodorou I, Paramasivam N, Park JW, Park JW, Park K, Park K, Park PJ, Parker JS, Parsons SL, Pass H, Pasternack D, Pastore A, Patch AM, Pauporté I, Pea A, Pearson JV. Author Correction: Genomic basis for RNA alterations in cancer. Nature 2023; 614:E37. [PMID: 36697831 PMCID: PMC9931574 DOI: 10.1038/s41586-022-05596-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Affiliation(s)
| | - Claudia Calabrese
- grid.225360.00000 0000 9709 7726European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
| | - Natalie R. Davidson
- grid.5801.c0000 0001 2156 2780ETH Zurich, Zurich, Switzerland ,grid.51462.340000 0001 2171 9952Memorial Sloan Kettering Cancer Center, New York, NY USA ,grid.5386.8000000041936877XWeill Cornell Medical College, New York, NY USA ,grid.419765.80000 0001 2223 3006SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland ,grid.412004.30000 0004 0478 9977University Hospital Zurich, Zurich, Switzerland
| | - Deniz Demircioğlu
- grid.4280.e0000 0001 2180 6431National University of Singapore, Singapore, Singapore ,grid.418377.e0000 0004 0620 715XGenome Institute of Singapore, Singapore, Singapore
| | - Nuno A. Fonseca
- grid.225360.00000 0000 9709 7726European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
| | - Yao He
- grid.11135.370000 0001 2256 9319Peking University, Beijing, China
| | - André Kahles
- grid.5801.c0000 0001 2156 2780ETH Zurich, Zurich, Switzerland ,grid.51462.340000 0001 2171 9952Memorial Sloan Kettering Cancer Center, New York, NY USA ,grid.419765.80000 0001 2223 3006SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland ,grid.412004.30000 0004 0478 9977University Hospital Zurich, Zurich, Switzerland
| | - Kjong-Van Lehmann
- grid.5801.c0000 0001 2156 2780ETH Zurich, Zurich, Switzerland ,grid.51462.340000 0001 2171 9952Memorial Sloan Kettering Cancer Center, New York, NY USA ,grid.419765.80000 0001 2223 3006SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland ,grid.412004.30000 0004 0478 9977University Hospital Zurich, Zurich, Switzerland
| | - Fenglin Liu
- grid.11135.370000 0001 2256 9319Peking University, Beijing, China
| | - Yuichi Shiraishi
- grid.26999.3d0000 0001 2151 536XThe University of Tokyo, Minato-ku, Japan
| | - Cameron M. Soulette
- grid.205975.c0000 0001 0740 6917University of California, Santa Cruz, Santa Cruz, CA USA
| | - Lara Urban
- grid.225360.00000 0000 9709 7726European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
| | - Liliana Greger
- grid.225360.00000 0000 9709 7726European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
| | - Siliang Li
- grid.21155.320000 0001 2034 1839BGI-Shenzhen, Shenzhen, China ,grid.507779.b0000 0004 4910 5858China National GeneBank-Shenzhen, Shenzhen, China
| | - Dongbing Liu
- grid.21155.320000 0001 2034 1839BGI-Shenzhen, Shenzhen, China ,grid.507779.b0000 0004 4910 5858China National GeneBank-Shenzhen, Shenzhen, China
| | - Marc D. Perry
- grid.17063.330000 0001 2157 2938Ontario Institute for Cancer Research, Toronto, Ontario, Canada ,grid.266102.10000 0001 2297 6811University of California, San Francisco, San Francisco, CA USA
| | - Qian Xiang
- grid.17063.330000 0001 2157 2938Ontario Institute for Cancer Research, Toronto, Ontario, Canada
| | - Fan Zhang
- grid.11135.370000 0001 2256 9319Peking University, Beijing, China
| | - Junjun Zhang
- grid.17063.330000 0001 2157 2938Ontario Institute for Cancer Research, Toronto, Ontario, Canada
| | - Peter Bailey
- grid.8756.c0000 0001 2193 314XUniversity of Glasgow, Glasgow, UK
| | - Serap Erkek
- grid.4709.a0000 0004 0495 846XEuropean Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany
| | - Katherine A. Hoadley
- grid.10698.360000000122483208The University of North Carolina at Chapel Hill, Chapel Hill, NC USA
| | - Yong Hou
- grid.21155.320000 0001 2034 1839BGI-Shenzhen, Shenzhen, China ,grid.507779.b0000 0004 4910 5858China National GeneBank-Shenzhen, Shenzhen, China
| | - Matthew R. Huska
- grid.419491.00000 0001 1014 0849Berlin Institute for Medical Systems Biology, Max Delbruck Center for Molecular Medicine, Berlin, Germany
| | - Helena Kilpinen
- grid.83440.3b0000000121901201University College London, London, UK
| | - Jan O. Korbel
- grid.4709.a0000 0004 0495 846XEuropean Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany
| | - Maximillian G. Marin
- grid.205975.c0000 0001 0740 6917University of California, Santa Cruz, Santa Cruz, CA USA
| | - Julia Markowski
- grid.419491.00000 0001 1014 0849Berlin Institute for Medical Systems Biology, Max Delbruck Center for Molecular Medicine, Berlin, Germany
| | - Tannistha Nandi
- grid.418377.e0000 0004 0620 715XGenome Institute of Singapore, Singapore, Singapore
| | - Qiang Pan-Hammarström
- grid.21155.320000 0001 2034 1839BGI-Shenzhen, Shenzhen, China ,grid.4714.60000 0004 1937 0626Karolinska Institutet, Stockholm, Sweden
| | - Chandra Sekhar Pedamallu
- grid.66859.340000 0004 0546 1623Broad Institute, Cambridge, MA USA ,grid.65499.370000 0001 2106 9910Dana-Farber Cancer Institute, Boston, MA USA ,grid.38142.3c000000041936754XHarvard Medical School, Boston, MA USA
| | - Reiner Siebert
- grid.410712.10000 0004 0473 882XUlm University and Ulm University Medical Center, Ulm, Germany
| | - Stefan G. Stark
- grid.5801.c0000 0001 2156 2780ETH Zurich, Zurich, Switzerland ,grid.51462.340000 0001 2171 9952Memorial Sloan Kettering Cancer Center, New York, NY USA ,grid.419765.80000 0001 2223 3006SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland ,grid.412004.30000 0004 0478 9977University Hospital Zurich, Zurich, Switzerland
| | - Hong Su
- grid.21155.320000 0001 2034 1839BGI-Shenzhen, Shenzhen, China ,grid.507779.b0000 0004 4910 5858China National GeneBank-Shenzhen, Shenzhen, China
| | - Patrick Tan
- grid.418377.e0000 0004 0620 715XGenome Institute of Singapore, Singapore, Singapore ,grid.428397.30000 0004 0385 0924Duke-NUS Medical School, Singapore, Singapore
| | - Sebastian M. Waszak
- grid.4709.a0000 0004 0495 846XEuropean Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany
| | - Christina Yung
- grid.17063.330000 0001 2157 2938Ontario Institute for Cancer Research, Toronto, Ontario, Canada
| | - Shida Zhu
- grid.21155.320000 0001 2034 1839BGI-Shenzhen, Shenzhen, China ,grid.507779.b0000 0004 4910 5858China National GeneBank-Shenzhen, Shenzhen, China
| | - Philip Awadalla
- grid.17063.330000 0001 2157 2938Ontario Institute for Cancer Research, Toronto, Ontario, Canada ,grid.17063.330000 0001 2157 2938University of Toronto, Toronto, Ontario Canada
| | - Chad J. Creighton
- grid.39382.330000 0001 2160 926XBaylor College of Medicine, Houston, TX USA
| | - Matthew Meyerson
- grid.66859.340000 0004 0546 1623Broad Institute, Cambridge, MA USA ,grid.65499.370000 0001 2106 9910Dana-Farber Cancer Institute, Boston, MA USA ,grid.38142.3c000000041936754XHarvard Medical School, Boston, MA USA
| | | | - Kui Wu
- grid.21155.320000 0001 2034 1839BGI-Shenzhen, Shenzhen, China ,grid.507779.b0000 0004 4910 5858China National GeneBank-Shenzhen, Shenzhen, China
| | - Huanming Yang
- grid.21155.320000 0001 2034 1839BGI-Shenzhen, Shenzhen, China
| | | | - Alvis Brazma
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK.
| | - Angela N. Brooks
- grid.205975.c0000 0001 0740 6917University of California, Santa Cruz, Santa Cruz, CA USA ,grid.66859.340000 0004 0546 1623Broad Institute, Cambridge, MA USA ,grid.65499.370000 0001 2106 9910Dana-Farber Cancer Institute, Boston, MA USA
| | - Jonathan Göke
- grid.418377.e0000 0004 0620 715XGenome Institute of Singapore, Singapore, Singapore ,grid.410724.40000 0004 0620 9745National Cancer Centre Singapore, Singapore, Singapore
| | - Gunnar Rätsch
- ETH Zurich, Zurich, Switzerland. .,Memorial Sloan Kettering Cancer Center, New York, NY, USA. .,Weill Cornell Medical College, New York, NY, USA. .,SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland. .,University Hospital Zurich, Zurich, Switzerland.
| | - Roland F. Schwarz
- grid.225360.00000 0000 9709 7726European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK ,grid.419491.00000 0001 1014 0849Berlin Institute for Medical Systems Biology, Max Delbruck Center for Molecular Medicine, Berlin, Germany ,grid.7497.d0000 0004 0492 0584German Cancer Consortium (DKTK), partner site Berlin, Germany ,grid.7497.d0000 0004 0492 0584German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Oliver Stegle
- grid.225360.00000 0000 9709 7726European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK ,grid.4709.a0000 0004 0495 846XEuropean Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany ,grid.7497.d0000 0004 0492 0584German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Zemin Zhang
- grid.11135.370000 0001 2256 9319Peking University, Beijing, China
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
7
|
Hari A, Zhou Q, Gonzaludo N, Harting J, Scott SA, Qin X, Scherer S, Sahinalp SC, Numanagić I. An efficient genotyper and star-allele caller for pharmacogenomics. Genome Res 2023; 33:61-70. [PMID: 36657977 PMCID: PMC9977157 DOI: 10.1101/gr.277075.122] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2022] [Accepted: 12/12/2022] [Indexed: 01/20/2023]
Abstract
High-throughput sequencing provides sufficient means for determining genotypes of clinically important pharmacogenes that can be used to tailor medical decisions to individual patients. However, pharmacogene genotyping, also known as star-allele calling, is a challenging problem that requires accurate copy number calling, structural variation identification, variant calling, and phasing within each pharmacogene copy present in the sample. Here we introduce Aldy 4, a fast and efficient tool for genotyping pharmacogenes that uses combinatorial optimization for accurate star-allele calling across different sequencing technologies. Aldy 4 adds support for long reads and uses a novel phasing model and improved copy number and variant calling models. We compare Aldy 4 against the current state-of-the-art star-allele callers on a large and diverse set of samples and genes sequenced by various sequencing technologies, such as whole-genome and targeted Illumina sequencing, barcoded 10x Genomics, and Pacific Biosciences (PacBio) HiFi. We show that Aldy 4 is the most accurate star-allele caller with near-perfect accuracy in all evaluated contexts, and hope that Aldy remains an invaluable tool in the clinical toolbox even with the advent of long-read sequencing technologies.
Collapse
Affiliation(s)
- Ananth Hari
- Department of Electrical and Computer Engineering, University of Maryland, College Park, Maryland 20742, USA;,Cancer Data Science Laboratory, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892, USA
| | - Qinghui Zhou
- Department of Computer Science, University of Victoria, Victoria, British Columbia V8P 5C2, Canada
| | | | - John Harting
- Pacific Biosciences, Menlo Park, California 94025, USA
| | - Stuart A. Scott
- Department of Pathology, Stanford University, Palo Alto, California 94304, USA
| | - Xiang Qin
- Baylor College of Medicine Human Genome Sequencing Center, Houston, Texas 77030, USA
| | - Steve Scherer
- Baylor College of Medicine Human Genome Sequencing Center, Houston, Texas 77030, USA
| | - S. Cenk Sahinalp
- Cancer Data Science Laboratory, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892, USA
| | - Ibrahim Numanagić
- Department of Computer Science, University of Victoria, Victoria, British Columbia V8P 5C2, Canada
| |
Collapse
|
8
|
Reyna MA, Haan D, Paczkowska M, Verbeke LPC, Vazquez M, Kahraman A, Pulido-Tamayo S, Barenboim J, Wadi L, Dhingra P, Shrestha R, Getz G, Lawrence MS, Pedersen JS, Rubin MA, Wheeler DA, Brunak S, Izarzugaza JMG, Khurana E, Marchal K, von Mering C, Sahinalp SC, Valencia A, Reimand J, Stuart JM, Raphael BJ. Author Correction: Pathway and network analysis of more than 2500 whole cancer genomes. Nat Commun 2022; 13:7566. [PMID: 36481610 PMCID: PMC9732045 DOI: 10.1038/s41467-022-32334-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Affiliation(s)
- Matthew A Reyna
- Department of Computer Science, Princeton University, Princeton, NJ, 08540, USA
- Department of Biomedical Informatics, Emory University, Atlanta, GA, 30322, USA
| | - David Haan
- Department of Biomolecular Engineering and UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, 95060, USA
| | - Marta Paczkowska
- Computational Biology Program, Ontario Institute for Cancer Research, Toronto, Ontario, Canada
| | - Lieven P C Verbeke
- Department of Information Technology, IDLab, Ghent University, IMEC, Ghent, the Netherlands
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, the Netherlands
| | - Miguel Vazquez
- Barcelona Supercomputing Center (BSC), Barcelona, 08034, Spain
- Department of Clinical and Molecular Medicine, Faculty of Medicine and Health Sciences, Norwegian University of Science and Technology, Trondheim, Norway
| | - Abdullah Kahraman
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, CH-8057, Zurich, Switzerland
- Department of Pathology and Molecular Pathology, University Hospital Zurich, CH-8091, Zurich, Switzerland
| | - Sergio Pulido-Tamayo
- Department of Information Technology, IDLab, Ghent University, IMEC, Ghent, the Netherlands
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, the Netherlands
| | - Jonathan Barenboim
- Computational Biology Program, Ontario Institute for Cancer Research, Toronto, Ontario, Canada
| | - Lina Wadi
- Computational Biology Program, Ontario Institute for Cancer Research, Toronto, Ontario, Canada
| | - Priyanka Dhingra
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, 10065, USA
| | - Raunak Shrestha
- Vancouver Prostate Centre, 2660 Oak Street, Vancouver, BC, V6H 3Z6, Canada
| | - Gad Getz
- The Broad Institute of MIT and Harvard, Cambridge, MA, 02124, USA
- Massachusetts General Hospital Center for Cancer Research, Charlestown, MA, 02129, USA
- Harvard Medical School, 250 Longwood Avenue, Boston, MA, 02115, USA
- Massachusetts General Hospital, Department of Pathology, Boston, MA, 02114, USA
| | - Michael S Lawrence
- The Broad Institute of MIT and Harvard, Cambridge, MA, 02124, USA
- Massachusetts General Hospital Center for Cancer Research, Charlestown, MA, 02129, USA
| | - Jakob Skou Pedersen
- Department of Molecular Medicine (MOMA), Aarhus University Hospital, Aarhus, Denmark
- Bioinformatics Research Centre (BiRC), Aarhus University, Aarhus, Denmark
| | - Mark A Rubin
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, 10065, USA
| | - David A Wheeler
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Søren Brunak
- DTU Bioinformatics, Department of Bio and Health Informatics, Technical University of Denmark, Kemitorvet, 2800, Kongens Lyngby, Denmark
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, 2200, Copenhagen, Denmark
| | - Jose M G Izarzugaza
- DTU Bioinformatics, Department of Bio and Health Informatics, Technical University of Denmark, Kemitorvet, 2800, Kongens Lyngby, Denmark
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, 2200, Copenhagen, Denmark
| | - Ekta Khurana
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, 10065, USA
| | - Kathleen Marchal
- Department of Information Technology, IDLab, Ghent University, IMEC, Ghent, the Netherlands
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, the Netherlands
| | - Christian von Mering
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, CH-8057, Zurich, Switzerland
| | - S Cenk Sahinalp
- Vancouver Prostate Centre, 2660 Oak Street, Vancouver, BC, V6H 3Z6, Canada
- Department of Computer Science, Indiana University, Bloomington, IN, 47405, USA
| | - Alfonso Valencia
- Barcelona Supercomputing Center (BSC), Barcelona, 08034, Spain
- ICREA, Barcelona, 08010, Spain
| | - Jüri Reimand
- Computational Biology Program, Ontario Institute for Cancer Research, Toronto, Ontario, Canada.
- Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada.
| | - Joshua M Stuart
- Department of Biomolecular Engineering and UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, 95060, USA.
| | - Benjamin J Raphael
- Department of Computer Science, Princeton University, Princeton, NJ, 08540, USA.
| |
Collapse
|
9
|
Ford MKB, Hari A, Rodriguez O, Xu J, Lack J, Oguz C, Zhang Y, Weber S, Magliocco M, Barnett J, Xirasagar S, Samuel S, Imberti L, Bonfanti P, Biondi A, Dalgard CL, Chanock S, Rosen L, Holland S, Su H, Notarangelo L, Vishkin U, Watson CT, Sahinalp SC. ImmunoTyper-SR: A computational approach for genotyping immunoglobulin heavy chain variable genes using short-read data. Cell Syst 2022; 13:808-816.e5. [PMID: 36265467 PMCID: PMC10084889 DOI: 10.1016/j.cels.2022.08.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2022] [Revised: 07/20/2022] [Accepted: 08/22/2022] [Indexed: 01/26/2023]
Abstract
Human immunoglobulin heavy chain (IGH) locus on chromosome 14 includes more than 40 functional copies of the variable gene (IGHV), which are critical for the structure of antibodies that identify and neutralize pathogenic invaders as a part of the adaptive immune system. Because of its highly repetitive sequence composition, the IGH locus has been particularly difficult to assemble or genotype when using standard short-read sequencing technologies. Here, we introduce ImmunoTyper-SR, an algorithmic tool for the genotyping and CNV analysis of the germline IGHV genes on Illumina whole-genome sequencing (WGS) data using a combinatorial optimization formulation that resolves ambiguous read mappings. We have validated ImmunoTyper-SR on 12 individuals, whose IGHV allele composition had been independently validated, as well as concordance between WGS replicates from nine individuals. We then applied ImmunoTyper-SR on 585 COVID patients to investigate the associations between IGHV alleles and anti-type I IFN autoantibodies, which were previously associated with COVID-19 severity.
Collapse
Affiliation(s)
| | - Ananth Hari
- National Cancer Institute, NIH, Bethesda, MD, USA; Department of Electrical Engineering, University of Maryland, College Park, MD, USA
| | - Oscar Rodriguez
- Department of Biochemistry and Molecular Genetics, University of Louisville, Louisville, KY, USA
| | - Junyan Xu
- National Cancer Institute, NIH, Bethesda, MD, USA
| | - Justin Lack
- National Institute of Allergy and Infectious Diseases, NIH, Bethesda, MD, USA
| | - Cihan Oguz
- National Institute of Allergy and Infectious Diseases, NIH, Bethesda, MD, USA
| | - Yu Zhang
- National Institute of Allergy and Infectious Diseases, NIH, Bethesda, MD, USA
| | - Sarah Weber
- National Institute of Allergy and Infectious Diseases, NIH, Bethesda, MD, USA
| | - Mary Magliocco
- National Institute of Allergy and Infectious Diseases, NIH, Bethesda, MD, USA
| | - Jason Barnett
- National Institute of Allergy and Infectious Diseases, NIH, Bethesda, MD, USA
| | - Sandhya Xirasagar
- National Institute of Allergy and Infectious Diseases, NIH, Bethesda, MD, USA
| | - Smilee Samuel
- National Institute of Allergy and Infectious Diseases, NIH, Bethesda, MD, USA
| | - Luisa Imberti
- Diagnostic Department, ASST Spedali Civili di Brescia, Brescia, Italy
| | - Paolo Bonfanti
- University of Milano-Bicocca, Fondazione MBBM, Monza, Italy
| | - Andrea Biondi
- University of Milano-Bicocca, Fondazione MBBM, Monza, Italy
| | - Clifton L Dalgard
- Uniformed Services University of the Health Sciences, Bethesda, MD, USA
| | | | - Lindsey Rosen
- National Institute of Allergy and Infectious Diseases, NIH, Bethesda, MD, USA
| | - Steven Holland
- National Institute of Allergy and Infectious Diseases, NIH, Bethesda, MD, USA
| | - Helen Su
- National Institute of Allergy and Infectious Diseases, NIH, Bethesda, MD, USA
| | - Luigi Notarangelo
- National Institute of Allergy and Infectious Diseases, NIH, Bethesda, MD, USA
| | - Uzi Vishkin
- Department of Electrical Engineering, University of Maryland, College Park, MD, USA
| | - Corey T Watson
- Department of Biochemistry and Molecular Genetics, University of Louisville, Louisville, KY, USA
| | | |
Collapse
|
10
|
Kızılkale C, Rashidi Mehrabadi F, Sadeqi Azer E, Pérez-Guijarro E, Marie KL, Lee MP, Day CP, Merlino G, Ergün F, Buluç A, Sahinalp SC, Malikić S. Fast intratumor heterogeneity inference from single-cell sequencing data. Nat Comput Sci 2022; 2:577-583. [PMID: 38177468 PMCID: PMC10765963 DOI: 10.1038/s43588-022-00298-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/09/2021] [Accepted: 07/14/2022] [Indexed: 01/06/2024]
Abstract
We introduce HUNTRESS, a computational method for mutational intratumor heterogeneity inference from noisy genotype matrices derived from single-cell sequencing data, the running time of which is linear with the number of cells and quadratic with the number of mutations. We prove that, under reasonable conditions, HUNTRESS computes the true progression history of a tumor with high probability. On simulated and real tumor sequencing data, HUNTRESS is demonstrated to be faster than available alternatives with comparable or better accuracy. Additionally, the progression histories of tumors inferred by HUNTRESS on real single-cell sequencing datasets agree with the best known evolution scenarios for the associated tumors.
Collapse
Affiliation(s)
- Can Kızılkale
- Department of Electrical Engineering and Computer Sciences UC Berkeley, Berkeley, CA, USA
- Computational Research Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Farid Rashidi Mehrabadi
- Cancer Data Science Laboratory, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
- Department of Computer Science, Indiana University, Bloomington, IN, USA
| | - Erfan Sadeqi Azer
- Department of Computer Science, Indiana University, Bloomington, IN, USA
- Google LLC, Sunnyvale, CA, USA
| | - Eva Pérez-Guijarro
- Laboratory of Cancer Biology and Genetics, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Kerrie L Marie
- Laboratory of Cancer Biology and Genetics, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Maxwell P Lee
- Laboratory of Cancer Biology and Genetics, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Chi-Ping Day
- Laboratory of Cancer Biology and Genetics, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Glenn Merlino
- Laboratory of Cancer Biology and Genetics, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Funda Ergün
- Department of Computer Science, Indiana University, Bloomington, IN, USA
| | - Aydın Buluç
- Department of Electrical Engineering and Computer Sciences UC Berkeley, Berkeley, CA, USA
- Computational Research Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - S Cenk Sahinalp
- Cancer Data Science Laboratory, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA.
| | - Salem Malikić
- Cancer Data Science Laboratory, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA.
| |
Collapse
|
11
|
Ly R, Shugg T, Ratcliff R, Osei W, Pratt V, Schneider B, Radovich M, Bray S, Salisbury B, Parikh B, Sahinalp SC, Numanagić I, Skaar T. eP373: Analytical validation of a computational method for pharmacogenetic genotyping from clinical exome sequencing. Genet Med 2022. [DOI: 10.1016/j.gim.2022.01.408] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open
|
12
|
Ford M, Hari A, Rodriguez O, Xu J, Lack J, Oguz C, Zhang Y, Weber S, Magglioco M, Barnett J, Xirasagar S, Samuel S, Imberti L, Bonfanti P, Biondi A, Dalgard CL, Chanock S, Rosen L, Holland S, Su H, Notarangelo L, Vishkin U, Watson C, Sahinalp SC. ImmunoTyper-SR: A Novel Computational Approach for Genotyping Immunoglobulin Heavy Chain Variable Genes using Short Read Data. bioRxiv 2022:2022.01.31.478564. [PMID: 35132409 PMCID: PMC8820654 DOI: 10.1101/2022.01.31.478564] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
Human immunoglobulin heavy chain (IGH) locus on chromosome 14 includes more than 40 functional copies of the variable gene (IGHV), which, together with the joining genes (IGHJ), diversity genes (IGHD), constant genes (IGHC) and immunoglobulin light chains, code for antibodies that identify and neutralize pathogenic invaders as a part of the adaptive immune system. Because of its highly repetitive sequence composition, the IGH locus has been particularly difficult to assemble or genotype through the use of standard short read sequencing technologies. Here we introduce ImmunoTyper-SR, an algorithmic method for genotype and CNV analysis of the germline IGHV genes using Illumina whole genome sequencing (WGS) data. ImmunoTyper-SR is based on a novel combinatorial optimization formulation that aims to minimize the total edit distance between reads and their assigned IGHV alleles from a given database, with constraints on the number and distribution of reads across each called allele. We have validated ImmunoTyper-SR on 12 individuals with Illumina WGS data from the 1000 Genomes Project, whose IGHV allele composition have been studied extensively through the use of long read and targeted sequencing platforms, as well as nine individuals from the NIAID COVID Consortium who have been subjected to WGS twice. We have then applied ImmunoTyper-SR on 585 samples from the NIAID COVID Consortium to investigate associations between distinct IGHV alleles and anti-type I IFN autoantibodies which have been linked to COVID-19 severity.
Collapse
|
13
|
El-Kebir M, Morris Q, Oesper L, Sahinalp SC. Emerging Topics in Cancer Evolution. Pac Symp Biocomput 2022; 27:397-401. [PMID: 34890166] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Cancer results from an evolutionary process that yields a heterogeneous tumor with distinct subpopulations and varying sets of somatic mutations. This perspective discusses computational methods to infer models of evolutionary processes in cancer that aim to improve our understanding of tumorigenesis and ultimately enhance current clinical practice.
Collapse
Affiliation(s)
- Mohammed El-Kebir
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL 61801, United States,
| | | | | | | |
Collapse
|
14
|
Dokmai N, Kockan C, Zhu K, Wang X, Sahinalp SC, Cho H. Privacy-preserving genotype imputation in a trusted execution environment. Cell Syst 2021; 12:983-993.e7. [PMID: 34450045 PMCID: PMC8542641 DOI: 10.1016/j.cels.2021.08.001] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2021] [Revised: 07/14/2021] [Accepted: 08/02/2021] [Indexed: 01/02/2023]
Abstract
Genotype imputation is an essential tool in genomics research, whereby missing genotypes are inferred using reference genomes to enhance downstream analyses. Recently, public imputation servers have allowed researchers to leverage large-scale genomic data resources for imputation. However, privacy concerns about uploading one's genetic data to a server limit the utility of these services. We introduce a secure hardware-based solution for privacy-preserving genotype imputation, which keeps the input genomes private by processing them within Intel SGX's trusted execution environment. Our solution features SMac, an efficient and secure imputation algorithm designed for Intel SGX, which employs a state-of-the-art imputation strategy also utilized by existing imputation servers. SMac achieves imputation accuracy equivalent to existing tools and provides protection against known side-channel attacks on SGX while maintaining scalability. We also show the necessity of our enhanced security by identifying vulnerabilities in existing imputation software. Our work represents a step toward privacy-preserving genomic analysis services.
Collapse
Affiliation(s)
- Natnatee Dokmai
- Department of Computer Science, Indiana University, Bloomington, IN 47408, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Can Kockan
- Department of Computer Science, Indiana University, Bloomington, IN 47408, USA; Cancer Data Science Laboratory, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Kaiyuan Zhu
- Department of Computer Science, Indiana University, Bloomington, IN 47408, USA; Cancer Data Science Laboratory, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - XiaoFeng Wang
- Department of Computer Science, Indiana University, Bloomington, IN 47408, USA
| | - S Cenk Sahinalp
- Cancer Data Science Laboratory, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA.
| | - Hyunghoon Cho
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.
| |
Collapse
|
15
|
Dokmai N, Kockan C, Zhu K, Wang X, Sahinalp SC, Cho H. Privacy-Preserving Genotype Imputation in a Trusted Execution Environment. Res Comput Mol Biol 2021; 12:983-993.e7. [PMID: 34859247 PMCID: PMC8635452] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Affiliation(s)
- Natnatee Dokmai
- Department of Computer Science, Indiana University, Bloomington, IN, USA
| | - Can Kockan
- Department of Computer Science, Indiana University, Bloomington, IN, USA
- Cancer Data Science Lab, National Cancer Institute, NIH, Bethesda, MD, USA
| | - Kaiyuan Zhu
- Department of Computer Science, Indiana University, Bloomington, IN, USA
- Cancer Data Science Lab, National Cancer Institute, NIH, Bethesda, MD, USA
| | - XiaoFeng Wang
- Department of Computer Science, Indiana University, Bloomington, IN, USA
| | - S. Cenk Sahinalp
- Cancer Data Science Lab, National Cancer Institute, NIH, Bethesda, MD, USA
| | - Hyunghoon Cho
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| |
Collapse
|
16
|
Mehrabadi FR, Malikić S, Marie KL, Pérez-Guijarro E, Azer ES, Yang HH, Kızılkale C, Gruen C, Liu H, Marcelus C, Buluç A, Ergün F, Lee MP, Merlino G, Day CP, Sahinalp SC. Abstract LB019: Trisicell: Scalable Tumor Phylogeny Reconstruction and Validation Reveals Developmental Origin and Therapeutic Impact of Intratumoral Heterogeneity. Cancer Res 2021. [DOI: 10.1158/1538-7445.am2021-lb019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Abstract
Emerging sets of single-cell sequencing data makes it appealing to apply existing tumor phylogeny reconstruction methods to analyze associated intratumor heterogeneity. Unfortunately, tumor phylogeny inference is an NP-hard problem and existing principled methods typically fail to scale up to handle thousands of cells and mutations observed in emerging single-cell data sets. Even though there are greedy heuristics to build hierarchical clustering of cells and mutations, they suffer from well-documented issues in accuracy. Additionally even when “optimal” solutions are feasible, existing approaches only provide a single “most likely” tree to depict the evolutionary processes that may result in an observed collection of cells and mutations. To make matters worse, the vast majority of single-cell sequencing data sets are transcriptomic and as a result, suffer from considerable variation in coverage across mutational loci.
In this paper, we introduce Trisicell, a computational toolkit for scalable tumor phylogeny reconstruction and validation from single-cell genomic, exomic or transcriptomic sequencing data. Trisicell has three components: (i) Trisicell-DnC, a new tumor phylogeny reconstruction method from genotype matrices derived from single-cell data, (ii) Trisicell-ConT a new algorithm for constructing the consensus for two or more tumor phylogenies - which may be built through the use of different data types on the same set of cells, or built through the use of different methods on the same data, and (iii) Trisicell-PF, a new partition function method for assessing the likelihood of any user-defined subtree/set of cells to be seeded by a given set of mutations in the phylogeny. Collectively, these tools provide means of identifying and validating robust portions of a tumor phylogeny, offering the ability to focus on the most important (sub)clones and the genomic alterations that seed the associated clonal expansion.
We applied Trisicell to a panel of clonal sublines derived from single-cells of a parental mouse melanoma model on which we performed both whole exome and whole transcriptome sequencing. The tumor phylogenies of the clonal sublines built on exomic and transcriptomic mutations by Trisicell-DnC, were shown by Trisicell-ConT to be highly similar and the subtrees comprised of phenotypically similar clonal sublines were shown to be strongly associated by Trisicell-PF to their seeding mutations. In addition, we applied Trisicell to single-cell whole transcriptome sequencing data from a tumor derived from the same parental melanoma cell line, which was subjected to anti-CTLA-4 immunotherapy. The phylogenies generated from both studies featured distinct subtrees, strongly associated with phenotypes including cell differentiation status, tumor growth and therapeutic response. These results suggest that Trisicell can be used for scalable tumor phylogeny reconstruction and validation through both single-cell and clonal-subline sequencing data, which may reveal strong phenotypic associations. In particular, they suggest that the developmental status and phenotypic intratumoral heterogeneity of melanoma originates from observable subclonal variation.
Citation Format: Farid Rashidi Mehrabadi, Salem Malikić, Kerrie L. Marie, Eva Pérez-Guijarro, Erfan Sadeqi Azer, Howard H. Yang, Can Kızılkale, Charli Gruen, Huaitian Liu, Christina Marcelus, Aydın Buluç, Funda Ergün, Maxwell P. Lee, Glenn Merlino, Chi-Ping Day, S. Cenk Sahinalp. Trisicell: Scalable Tumor Phylogeny Reconstruction and Validation Reveals Developmental Origin and Therapeutic Impact of Intratumoral Heterogeneity [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2021; 2021 Apr 10-15 and May 17-21. Philadelphia (PA): AACR; Cancer Res 2021;81(13_Suppl):Abstract nr LB019.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Can Kızılkale
- 3Lawrence Berkeley National Laboratory, Berkeley, CA
| | | | | | | | - Aydın Buluç
- 3Lawrence Berkeley National Laboratory, Berkeley, CA
| | | | | | | | | | | |
Collapse
|
17
|
Dentro SC, Leshchiner I, Haase K, Tarabichi M, Wintersinger J, Deshwar AG, Yu K, Rubanova Y, Macintyre G, Demeulemeester J, Vázquez-García I, Kleinheinz K, Livitz DG, Malikic S, Donmez N, Sengupta S, Anur P, Jolly C, Cmero M, Rosebrock D, Schumacher SE, Fan Y, Fittall M, Drews RM, Yao X, Watkins TBK, Lee J, Schlesner M, Zhu H, Adams DJ, McGranahan N, Swanton C, Getz G, Boutros PC, Imielinski M, Beroukhim R, Sahinalp SC, Ji Y, Peifer M, Martincorena I, Markowetz F, Mustonen V, Yuan K, Gerstung M, Spellman PT, Wang W, Morris QD, Wedge DC, Van Loo P. Characterizing genetic intra-tumor heterogeneity across 2,658 human cancer genomes. Cell 2021; 184:2239-2254.e39. [PMID: 33831375 PMCID: PMC8054914 DOI: 10.1016/j.cell.2021.03.009] [Citation(s) in RCA: 199] [Impact Index Per Article: 66.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2020] [Revised: 09/21/2020] [Accepted: 03/03/2021] [Indexed: 02/07/2023]
Abstract
Intra-tumor heterogeneity (ITH) is a mechanism of therapeutic resistance and therefore an important clinical challenge. However, the extent, origin, and drivers of ITH across cancer types are poorly understood. To address this, we extensively characterize ITH across whole-genome sequences of 2,658 cancer samples spanning 38 cancer types. Nearly all informative samples (95.1%) contain evidence of distinct subclonal expansions with frequent branching relationships between subclones. We observe positive selection of subclonal driver mutations across most cancer types and identify cancer type-specific subclonal patterns of driver gene mutations, fusions, structural variants, and copy number alterations as well as dynamic changes in mutational processes between subclonal expansions. Our results underline the importance of ITH and its drivers in tumor evolution and provide a pan-cancer resource of comprehensively annotated subclonal events from whole-genome sequencing data.
Collapse
Affiliation(s)
- Stefan C Dentro
- Cancer Genomics Laboratory, The Francis Crick Institute, London NW1 1AT, UK; Wellcome Trust Sanger Institute, Cambridge CB10 1SA, UK; Big Data Institute, University of Oxford, Oxford OX3 7LF, UK
| | | | - Kerstin Haase
- Cancer Genomics Laboratory, The Francis Crick Institute, London NW1 1AT, UK
| | - Maxime Tarabichi
- Cancer Genomics Laboratory, The Francis Crick Institute, London NW1 1AT, UK; Wellcome Trust Sanger Institute, Cambridge CB10 1SA, UK
| | - Jeff Wintersinger
- University of Toronto, Toronto, ON M5S 3E1, Canada; Vector Institute, Toronto, ON M5G 1L7, Canada
| | - Amit G Deshwar
- University of Toronto, Toronto, ON M5S 3E1, Canada; Vector Institute, Toronto, ON M5G 1L7, Canada
| | - Kaixian Yu
- The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Yulia Rubanova
- University of Toronto, Toronto, ON M5S 3E1, Canada; Vector Institute, Toronto, ON M5G 1L7, Canada
| | - Geoff Macintyre
- Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge CB2 0RE, UK
| | - Jonas Demeulemeester
- Cancer Genomics Laboratory, The Francis Crick Institute, London NW1 1AT, UK; Department of Human Genetics, University of Leuven, 3000 Leuven, Belgium
| | - Ignacio Vázquez-García
- Wellcome Trust Sanger Institute, Cambridge CB10 1SA, UK; University of Cambridge, Cambridge CB2 0QQ, UK; Computational Oncology, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA; Irving Institute for Cancer Dynamics, Columbia University, New York, NY 10027, USA
| | - Kortine Kleinheinz
- German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany; Heidelberg University, 69120 Heidelberg, Germany
| | | | - Salem Malikic
- Cancer Data Science Laboratory, National Cancer Institute, NIH, Bethesda, MD 20892, USA
| | - Nilgun Donmez
- Simon Fraser University, Burnaby, BC V5A 1S6, Canada; Vancouver Prostate Centre, Vancouver, BC V6H 3Z6, Canada
| | | | - Pavana Anur
- Molecular and Medical Genetics, Oregon Health & Science University, Portland, OR 97231, USA
| | - Clemency Jolly
- Cancer Genomics Laboratory, The Francis Crick Institute, London NW1 1AT, UK
| | - Marek Cmero
- University of Melbourne, Melbourne, VIC 3010, Australia; Walter + Eliza Hall Institute, Melbourne, VIC 3000, Australia
| | | | | | - Yu Fan
- The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Matthew Fittall
- Cancer Genomics Laboratory, The Francis Crick Institute, London NW1 1AT, UK
| | - Ruben M Drews
- Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge CB2 0RE, UK
| | - Xiaotong Yao
- Weill Cornell Medicine, New York, NY 10065, USA; New York Genome Center, New York, NY 10013, USA
| | - Thomas B K Watkins
- Cancer Evolution and Genome Instability Laboratory, The Francis Crick Institute, London NW1 1AT, UK
| | - Juhee Lee
- University of California, Santa Cruz, Santa Cruz, CA 95064, USA
| | | | - Hongtu Zhu
- The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - David J Adams
- Wellcome Trust Sanger Institute, Cambridge CB10 1SA, UK
| | - Nicholas McGranahan
- Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London WC1E 6BT, UK; Cancer Genome Evolution Research Group, University College London Cancer Institute, London WC1E 6DD, UK
| | - Charles Swanton
- Cancer Evolution and Genome Instability Laboratory, The Francis Crick Institute, London NW1 1AT, UK; Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London WC1E 6BT, UK; Department of Medical Oncology, University College London Hospitals, London NW1 2BU, UK
| | - Gad Getz
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Massachusetts General Hospital Center for Cancer Research, Charlestown, MA 02129, USA; Massachusetts General Hospital, Department of Pathology, Boston, MA 02114, USA; Harvard Medical School, Boston, MA 02215, USA
| | - Paul C Boutros
- University of Toronto, Toronto, ON M5S 3E1, Canada; Ontario Institute for Cancer Research, Toronto, ON M5G 0A3, Canada; University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Marcin Imielinski
- Weill Cornell Medicine, New York, NY 10065, USA; New York Genome Center, New York, NY 10013, USA
| | - Rameen Beroukhim
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Dana-Farber Cancer Institute, Boston, MA 02215, USA
| | - S Cenk Sahinalp
- Cancer Data Science Laboratory, National Cancer Institute, NIH, Bethesda, MD 20892, USA
| | - Yuan Ji
- NorthShore University HealthSystem, Evanston, IL 60201, USA; The University of Chicago, Chicago, IL 60637, USA
| | - Martin Peifer
- Department of Translational Genomics, Center for Integrated Oncology Cologne-Bonn, Medical Faculty, University of Cologne, 50931 Cologne, Germany
| | | | - Florian Markowetz
- Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge CB2 0RE, UK
| | - Ville Mustonen
- Organismal and Evolutionary Biology Research Programme, Department of Computer Science, Institute of Biotechnology, University of Helsinki, 00014 Helsinki, Finland
| | - Ke Yuan
- Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge CB2 0RE, UK; School of Computing Science, University of Glasgow, Glasgow G12 8RZ, UK
| | - Moritz Gerstung
- Wellcome Trust Sanger Institute, Cambridge CB10 1SA, UK; European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge CB10 1SD, UK; European Molecular Biology Laboratory, Genome Biology Unit, 69117 Heidelberg, Germany
| | - Paul T Spellman
- Molecular and Medical Genetics, Oregon Health & Science University, Portland, OR 97231, USA
| | - Wenyi Wang
- The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Quaid D Morris
- University of Toronto, Toronto, ON M5S 3E1, Canada; Vector Institute, Toronto, ON M5G 1L7, Canada; Ontario Institute for Cancer Research, Toronto, ON M5G 0A3, Canada; Computational and Systems Biology, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - David C Wedge
- Big Data Institute, University of Oxford, Oxford OX3 7LF, UK; Oxford NIHR Biomedical Research Centre, Oxford OX4 2PG, UK; Manchester Cancer Research Centre, University of Manchester, Manchester M20 4GJ, UK
| | - Peter Van Loo
- Cancer Genomics Laboratory, The Francis Crick Institute, London NW1 1AT, UK.
| |
Collapse
|
18
|
Sadeqi Azer E, Rashidi Mehrabadi F, Malikić S, Li XC, Bartok O, Litchfield K, Levy R, Samuels Y, Schäffer AA, Gertz EM, Day CP, Pérez-Guijarro E, Marie K, Lee MP, Merlino G, Ergun F, Sahinalp SC. PhISCS-BnB: a fast branch and bound algorithm for the perfect tumor phylogeny reconstruction problem. Bioinformatics 2021; 36:i169-i176. [PMID: 32657358 DOI: 10.1093/bioinformatics/btaa464] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
MOTIVATION Recent advances in single-cell sequencing (SCS) offer an unprecedented insight into tumor emergence and evolution. Principled approaches to tumor phylogeny reconstruction via SCS data are typically based on general computational methods for solving an integer linear program, or a constraint satisfaction program, which, although guaranteeing convergence to the most likely solution, are very slow. Others based on Monte Carlo Markov Chain or alternative heuristics not only offer no such guarantee, but also are not faster in practice. As a result, novel methods that can scale up to handle the size and noise characteristics of emerging SCS data are highly desirable to fully utilize this technology. RESULTS We introduce PhISCS-BnB (phylogeny inference using SCS via branch and bound), a branch and bound algorithm to compute the most likely perfect phylogeny on an input genotype matrix extracted from an SCS dataset. PhISCS-BnB not only offers an optimality guarantee, but is also 10-100 times faster than the best available methods on simulated tumor SCS data. We also applied PhISCS-BnB on a recently published large melanoma dataset derived from the sublineages of a cell line involving 20 clones with 2367 mutations, which returned the optimal tumor phylogeny in <4 h. The resulting phylogeny agrees with and extends the published results by providing a more detailed picture on the clonal evolution of the tumor. AVAILABILITY AND IMPLEMENTATION https://github.com/algo-cancer/PhISCS-BnB. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Erfan Sadeqi Azer
- Department of Computer Science, Indiana University, Bloomington, IN 47408, USA
| | - Farid Rashidi Mehrabadi
- Department of Computer Science, Indiana University, Bloomington, IN 47408, USA.,Cancer Data Science Laboratory, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Salem Malikić
- Department of Computer Science, Indiana University, Bloomington, IN 47408, USA
| | - Xuan Cindy Li
- Cancer Data Science Laboratory, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA.,Program in Computational Biology, Bioinformatics and Genomics, University of Maryland, College Park, MD 20742, USA
| | - Osnat Bartok
- Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot, Israel
| | - Kevin Litchfield
- Cancer Evolution and Genome Instability Laboratory, Francis Crick Institute, London NW1 1AT, UK.,Cancer Research UK Lung Cancer Centre of Excellence London, University College London Cancer Institute, London WC1E 6DD, UK
| | - Ronen Levy
- Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot, Israel
| | - Yardena Samuels
- Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot, Israel
| | - Alejandro A Schäffer
- Cancer Data Science Laboratory, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - E Michael Gertz
- Cancer Data Science Laboratory, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Chi-Ping Day
- Laboratory of Cancer Biology and Genetics, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Eva Pérez-Guijarro
- Laboratory of Cancer Biology and Genetics, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Kerrie Marie
- Laboratory of Cancer Biology and Genetics, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Maxwell P Lee
- Laboratory of Cancer Biology and Genetics, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Glenn Merlino
- Laboratory of Cancer Biology and Genetics, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Funda Ergun
- Department of Computer Science, Indiana University, Bloomington, IN 47408, USA
| | - S Cenk Sahinalp
- Cancer Data Science Laboratory, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA
| |
Collapse
|
19
|
Hodzic E, Shrestha R, Malikic S, Collins CC, Litchfield K, Turajlic S, Sahinalp SC. Identification of conserved evolutionary trajectories in tumors. Bioinformatics 2021; 36:i427-i435. [PMID: 32657374 DOI: 10.1093/bioinformatics/btaa453] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
MOTIVATION As multi-region, time-series and single-cell sequencing data become more widely available; it is becoming clear that certain tumors share evolutionary characteristics with others. In the last few years, several computational methods have been developed with the goal of inferring the subclonal composition and evolutionary history of tumors from tumor biopsy sequencing data. However, the phylogenetic trees that they report differ significantly between tumors (even those with similar characteristics). RESULTS In this article, we present a novel combinatorial optimization method, CONETT, for detection of recurrent tumor evolution trajectories. Our method constructs a consensus tree of conserved evolutionary trajectories based on the information about temporal order of alteration events in a set of tumors. We apply our method to previously published datasets of 100 clear-cell renal cell carcinoma and 99 non-small-cell lung cancer patients and identify both conserved trajectories that were reported in the original studies, as well as new trajectories. AVAILABILITY AND IMPLEMENTATION CONETT is implemented in C++ and available at https://github.com/ehodzic/CONETT. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Ermin Hodzic
- Department of Computing Science, Simon Fraser University, Burnaby, BC, Canada
| | - Raunak Shrestha
- Department of Radiation Oncology, University of California San Francisco, San Francisco, CA, USA
| | - Salem Malikic
- Department of Computer Science, Indiana University Bloomington, Bloomington, IN, USA
| | - Colin C Collins
- Department of Urologic Sciences, University of British Columbia, Vancouver, BC, Canada.,aboratory for Advanced Genome Analysis, Vancouver Prostate Centre, Vancouver, BC, Canada
| | - Kevin Litchfield
- Cancer Dynamics Laboratory, the Francis Crick institute, Genome Instability Laboratory, Francis Crick Institute, London, UK
| | - Samra Turajlic
- Cancer Dynamics Laboratory, the Francis Crick institute, Genome Instability Laboratory, Francis Crick Institute, London, UK.,Skin and Renal Units, The royal Marsden NHS Foundation Trust, London, UK
| | - S Cenk Sahinalp
- Cancer Data Science Lab., National Cancer Institute, NIH, Bethesda, MD, USA
| |
Collapse
|
20
|
Sadeqi Azer E, Haghir Ebrahimabadi M, Malikić S, Khardon R, Sahinalp SC. Tumor Phylogeny Topology Inference via Deep Learning. iScience 2020; 23:101655. [PMID: 33117968 PMCID: PMC7582044 DOI: 10.1016/j.isci.2020.101655] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2020] [Revised: 08/10/2020] [Accepted: 10/02/2020] [Indexed: 01/24/2023] Open
Abstract
Principled computational approaches for tumor phylogeny reconstruction via single-cell sequencing typically aim to build the most likely perfect phylogeny tree from the noisy genotype matrix - which represents genotype calls of single cells. This problem is NP-hard, and as a result, existing approaches aim to solve relatively small instances of it through combinatorial optimization techniques or Bayesian inference. As expected, even when the goal is to infer basic topological features of the tumor phylogeny, rather than reconstructing the topology entirely, these approaches could be prohibitively slow. In this paper, we introduce fast deep learning solutions to the problems of inferring whether the most likely tree has a linear (chain) or branching topology and whether a perfect phylogeny is feasible from a given genotype matrix. We also present a reinforcement learning approach for reconstructing the most likely tumor phylogeny. This preliminary work demonstrates that data-driven approaches can reconstruct key features of tumor evolution.
Collapse
Affiliation(s)
- Erfan Sadeqi Azer
- Department of Computer Science, Indiana University, Bloomington, IN 47408, USA
| | - Mohammad Haghir Ebrahimabadi
- Department of Computer Science, Indiana University, Bloomington, IN 47408, USA
- Cancer Data Science Laboratory, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Salem Malikić
- Department of Computer Science, Indiana University, Bloomington, IN 47408, USA
- Cancer Data Science Laboratory, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Roni Khardon
- Department of Computer Science, Indiana University, Bloomington, IN 47408, USA
| | - S. Cenk Sahinalp
- Cancer Data Science Laboratory, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA
| |
Collapse
|
21
|
Ford M, Haghshenas E, Watson CT, Sahinalp SC. Erratum: Genotyping and Copy Number Analysis of Immunoglobulin Heavy Chain Variable Genes Using Long Reads. iScience 2020; 23:101508. [PMID: 32896768 PMCID: PMC7482014 DOI: 10.1016/j.isci.2020.101508] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open
Abstract
[This corrects the article DOI: 10.1016/j.isci.2020.100883.].
Collapse
|
22
|
Ford M, Haghshenas E, Watson CT, Sahinalp SC. Genotyping and Copy Number Analysis of Immunoglobin Heavy Chain Variable Genes Using Long Reads. iScience 2020; 23:100883. [PMID: 32109676 PMCID: PMC7044747 DOI: 10.1016/j.isci.2020.100883] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2019] [Revised: 11/08/2019] [Accepted: 01/29/2020] [Indexed: 11/22/2022] Open
Abstract
One of the remaining challenges to describing an individual's genetic variation lies in the highly heterogeneous and complex genomic regions that impede the use of classical reference-guided mapping and assembly approaches. Once such region is the Immunoglobulin heavy chain locus (IGH), which is critical for the development of antibodies and the adaptive immune system. We describe ImmunoTyper, the first PacBio-based genotyping and copy number calling tool specifically designed for IGH V genes (IGHV). We demonstrate that ImmunoTyper's multi-stage clustering and combinatorial optimization approach represents the most comprehensive IGHV genotyping approach published to date, through validation using gold-standard IGH reference sequence. This preliminary work establishes the feasibility of fine-grained genotype and copy number analysis using error-prone long reads in complex multi-gene loci and opens the door for in-depth investigation into IGHV heterogeneity using accessible and increasingly common whole-genome sequence.
Collapse
Affiliation(s)
- Michael Ford
- School of Computing Science, Simon Fraser University, Burnaby V5A 1S6, Canada
| | - Ehsan Haghshenas
- School of Computing Science, Simon Fraser University, Burnaby V5A 1S6, Canada
| | - Corey T Watson
- Department of Biochemistry and Molecular Genetics, University of Louisville, Louisville 40292, USA
| | - S Cenk Sahinalp
- Cancer Data Science Laboratory, National Cancer Institute, Bethesda, MD 20892, USA.
| |
Collapse
|
23
|
Kockan C, Zhu K, Dokmai N, Karpov N, Kulekci MO, Woodruff DP, Sahinalp SC. Sketching algorithms for genomic data analysis and querying in a secure enclave. Nat Methods 2020; 17:295-301. [PMID: 32132732 DOI: 10.1038/s41592-020-0761-8] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2019] [Accepted: 01/22/2020] [Indexed: 11/09/2022]
Abstract
Genome-wide association studies (GWAS), especially on rare diseases, may necessitate exchange of sensitive genomic data between multiple institutions. Since genomic data sharing is often infeasible due to privacy concerns, cryptographic methods, such as secure multiparty computation (SMC) protocols, have been developed with the aim of offering privacy-preserving collaborative GWAS. Unfortunately, the computational overhead of these methods remain prohibitive for human-genome-scale data. Here we introduce SkSES (https://github.com/ndokmai/sgx-genome-variants-search), a hardware-software hybrid approach for privacy-preserving collaborative GWAS, which improves the running time of the most advanced cryptographic protocols by two orders of magnitude. The SkSES approach is based on trusted execution environments (TEEs) offered by current-generation microprocessors-in particular, Intel's SGX. To overcome the severe memory limitation of the TEEs, SkSES employs novel 'sketching' algorithms that maintain essential statistical information on genomic variants in input VCF files. By additionally incorporating efficient data compression and population stratification reduction methods, SkSES identifies the top k genomic variants in a cohort quickly, accurately and in a privacy-preserving manner.
Collapse
Affiliation(s)
- Can Kockan
- Department of Computer Science, Indiana University, Bloomington, IN, USA.,Cancer Data Science Laboratory, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Kaiyuan Zhu
- Department of Computer Science, Indiana University, Bloomington, IN, USA.,Cancer Data Science Laboratory, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Natnatee Dokmai
- Department of Computer Science, Indiana University, Bloomington, IN, USA
| | - Nikolai Karpov
- Department of Computer Science, Indiana University, Bloomington, IN, USA
| | - M Oguzhan Kulekci
- Informatics Institute, Istanbul Technical University, Istanbul, Turkey
| | - David P Woodruff
- Department of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA
| | - S Cenk Sahinalp
- Cancer Data Science Laboratory, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA.
| |
Collapse
|
24
|
Abstract
Many problems in applied machine learning deal with graphs (also called networks), including social networks, security, web data mining, protein function prediction, and genome informatics. The kernel paradigm beautifully decouples the learning algorithm from the underlying geometric space, which renders graph kernels important for the aforementioned applications. In this article, we give a new graph kernel, which we call graph traversal edit distance (GTED). We introduce the GTED problem and give the first polynomial time algorithm for it. Informally, the GTED is the minimum edit distance between two strings formed by the edge labels of respective Eulerian traversals of the two graphs. Also, GTED is motivated by and provides the first mathematical formalism for sequence co-assembly and de novo variation detection in bioinformatics. We demonstrate that GTED admits a polynomial time algorithm using a linear program in the graph product space that is guaranteed to yield an integer solution. To the best of our knowledge, this is the first approach to this problem. We also give a linear programming relaxation algorithm for a lower bound on GTED. We use GTED as a graph kernel and evaluate it by computing the accuracy of a support vector machine (SVM) classifier on a few data sets in the literature. Our results suggest that our kernel outperforms many of the common graph kernels in the tested data sets. As a second set of experiments, we successfully cluster viral genomes using GTED on their assembly graphs obtained from de novo assembly of next-generation sequencing reads.
Collapse
Affiliation(s)
| | - Akash Shrestha
- Department of Computer Science, Colorado State University, Fort Collins, Colorado
| | - Ali Sharifi-Zarchi
- Department of Computer Engineering, Sharif University of Technology, Tehran, Iran
| | | | | | - Hamidreza Chitsaz
- Department of Computer Science, Colorado State University, Fort Collins, Colorado
| |
Collapse
|
25
|
Reyna MA, Haan D, Paczkowska M, Verbeke LPC, Vazquez M, Kahraman A, Pulido-Tamayo S, Barenboim J, Wadi L, Dhingra P, Shrestha R, Getz G, Lawrence MS, Pedersen JS, Rubin MA, Wheeler DA, Brunak S, Izarzugaza JMG, Khurana E, Marchal K, von Mering C, Sahinalp SC, Valencia A, Reimand J, Stuart JM, Raphael BJ. Pathway and network analysis of more than 2500 whole cancer genomes. Nat Commun 2020; 11:729. [PMID: 32024854 PMCID: PMC7002574 DOI: 10.1038/s41467-020-14367-0] [Citation(s) in RCA: 56] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2018] [Accepted: 12/18/2019] [Indexed: 12/14/2022] Open
Abstract
The catalog of cancer driver mutations in protein-coding genes has greatly expanded in the past decade. However, non-coding cancer driver mutations are less well-characterized and only a handful of recurrent non-coding mutations, most notably TERT promoter mutations, have been reported. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2658 cancer across 38 tumor types, we perform multi-faceted pathway and network analyses of non-coding mutations across 2583 whole cancer genomes from 27 tumor types compiled by the ICGC/TCGA PCAWG project that was motivated by the success of pathway and network analyses in prioritizing rare mutations in protein-coding genes. While few non-coding genomic elements are recurrently mutated in this cohort, we identify 93 genes harboring non-coding mutations that cluster into several modules of interacting proteins. Among these are promoter mutations associated with reduced mRNA expression in TP53, TLE4, and TCF4. We find that biological processes had variable proportions of coding and non-coding mutations, with chromatin remodeling and proliferation pathways altered primarily by coding mutations, while developmental pathways, including Wnt and Notch, altered by both coding and non-coding mutations. RNA splicing is primarily altered by non-coding mutations in this cohort, and samples containing non-coding mutations in well-known RNA splicing factors exhibit similar gene expression signatures as samples with coding mutations in these genes. These analyses contribute a new repertoire of possible cancer genes and mechanisms that are altered by non-coding mutations and offer insights into additional cancer vulnerabilities that can be investigated for potential therapeutic treatments.
Collapse
Affiliation(s)
- Matthew A Reyna
- Department of Computer Science, Princeton University, Princeton, NJ, 08540, USA
- Department of Biomedical Informatics, Emory University, Atlanta, GA, 30322, USA
| | - David Haan
- Department of Biomolecular Engineering and UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, 95060, USA
| | - Marta Paczkowska
- Computational Biology Program, Ontario Institute for Cancer Research, Toronto, Ontario, Canada
| | - Lieven P C Verbeke
- Department of Information Technology, IDLab, Ghent University, IMEC, Ghent, the Netherlands
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, the Netherlands
| | - Miguel Vazquez
- Barcelona Supercomputing Center (BSC), Barcelona, 08034, Spain
- Department of Clinical and Molecular Medicine, Faculty of Medicine and Health Sciences, Norwegian University of Science and Technology, Trondheim, Norway
| | - Abdullah Kahraman
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, CH-8057, Zurich, Switzerland
- Department of Pathology and Molecular Pathology, University Hospital Zurich, CH-8091, Zurich, Switzerland
| | - Sergio Pulido-Tamayo
- Department of Information Technology, IDLab, Ghent University, IMEC, Ghent, the Netherlands
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, the Netherlands
| | - Jonathan Barenboim
- Computational Biology Program, Ontario Institute for Cancer Research, Toronto, Ontario, Canada
| | - Lina Wadi
- Computational Biology Program, Ontario Institute for Cancer Research, Toronto, Ontario, Canada
| | - Priyanka Dhingra
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, 10065, USA
| | - Raunak Shrestha
- Vancouver Prostate Centre, 2660 Oak Street, Vancouver, BC, V6H 3Z6, Canada
| | - Gad Getz
- The Broad Institute of MIT and Harvard, Cambridge, MA, 02124, USA
- Massachusetts General Hospital Center for Cancer Research, Charlestown, MA, 02129, USA
- Harvard Medical School, 250 Longwood Avenue, Boston, MA, 02115, USA
- Massachusetts General Hospital, Department of Pathology, Boston, MA, 02114, USA
| | - Michael S Lawrence
- The Broad Institute of MIT and Harvard, Cambridge, MA, 02124, USA
- Massachusetts General Hospital Center for Cancer Research, Charlestown, MA, 02129, USA
| | - Jakob Skou Pedersen
- Department of Molecular Medicine (MOMA), Aarhus University Hospital, Aarhus, Denmark
- Bioinformatics Research Centre (BiRC), Aarhus University, Aarhus, Denmark
| | - Mark A Rubin
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, 10065, USA
| | - David A Wheeler
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Søren Brunak
- DTU Bioinformatics, Department of Bio and Health Informatics, Technical University of Denmark, Kemitorvet, 2800, Kongens Lyngby, Denmark
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, 2200, Copenhagen, Denmark
| | - Jose M G Izarzugaza
- DTU Bioinformatics, Department of Bio and Health Informatics, Technical University of Denmark, Kemitorvet, 2800, Kongens Lyngby, Denmark
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, 2200, Copenhagen, Denmark
| | - Ekta Khurana
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, 10065, USA
| | - Kathleen Marchal
- Department of Information Technology, IDLab, Ghent University, IMEC, Ghent, the Netherlands
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, the Netherlands
| | - Christian von Mering
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, CH-8057, Zurich, Switzerland
| | - S Cenk Sahinalp
- Vancouver Prostate Centre, 2660 Oak Street, Vancouver, BC, V6H 3Z6, Canada
- Department of Computer Science, Indiana University, Bloomington, IN, 47405, USA
| | - Alfonso Valencia
- Barcelona Supercomputing Center (BSC), Barcelona, 08034, Spain
- ICREA, Barcelona, 08010, Spain
| | - Jüri Reimand
- Computational Biology Program, Ontario Institute for Cancer Research, Toronto, Ontario, Canada.
- Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada.
| | - Joshua M Stuart
- Department of Biomolecular Engineering and UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, 95060, USA.
| | - Benjamin J Raphael
- Department of Computer Science, Princeton University, Princeton, NJ, 08540, USA.
| |
Collapse
|
26
|
Aaltonen LA, Abascal F, Abeshouse A, Aburatani H, Adams DJ, Agrawal N, Ahn KS, Ahn SM, Aikata H, Akbani R, Akdemir KC, Al-Ahmadie H, Al-Sedairy ST, Al-Shahrour F, Alawi M, Albert M, Aldape K, Alexandrov LB, Ally A, Alsop K, Alvarez EG, Amary F, Amin SB, Aminou B, Ammerpohl O, Anderson MJ, Ang Y, Antonello D, Anur P, Aparicio S, Appelbaum EL, Arai Y, Aretz A, Arihiro K, Ariizumi SI, Armenia J, Arnould L, Asa S, Assenov Y, Atwal G, Aukema S, Auman JT, Aure MRR, Awadalla P, Aymerich M, Bader GD, Baez-Ortega A, Bailey MH, Bailey PJ, Balasundaram M, Balu S, Bandopadhayay P, Banks RE, Barbi S, Barbour AP, Barenboim J, Barnholtz-Sloan J, Barr H, Barrera E, Bartlett J, Bartolome J, Bassi C, Bathe OF, Baumhoer D, Bavi P, Baylin SB, Bazant W, Beardsmore D, Beck TA, Behjati S, Behren A, Niu B, Bell C, Beltran S, Benz C, Berchuck A, Bergmann AK, Bergstrom EN, Berman BP, Berney DM, Bernhart SH, Beroukhim R, Berrios M, Bersani S, Bertl J, Betancourt M, Bhandari V, Bhosle SG, Biankin AV, Bieg M, Bigner D, Binder H, Birney E, Birrer M, Biswas NK, Bjerkehagen B, Bodenheimer T, Boice L, Bonizzato G, De Bono JS, Boot A, Bootwalla MS, Borg A, Borkhardt A, Boroevich KA, Borozan I, Borst C, Bosenberg M, Bosio M, Boultwood J, Bourque G, Boutros PC, Bova GS, Bowen DT, Bowlby R, Bowtell DDL, Boyault S, Boyce R, Boyd J, Brazma A, Brennan P, Brewer DS, Brinkman AB, Bristow RG, Broaddus RR, Brock JE, Brock M, Broeks A, Brooks AN, Brooks D, Brors B, Brunak S, Bruxner TJC, Bruzos AL, Buchanan A, Buchhalter I, Buchholz C, Bullman S, Burke H, Burkhardt B, Burns KH, Busanovich J, Bustamante CD, Butler AP, Butte AJ, Byrne NJ, Børresen-Dale AL, Caesar-Johnson SJ, Cafferkey A, Cahill D, Calabrese C, Caldas C, Calvo F, Camacho N, Campbell PJ, Campo E, Cantù C, Cao S, Carey TE, Carlevaro-Fita J, Carlsen R, Cataldo I, Cazzola M, Cebon J, Cerfolio R, Chadwick DE, Chakravarty D, Chalmers D, Chan CWY, Chan K, Chan-Seng-Yue M, Chandan VS, Chang DK, Chanock SJ, Chantrill LA, Chateigner A, Chatterjee N, Chayama K, Chen HW, Chen J, Chen K, Chen Y, Chen Z, Cherniack AD, Chien J, Chiew YE, Chin SF, Cho J, Cho S, Choi JK, Choi W, Chomienne C, Chong Z, Choo SP, Chou A, Christ AN, Christie EL, Chuah E, Cibulskis C, Cibulskis K, Cingarlini S, Clapham P, Claviez A, Cleary S, Cloonan N, Cmero M, Collins CC, Connor AA, Cooke SL, Cooper CS, Cope L, Corbo V, Cordes MG, Cordner SM, Cortés-Ciriano I, Covington K, Cowin PA, Craft B, Craft D, Creighton CJ, Cun Y, Curley E, Cutcutache I, Czajka K, Czerniak B, Dagg RA, Danilova L, Davi MV, Davidson NR, Davies H, Davis IJ, Davis-Dusenbery BN, Dawson KJ, De La Vega FM, De Paoli-Iseppi R, Defreitas T, Tos APD, Delaneau O, Demchok JA, Demeulemeester J, Demidov GM, Demircioğlu D, Dennis NM, Denroche RE, Dentro SC, Desai N, Deshpande V, Deshwar AG, Desmedt C, Deu-Pons J, Dhalla N, Dhani NC, Dhingra P, Dhir R, DiBiase A, Diamanti K, Ding L, Ding S, Dinh HQ, Dirix L, Doddapaneni H, Donmez N, Dow MT, Drapkin R, Drechsel O, Drews RM, Serge S, Dudderidge T, Dueso-Barroso A, Dunford AJ, Dunn M, Dursi LJ, Duthie FR, Dutton-Regester K, Eagles J, Easton DF, Edmonds S, Edwards PA, Edwards SE, Eeles RA, Ehinger A, Eils J, Eils R, El-Naggar A, Eldridge M, Ellrott K, Erkek S, Escaramis G, Espiritu SMG, Estivill X, Etemadmoghadam D, Eyfjord JE, Faltas BM, Fan D, Fan Y, Faquin WC, Farcas C, Fassan M, Fatima A, Favero F, Fayzullaev N, Felau I, Fereday S, Ferguson ML, Ferretti V, Feuerbach L, Field MA, Fink JL, Finocchiaro G, Fisher C, Fittall MW, Fitzgerald A, Fitzgerald RC, Flanagan AM, Fleshner NE, Flicek P, Foekens JA, Fong KM, Fonseca NA, Foster CS, Fox NS, Fraser M, Frazer S, Frenkel-Morgenstern M, Friedman W, Frigola J, Fronick CC, Fujimoto A, Fujita M, Fukayama M, Fulton LA, Fulton RS, Furuta M, Futreal PA, Füllgrabe A, Gabriel SB, Gallinger S, Gambacorti-Passerini C, Gao J, Gao S, Garraway L, Garred Ø, Garrison E, Garsed DW, Gehlenborg N, Gelpi JLL, George J, Gerhard DS, Gerhauser C, Gershenwald JE, Gerstein M, Gerstung M, Getz G, Ghori M, Ghossein R, Giama NH, Gibbs RA, Gibson B, Gill AJ, Gill P, Giri DD, Glodzik D, Gnanapragasam VJ, Goebler ME, Goldman MJ, Gomez C, Gonzalez S, Gonzalez-Perez A, Gordenin DA, Gossage J, Gotoh K, Govindan R, Grabau D, Graham JS, Grant RC, Green AR, Green E, Greger L, Grehan N, Grimaldi S, Grimmond SM, Grossman RL, Grundhoff A, Gundem G, Guo Q, Gupta M, Gupta S, Gut IG, Gut M, Göke J, Ha G, Haake A, Haan D, Haas S, Haase K, Haber JE, Habermann N, Hach F, Haider S, Hama N, Hamdy FC, Hamilton A, Hamilton MP, Han L, Hanna GB, Hansmann M, Haradhvala NJ, Harismendy O, Harliwong I, Harmanci AO, Harrington E, Hasegawa T, Haussler D, Hawkins S, Hayami S, Hayashi S, Hayes DN, Hayes SJ, Hayward NK, Hazell S, He Y, Heath AP, Heath SC, Hedley D, Hegde AM, Heiman DI, Heinold MC, Heins Z, Heisler LE, Hellstrom-Lindberg E, Helmy M, Heo SG, Hepperla AJ, Heredia-Genestar JM, Herrmann C, Hersey P, Hess JM, Hilmarsdottir H, Hinton J, Hirano S, Hiraoka N, Hoadley KA, Hobolth A, Hodzic E, Hoell JI, Hoffmann S, Hofmann O, Holbrook A, Holik AZ, Hollingsworth MA, Holmes O, Holt RA, Hong C, Hong EP, Hong JH, Hooijer GK, Hornshøj H, Hosoda F, Hou Y, Hovestadt V, Howat W, Hoyle AP, Hruban RH, Hu J, Hu T, Hua X, Huang KL, Huang M, Huang MN, Huang V, Huang Y, Huber W, Hudson TJ, Hummel M, Hung JA, Huntsman D, Hupp TR, Huse J, Huska MR, Hutter B, Hutter CM, Hübschmann D, Iacobuzio-Donahue CA, Imbusch CD, Imielinski M, Imoto S, Isaacs WB, Isaev K, Ishikawa S, Iskar M, Islam SMA, Ittmann M, Ivkovic S, Izarzugaza JMG, Jacquemier J, Jakrot V, Jamieson NB, Jang GH, Jang SJ, Jayaseelan JC, Jayasinghe R, Jefferys SR, Jegalian K, Jennings JL, Jeon SH, Jerman L, Ji Y, Jiao W, Johansson PA, Johns AL, Johns J, Johnson R, Johnson TA, Jolly C, Joly Y, Jonasson JG, Jones CD, Jones DR, Jones DTW, Jones N, Jones SJM, Jonkers J, Ju YS, Juhl H, Jung J, Juul M, Juul RI, Juul S, Jäger N, Kabbe R, Kahles A, Kahraman A, Kaiser VB, Kakavand H, Kalimuthu S, von Kalle C, Kang KJ, Karaszi K, Karlan B, Karlić R, Karsch D, Kasaian K, Kassahn KS, Katai H, Kato M, Katoh H, Kawakami Y, Kay JD, Kazakoff SH, Kazanov MD, Keays M, Kebebew E, Kefford RF, Kellis M, Kench JG, Kennedy CJ, Kerssemakers JNA, Khoo D, Khoo V, Khuntikeo N, Khurana E, Kilpinen H, Kim HK, Kim HL, Kim HY, Kim H, Kim J, Kim J, Kim JK, Kim Y, King TA, Klapper W, Kleinheinz K, Klimczak LJ, Knappskog S, Kneba M, Knoppers BM, Koh Y, Komorowski J, Komura D, Komura M, Kong G, Kool M, Korbel JO, Korchina V, Korshunov A, Koscher M, Koster R, Kote-Jarai Z, Koures A, Kovacevic M, Kremeyer B, Kretzmer H, Kreuz M, Krishnamurthy S, Kube D, Kumar K, Kumar P, Kumar S, Kumar Y, Kundra R, Kübler K, Küppers R, Lagergren J, Lai PH, Laird PW, Lakhani SR, Lalansingh CM, Lalonde E, Lamaze FC, Lambert A, Lander E, Landgraf P, Landoni L, Langerød A, Lanzós A, Larsimont D, Larsson E, Lathrop M, Lau LMS, Lawerenz C, Lawlor RT, Lawrence MS, Lazar AJ, Lazic AM, Le X, Lee D, Lee D, Lee EA, Lee HJ, Lee JJK, Lee JY, Lee J, Lee MTM, Lee-Six H, Lehmann KV, Lehrach H, Lenze D, Leonard CR, Leongamornlert DA, Leshchiner I, Letourneau L, Letunic I, Levine DA, Lewis L, Ley T, Li C, Li CH, Li HI, Li J, Li L, Li S, Li S, Li X, Li X, Li X, Li Y, Liang H, Liang SB, Lichter P, Lin P, Lin Z, Linehan WM, Lingjærde OC, Liu D, Liu EM, Liu FFF, Liu F, Liu J, Liu X, Livingstone J, Livitz D, Livni N, Lochovsky L, Loeffler M, Long GV, Lopez-Guillermo A, Lou S, Louis DN, Lovat LB, Lu Y, Lu YJ, Lu Y, Luchini C, Lungu I, Luo X, Luxton HJ, Lynch AG, Lype L, López C, López-Otín C, Ma EZ, Ma Y, MacGrogan G, MacRae S, Macintyre G, Madsen T, Maejima K, Mafficini A, Maglinte DT, Maitra A, Majumder PP, Malcovati L, Malikic S, Malleo G, Mann GJ, Mantovani-Löffler L, Marchal K, Marchegiani G, Mardis ER, Margolin AA, Marin MG, Markowetz F, Markowski J, Marks J, Marques-Bonet T, Marra MA, Marsden L, Martens JWM, Martin S, Martin-Subero JI, Martincorena I, Martinez-Fundichely A, Maruvka YE, Mashl RJ, Massie CE, Matthew TJ, Matthews L, Mayer E, Mayes S, Mayo M, Mbabaali F, McCune K, McDermott U, McGillivray PD, McLellan MD, McPherson JD, McPherson JR, McPherson TA, Meier SR, Meng A, Meng S, Menzies A, Merrett ND, Merson S, Meyerson M, Meyerson W, Mieczkowski PA, Mihaiescu GL, Mijalkovic S, Mikkelsen T, Milella M, Mileshkin L, Miller CA, Miller DK, Miller JK, Mills GB, Milovanovic A, Minner S, Miotto M, Arnau GM, Mirabello L, Mitchell C, Mitchell TJ, Miyano S, Miyoshi N, Mizuno S, Molnár-Gábor F, Moore MJ, Moore RA, Morganella S, Morris QD, Morrison C, Mose LE, Moser CD, Muiños F, Mularoni L, Mungall AJ, Mungall K, Musgrove EA, Mustonen V, Mutch D, Muyas F, Muzny DM, Muñoz A, Myers J, Myklebost O, Möller P, Nagae G, Nagrial AM, Nahal-Bose HK, Nakagama H, Nakagawa H, Nakamura H, Nakamura T, Nakano K, Nandi T, Nangalia J, Nastic M, Navarro A, Navarro FCP, Neal DE, Nettekoven G, Newell F, Newhouse SJ, Newton Y, Ng AWT, Ng A, Nicholson J, Nicol D, Nie Y, Nielsen GP, Nielsen MM, Nik-Zainal S, Noble MS, Nones K, Northcott PA, Notta F, O’Connor BD, O’Donnell P, O’Donovan M, O’Meara S, O’Neill BP, O’Neill JR, Ocana D, Ochoa A, Oesper L, Ogden C, Ohdan H, Ohi K, Ohno-Machado L, Oien KA, Ojesina AI, Ojima H, Okusaka T, Omberg L, Ong CK, Ossowski S, Ott G, Ouellette BFF, P’ng C, Paczkowska M, Paiella S, Pairojkul C, Pajic M, Pan-Hammarström Q, Papaemmanuil E, Papatheodorou I, Paramasivam N, Park JW, Park JW, Park K, Park K, Park PJ, Parker JS, Parsons SL, Pass H, Pasternack D, Pastore A, Patch AM, Pauporté I, Pea A, Pearson JV, Pedamallu CS, Pedersen JS, Pederzoli P, Peifer M, Pennell NA, Perou CM, Perry MD, Petersen GM, Peto M, Petrelli N, Petryszak R, Pfister SM, Phillips M, Pich O, Pickett HA, Pihl TD, Pillay N, Pinder S, Pinese M, Pinho AV, Pitkänen E, Pivot X, Piñeiro-Yáñez E, Planko L, Plass C, Polak P, Pons T, Popescu I, Potapova O, Prasad A, Preston SR, Prinz M, Pritchard AL, Prokopec SD, Provenzano E, Puente XS, Puig S, Puiggròs M, Pulido-Tamayo S, Pupo GM, Purdie CA, Quinn MC, Rabionet R, Rader JS, Radlwimmer B, Radovic P, Raeder B, Raine KM, Ramakrishna M, Ramakrishnan K, Ramalingam S, Raphael BJ, Rathmell WK, Rausch T, Reifenberger G, Reimand J, Reis-Filho J, Reuter V, Reyes-Salazar I, Reyna MA, Reynolds SM, Rheinbay E, Riazalhosseini Y, Richardson AL, Richter J, Ringel M, Ringnér M, Rino Y, Rippe K, Roach J, Roberts LR, Roberts ND, Roberts SA, Robertson AG, Robertson AJ, Rodriguez JB, Rodriguez-Martin B, Rodríguez-González FG, Roehrl MHA, Rohde M, Rokutan H, Romieu G, Rooman I, Roques T, Rosebrock D, Rosenberg M, Rosenstiel PC, Rosenwald A, Rowe EW, Royo R, Rozen SG, Rubanova Y, Rubin MA, Rubio-Perez C, Rudneva VA, Rusev BC, Ruzzenente A, Rätsch G, Sabarinathan R, Sabelnykova VY, Sadeghi S, Sahinalp SC, Saini N, Saito-Adachi M, Saksena G, Salcedo A, Salgado R, Salichos L, Sallari R, Saller C, Salvia R, Sam M, Samra JS, Sanchez-Vega F, Sander C, Sanders G, Sarin R, Sarrafi I, Sasaki-Oku A, Sauer T, Sauter G, Saw RPM, Scardoni M, Scarlett CJ, Scarpa A, Scelo G, Schadendorf D, Schein JE, Schilhabel MB, Schlesner M, Schlomm T, Schmidt HK, Schramm SJ, Schreiber S, Schultz N, Schumacher SE, Schwarz RF, Scolyer RA, Scott D, Scully R, Seethala R, Segre AV, Selander I, Semple CA, Senbabaoglu Y, Sengupta S, Sereni E, Serra S, Sgroi DC, Shackleton M, Shah NC, Shahabi S, Shang CA, Shang P, Shapira O, Shelton T, Shen C, Shen H, Shepherd R, Shi R, Shi Y, Shiah YJ, Shibata T, Shih J, Shimizu E, Shimizu K, Shin SJ, Shiraishi Y, Shmaya T, Shmulevich I, Shorser SI, Short C, Shrestha R, Shringarpure SS, Shriver C, Shuai S, Sidiropoulos N, Siebert R, Sieuwerts AM, Sieverling L, Signoretti S, Sikora KO, Simbolo M, Simon R, Simons JV, Simpson JT, Simpson PT, Singer S, Sinnott-Armstrong N, Sipahimalani P, Skelly TJ, Smid M, Smith J, Smith-McCune K, Socci ND, Sofia HJ, Soloway MG, Song L, Sood AK, Sothi S, Sotiriou C, Soulette CM, Span PN, Spellman PT, Sperandio N, Spillane AJ, Spiro O, Spring J, Staaf J, Stadler PF, Staib P, Stark SG, Stebbings L, Stefánsson ÓA, Stegle O, Stein LD, Stenhouse A, Stewart C, Stilgenbauer S, Stobbe MD, Stratton MR, Stretch JR, Struck AJ, Stuart JM, Stunnenberg HG, Su H, Su X, Sun RX, Sungalee S, Susak H, Suzuki A, Sweep F, Szczepanowski M, Sültmann H, Yugawa T, Tam A, Tamborero D, Tan BKT, Tan D, Tan P, Tanaka H, Taniguchi H, Tanskanen TJ, Tarabichi M, Tarnuzzer R, Tarpey P, Taschuk ML, Tatsuno K, Tavaré S, Taylor DF, Taylor-Weiner A, Teague JW, Teh BT, Tembe V, Temes J, Thai K, Thayer SP, Thiessen N, Thomas G, Thomas S, Thompson A, Thompson AM, Thompson JFF, Thompson RH, Thorne H, Thorne LB, Thorogood A, Tiao G, Tijanic N, Timms LE, Tirabosco R, Tojo M, Tommasi S, Toon CW, Toprak UH, Torrents D, Tortora G, Tost J, Totoki Y, Townend D, Traficante N, Treilleux I, Trotta JR, Trümper LHP, Tsao M, Tsunoda T, Tubio JMC, Tucker O, Turkington R, Turner DJ, Tutt A, Ueno M, Ueno NT, Umbricht C, Umer HM, Underwood TJ, Urban L, Urushidate T, Ushiku T, Uusküla-Reimand L, Valencia A, Van Den Berg DJ, Van Laere S, Van Loo P, Van Meir EG, Van den Eynden GG, Van der Kwast T, Vasudev N, Vazquez M, Vedururu R, Veluvolu U, Vembu S, Verbeke LPC, Vermeulen P, Verrill C, Viari A, Vicente D, Vicentini C, VijayRaghavan K, Viksna J, Vilain RE, Villasante I, Vincent-Salomon A, Visakorpi T, Voet D, Vyas P, Vázquez-García I, Waddell NM, Waddell N, Wadelius C, Wadi L, Wagener R, Wala JA, Wang J, Wang J, Wang L, Wang Q, Wang W, Wang Y, Wang Z, Waring PM, Warnatz HJ, Warrell J, Warren AY, Waszak SM, Wedge DC, Weichenhan D, Weinberger P, Weinstein JN, Weischenfeldt J, Weisenberger DJ, Welch I, Wendl MC, Werner J, Whalley JP, Wheeler DA, Whitaker HC, Wigle D, Wilkerson MD, Williams A, Wilmott JS, Wilson GW, Wilson JM, Wilson RK, Winterhoff B, Wintersinger JA, Wiznerowicz M, Wolf S, Wong BH, Wong T, Wong W, Woo Y, Wood S, Wouters BG, Wright AJ, Wright DW, Wright MH, Wu CL, Wu DY, Wu G, Wu J, Wu K, Wu Y, Wu Z, Xi L, Xia T, Xiang Q, Xiao X, Xing R, Xiong H, Xu Q, Xu Y, Xue H, Yachida S, Yakneen S, Yamaguchi R, Yamaguchi TN, Yamamoto M, Yamamoto S, Yamaue H, Yang F, Yang H, Yang JY, Yang L, Yang L, Yang S, Yang TP, Yang Y, Yao X, Yaspo ML, Yates L, Yau C, Ye C, Ye K, Yellapantula VD, Yoon CJ, Yoon SS, Yousif F, Yu J, Yu K, Yu W, Yu Y, Yuan K, Yuan Y, Yuen D, Yung CK, Zaikova O, Zamora J, Zapatka M, Zenklusen JC, Zenz T, Zeps N, Zhang CZ, Zhang F, Zhang H, Zhang H, Zhang H, Zhang J, Zhang J, Zhang J, Zhang X, Zhang X, Zhang Y, Zhang Z, Zhao Z, Zheng L, Zheng X, Zhou W, Zhou Y, Zhu B, Zhu H, Zhu J, Zhu S, Zou L, Zou X, deFazio A, van As N, van Deurzen CHM, van de Vijver MJ, van’t Veer L, von Mering C. Pan-cancer analysis of whole genomes. Nature 2020; 578:82-93. [PMID: 32025007 PMCID: PMC7025898 DOI: 10.1038/s41586-020-1969-6] [Citation(s) in RCA: 1435] [Impact Index Per Article: 358.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2018] [Accepted: 12/11/2019] [Indexed: 02/07/2023]
Abstract
Cancer is driven by genetic change, and the advent of massively parallel sequencing has enabled systematic documentation of this variation at the whole-genome scale1-3. Here we report the integrative analysis of 2,658 whole-cancer genomes and their matching normal tissues across 38 tumour types from the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA). We describe the generation of the PCAWG resource, facilitated by international data sharing using compute clouds. On average, cancer genomes contained 4-5 driver mutations when combining coding and non-coding genomic elements; however, in around 5% of cases no drivers were identified, suggesting that cancer driver discovery is not yet complete. Chromothripsis, in which many clustered structural variants arise in a single catastrophic event, is frequently an early event in tumour evolution; in acral melanoma, for example, these events precede most somatic point mutations and affect several cancer-associated genes simultaneously. Cancers with abnormal telomere maintenance often originate from tissues with low replicative activity and show several mechanisms of preventing telomere attrition to critical levels. Common and rare germline variants affect patterns of somatic mutation, including point mutations, structural variants and somatic retrotransposition. A collection of papers from the PCAWG Consortium describes non-coding mutations that drive cancer beyond those in the TERT promoter4; identifies new signatures of mutational processes that cause base substitutions, small insertions and deletions and structural variation5,6; analyses timings and patterns of tumour evolution7; describes the diverse transcriptional consequences of somatic mutation on splicing, expression levels, fusion genes and promoter activity8,9; and evaluates a range of more-specialized features of cancer genomes8,10-18.
Collapse
|
27
|
Gerstung M, Jolly C, Leshchiner I, Dentro SC, Gonzalez S, Rosebrock D, Mitchell TJ, Rubanova Y, Anur P, Yu K, Tarabichi M, Deshwar A, Wintersinger J, Kleinheinz K, Vázquez-García I, Haase K, Jerman L, Sengupta S, Macintyre G, Malikic S, Donmez N, Livitz DG, Cmero M, Demeulemeester J, Schumacher S, Fan Y, Yao X, Lee J, Schlesner M, Boutros PC, Bowtell DD, Zhu H, Getz G, Imielinski M, Beroukhim R, Sahinalp SC, Ji Y, Peifer M, Markowetz F, Mustonen V, Yuan K, Wang W, Morris QD, Spellman PT, Wedge DC, Van Loo P. The evolutionary history of 2,658 cancers. Nature 2020; 578:122-128. [PMID: 32025013 PMCID: PMC7054212 DOI: 10.1038/s41586-019-1907-7] [Citation(s) in RCA: 518] [Impact Index Per Article: 129.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2017] [Accepted: 11/18/2019] [Indexed: 01/28/2023]
Abstract
Cancer develops through a process of somatic evolution1,2. Sequencing data from a single biopsy represent a snapshot of this process that can reveal the timing of specific genomic aberrations and the changing influence of mutational processes3. Here, by whole-genome sequencing analysis of 2,658 cancers as part of the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA)4, we reconstruct the life history and evolution of mutational processes and driver mutation sequences of 38 types of cancer. Early oncogenesis is characterized by mutations in a constrained set of driver genes, and specific copy number gains, such as trisomy 7 in glioblastoma and isochromosome 17q in medulloblastoma. The mutational spectrum changes significantly throughout tumour evolution in 40% of samples. A nearly fourfold diversification of driver genes and increased genomic instability are features of later stages. Copy number alterations often occur in mitotic crises, and lead to simultaneous gains of chromosomal segments. Timing analyses suggest that driver mutations often precede diagnosis by many years, if not decades. Together, these results determine the evolutionary trajectories of cancer, and highlight opportunities for early cancer detection.
Collapse
Affiliation(s)
- Moritz Gerstung
- grid.225360.00000 0000 9709 7726European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, UK ,grid.4709.a0000 0004 0495 846XEuropean Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany ,grid.10306.340000 0004 0606 5382Wellcome Sanger Institute, Cambridge, UK
| | - Clemency Jolly
- grid.451388.30000 0004 1795 1830The Francis Crick Institute, London, UK
| | - Ignaty Leshchiner
- grid.66859.340000 0004 0546 1623Broad Institute of MIT and Harvard, Cambridge, MA USA
| | - Stefan C. Dentro
- grid.10306.340000 0004 0606 5382Wellcome Sanger Institute, Cambridge, UK ,grid.451388.30000 0004 1795 1830The Francis Crick Institute, London, UK ,grid.4991.50000 0004 1936 8948Big Data Institute, University of Oxford, Oxford, UK
| | - Santiago Gonzalez
- grid.225360.00000 0000 9709 7726European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, UK
| | - Daniel Rosebrock
- grid.66859.340000 0004 0546 1623Broad Institute of MIT and Harvard, Cambridge, MA USA
| | - Thomas J. Mitchell
- grid.10306.340000 0004 0606 5382Wellcome Sanger Institute, Cambridge, UK ,grid.5335.00000000121885934University of Cambridge, Cambridge, UK
| | - Yulia Rubanova
- grid.17063.330000 0001 2157 2938University of Toronto, Toronto, Ontario Canada ,grid.494618.6Vector Institute, Toronto, Ontario Canada
| | - Pavana Anur
- grid.5288.70000 0000 9758 5690Molecular and Medical Genetics, Oregon Health & Science University, Portland, OR USA
| | - Kaixian Yu
- grid.240145.60000 0001 2291 4776The University of Texas MD Anderson Cancer Center, Houston, TX USA
| | - Maxime Tarabichi
- grid.10306.340000 0004 0606 5382Wellcome Sanger Institute, Cambridge, UK ,grid.451388.30000 0004 1795 1830The Francis Crick Institute, London, UK
| | - Amit Deshwar
- grid.17063.330000 0001 2157 2938University of Toronto, Toronto, Ontario Canada ,grid.494618.6Vector Institute, Toronto, Ontario Canada
| | - Jeff Wintersinger
- grid.17063.330000 0001 2157 2938University of Toronto, Toronto, Ontario Canada ,grid.494618.6Vector Institute, Toronto, Ontario Canada
| | - Kortine Kleinheinz
- grid.7497.d0000 0004 0492 0584German Cancer Research Center (DKFZ), Heidelberg, Germany ,grid.7700.00000 0001 2190 4373Heidelberg University, Heidelberg, Germany
| | - Ignacio Vázquez-García
- grid.10306.340000 0004 0606 5382Wellcome Sanger Institute, Cambridge, UK ,grid.5335.00000000121885934University of Cambridge, Cambridge, UK
| | - Kerstin Haase
- grid.451388.30000 0004 1795 1830The Francis Crick Institute, London, UK
| | - Lara Jerman
- grid.225360.00000 0000 9709 7726European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, UK ,grid.8954.00000 0001 0721 6013University of Ljubljana, Ljubljana, Slovenia
| | - Subhajit Sengupta
- grid.240372.00000 0004 0400 4439NorthShore University HealthSystem, Evanston, IL USA
| | - Geoff Macintyre
- grid.5335.00000000121885934Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge, UK
| | - Salem Malikic
- grid.61971.380000 0004 1936 7494Simon Fraser University, Burnaby, British Columbia Canada ,grid.412541.70000 0001 0684 7796Vancouver Prostate Centre, Vancouver, British Columbia Canada
| | - Nilgun Donmez
- grid.61971.380000 0004 1936 7494Simon Fraser University, Burnaby, British Columbia Canada ,grid.412541.70000 0001 0684 7796Vancouver Prostate Centre, Vancouver, British Columbia Canada
| | - Dimitri G. Livitz
- grid.66859.340000 0004 0546 1623Broad Institute of MIT and Harvard, Cambridge, MA USA
| | - Marek Cmero
- grid.1008.90000 0001 2179 088XUniversity of Melbourne, Melbourne, Victoria Australia ,grid.1042.70000 0004 0432 4889Walter and Eliza Hall Institute, Melbourne, Victoria Australia
| | - Jonas Demeulemeester
- grid.451388.30000 0004 1795 1830The Francis Crick Institute, London, UK ,grid.5596.f0000 0001 0668 7884University of Leuven, Leuven, Belgium
| | - Steven Schumacher
- grid.66859.340000 0004 0546 1623Broad Institute of MIT and Harvard, Cambridge, MA USA
| | - Yu Fan
- grid.240145.60000 0001 2291 4776The University of Texas MD Anderson Cancer Center, Houston, TX USA
| | - Xiaotong Yao
- grid.5386.8000000041936877XWeill Cornell Medicine, New York, NY USA ,grid.429884.b0000 0004 1791 0895New York Genome Center, New York, NY USA
| | - Juhee Lee
- grid.205975.c0000 0001 0740 6917University of California Santa Cruz, Santa Cruz, CA USA
| | - Matthias Schlesner
- grid.7497.d0000 0004 0492 0584German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Paul C. Boutros
- grid.17063.330000 0001 2157 2938University of Toronto, Toronto, Ontario Canada ,grid.419890.d0000 0004 0626 690XOntario Institute for Cancer Research, Toronto, Ontario Canada ,grid.19006.3e0000 0000 9632 6718University of California, Los Angeles, CA USA
| | - David D. Bowtell
- grid.1055.10000000403978434Peter MacCallum Cancer Centre, Melbourne, Victoria Australia
| | - Hongtu Zhu
- grid.240145.60000 0001 2291 4776The University of Texas MD Anderson Cancer Center, Houston, TX USA
| | - Gad Getz
- grid.66859.340000 0004 0546 1623Broad Institute of MIT and Harvard, Cambridge, MA USA ,grid.32224.350000 0004 0386 9924Center for Cancer Research, Massachusetts General Hospital, Charlestown, MA USA ,grid.32224.350000 0004 0386 9924Department of Pathology, Massachusetts General Hospital, Boston, MA USA ,grid.38142.3c000000041936754XHarvard Medical School, Boston, MA USA
| | - Marcin Imielinski
- grid.5386.8000000041936877XWeill Cornell Medicine, New York, NY USA ,grid.429884.b0000 0004 1791 0895New York Genome Center, New York, NY USA
| | - Rameen Beroukhim
- grid.66859.340000 0004 0546 1623Broad Institute of MIT and Harvard, Cambridge, MA USA ,grid.65499.370000 0001 2106 9910Dana-Farber Cancer Institute, Boston, MA USA
| | - S. Cenk Sahinalp
- grid.412541.70000 0001 0684 7796Vancouver Prostate Centre, Vancouver, British Columbia Canada ,grid.411377.70000 0001 0790 959XIndiana University, Bloomington, IN USA
| | - Yuan Ji
- grid.240372.00000 0004 0400 4439NorthShore University HealthSystem, Evanston, IL USA ,grid.170205.10000 0004 1936 7822The University of Chicago, Chicago, IL USA
| | - Martin Peifer
- grid.6190.e0000 0000 8580 3777University of Cologne, Cologne, Germany
| | - Florian Markowetz
- grid.5335.00000000121885934Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge, UK
| | - Ville Mustonen
- grid.7737.40000 0004 0410 2071University of Helsinki, Helsinki, Finland
| | - Ke Yuan
- grid.5335.00000000121885934Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge, UK ,grid.8756.c0000 0001 2193 314XUniversity of Glasgow, Glasgow, UK
| | - Wenyi Wang
- grid.240145.60000 0001 2291 4776The University of Texas MD Anderson Cancer Center, Houston, TX USA
| | - Quaid D. Morris
- grid.17063.330000 0001 2157 2938University of Toronto, Toronto, Ontario Canada ,grid.494618.6Vector Institute, Toronto, Ontario Canada
| | | | - Paul T. Spellman
- grid.5288.70000 0000 9758 5690Molecular and Medical Genetics, Oregon Health & Science University, Portland, OR USA
| | - David C. Wedge
- grid.4991.50000 0004 1936 8948Big Data Institute, University of Oxford, Oxford, UK ,grid.454382.c0000 0004 7871 7212Oxford NIHR Biomedical Research Centre, Oxford, UK
| | - Peter Van Loo
- grid.451388.30000 0004 1795 1830The Francis Crick Institute, London, UK ,grid.5596.f0000 0001 0668 7884University of Leuven, Leuven, Belgium
| | | |
Collapse
|
28
|
Gawroński AR, Lin YY, McConeghy B, LeBihan S, Asghari H, Koçkan C, Orabi B, Adra N, Pili R, Collins CC, Sahinalp SC, Hach F. Structural variation and fusion detection using targeted sequencing data from circulating cell free DNA. Nucleic Acids Res 2019; 47:e38. [PMID: 30759232 PMCID: PMC6468241 DOI: 10.1093/nar/gkz067] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2018] [Revised: 12/15/2018] [Accepted: 02/01/2019] [Indexed: 12/15/2022] Open
Abstract
MOTIVATION Cancer is a complex disease that involves rapidly evolving cells, often forming multiple distinct clones. In order to effectively understand progression of a patient-specific tumor, one needs to comprehensively sample tumor DNA at multiple time points, ideally obtained through inexpensive and minimally invasive techniques. Current sequencing technologies make the 'liquid biopsy' possible, which involves sampling a patient's blood or urine and sequencing the circulating cell free DNA (cfDNA). A certain percentage of this DNA originates from the tumor, known as circulating tumor DNA (ctDNA). The ratio of ctDNA may be extremely low in the sample, and the ctDNA may originate from multiple tumors or clones. These factors present unique challenges for applying existing tools and workflows to the analysis of ctDNA, especially in the detection of structural variations which rely on sufficient read coverage to be detectable. RESULTS Here we introduce SViCT , a structural variation (SV) detection tool designed to handle the challenges associated with cfDNA analysis. SViCT can detect breakpoints and sequences of various structural variations including deletions, insertions, inversions, duplications and translocations. SViCT extracts discordant read pairs, one-end anchors and soft-clipped/split reads, assembles them into contigs, and re-maps contig intervals to a reference genome using an efficient k-mer indexing approach. The intervals are then joined using a combination of graph and greedy algorithms to identify specific structural variant signatures. We assessed the performance of SViCT and compared it to state-of-the-art tools using simulated cfDNA datasets with properties matching those of real cfDNA samples. The positive predictive value and sensitivity of our tool was superior to all the tested tools and reasonable performance was maintained down to the lowest dilution of 0.01% tumor DNA in simulated datasets. Additionally, SViCT was able to detect all known SVs in two real cfDNA reference datasets (at 0.6-5% ctDNA) and predict a novel structural variant in a prostate cancer cohort. AVAILABILITY SViCT is available at https://github.com/vpc-ccg/svict. Contact:faraz.hach@ubc.ca.
Collapse
Affiliation(s)
- Alexander R Gawroński
- School of Computing Science, Simon Fraser University, Burnaby, British Columbia V5A 1S6, Canada
| | - Yen-Yi Lin
- Department of Urologic Sciences, Faculty of Medicine, University of British Columbia, Vancouver, British Columbia V52 1M9, Canada.,Vancouver Prostate Centre, Vancouver, British Columbia V6H 3Z6, Canada
| | - Brian McConeghy
- Department of Urologic Sciences, Faculty of Medicine, University of British Columbia, Vancouver, British Columbia V52 1M9, Canada.,Vancouver Prostate Centre, Vancouver, British Columbia V6H 3Z6, Canada
| | - Stephane LeBihan
- Department of Urologic Sciences, Faculty of Medicine, University of British Columbia, Vancouver, British Columbia V52 1M9, Canada.,Vancouver Prostate Centre, Vancouver, British Columbia V6H 3Z6, Canada
| | - Hossein Asghari
- School of Computing Science, Simon Fraser University, Burnaby, British Columbia V5A 1S6, Canada.,Vancouver Prostate Centre, Vancouver, British Columbia V6H 3Z6, Canada
| | - Can Koçkan
- Department of Computer Science, Indiana University, Bloomington 47405, USA
| | - Baraa Orabi
- School of Computing Science, Simon Fraser University, Burnaby, British Columbia V5A 1S6, Canada.,Vancouver Prostate Centre, Vancouver, British Columbia V6H 3Z6, Canada
| | - Nabil Adra
- School of Medicine, Indiana University, Indianapolis, 46202, USA
| | - Roberto Pili
- School of Medicine, Indiana University, Indianapolis, 46202, USA
| | - Colin C Collins
- Department of Urologic Sciences, Faculty of Medicine, University of British Columbia, Vancouver, British Columbia V52 1M9, Canada.,Vancouver Prostate Centre, Vancouver, British Columbia V6H 3Z6, Canada
| | - S Cenk Sahinalp
- Department of Computer Science, Indiana University, Bloomington 47405, USA
| | - Faraz Hach
- Department of Urologic Sciences, Faculty of Medicine, University of British Columbia, Vancouver, British Columbia V52 1M9, Canada.,Vancouver Prostate Centre, Vancouver, British Columbia V6H 3Z6, Canada
| |
Collapse
|
29
|
Haghshenas E, Sahinalp SC, Hach F. lordFAST: sensitive and Fast Alignment Search Tool for LOng noisy Read sequencing Data. Bioinformatics 2019; 35:20-27. [PMID: 30561550 DOI: 10.1093/bioinformatics/bty544] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2017] [Accepted: 06/28/2018] [Indexed: 02/01/2023] Open
Abstract
Motivation Recent advances in genomics and precision medicine have been made possible through the application of high throughput sequencing (HTS) to large collections of human genomes. Although HTS technologies have proven their use in cataloging human genome variation, computational analysis of the data they generate is still far from being perfect. The main limitation of Illumina and other popular sequencing technologies is their short read length relative to the lengths of (common) genomic repeats. Newer (single molecule sequencing - SMS) technologies such as Pacific Biosciences and Oxford Nanopore are producing longer reads, making it theoretically possible to overcome the difficulties imposed by repeat regions. Unfortunately, because of their high sequencing error rate, reads generated by these technologies are very difficult to work with and cannot be used in many of the standard downstream analysis pipelines. Note that it is not only difficult to find the correct mapping locations of such reads in a reference genome, but also to establish their correct alignment so as to differentiate sequencing errors from real genomic variants. Furthermore, especially since newer SMS instruments provide higher throughput, mapping and alignment need to be performed much faster than before, maintaining high sensitivity. Results We introduce lordFAST, a novel long-read mapper that is specifically designed to align reads generated by PacBio and potentially other SMS technologies to a reference. lordFAST not only has higher sensitivity than the available alternatives, it is also among the fastest and has a very low memory footprint. Availability and implementation lordFAST is implemented in C++ and supports multi-threading. The source code of lordFAST is available at https://github.com/vpc-ccg/lordfast. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Ehsan Haghshenas
- School of Computing Science, Simon Fraser University, Burnaby, BC, Canada
| | - S Cenk Sahinalp
- School of Computing Science, Simon Fraser University, Burnaby, BC, Canada.,School of Informatics and Computing, Indiana University, Bloomington, IN, USA
| | - Faraz Hach
- Vancouver Prostate Centre, Vancouver, BC, Canada.,Department of Urologic Sciences, University of British Columbia, Vancouver, BC, Canada
| |
Collapse
|
30
|
Malikic S, Mehrabadi FR, Ciccolella S, Rahman MK, Ricketts C, Haghshenas E, Seidman D, Hach F, Hajirasouliha I, Sahinalp SC. PhISCS: a combinatorial approach for subperfect tumor phylogeny reconstruction via integrative use of single-cell and bulk sequencing data. Genome Res 2019; 29:1860-1877. [PMID: 31628256 PMCID: PMC6836735 DOI: 10.1101/gr.234435.118] [Citation(s) in RCA: 43] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2018] [Accepted: 09/11/2019] [Indexed: 12/29/2022]
Abstract
Available computational methods for tumor phylogeny inference via single-cell sequencing (SCS) data typically aim to identify the most likely perfect phylogeny tree satisfying the infinite sites assumption (ISA). However, the limitations of SCS technologies including frequent allele dropout and variable sequence coverage may prohibit a perfect phylogeny. In addition, ISA violations are commonly observed in tumor phylogenies due to the loss of heterozygosity, deletions, and convergent evolution. In order to address such limitations, we introduce the optimal subperfect phylogeny problem which asks to integrate SCS data with matching bulk sequencing data by minimizing a linear combination of potential false negatives (due to allele dropout or variance in sequence coverage), false positives (due to read errors) among mutation calls, and the number of mutations that violate ISA (real or because of incorrect copy number estimation). We then describe a combinatorial formulation to solve this problem which ensures that several lineage constraints imposed by the use of variant allele frequencies (VAFs, derived from bulk sequence data) are satisfied. We express our formulation both in the form of an integer linear program (ILP) and—as a first in tumor phylogeny reconstruction—a Boolean constraint satisfaction problem (CSP) and solve them by leveraging state-of-the-art ILP/CSP solvers. The resulting method, which we name PhISCS, is the first to integrate SCS and bulk sequencing data while accounting for ISA violating mutations. In contrast to the alternative methods, typically based on probabilistic approaches, PhISCS provides a guarantee of optimality in reported solutions. Using simulated and real data sets, we demonstrate that PhISCS is more general and accurate than all available approaches.
Collapse
Affiliation(s)
- Salem Malikic
- School of Computing Science, Simon Fraser University, Burnaby, BC V5A 1S6, Canada
| | - Farid Rashidi Mehrabadi
- Department of Computer Science, Indiana University, Bloomington, Indiana 47408, USA.,Cancer Data Science Laboratory, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892, USA
| | - Simone Ciccolella
- Department of Computer Systems and Communication, University of Milano-Bicocca, 20136 Milan, Italy.,Institute for Computational Biomedicine, Weill Cornell Medicine, New York, New York 10065, USA
| | - Md Khaledur Rahman
- Department of Computer Science, Indiana University, Bloomington, Indiana 47408, USA
| | - Camir Ricketts
- Institute for Computational Biomedicine, Weill Cornell Medicine, New York, New York 10065, USA.,Tri-I Computational Biology and Medicine Graduate Program, Cornell University, New York, New York 10065, USA
| | - Ehsan Haghshenas
- School of Computing Science, Simon Fraser University, Burnaby, BC V5A 1S6, Canada
| | - Daniel Seidman
- Tri-I Computational Biology and Medicine Graduate Program, Cornell University, New York, New York 10065, USA
| | - Faraz Hach
- School of Computing Science, Simon Fraser University, Burnaby, BC V5A 1S6, Canada.,Department of Urologic Sciences, University of British Columbia, Vancouver, BC V5Z 1M9, Canada.,Vancouver Prostate Centre, Vancouver, BC V6H 3Z6, Canada
| | - Iman Hajirasouliha
- Institute for Computational Biomedicine, Weill Cornell Medicine, New York, New York 10065, USA.,Department of Physiology and Biophysics, Englander Institute for Precision Medicine, The Meyer Cancer Center, Weill Cornell Medicine, New York, New York 10065, USA
| | - S Cenk Sahinalp
- Cancer Data Science Laboratory, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892, USA
| |
Collapse
|
31
|
Karpov N, Malikic S, Rahman MK, Sahinalp SC. A multi-labeled tree dissimilarity measure for comparing "clonal trees" of tumor progression. Algorithms Mol Biol 2019; 14:17. [PMID: 31372179 PMCID: PMC6661107 DOI: 10.1186/s13015-019-0152-9] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2019] [Accepted: 07/15/2019] [Indexed: 12/18/2022] Open
Abstract
We introduce a new dissimilarity measure between a pair of "clonal trees", each representing the progression and mutational heterogeneity of a tumor sample, constructed by the use of single cell or bulk high throughput sequencing data. In a clonal tree, each vertex represents a specific tumor clone, and is labeled with one or more mutations in a way that each mutation is assigned to the oldest clone that harbors it. Given two clonal trees, our multi-labeled tree dissimilarity (MLTD) measure is defined as the minimum number of mutation/label deletions, (empty) leaf deletions, and vertex (clonal) expansions, applied in any order, to convert each of the two trees to the maximum common tree. We show that the MLTD measure can be computed efficiently in polynomial time and it captures the similarity between trees of different clonal granularity well.
Collapse
Affiliation(s)
- Nikolai Karpov
- Department of Computer Science, Indiana University, Bloomington, IN USA
| | - Salem Malikic
- School of Computing Science, Simon Fraser University, Burnaby, BC Canada
| | | | - S. Cenk Sahinalp
- Department of Computer Science, Indiana University, Bloomington, IN USA
| |
Collapse
|
32
|
Malikic S, Jahn K, Kuipers J, Sahinalp SC, Beerenwinkel N. Integrative inference of subclonal tumour evolution from single-cell and bulk sequencing data. Nat Commun 2019; 10:2750. [PMID: 31227714 PMCID: PMC6588593 DOI: 10.1038/s41467-019-10737-5] [Citation(s) in RCA: 72] [Impact Index Per Article: 14.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2018] [Accepted: 05/30/2019] [Indexed: 02/07/2023] Open
Abstract
Understanding the clonal architecture and evolutionary history of a tumour poses one of the key challenges to overcome treatment failure due to resistant cell populations. Previously, studies on subclonal tumour evolution have been primarily based on bulk sequencing and in some recent cases on single-cell sequencing data. Either data type alone has shortcomings with regard to this task, but methods integrating both data types have been lacking. Here, we present B-SCITE, the first computational approach that infers tumour phylogenies from combined single-cell and bulk sequencing data. Using a comprehensive set of simulated data, we show that B-SCITE systematically outperforms existing methods with respect to tree reconstruction accuracy and subclone identification. B-SCITE provides high-fidelity reconstructions even with a modest number of single cells and in cases where bulk allele frequencies are affected by copy number changes. On real tumour data, B-SCITE generated mutation histories show high concordance with expert generated trees. Intra-tumour heterogeneity provides important information about subclonal tumour evolution. Here, the authors develop B-SCITE, a computational method for inferring tumour phylogenies from combined single-cell and bulk sequencing data.
Collapse
Affiliation(s)
- Salem Malikic
- School of Computing Science, Simon Fraser University, Burnaby, V5A 1S6, BC, Canada.,Vancouver Prostate Centre, Vancouver, V6H 3Z6, BC, Canada
| | - Katharina Jahn
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, 4058, Switzerland.,Swiss Institute of Bioinformatics, Lausanne, 1015, Switzerland
| | - Jack Kuipers
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, 4058, Switzerland.,Swiss Institute of Bioinformatics, Lausanne, 1015, Switzerland
| | - S Cenk Sahinalp
- Department of Computer Science, Indiana University, Bloomington, 47405, IN, USA.
| | - Niko Beerenwinkel
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, 4058, Switzerland. .,Swiss Institute of Bioinformatics, Lausanne, 1015, Switzerland.
| |
Collapse
|
33
|
Shrestha R, Nabavi N, Lin YY, Mo F, Anderson S, Volik S, Adomat HH, Lin D, Xue H, Dong X, Shukin R, Bell RH, McConeghy B, Haegert A, Brahmbhatt S, Li E, Oo HZ, Hurtado-Coll A, Fazli L, Zhou J, McConnell Y, McCart A, Lowy A, Morin GB, Chen T, Daugaard M, Sahinalp SC, Hach F, Le Bihan S, Gleave ME, Wang Y, Churg A, Collins CC. BAP1 haploinsufficiency predicts a distinct immunogenic class of malignant peritoneal mesothelioma. Genome Med 2019; 11:8. [PMID: 30777124 PMCID: PMC6378747 DOI: 10.1186/s13073-019-0620-3] [Citation(s) in RCA: 77] [Impact Index Per Article: 15.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2018] [Accepted: 02/07/2019] [Indexed: 02/06/2023] Open
Abstract
Background Malignant peritoneal mesothelioma (PeM) is a rare and fatal cancer that originates from the peritoneal lining of the abdomen. Standard treatment of PeM is limited to cytoreductive surgery and/or chemotherapy, and no effective targeted therapies for PeM exist. Some immune checkpoint inhibitor studies of mesothelioma have found positivity to be associated with a worse prognosis. Methods To search for novel therapeutic targets for PeM, we performed a comprehensive integrative multi-omics analysis of the genome, transcriptome, and proteome of 19 treatment-naïve PeM, and in particular, we examined BAP1 mutation and copy number status and its relationship to immune checkpoint inhibitor activation. Results We found that PeM could be divided into tumors with an inflammatory tumor microenvironment and those without and that this distinction correlated with haploinsufficiency of BAP1. To further investigate the role of BAP1, we used our recently developed cancer driver gene prioritization algorithm, HIT’nDRIVE, and observed that PeM with BAP1 haploinsufficiency form a distinct molecular subtype characterized by distinct gene expression patterns of chromatin remodeling, DNA repair pathways, and immune checkpoint receptor activation. We demonstrate that this subtype is correlated with an inflammatory tumor microenvironment and thus is a candidate for immune checkpoint blockade therapies. Conclusions Our findings reveal BAP1 to be a potential, easily trackable prognostic and predictive biomarker for PeM immunotherapy that refines PeM disease classification. BAP1 stratification may improve drug response rates in ongoing phases I and II clinical trials exploring the use of immune checkpoint blockade therapies in PeM in which BAP1 status is not considered. This integrated molecular characterization provides a comprehensive foundation for improved management of a subset of PeM patients. Electronic supplementary material The online version of this article (10.1186/s13073-019-0620-3) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Raunak Shrestha
- Vancouver Prostate Centre, 2660 Oak St, Vancouver, BC, V6H 3Z6, Canada.,Bioinformatics Training Program, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada.,Department of Urologic Sciences, University of British Columbia, Vancouver, BC, V5Z 1M9, Canada
| | - Noushin Nabavi
- Vancouver Prostate Centre, 2660 Oak St, Vancouver, BC, V6H 3Z6, Canada.,BC Cancer Research Centre, 675 W 10th Ave, Vancouver, BC, V5Z 1L3, Canada
| | - Yen-Yi Lin
- Vancouver Prostate Centre, 2660 Oak St, Vancouver, BC, V6H 3Z6, Canada.,Department of Urologic Sciences, University of British Columbia, Vancouver, BC, V5Z 1M9, Canada
| | - Fan Mo
- Vancouver Prostate Centre, 2660 Oak St, Vancouver, BC, V6H 3Z6, Canada.,International Precision Medicine Research Centre, Zhejiang-California International Nanosystems Institute, Zhejiang University, Hangzhou, 310058, Zhejiang, China.,Neoantigen Therapeutics, Inc., Hangzhou, 310051, Zhejiang, China
| | - Shawn Anderson
- Vancouver Prostate Centre, 2660 Oak St, Vancouver, BC, V6H 3Z6, Canada
| | - Stanislav Volik
- Vancouver Prostate Centre, 2660 Oak St, Vancouver, BC, V6H 3Z6, Canada
| | - Hans H Adomat
- Vancouver Prostate Centre, 2660 Oak St, Vancouver, BC, V6H 3Z6, Canada
| | - Dong Lin
- Vancouver Prostate Centre, 2660 Oak St, Vancouver, BC, V6H 3Z6, Canada.,BC Cancer Research Centre, 675 W 10th Ave, Vancouver, BC, V5Z 1L3, Canada
| | - Hui Xue
- BC Cancer Research Centre, 675 W 10th Ave, Vancouver, BC, V5Z 1L3, Canada
| | - Xin Dong
- BC Cancer Research Centre, 675 W 10th Ave, Vancouver, BC, V5Z 1L3, Canada
| | - Robert Shukin
- Vancouver Prostate Centre, 2660 Oak St, Vancouver, BC, V6H 3Z6, Canada
| | - Robert H Bell
- Vancouver Prostate Centre, 2660 Oak St, Vancouver, BC, V6H 3Z6, Canada
| | - Brian McConeghy
- Vancouver Prostate Centre, 2660 Oak St, Vancouver, BC, V6H 3Z6, Canada
| | - Anne Haegert
- Vancouver Prostate Centre, 2660 Oak St, Vancouver, BC, V6H 3Z6, Canada
| | - Sonal Brahmbhatt
- Vancouver Prostate Centre, 2660 Oak St, Vancouver, BC, V6H 3Z6, Canada
| | - Estelle Li
- Vancouver Prostate Centre, 2660 Oak St, Vancouver, BC, V6H 3Z6, Canada
| | - Htoo Zarni Oo
- Vancouver Prostate Centre, 2660 Oak St, Vancouver, BC, V6H 3Z6, Canada.,Department of Urologic Sciences, University of British Columbia, Vancouver, BC, V5Z 1M9, Canada
| | | | - Ladan Fazli
- Vancouver Prostate Centre, 2660 Oak St, Vancouver, BC, V6H 3Z6, Canada
| | - Joshua Zhou
- Vancouver Prostate Centre, 2660 Oak St, Vancouver, BC, V6H 3Z6, Canada
| | - Yarrow McConnell
- Department of Surgery, University of British Columbia, Vancouver, BC, V5Z 1M9, Canada
| | - Andrea McCart
- Mount Sinai Hospital, 600 University Ave, Toronto, ON, M5G 1X5, Canada
| | - Andrew Lowy
- Moores Cancer Center, 3855 Health Sciences Dr, La Jolla, CA, 92093, USA
| | - Gregg B Morin
- BC Cancer Research Centre, 675 W 10th Ave, Vancouver, BC, V5Z 1L3, Canada
| | - Tianhui Chen
- Zhejiang Academy of Medical Sciences, Tianmushan Road 182, Hangzhou, 310013, China
| | - Mads Daugaard
- Vancouver Prostate Centre, 2660 Oak St, Vancouver, BC, V6H 3Z6, Canada.,Department of Urologic Sciences, University of British Columbia, Vancouver, BC, V5Z 1M9, Canada
| | - S Cenk Sahinalp
- Vancouver Prostate Centre, 2660 Oak St, Vancouver, BC, V6H 3Z6, Canada.,School of Informatics and Computing, Indiana University, Bloomington, IN, 47408, USA
| | - Faraz Hach
- Vancouver Prostate Centre, 2660 Oak St, Vancouver, BC, V6H 3Z6, Canada.,Department of Urologic Sciences, University of British Columbia, Vancouver, BC, V5Z 1M9, Canada
| | - Stephane Le Bihan
- Vancouver Prostate Centre, 2660 Oak St, Vancouver, BC, V6H 3Z6, Canada
| | - Martin E Gleave
- Vancouver Prostate Centre, 2660 Oak St, Vancouver, BC, V6H 3Z6, Canada.,Department of Urologic Sciences, University of British Columbia, Vancouver, BC, V5Z 1M9, Canada
| | - Yuzhuo Wang
- Vancouver Prostate Centre, 2660 Oak St, Vancouver, BC, V6H 3Z6, Canada.,Department of Urologic Sciences, University of British Columbia, Vancouver, BC, V5Z 1M9, Canada.,BC Cancer Research Centre, 675 W 10th Ave, Vancouver, BC, V5Z 1L3, Canada
| | - Andrew Churg
- Department of Pathology, Vancouver General Hospital, Vancouver, BC, V5Z 1M9, Canada.
| | - Colin C Collins
- Vancouver Prostate Centre, 2660 Oak St, Vancouver, BC, V6H 3Z6, Canada. .,Department of Urologic Sciences, University of British Columbia, Vancouver, BC, V5Z 1M9, Canada.
| |
Collapse
|
34
|
Gawronski AR, Uhl M, Zhang Y, Lin YY, Niknafs YS, Ramnarine VR, Malik R, Feng F, Chinnaiyan AM, Collins CC, Sahinalp SC, Backofen R. MechRNA: prediction of lncRNA mechanisms from RNA-RNA and RNA-protein interactions. Bioinformatics 2018; 34:3101-3110. [PMID: 29617966 PMCID: PMC6137976 DOI: 10.1093/bioinformatics/bty208] [Citation(s) in RCA: 40] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2018] [Revised: 03/14/2018] [Accepted: 03/27/2018] [Indexed: 01/07/2023] Open
Abstract
Motivation Long non-coding RNAs (lncRNAs) are defined as transcripts longer than 200 nt that do not get translated into proteins. Often these transcripts are processed (spliced, capped and polyadenylated) and some are known to have important biological functions. However, most lncRNAs have unknown or poorly understood functions. Nevertheless, because of their potential role in cancer, lncRNAs are receiving a lot of attention, and the need for computational tools to predict their possible mechanisms of action is more than ever. Fundamentally, most of the known lncRNA mechanisms involve RNA-RNA and/or RNA-protein interactions. Through accurate predictions of each kind of interaction and integration of these predictions, it is possible to elucidate potential mechanisms for a given lncRNA. Results Here, we introduce MechRNA, a pipeline for corroborating RNA-RNA interaction prediction and protein binding prediction for identifying possible lncRNA mechanisms involving specific targets or on a transcriptome-wide scale. The first stage uses a version of IntaRNA2 with added functionality for efficient prediction of RNA-RNA interactions with very long input sequences, allowing for large-scale analysis of lncRNA interactions with little or no loss of optimality. The second stage integrates protein binding information pre-computed by GraphProt, for both the lncRNA and the target. The final stage involves inferring the most likely mechanism for each lncRNA/target pair. This is achieved by generating candidate mechanisms from the predicted interactions, the relative locations of these interactions and correlation data, followed by selection of the most likely mechanistic explanation using a combined P-value. We applied MechRNA on a number of recently identified cancer-related lncRNAs (PCAT1, PCAT29 and ARLnc1) and also on two well-studied lncRNAs (PCA3 and 7SL). This led to the identification of hundreds of high confidence potential targets for each lncRNA and corresponding mechanisms. These predictions include the known competitive mechanism of 7SL with HuR for binding on the tumor suppressor TP53, as well as mechanisms expanding what is known about PCAT1 and ARLn1 and their targets BRCA2 and AR, respectively. For PCAT1-BRCA2, the mechanism involves competitive binding with HuR, which we confirmed using HuR immunoprecipitation assays. Availability and implementation MechRNA is available for download at https://bitbucket.org/compbio/mechrna. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | - Michael Uhl
- Centre for Biological Signalling Studies, University of Freiburg, Freiburg im Breisgau, Germany
| | - Yajia Zhang
- Department of Pathology, University of Michigan, Ann Arbor, MI, USA
- Department of Computational Medicine and Bioinformatics, Ann Arbor, MI, USA
| | - Yen-Yi Lin
- Computing Science, Simon Fraser University, Burnaby BC, Canada
- Vancouver Prostate Centre, Vancouver, BC, Canada
| | - Yashar S Niknafs
- Michigan Center for Translational Pathology, University of Michigan, Ann Arbor, MI, USA
| | | | - Rohit Malik
- Michigan Center for Translational Pathology, University of Michigan, Ann Arbor, MI, USA
| | - Felix Feng
- Michigan Center for Translational Pathology, University of Michigan, Ann Arbor, MI, USA
- Department of Radiation Oncology, University of Michigan, Ann Arbor, MI, USA
| | - Arul M Chinnaiyan
- Department of Pathology, University of Michigan, Ann Arbor, MI, USA
- Department of Computational Medicine and Bioinformatics, Ann Arbor, MI, USA
- Michigan Center for Translational Pathology, University of Michigan, Ann Arbor, MI, USA
- Howard Hughes Medical Institute, University of Michigan, Ann Arbor, MI, USA
| | | | - S Cenk Sahinalp
- Vancouver Prostate Centre, Vancouver, BC, Canada
- Department of Computer Science, Indiana University, Bloomington, USA
| | - Rolf Backofen
- Centre for Biological Signalling Studies, University of Freiburg, Freiburg im Breisgau, Germany
| |
Collapse
|
35
|
Jolly C, Gerstung M, Leshchiner I, Dentro SC, Gonzalez S, Mitchell TJ, Rubanova Y, Anur P, Rosebrock D, Yu K, Tarabichi M, Deshwar A, Wintersinger J, Kleinheinz K, Vásquez-García I, Haase K, Sengupta S, Macintyre G, Malikic S, Donmez N, Livitz DG, Cmero M, Demeulemeester J, Schumacher S, Fan Y, Yao X, Lee J, Schlesner M, Boutros PC, Bowtell DD, Zhu H, Getz G, Imielinski M, Beroukhim R, Sahinalp SC, Ji Y, Peifer M, Markowetz F, Mustonen V, Juan K, Wang W, Morris QD, Spellman PT, Wedge DC, Loo PV. Abstract 218: The evolutionary history of 2,658 cancers. Cancer Res 2018. [DOI: 10.1158/1538-7445.am2018-218] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Abstract
Cancer develops through a continuous process of somatic evolution. Whole genome sequencing provides a snapshot of the tumor genome at the point of sampling, however, the data can contain information that permits the reconstruction of a tumor's evolutionary past.
Here, we apply such life history analyses on an unprecedented scale, to a set of 2,658 tumors spanning 39 cancer types. We estimated the timing of large chromosomal gains during tumor evolution, by comparing the rates of doubled to non-doubled point mutations within gained regions. Although we find that such events typically occur in the second half of clonal evolution, we also observe distinctive and early chromosomal gains in some cancer types, such as gains of chromosomes 7, 19 and 20 in glioblastoma, and isochromosome 17q in medulloblastoma. By integrating these results with the qualitative timing of individual driver mutations, we obtained an overall ranking, from early to late, of frequent somatic events per cancer type, which both identified novel patterns of tumor evolution, and incorporated additional detail into known models, such as the progression of APC-KRAS-TP53 in colorectal cancer proposed by Vogelstein and Fearon.
To estimate how mutational processes acting on the tumor genome change over time, we classified mutations in each sample according to three broad time periods (early clonal, late clonal, and subclonal), and quantified the activity of mutational signatures in each period. Most mutational processes appear to remain remarkably constant, however, certain signatures show clear and consistent changes during clonal evolution. Particularly, mutational signatures associated with exposure to carcinogens, such as smoking and UV light, tend to decrease over time. In contrast, signatures associated with defective endogenous processes, such as APOBEC mutagenesis and defective double strand break repair, show an increase between early and late phases of tumor evolution.
Making use of clock-like mutational signatures, we converted mutational time estimates for large events, such as whole genome duplication (WGD), and the emergence of the most recent common ancestor (MRCA), into real time estimates, which allowed us to combine our analyses into overall timelines of cancer evolution, per tumor type. For example, the typical timeline of ovarian adenocarcinoma development shows that early tumor evolution is characterized by mutations in TP53, and widespread genome instability, with WGD events taking place on average 8 years prior to diagnosis. In later stages of evolution, signatures of defective repair processes increase, and the MRCA emerges on average 1 year before diagnosis.
Taken together, these data reveal the common and divergent evolutionary trajectories available to a cancer, which might be crucial in understanding specific tumor biology, and in providing new opportunities for early detection and cancer prevention.
Citation Format: Clemency Jolly, Moritz Gerstung, Ignaty Leshchiner, Stefan C. Dentro, Santiago Gonzalez, Thomas J. Mitchell, Yulia Rubanova, Pavana Anur, Daniel Rosebrock, Kaixian Yu, Maxime Tarabichi, Amit Deshwar, Jeff Wintersinger, Kortine Kleinheinz, Ignacio Vásquez-García, Kerstin Haase, Subhajit Sengupta, Geoff Macintyre, Salem Malikic, Nilgun Donmez, Dimitri G. Livitz, Mark Cmero, Jonas Demeulemeester, Steve Schumacher, Yu Fan, Xiaotong Yao, Juhee Lee, Matthias Schlesner, Paul C. Boutros, David D. Bowtell, Hongtu Zhu, Gad Getz, Marcin Imielinski, Rameen Beroukhim, S Cenk Sahinalp, Yuan Ji, Martin Peifer, Florian Markowetz, Ville Mustonen, Ke Juan, Wenyi Wang, Quaid D. Morris, Paul T. Spellman, David C. Wedge, Peter Van Loo, PCAWG Evolution and Heterogeneity Working Group. The evolutionary history of 2,658 cancers [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2018; 2018 Apr 14-18; Chicago, IL. Philadelphia (PA): AACR; Cancer Res 2018;78(13 Suppl):Abstract nr 218.
Collapse
Affiliation(s)
| | - Moritz Gerstung
- 2European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, United Kingdom
| | | | | | - Santiago Gonzalez
- 2European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, United Kingdom
| | | | | | - Pavana Anur
- 6Oregon Health and Science University, Portland, OR
| | | | - Kaixian Yu
- 7The University of Texas MD Anderson Cancer Center, Houston, TX
| | | | - Amit Deshwar
- 5University of Toronto, Toronto, Ontario, Canada
| | | | | | | | - Kerstin Haase
- 1The Francis Crick Institute, London, United Kingdom
| | | | - Geoff Macintyre
- 10Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge, United Kingdom
| | - Salem Malikic
- 11Simon Fraser University, Vancouver, British Columbia, Canada
| | - Nilgun Donmez
- 11Simon Fraser University, Vancouver, British Columbia, Canada
| | | | - Mark Cmero
- 12University of Melbourne, Melbourne, Australia
| | | | | | - Yu Fan
- 7The University of Texas MD Anderson Cancer Center, Houston, TX
| | | | - Juhee Lee
- 14University of California Santa Cruz, Santa Cruz, CA
| | | | | | | | - Hongtu Zhu
- 7The University of Texas MD Anderson Cancer Center, Houston, TX
| | - Gad Getz
- 3Broad Institute of MIT and Harvard, Cambridge, MA
| | | | | | | | - Yuan Ji
- 9NorthShore University HealthSystem, Evanston, IL
| | | | - Florian Markowetz
- 10Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge, United Kingdom
| | | | - Ke Juan
- 10Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge, United Kingdom
| | - Wenyi Wang
- 7The University of Texas MD Anderson Cancer Center, Houston, TX
| | | | | | | | - Peter Van Loo
- 1The Francis Crick Institute, London, United Kingdom
| | | |
Collapse
|
36
|
Lin YY, Gawronski A, Hach F, Li S, Numanagić I, Sarrafi I, Mishra S, McPherson A, Collins CC, Radovich M, Tang H, Sahinalp SC. Computational identification of micro-structural variations and their proteogenomic consequences in cancer. Bioinformatics 2018; 34:1672-1681. [PMID: 29267878 PMCID: PMC5946953 DOI: 10.1093/bioinformatics/btx807] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2017] [Revised: 11/24/2017] [Accepted: 12/15/2017] [Indexed: 12/18/2022] Open
Abstract
Motivation Rapid advancement in high throughput genome and transcriptome sequencing (HTS) and mass spectrometry (MS) technologies has enabled the acquisition of the genomic, transcriptomic and proteomic data from the same tissue sample. We introduce a computational framework, ProTIE, to integratively analyze all three types of omics data for a complete molecular profile of a tissue sample. Our framework features MiStrVar, a novel algorithmic method to identify micro structural variants (microSVs) on genomic HTS data. Coupled with deFuse, a popular gene fusion detection method we developed earlier, MiStrVar can accurately profile structurally aberrant transcripts in tumors. Given the breakpoints obtained by MiStrVar and deFuse, our framework can then identify all relevant peptides that span the breakpoint junctions and match them with unique proteomic signatures. Observing structural aberrations in all three types of omics data validates their presence in the tumor samples. Results We have applied our framework to all The Cancer Genome Atlas (TCGA) breast cancer Whole Genome Sequencing (WGS) and/or RNA-Seq datasets, spanning all four major subtypes, for which proteomics data from Clinical Proteomic Tumor Analysis Consortium (CPTAC) have been released. A recent study on this dataset focusing on SNVs has reported many that lead to novel peptides. Complementing and significantly broadening this study, we detected 244 novel peptides from 432 candidate genomic or transcriptomic sequence aberrations. Many of the fusions and microSVs we discovered have not been reported in the literature. Interestingly, the vast majority of these translated aberrations, fusions in particular, were private, demonstrating the extensive inter-genomic heterogeneity present in breast cancer. Many of these aberrations also have matching out-of-frame downstream peptides, potentially indicating novel protein sequence and structure. Availability and implementation MiStrVar is available for download at https://bitbucket.org/compbio/mistrvar, and ProTIE is available at https://bitbucket.org/compbio/protie. Contact cenksahi@indiana.edu. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yen-Yi Lin
- School of Computing Science, Simon Fraser University, Burnaby, BC, Canada
- Vancouver Prostate Centre, Vancouver, BC, Canada
| | | | - Faraz Hach
- School of Computing Science, Simon Fraser University, Burnaby, BC, Canada
- Vancouver Prostate Centre, Vancouver, BC, Canada
- Department of Urologic Sciences, University of British Columbia, Vancouver, BC, Canada
| | - Sujun Li
- Department of Computer Science, Indiana University, Bloomington, IN, USA
| | - Ibrahim Numanagić
- School of Computing Science, Simon Fraser University, Burnaby, BC, Canada
| | - Iman Sarrafi
- School of Computing Science, Simon Fraser University, Burnaby, BC, Canada
- Vancouver Prostate Centre, Vancouver, BC, Canada
| | - Swati Mishra
- Department of Surgery, Indiana University, School of Medicine, Indianapolis, IN, USA
| | - Andrew McPherson
- School of Computing Science, Simon Fraser University, Burnaby, BC, Canada
| | - Colin C Collins
- Vancouver Prostate Centre, Vancouver, BC, Canada
- Department of Urologic Sciences, University of British Columbia, Vancouver, BC, Canada
| | - Milan Radovich
- Department of Surgery, Indiana University, School of Medicine, Indianapolis, IN, USA
| | - Haixu Tang
- Department of Computer Science, Indiana University, Bloomington, IN, USA
| | - S Cenk Sahinalp
- Vancouver Prostate Centre, Vancouver, BC, Canada
- Department of Computer Science, Indiana University, Bloomington, IN, USA
| |
Collapse
|
37
|
Numanagić I, Malikić S, Ford M, Qin X, Toji L, Radovich M, Skaar TC, Pratt VM, Berger B, Scherer S, Sahinalp SC. Allelic decomposition and exact genotyping of highly polymorphic and structurally variant genes. Nat Commun 2018; 9:828. [PMID: 29483503 PMCID: PMC5826927 DOI: 10.1038/s41467-018-03273-1] [Citation(s) in RCA: 56] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2017] [Accepted: 02/01/2018] [Indexed: 12/30/2022] Open
Abstract
High-throughput sequencing provides the means to determine the allelic decomposition for any gene of interest-the number of copies and the exact sequence content of each copy of a gene. Although many clinically and functionally important genes are highly polymorphic and have undergone structural alterations, no high-throughput sequencing data analysis tool has yet been designed to effectively solve the full allelic decomposition problem. Here we introduce a combinatorial optimization framework that successfully resolves this challenging problem, including for genes with structural alterations. We provide an associated computational tool Aldy that performs allelic decomposition of highly polymorphic, multi-copy genes through using whole or targeted genome sequencing data. For a large diverse sequencing data set, Aldy identifies multiple rare and novel alleles for several important pharmacogenes, significantly improving upon the accuracy and utility of current genotyping assays. As more data sets become available, we expect Aldy to become an essential component of genotyping toolkits.
Collapse
Affiliation(s)
- Ibrahim Numanagić
- School of Computing Science, Simon Fraser University, Burnaby, BC, V5A 1S6, Canada
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
- Department of Mathematics, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
| | - Salem Malikić
- School of Computing Science, Simon Fraser University, Burnaby, BC, V5A 1S6, Canada
| | - Michael Ford
- School of Computing Science, Simon Fraser University, Burnaby, BC, V5A 1S6, Canada
| | - Xiang Qin
- Baylor College of Medicine Human Genome Sequencing Center, Houston, TX, 77030, USA
| | - Lorraine Toji
- Coriell Institute for Medical Research, Camden, NJ, 08103, USA
| | - Milan Radovich
- Indiana University School of Medicine, Indianapolis, IN, 46202, USA
| | - Todd C Skaar
- Indiana University School of Medicine, Indianapolis, IN, 46202, USA
| | - Victoria M Pratt
- Indiana University School of Medicine, Indianapolis, IN, 46202, USA
| | - Bonnie Berger
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
- Department of Mathematics, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
| | - Steve Scherer
- Baylor College of Medicine Human Genome Sequencing Center, Houston, TX, 77030, USA
| | - S Cenk Sahinalp
- Department of Computer Science, Indiana University, Bloomington, IN, 47405, USA.
| |
Collapse
|
38
|
Chen F, Wang S, Jiang X, Ding S, Lu Y, Kim J, Sahinalp SC, Shimizu C, Burns JC, Wright VJ, Png E, Hibberd ML, Lloyd DD, Yang H, Telenti A, Bloss CS, Fox D, Lauter K, Ohno-Machado L. PRINCESS: Privacy-protecting Rare disease International Network Collaboration via Encryption through Software guard extensionS. Bioinformatics 2017; 33:871-878. [PMID: 28065902 DOI: 10.1093/bioinformatics/btw758] [Citation(s) in RCA: 39] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2016] [Accepted: 11/23/2016] [Indexed: 12/19/2022] Open
Abstract
Motivation We introduce PRINCESS, a privacy-preserving international collaboration framework for analyzing rare disease genetic data that are distributed across different continents. PRINCESS leverages Software Guard Extensions (SGX) and hardware for trustworthy computation. Unlike a traditional international collaboration model, where individual-level patient DNA are physically centralized at a single site, PRINCESS performs a secure and distributed computation over encrypted data, fulfilling institutional policies and regulations for protected health information. Results To demonstrate PRINCESS' performance and feasibility, we conducted a family-based allelic association study for Kawasaki Disease, with data hosted in three different continents. The experimental results show that PRINCESS provides secure and accurate analyses much faster than alternative solutions, such as homomorphic encryption and garbled circuits (over 40 000× faster). Availability and Implementation https://github.com/achenfengb/PRINCESS_opensource. Contact shw070@ucsd.edu. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Feng Chen
- Health System Department of Biomedical Informatics, University of California San Diego, La Jolla, CA, USA
| | - Shuang Wang
- Health System Department of Biomedical Informatics, University of California San Diego, La Jolla, CA, USA
| | - Xiaoqian Jiang
- Health System Department of Biomedical Informatics, University of California San Diego, La Jolla, CA, USA
| | - Sijie Ding
- Health System Department of Biomedical Informatics, University of California San Diego, La Jolla, CA, USA
| | - Yao Lu
- Department of Electrical and Computer Engineering, University of California San Diego, La Jolla, CA, USA
| | - Jihoon Kim
- Health System Department of Biomedical Informatics, University of California San Diego, La Jolla, CA, USA
| | - S Cenk Sahinalp
- Department of Computer Science and Informatics, Indiana University, Bloomington, IN, USA
| | - Chisato Shimizu
- Department of Pediatrics, University of California San Diego, La Jolla, CA, USA
| | - Jane C Burns
- Department of Pediatrics, University of California San Diego, La Jolla, CA, USA
| | | | - Eileen Png
- Genome Institute of Singapore, ASTAR, Singapore, Singapore
| | | | - David D Lloyd
- Deparment of Pediatrics, School of Medicine, Emory University, Atlanta, GA, USA
| | - Hai Yang
- Health System Department of Biomedical Informatics, University of California San Diego, La Jolla, CA, USA
| | | | - Cinnamon S Bloss
- Department of Psychiatry, University of California San Diego, La Jolla, CA, USA
| | - Dov Fox
- School of Law, University of San Diego, San Diego, CA, USA
| | - Kristin Lauter
- Cryptography Group, Microsoft Research, San Diego, CA, USA
| | - Lucila Ohno-Machado
- Health System Department of Biomedical Informatics, University of California San Diego, La Jolla, CA, USA
| |
Collapse
|
39
|
McPherson AW, Roth A, Ha G, Chauve C, Steif A, de Souza CPE, Eirew P, Bouchard-Côté A, Aparicio S, Sahinalp SC, Shah SP. Correction to: ReMixT: clone-specific genomic structure estimation in cancer. Genome Biol 2017; 18:188. [PMID: 28985744 PMCID: PMC5629763 DOI: 10.1186/s13059-017-1327-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2017] [Accepted: 09/27/2017] [Indexed: 11/10/2022] Open
Affiliation(s)
- Andrew W McPherson
- Department of Molecular Oncology, BC Cancer Agency, Vancouver, Canada.,Department of Pathology and Laboratory Medicine, University of British Columbia, Vancouver, Canada
| | - Andrew Roth
- Department of Statistics, Oxford University, Oxford, UK.,Ludwig Institute for Cancer Research, Oxford University, Oxford, UK
| | - Gavin Ha
- Dana-Farber Cancer Institute, Oxford, USA.,Eli and Edythe L. Broad Institute of MIT and Harvard, Cambridge, USA
| | - Cedric Chauve
- Department of Mathematics, Simon Fraser University, Burnaby, Canada
| | - Adi Steif
- Department of Molecular Oncology, BC Cancer Agency, Vancouver, Canada
| | - Camila P E de Souza
- Department of Molecular Oncology, BC Cancer Agency, Vancouver, Canada.,Department of Pathology and Laboratory Medicine, University of British Columbia, Vancouver, Canada
| | - Peter Eirew
- Department of Molecular Oncology, BC Cancer Agency, Vancouver, Canada
| | | | - Sam Aparicio
- Department of Molecular Oncology, BC Cancer Agency, Vancouver, Canada.,Department of Pathology and Laboratory Medicine, University of British Columbia, Vancouver, Canada
| | - S Cenk Sahinalp
- Vancouver Prostate Centre, Vancouver, Canada.,Department of Computer Science, Indiana University Bloomington, Bloomington, USA
| | - Sohrab P Shah
- Department of Molecular Oncology, BC Cancer Agency, Vancouver, Canada. .,Department of Pathology and Laboratory Medicine, University of British Columbia, Vancouver, Canada.
| |
Collapse
|
40
|
Kalina JL, Neilson DS, Lin YY, Hamilton PT, Comber AP, Loy EMH, Sahinalp SC, Collins CC, Hach F, Lum JJ. Mutational Analysis of Gene Fusions Predicts Novel MHC Class I-Restricted T-Cell Epitopes and Immune Signatures in a Subset of Prostate Cancer. Clin Cancer Res 2017; 23:7596-7607. [PMID: 28954787 DOI: 10.1158/1078-0432.ccr-17-0618] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2017] [Revised: 08/25/2017] [Accepted: 09/20/2017] [Indexed: 11/16/2022]
Abstract
Purpose: Gene fusions are frequently found in prostate cancer and may result in the formation of unique chimeric amino acid sequences (CASQ) that span the breakpoint of two fused gene products. This study evaluated the potential for fusion-derived CASQs to be a source of tumor neoepitopes, and determined their relationship to patterns of immune signatures in prostate cancer patients.Experimental Design: A computational strategy was used to identify CASQs and their corresponding predicted MHC class I epitopes using RNA-Seq data from The Cancer Genome Atlas of prostate tumors. In vitro peptide-specific T-cell expansion was performed to identify CASQ-reactive T cells. A multivariate analysis was used to relate patterns of in silico-predicted tumor-infiltrating immune cells with prostate tumors harboring these mutational events.Results: Eighty-seven percent of tumors contained gene fusions with a mean of 12 per tumor. In total, 41% of fusion-positive tumors were found to encode CASQs. Within these tumors, 87% gave rise to predicted MHC class I-binding epitopes. This observation was more prominent when patients were stratified into low- and intermediate/high-risk categories. One of the identified CASQ from the recurrent TMPRSS2:ERG type VI fusion contained several high-affinity HLA-restricted epitopes. These peptides bound HLA-A*02:01 in vitro and were recognized by CD8+ T cells. Finally, the presence of fusions and CASQs were associated with expression of immune cell infiltration.Conclusions: Mutanome analysis of gene fusion-derived CASQs can give rise to patient-specific predicted neoepitopes. Moreover, these fusions predicted patterns of immune cell infiltration within a subgroup of prostate cancer patients. Clin Cancer Res; 23(24); 7596-607. ©2017 AACR.
Collapse
Affiliation(s)
- Jennifer L Kalina
- Trev & Joyce Deeley Research Centre, British Columbia Cancer Agency, Victoria, Canada
| | - David S Neilson
- Trev & Joyce Deeley Research Centre, British Columbia Cancer Agency, Victoria, Canada.,Department of Biochemistry & Microbiology, University of Victoria, Victoria, Canada
| | - Yen-Yi Lin
- School of Computing Science, Simon Fraser University, Burnaby, Canada
| | - Phineas T Hamilton
- Trev & Joyce Deeley Research Centre, British Columbia Cancer Agency, Victoria, Canada
| | - Alexandra P Comber
- Trev & Joyce Deeley Research Centre, British Columbia Cancer Agency, Victoria, Canada
| | - Emma M H Loy
- Trev & Joyce Deeley Research Centre, British Columbia Cancer Agency, Victoria, Canada.,Department of Biochemistry & Microbiology, University of Victoria, Victoria, Canada
| | - S Cenk Sahinalp
- School of Computing Science, Simon Fraser University, Burnaby, Canada.,Vancouver Prostate Centre, Vancouver, Canada.,School of Informatics & Computing, Indiana University, Bloomington, Indiana
| | | | - Faraz Hach
- School of Computing Science, Simon Fraser University, Burnaby, Canada. .,Vancouver Prostate Centre, Vancouver, Canada.,Department of Urologic Sciences, University of British Columbia, Vancouver, Canada
| | - Julian J Lum
- Trev & Joyce Deeley Research Centre, British Columbia Cancer Agency, Victoria, Canada. .,Department of Biochemistry & Microbiology, University of Victoria, Victoria, Canada
| |
Collapse
|
41
|
Abstract
MOTIVATION Second generation sequencing technologies paved the way to an exceptional increase in the number of sequenced genomes, both prokaryotic and eukaryotic. However, short reads are difficult to assemble and often lead to highly fragmented assemblies. The recent developments in long reads sequencing methods offer a promising way to address this issue. However, so far long reads are characterized by a high error rate, and assembling from long reads require a high depth of coverage. This motivates the development of hybrid approaches that leverage the high quality of short reads to correct errors in long reads. RESULTS We introduce CoLoRMap, a hybrid method for correcting noisy long reads, such as the ones produced by PacBio sequencing technology, using high-quality Illumina paired-end reads mapped onto the long reads. Our algorithm is based on two novel ideas: using a classical shortest path algorithm to find a sequence of overlapping short reads that minimizes the edit score to a long read and extending corrected regions by local assembly of unmapped mates of mapped short reads. Our results on bacterial, fungal and insect data sets show that CoLoRMap compares well with existing hybrid correction methods. AVAILABILITY AND IMPLEMENTATION The source code of CoLoRMap is freely available for non-commercial use at https://github.com/sfu-compbio/colormap CONTACT ehaghshe@sfu.ca or cedric.chauve@sfu.ca SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Ehsan Haghshenas
- School of Computing Sciences MADD-Gen Graduate Program, Simon Fraser University, Burnaby, BC V5A 1S6, Canada
| | - Faraz Hach
- School of Computing Sciences Vancouver Prostate Centre, Vancouver, BC V6H 3Z6, Canada
| | - S Cenk Sahinalp
- School of Computing Sciences Vancouver Prostate Centre, Vancouver, BC V6H 3Z6, Canada, School of Informatics and Computing, Indiana University, Bloomington, IN 47405, USA
| | - Cedric Chauve
- Department of Mathematics, Simon Fraser University, Burnaby, BC V5A 1S6, Canada
| |
Collapse
|
42
|
McPherson AW, Roth A, Ha G, Chauve C, Steif A, de Souza CPE, Eirew P, Bouchard-Côté A, Aparicio S, Sahinalp SC, Shah SP. ReMixT: clone-specific genomic structure estimation in cancer. Genome Biol 2017; 18:140. [PMID: 28750660 PMCID: PMC5530528 DOI: 10.1186/s13059-017-1267-2] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2017] [Accepted: 07/03/2017] [Indexed: 11/10/2022] Open
Abstract
Somatic evolution of malignant cells produces tumors composed of multiple clonal populations, distinguished in part by rearrangements and copy number changes affecting chromosomal segments. Whole genome sequencing mixes the signals of sampled populations, diluting the signals of clone-specific aberrations, and complicating estimation of clone-specific genotypes. We introduce ReMixT, a method to unmix tumor and contaminating normal signals and jointly predict mixture proportions, clone-specific segment copy number, and clone specificity of breakpoints. ReMixT is free, open-source software and is available at http://bitbucket.org/dranew/remixt .
Collapse
Affiliation(s)
- Andrew W McPherson
- Department of Molecular Oncology, BC Cancer Agency, 675 West 10th Avenue, Vancouver, BC, Canada.,Department of Pathology and Laboratory Medicine, University of British Columbia, 2329 West Mall, Vancouver, BC, Canada
| | - Andrew Roth
- Department of Statistics, Oxford University, 24-29 St Giles, Oxford, United Kingdom.,Ludwig Institute for Cancer Research, Oxford University, Old Road Campus Research Building, Headington, Oxford, United Kingdom
| | - Gavin Ha
- Dana-Farber Cancer Institute, 450 Brookline Ave, Oxford, Boston, USA.,Eli and Edythe L. Broad Institute of MIT and Harvard, 415 Main Street, Cambridge, MA, USA
| | - Cedric Chauve
- Department of Mathematics, Simon Fraser University, 8888 University Drive, Burnaby, BC, Canada
| | - Adi Steif
- Department of Molecular Oncology, BC Cancer Agency, 675 West 10th Avenue, Vancouver, BC, Canada
| | - Camila P E de Souza
- Department of Molecular Oncology, BC Cancer Agency, 675 West 10th Avenue, Vancouver, BC, Canada.,Department of Pathology and Laboratory Medicine, University of British Columbia, 2329 West Mall, Vancouver, BC, Canada
| | - Peter Eirew
- Department of Molecular Oncology, BC Cancer Agency, 675 West 10th Avenue, Vancouver, BC, Canada
| | - Alexandre Bouchard-Côté
- Department of Statistics, University of British Columbia, 2329 West Mall, Vancouver, BC, Canada
| | - Sam Aparicio
- Department of Molecular Oncology, BC Cancer Agency, 675 West 10th Avenue, Vancouver, BC, Canada.,Department of Pathology and Laboratory Medicine, University of British Columbia, 2329 West Mall, Vancouver, BC, Canada
| | - S Cenk Sahinalp
- Vancouver Prostate Centre, 2660 Oak Street, Vancouver, Canada.,Department of Computer Science, Indiana University Bloomington, 107 S. Indiana Avenue, Bloomington, IN, USA
| | - Sohrab P Shah
- Department of Molecular Oncology, BC Cancer Agency, 675 West 10th Avenue, Vancouver, BC, Canada. .,Department of Pathology and Laboratory Medicine, University of British Columbia, 2329 West Mall, Vancouver, BC, Canada.
| |
Collapse
|
43
|
Shrestha R, Hodzic E, Sauerwald T, Dao P, Wang K, Yeung J, Anderson S, Vandin F, Haffari G, Collins CC, Sahinalp SC. HIT'nDRIVE: patient-specific multidriver gene prioritization for precision oncology. Genome Res 2017; 27:1573-1588. [PMID: 28768687 PMCID: PMC5580716 DOI: 10.1101/gr.221218.117] [Citation(s) in RCA: 67] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2017] [Accepted: 07/06/2017] [Indexed: 12/12/2022]
Abstract
Prioritizing molecular alterations that act as drivers of cancer remains a crucial bottleneck in therapeutic development. Here we introduce HIT'nDRIVE, a computational method that integrates genomic and transcriptomic data to identify a set of patient-specific, sequence-altered genes, with sufficient collective influence over dysregulated transcripts. HIT'nDRIVE aims to solve the "random walk facility location" (RWFL) problem in a gene (or protein) interaction network, which differs from the standard facility location problem by its use of an alternative distance measure: "multihitting time," the expected length of the shortest random walk from any one of the set of sequence-altered genes to an expression-altered target gene. When applied to 2200 tumors from four major cancer types, HIT'nDRIVE revealed many potentially clinically actionable driver genes. We also demonstrated that it is possible to perform accurate phenotype prediction for tumor samples by only using HIT'nDRIVE-seeded driver gene modules from gene interaction networks. In addition, we identified a number of breast cancer subtype-specific driver modules that are associated with patients' survival outcome. Furthermore, HIT'nDRIVE, when applied to a large panel of pan-cancer cell lines, accurately predicted drug efficacy using the driver genes and their seeded gene modules. Overall, HIT'nDRIVE may help clinicians contextualize massive multiomics data in therapeutic decision making, enabling widespread implementation of precision oncology.
Collapse
Affiliation(s)
- Raunak Shrestha
- Bioinformatics Training Program, University of British Columbia, Vancouver, British Columbia, Canada V6T 1Z4.,Laboratory for Advanced Genome Analysis, Vancouver Prostate Centre, Vancouver, British Columbia, Canada V6H 3Z6
| | - Ermin Hodzic
- School of Computing Science, Simon Fraser University, Burnaby, British Columbia, Canada V5A 1S6
| | - Thomas Sauerwald
- Computer Laboratory, University of Cambridge, Cambridge CB3 0FD, United Kingdom
| | - Phuong Dao
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA
| | - Kendric Wang
- Laboratory for Advanced Genome Analysis, Vancouver Prostate Centre, Vancouver, British Columbia, Canada V6H 3Z6
| | - Jake Yeung
- Laboratory for Advanced Genome Analysis, Vancouver Prostate Centre, Vancouver, British Columbia, Canada V6H 3Z6
| | - Shawn Anderson
- Laboratory for Advanced Genome Analysis, Vancouver Prostate Centre, Vancouver, British Columbia, Canada V6H 3Z6
| | - Fabio Vandin
- Department of Information Engineering, University of Padova, 35131 Padova, Italy
| | - Gholamreza Haffari
- Faculty of Information Technology, Monash University, Melbourne 3800, Australia
| | - Colin C Collins
- Laboratory for Advanced Genome Analysis, Vancouver Prostate Centre, Vancouver, British Columbia, Canada V6H 3Z6.,Department of Urologic Sciences, University of British Columbia, Vancouver, British Columbia, Canada V5Z 1M9
| | - S Cenk Sahinalp
- Laboratory for Advanced Genome Analysis, Vancouver Prostate Centre, Vancouver, British Columbia, Canada V6H 3Z6.,School of Computing Science, Simon Fraser University, Burnaby, British Columbia, Canada V5A 1S6.,School of Informatics and Computing, Indiana University, Bloomington, Indiana 47408, USA
| |
Collapse
|
44
|
Mo F, Lin D, Takhar M, Ramnarine VR, Dong X, Bell RH, Volik SV, Wang K, Xue H, Wang Y, Haegert A, Anderson S, Brahmbhatt S, Erho N, Wang X, Gout PW, Morris J, Karnes RJ, Den RB, Klein EA, Schaeffer EM, Ross A, Ren S, Sahinalp SC, Li Y, Xu X, Wang J, Wang J, Gleave ME, Davicioni E, Sun Y, Wang Y, Collins CC. Stromal Gene Expression is Predictive for Metastatic Primary Prostate Cancer. Eur Urol 2017; 73:524-532. [PMID: 28330676 DOI: 10.1016/j.eururo.2017.02.038] [Citation(s) in RCA: 44] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2016] [Accepted: 02/28/2017] [Indexed: 01/25/2023]
Abstract
BACKGROUND Clinical grading systems using clinical features alongside nomograms lack precision in guiding treatment decisions in prostate cancer (PCa). There is a critical need for identification of biomarkers that can more accurately stratify patients with primary PCa. OBJECTIVE To identify a robust prognostic signature to better distinguish indolent from aggressive prostate cancer (PCa). DESIGN, SETTING, AND PARTICIPANTS To develop the signature, whole-genome and whole-transcriptome sequencing was conducted on five PCa patient-derived xenograft (PDX) models collected from independent foci of a single primary tumor and exhibiting variable metastatic phenotypes. Multiple independent clinical cohorts including an intermediate-risk cohort were used to validate the biomarkers. OUTCOME MEASUREMENTS AND STATISTICAL ANALYSIS The outcome measurement defining aggressive PCa was metastasis following radical prostatectomy. A generalized linear model with lasso regularization was used to build a 93-gene stroma-derived metastasis signature (SDMS). The SDMS association with metastasis was assessed using a Wilcoxon rank-sum test. Performance was evaluated using the area under the curve (AUC) for the receiver operating characteristic, and Kaplan-Meier curves. Univariable and multivariable regression models were used to compare the SDMS alongside clinicopathological variables and reported signatures. AUC was assessed to determine if SDMS is additive or synergistic to previously reported signatures. RESULTS AND LIMITATIONS A close association between stromal gene expression and metastatic phenotype was observed. Accordingly, the SDMS was modeled and validated in multiple independent clinical cohorts. Patients with higher SDMS scores were found to have worse prognosis. Furthermore, SDMS was an independent prognostic factor, can stratify risk in intermediate-risk PCa, and can improve the performance of other previously reported signatures. CONCLUSIONS Profiling of stromal gene expression led to development of an SDMS that was validated as independently prognostic for the metastatic potential of prostate tumors. PATIENT SUMMARY Our stroma-derived metastasis signature can predict the metastatic potential of early stage disease and will strengthen decisions regarding selection of active surveillance versus surgery and/or radiation therapy for prostate cancer patients. Furthermore, profiling of stroma cells should be more consistent than profiling of diverse cellular populations of heterogeneous tumors.
Collapse
Affiliation(s)
- Fan Mo
- Vancouver Prostate Centre & Laboratory for Advanced Genome Analysis, Department of Urologic Sciences, University of British Columbia, Vancouver, BC, Canada
| | - Dong Lin
- Vancouver Prostate Centre & Laboratory for Advanced Genome Analysis, Department of Urologic Sciences, University of British Columbia, Vancouver, BC, Canada; Department of Experimental Therapeutics, BC Cancer Agency, Vancouver, BC, Canada
| | - Mandeep Takhar
- Research and Development, GenomeDx Biosciences, Vancouver, BC, Canada
| | - Varune Rohan Ramnarine
- Vancouver Prostate Centre & Laboratory for Advanced Genome Analysis, Department of Urologic Sciences, University of British Columbia, Vancouver, BC, Canada
| | - Xin Dong
- Department of Experimental Therapeutics, BC Cancer Agency, Vancouver, BC, Canada
| | - Robert H Bell
- Vancouver Prostate Centre & Laboratory for Advanced Genome Analysis, Department of Urologic Sciences, University of British Columbia, Vancouver, BC, Canada
| | - Stanislav V Volik
- Vancouver Prostate Centre & Laboratory for Advanced Genome Analysis, Department of Urologic Sciences, University of British Columbia, Vancouver, BC, Canada
| | - Kendric Wang
- Vancouver Prostate Centre & Laboratory for Advanced Genome Analysis, Department of Urologic Sciences, University of British Columbia, Vancouver, BC, Canada
| | - Hui Xue
- Department of Experimental Therapeutics, BC Cancer Agency, Vancouver, BC, Canada
| | - Yuwei Wang
- Department of Experimental Therapeutics, BC Cancer Agency, Vancouver, BC, Canada
| | - Anne Haegert
- Vancouver Prostate Centre & Laboratory for Advanced Genome Analysis, Department of Urologic Sciences, University of British Columbia, Vancouver, BC, Canada
| | - Shawn Anderson
- Vancouver Prostate Centre & Laboratory for Advanced Genome Analysis, Department of Urologic Sciences, University of British Columbia, Vancouver, BC, Canada
| | - Sonal Brahmbhatt
- Vancouver Prostate Centre & Laboratory for Advanced Genome Analysis, Department of Urologic Sciences, University of British Columbia, Vancouver, BC, Canada
| | - Nicholas Erho
- Research and Development, GenomeDx Biosciences, Vancouver, BC, Canada
| | - Xinya Wang
- Vancouver Prostate Centre & Laboratory for Advanced Genome Analysis, Department of Urologic Sciences, University of British Columbia, Vancouver, BC, Canada
| | - Peter W Gout
- Department of Experimental Therapeutics, BC Cancer Agency, Vancouver, BC, Canada
| | - James Morris
- Department of Radiation Oncology, BC Cancer Agency, Vancouver, BC, Canada
| | - R Jeffrey Karnes
- Department of Urology, Mayo Clinic College of Medicine, Rochester, MN, USA
| | - Robert B Den
- Department of Radiation Oncology, Sidney Kimmel Medical College at Thomas Jefferson University, Philadelphia, PA, USA
| | - Eric A Klein
- Glickman Urological and Kidney Institute, Cleveland Clinic, Cleveland, OH, USA
| | - Edward M Schaeffer
- Department of Urology, James Buchanan Brady Urological Institute, Department of Oncology, Johns Hopkins Hospital, Baltimore, MD, USA; Department of Urology, Northwestern University School of Medicine, Chicago, IL, USA
| | - Ashley Ross
- Department of Urology, James Buchanan Brady Urological Institute, Department of Oncology, Johns Hopkins Hospital, Baltimore, MD, USA
| | - Shancheng Ren
- Department of Urology, Shanghai Changhai Hospital, Second Military Medical University, Shanghai, China
| | - S Cenk Sahinalp
- School of Computing Sciences, Simon Fraser University, Burnaby, BC, Canada; School of Informatics and Computing, Indiana University, Bloomington, IN, USA
| | | | - Xun Xu
- BGI-Shenzhen, Shenzhen, China
| | | | | | - Martin E Gleave
- Vancouver Prostate Centre & Laboratory for Advanced Genome Analysis, Department of Urologic Sciences, University of British Columbia, Vancouver, BC, Canada
| | - Elai Davicioni
- Research and Development, GenomeDx Biosciences, Vancouver, BC, Canada
| | - Yinghao Sun
- Department of Urology, Shanghai Changhai Hospital, Second Military Medical University, Shanghai, China
| | - Yuzhuo Wang
- Vancouver Prostate Centre & Laboratory for Advanced Genome Analysis, Department of Urologic Sciences, University of British Columbia, Vancouver, BC, Canada; Department of Experimental Therapeutics, BC Cancer Agency, Vancouver, BC, Canada.
| | - Colin C Collins
- Vancouver Prostate Centre & Laboratory for Advanced Genome Analysis, Department of Urologic Sciences, University of British Columbia, Vancouver, BC, Canada; School of Computing Sciences, Simon Fraser University, Burnaby, BC, Canada.
| |
Collapse
|
45
|
Donmez N, Malikic S, Wyatt AW, Gleave ME, Collins CC, Sahinalp SC. Clonality Inference from Single Tumor Samples Using Low-Coverage Sequence Data. J Comput Biol 2017; 24:515-523. [PMID: 28056180 DOI: 10.1089/cmb.2016.0148] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
Inference of intra-tumor heterogeneity can provide valuable insight into cancer evolution. Somatic mutations detected by sequencing can help estimate the purity of a tumor sample and reconstruct its subclonal composition. Although several methods have been developed to infer intra-tumor heterogeneity, the majority of these tools rely on variant allele frequencies as estimated via ultra-deep sequencing from multiple samples of the same tumor. In practice, obtaining sequencing data from a large number of samples per patient is only feasible in a few cancer types such as liquid tumors, or in rare cases involving solid tumors selected for research. We introduce CTPsingle, which aims at inferring the subclonal composition by using low-coverage sequencing data from a single tumor sample. We show that CTPsingle is able to infer the purity and the clonality of single-sample tumors with high accuracy, even restricted to a coverage depth of ∼30 × .
Collapse
Affiliation(s)
- Nilgun Donmez
- 1 School of Computing Science, Simon Fraser University , Burnaby, Canada .,2 Vancouver Prostate Centre , Vancouver, Canada
| | - Salem Malikic
- 1 School of Computing Science, Simon Fraser University , Burnaby, Canada .,2 Vancouver Prostate Centre , Vancouver, Canada
| | - Alexander W Wyatt
- 2 Vancouver Prostate Centre , Vancouver, Canada .,3 Department of Urologic Sciences, University of British Columbia , Vancouver, Canada
| | | | - Colin C Collins
- 2 Vancouver Prostate Centre , Vancouver, Canada .,3 Department of Urologic Sciences, University of British Columbia , Vancouver, Canada
| | - S Cenk Sahinalp
- 1 School of Computing Science, Simon Fraser University , Burnaby, Canada .,2 Vancouver Prostate Centre , Vancouver, Canada .,4 School of Informatics and Computing, Indiana University , Bloomington, Indiana
| |
Collapse
|
46
|
Kockan C, Hach F, Sarrafi I, Bell RH, McConeghy B, Beja K, Haegert A, Wyatt AW, Volik SV, Chi KN, Collins CC, Sahinalp SC. SiNVICT: ultra-sensitive detection of single nucleotide variants and indels in circulating tumour DNA. Bioinformatics 2016; 33:26-34. [PMID: 27531099 DOI: 10.1093/bioinformatics/btw536] [Citation(s) in RCA: 41] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2015] [Revised: 08/09/2016] [Accepted: 08/11/2016] [Indexed: 01/05/2023] Open
Abstract
MOTIVATION Successful development and application of precision oncology approaches require robust elucidation of the genomic landscape of a patient's cancer and, ideally, the ability to monitor therapy-induced genomic changes in the tumour in an inexpensive and minimally invasive manner. Thanks to recent advances in sequencing technologies, 'liquid biopsy', the sampling of patient's bodily fluids such as blood and urine, is considered as one of the most promising approaches to achieve this goal. In many cancer patients, and especially those with advanced metastatic disease, deep sequencing of circulating cell free DNA (cfDNA) obtained from patient's blood yields a mixture of reads originating from the normal DNA and from multiple tumour subclones-called circulating tumour DNA or ctDNA. The ctDNA/cfDNA ratio as well as the proportion of ctDNA originating from specific tumour subclones depend on multiple factors, making comprehensive detection of mutations difficult, especially at early stages of cancer. Furthermore, sensitive and accurate detection of single nucleotide variants (SNVs) and indels from cfDNA is constrained by several factors such as the sequencing errors and PCR artifacts, and mapping errors related to repeat regions within the genome. In this article, we introduce SiNVICT, a computational method that increases the sensitivity and specificity of SNV and indel detection at very low variant allele frequencies. SiNVICT has the capability to handle multiple sequencing platforms with different error properties; it minimizes false positives resulting from mapping errors and other technology specific artifacts including strand bias and low base quality at read ends. SiNVICT also has the capability to perform time-series analysis, where samples from a patient sequenced at multiple time points are jointly examined to report locations of interest where there is a possibility that certain clones were wiped out by some treatment while some subclones gained selective advantage. RESULTS We tested SiNVICT on simulated data as well as prostate cancer cell lines and cfDNA obtained from castration-resistant prostate cancer patients. On both simulated and biological data, SiNVICT was able to detect SNVs and indels with variant allele percentages as low as 0.5%. The lowest amounts of total DNA used for the biological data where SNVs and indels could be detected with very high sensitivity were 2.5 ng on the Ion Torrent platform and 10 ng on Illumina. With increased sequencing and mapping accuracy, SiNVICT might be utilized in clinical settings, making it possible to track the progress of point mutations and indels that are associated with resistance to cancer therapies and provide patients personalized treatment. We also compared SiNVICT with other popular SNV callers such as MuTect, VarScan2 and Freebayes. Our results show that SiNVICT performs better than these tools in most cases and allows further data exploration such as time-series analysis on cfDNA sequencing data. AVAILABILITY AND IMPLEMENTATION SiNVICT is available at: https://sfu-compbio.github.io/sinvictSupplementary information: Supplementary data are available at Bioinformatics online. CONTACT cenk@sfu.ca.
Collapse
Affiliation(s)
- Can Kockan
- School of Computing Science.,MADD-Gen Graduate Program, Simon Fraser University, Burnaby, (BC), V5A 1S6, Canada
| | - Faraz Hach
- School of Computing Science.,Vancouver Prostate Centre, Vancouver, BC V6H 3Z6, Canada
| | | | - Robert H Bell
- Vancouver Prostate Centre, Vancouver, BC V6H 3Z6, Canada
| | | | - Kevin Beja
- Vancouver Prostate Centre, Vancouver, BC V6H 3Z6, Canada
| | - Anne Haegert
- Vancouver Prostate Centre, Vancouver, BC V6H 3Z6, Canada
| | - Alexander W Wyatt
- Vancouver Prostate Centre, Vancouver, BC V6H 3Z6, Canada.,Department of Urologic Sciences, University of British Columbia, Vancouver, BC V6T 1Z4, Canada
| | | | - Kim N Chi
- Department of Urologic Sciences, University of British Columbia, Vancouver, BC V6T 1Z4, Canada
| | - Colin C Collins
- Vancouver Prostate Centre, Vancouver, BC V6H 3Z6, Canada.,Department of Urologic Sciences, University of British Columbia, Vancouver, BC V6T 1Z4, Canada
| | - S Cenk Sahinalp
- School of Computing Science.,Department of Urologic Sciences, University of British Columbia, Vancouver, BC V6T 1Z4, Canada.,School of Informatics and Computing, Indiana University, Bloomington, IN 47405, USA
| |
Collapse
|
47
|
Abstract
Motivation:CYP2D6 is highly polymorphic gene which encodes the (CYP2D6) enzyme, involved in the metabolism of 20–25% of all clinically prescribed drugs and other xenobiotics in the human body. CYP2D6 genotyping is recommended prior to treatment decisions involving one or more of the numerous drugs sensitive to CYP2D6 allelic composition. In this context, high-throughput sequencing (HTS) technologies provide a promising time-efficient and cost-effective alternative to currently used genotyping techniques. To achieve accurate interpretation of HTS data, however, one needs to overcome several obstacles such as high sequence similarity and genetic recombinations between CYP2D6 and evolutionarily related pseudogenes CYP2D7 and CYP2D8, high copy number variation among individuals and short read lengths generated by HTS technologies. Results: In this work, we present the first algorithm to computationally infer CYP2D6 genotype at basepair resolution from HTS data. Our algorithm is able to resolve complex genotypes, including alleles that are the products of duplication, deletion and fusion events involving CYP2D6 and its evolutionarily related cousin CYP2D7. Through extensive experiments using simulated and real datasets, we show that our algorithm accurately solves this important problem with potential clinical implications. Availability and implementation: Cypiripi is available at http://sfu-compbio.github.io/cypiripi. Contact:cenk@sfu.ca.
Collapse
Affiliation(s)
- Ibrahim Numanagić
- School of Computing Science, Simon Fraser University, Burnaby, BC V5A 1S6, Canada, Department of Medicine, Division of Clinical Pharmacology, Indiana University School of Medicine, Indianapolis, IN 46202, USA and School of Informatics and Computing, Indiana University, Bloomington, IN 47401, USA School of Computing Science, Simon Fraser University, Burnaby, BC V5A 1S6, Canada, Department of Medicine, Division of Clinical Pharmacology, Indiana University School of Medicine, Indianapolis, IN 46202, USA and School of Informatics and Computing, Indiana University, Bloomington, IN 47401, USA
| | - Salem Malikić
- School of Computing Science, Simon Fraser University, Burnaby, BC V5A 1S6, Canada, Department of Medicine, Division of Clinical Pharmacology, Indiana University School of Medicine, Indianapolis, IN 46202, USA and School of Informatics and Computing, Indiana University, Bloomington, IN 47401, USA School of Computing Science, Simon Fraser University, Burnaby, BC V5A 1S6, Canada, Department of Medicine, Division of Clinical Pharmacology, Indiana University School of Medicine, Indianapolis, IN 46202, USA and School of Informatics and Computing, Indiana University, Bloomington, IN 47401, USA
| | - Victoria M Pratt
- School of Computing Science, Simon Fraser University, Burnaby, BC V5A 1S6, Canada, Department of Medicine, Division of Clinical Pharmacology, Indiana University School of Medicine, Indianapolis, IN 46202, USA and School of Informatics and Computing, Indiana University, Bloomington, IN 47401, USA
| | - Todd C Skaar
- School of Computing Science, Simon Fraser University, Burnaby, BC V5A 1S6, Canada, Department of Medicine, Division of Clinical Pharmacology, Indiana University School of Medicine, Indianapolis, IN 46202, USA and School of Informatics and Computing, Indiana University, Bloomington, IN 47401, USA
| | - David A Flockhart
- School of Computing Science, Simon Fraser University, Burnaby, BC V5A 1S6, Canada, Department of Medicine, Division of Clinical Pharmacology, Indiana University School of Medicine, Indianapolis, IN 46202, USA and School of Informatics and Computing, Indiana University, Bloomington, IN 47401, USA
| | - S Cenk Sahinalp
- School of Computing Science, Simon Fraser University, Burnaby, BC V5A 1S6, Canada, Department of Medicine, Division of Clinical Pharmacology, Indiana University School of Medicine, Indianapolis, IN 46202, USA and School of Informatics and Computing, Indiana University, Bloomington, IN 47401, USA School of Computing Science, Simon Fraser University, Burnaby, BC V5A 1S6, Canada, Department of Medicine, Division of Clinical Pharmacology, Indiana University School of Medicine, Indianapolis, IN 46202, USA and School of Informatics and Computing, Indiana University, Bloomington, IN 47401, USA
| |
Collapse
|
48
|
Alkan C, Kavak P, Somel M, Gokcumen O, Ugurlu S, Saygi C, Dal E, Bugra K, Güngör T, Sahinalp SC, Özören N, Bekpen C. Whole genome sequencing of Turkish genomes reveals functional private alleles and impact of genetic interactions with Europe, Asia and Africa. BMC Genomics 2014; 15:963. [PMID: 25376095 PMCID: PMC4236450 DOI: 10.1186/1471-2164-15-963] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2014] [Accepted: 10/14/2014] [Indexed: 12/30/2022] Open
Abstract
Background Turkey is a crossroads of major population movements throughout history and has been a hotspot of cultural interactions. Several studies have investigated the complex population history of Turkey through a limited set of genetic markers. However, to date, there have been no studies to assess the genetic variation at the whole genome level using whole genome sequencing. Here, we present whole genome sequences of 16 Turkish individuals resequenced at high coverage (32 × -48×). Results We show that the genetic variation of the contemporary Turkish population clusters with South European populations, as expected, but also shows signatures of relatively recent contribution from ancestral East Asian populations. In addition, we document a significant enrichment of non-synonymous private alleles, consistent with recent observations in European populations. A number of variants associated with skin color and total cholesterol levels show frequency differentiation between the Turkish populations and European populations. Furthermore, we have analyzed the 17q21.31 inversion polymorphism region (MAPT locus) and found increased allele frequency of 31.25% for H1/H2 inversion polymorphism when compared to European populations that show about 25% of allele frequency. Conclusion This study provides the first map of common genetic variation from 16 western Asian individuals and thus helps fill an important geographical gap in analyzing natural human variation and human migration. Our data will help develop population-specific experimental designs for studies investigating disease associations and demographic history in Turkey. Electronic supplementary material The online version of this article (doi:10.1186/1471-2164-15-963) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | - Nesrin Özören
- Department of Molecular Biology and Genetics, Boğaziçi University, İstanbul 34342, Turkey.
| | | |
Collapse
|
49
|
Hach F, Sarrafi I, Hormozdiari F, Alkan C, Eichler EE, Sahinalp SC. mrsFAST-Ultra: a compact, SNP-aware mapper for high performance sequencing applications. Nucleic Acids Res 2014; 42:W494-500. [PMID: 24810850 PMCID: PMC4086126 DOI: 10.1093/nar/gku370] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
High throughput sequencing (HTS) platforms generate unprecedented amounts of data that introduce challenges for processing and downstream analysis. While tools that report the ‘best’ mapping location of each read provide a fast way to process HTS data, they are not suitable for many types of downstream analysis such as structural variation detection, where it is important to report multiple mapping loci for each read. For this purpose we introduce mrsFAST-Ultra, a fast, cache oblivious, SNP-aware aligner that can handle the multi-mapping of HTS reads very efficiently. mrsFAST-Ultra improves mrsFAST, our first cache oblivious read aligner capable of handling multi-mapping reads, through new and compact index structures that reduce not only the overall memory usage but also the number of CPU operations per alignment. In fact the size of the index generated by mrsFAST-Ultra is 10 times smaller than that of mrsFAST. As importantly, mrsFAST-Ultra introduces new features such as being able to (i) obtain the best mapping loci for each read, and (ii) return all reads that have at most n mapping loci (within an error threshold), together with these loci, for any user specified n. Furthermore, mrsFAST-Ultra is SNP-aware, i.e. it can map reads to reference genome while discounting the mismatches that occur at common SNP locations provided by db-SNP; this significantly increases the number of reads that can be mapped to the reference genome. Notice that all of the above features are implemented within the index structure and are not simple post-processing steps and thus are performed highly efficiently. Finally, mrsFAST-Ultra utilizes multiple available cores and processors and can be tuned for various memory settings. Our results show that mrsFAST-Ultra is roughly five times faster than its predecessor mrsFAST. In comparison to newly enhanced popular tools such as Bowtie2, it is more sensitive (it can report 10 times or more mappings per read) and much faster (six times or more) in the multi-mapping mode. Furthermore, mrsFAST-Ultra has an index size of 2GB for the entire human reference genome, which is roughly half of that of Bowtie2. mrsFAST-Ultra is open source and it can be accessed at http://mrsfast.sourceforge.net.
Collapse
Affiliation(s)
- Faraz Hach
- School of Computing Science, Simon Fraser University, Burnaby, BC, Canada, V5A 1S6
| | - Iman Sarrafi
- School of Computing Science, Simon Fraser University, Burnaby, BC, Canada, V5A 1S6
| | - Farhad Hormozdiari
- Computer Science Department, University of California, Los Angeles, CA, USA, 90095
| | - Can Alkan
- Department of Computer Engineering, Bilkent University, 06800 Ankara, Turkey
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington, Seattle, WA, USA, 98195
| | - S Cenk Sahinalp
- School of Computing Science, Simon Fraser University, Burnaby, BC, Canada, V5A 1S6 School of Informatics and Computing, Indiana University, Bloomington, IN, USA, 47405
| |
Collapse
|
50
|
Dao P, Numanagić I, Lin YY, Hach F, Karakoc E, Donmez N, Collins C, Eichler EE, Sahinalp SC. ORMAN: optimal resolution of ambiguous RNA-Seq multimappings in the presence of novel isoforms. ACTA ACUST UNITED AC 2013; 30:644-51. [PMID: 24130305 DOI: 10.1093/bioinformatics/btt591] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
MOTIVATION RNA-Seq technology is promising to uncover many novel alternative splicing events, gene fusions and other variations in RNA transcripts. For an accurate detection and quantification of transcripts, it is important to resolve the mapping ambiguity for those RNA-Seq reads that can be mapped to multiple loci: >17% of the reads from mouse RNA-Seq data and 50% of the reads from some plant RNA-Seq data have multiple mapping loci. In this study, we show how to resolve the mapping ambiguity in the presence of novel transcriptomic events such as exon skipping and novel indels towards accurate downstream analysis. We introduce ORMAN ( O ptimal R esolution of M ultimapping A mbiguity of R N A-Seq Reads), which aims to compute the minimum number of potential transcript products for each gene and to assign each multimapping read to one of these transcripts based on the estimated distribution of the region covering the read. ORMAN achieves this objective through a combinatorial optimization formulation, which is solved through well-known approximation algorithms, integer linear programs and heuristics. RESULTS On a simulated RNA-Seq dataset including a random subset of transcripts from the UCSC database, the performance of several state-of-the-art methods for identifying and quantifying novel transcripts, such as Cufflinks, IsoLasso and CLIIQ, is significantly improved through the use of ORMAN. Furthermore, in an experiment using real RNA-Seq reads, we show that ORMAN is able to resolve multimapping to produce coverage values that are similar to the original distribution, even in genes with highly non-uniform coverage. AVAILABILITY ORMAN is available at http://orman.sf.net
Collapse
Affiliation(s)
- Phuong Dao
- School of Computing Science, Simon Fraser University, Burnaby, BC, Canada, Department of Genome Sciences, University of Washington, Seattle, WA, USA, Vancouver Prostate Centre & Department of Urologic Sciences, University of British Columbia, Vancouver, BC, Canada and Division of Computer Science, School of Informatics and Computing, Indiana University, Bloomington, IN, USA
| | | | | | | | | | | | | | | | | |
Collapse
|