1
|
Chakraborty S, Sharma G, Karmakar S, Banerjee S. Multi-OMICS approaches in cancer biology: New era in cancer therapy. Biochim Biophys Acta Mol Basis Dis 2024; 1870:167120. [PMID: 38484941 DOI: 10.1016/j.bbadis.2024.167120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2024] [Revised: 03/06/2024] [Accepted: 03/06/2024] [Indexed: 04/01/2024]
Abstract
Innovative multi-omics frameworks integrate diverse datasets from the same patients to enhance our understanding of the molecular and clinical aspects of cancers. Advanced omics and multi-view clustering algorithms present unprecedented opportunities for classifying cancers into subtypes, refining survival predictions and treatment outcomes, and unravelling key pathophysiological processes across various molecular layers. However, with the increasing availability of cost-effective high-throughput technologies (HTT) that generate vast amounts of data, analyzing single layers often falls short of establishing causal relations. Integrating multi-omics data spanning genomes, epigenomes, transcriptomes, proteomes, metabolomes, and microbiomes offers unique prospects to comprehend the underlying biology of complex diseases like cancer. This discussion explores algorithmic frameworks designed to uncover cancer subtypes, disease mechanisms, and methods for identifying pivotal genomic alterations. It also underscores the significance of multi-omics in tumor classifications, diagnostics, and prognostications. Despite its unparalleled advantages, the integration of multi-omics data has been slow to find its way into everyday clinics. A major hurdle is the uneven maturity of different omics approaches and the widening gap between the generation of large datasets and the capacity to process this data. Initiatives promoting the standardization of sample processing and analytical pipelines, as well as multidisciplinary training for experts in data analysis and interpretation, are crucial for translating theoretical findings into practical applications.
Collapse
Affiliation(s)
- Sohini Chakraborty
- Department of Biotechnology, School of Biosciences and Technology, Vellore Institute of Technology, Vellore 632014, Tamil Nadu, India
| | - Gaurav Sharma
- Department of Biotechnology, School of Biosciences and Technology, Vellore Institute of Technology, Vellore 632014, Tamil Nadu, India
| | - Sricheta Karmakar
- Department of Biotechnology, School of Biosciences and Technology, Vellore Institute of Technology, Vellore 632014, Tamil Nadu, India
| | - Satarupa Banerjee
- Department of Biotechnology, School of Biosciences and Technology, Vellore Institute of Technology, Vellore 632014, Tamil Nadu, India.
| |
Collapse
|
2
|
Kurt S, Chen M, Toosi H, Chen X, Engblom C, Mold J, Hartman J, Lagergren J. CopyVAE: a variational autoencoder-based approach for copy number variation inference using single-cell transcriptomics. Bioinformatics 2024; 40:btae284. [PMID: 38676578 PMCID: PMC11087824 DOI: 10.1093/bioinformatics/btae284] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2023] [Revised: 03/06/2024] [Accepted: 04/25/2024] [Indexed: 04/29/2024] Open
Abstract
MOTIVATION Copy number variations (CNVs) are common genetic alterations in tumour cells. The delineation of CNVs holds promise for enhancing our comprehension of cancer progression. Moreover, accurate inference of CNVs from single-cell sequencing data is essential for unravelling intratumoral heterogeneity. However, existing inference methods face limitations in resolution and sensitivity. RESULTS To address these challenges, we present CopyVAE, a deep learning framework based on a variational autoencoder architecture. Through experiments, we demonstrated that CopyVAE can accurately and reliably detect CNVs from data obtained using single-cell RNA sequencing. CopyVAE surpasses existing methods in terms of sensitivity and specificity. We also discussed CopyVAE's potential to advance our understanding of genetic alterations and their impact on disease advancement. AVAILABILITY AND IMPLEMENTATION CopyVAE is implemented and freely available under MIT license at https://github.com/kurtsemih/copyVAE.
Collapse
Affiliation(s)
- Semih Kurt
- School of EECS and SciLifeLab, KTH Royal Institute of Technology, Stockholm, 100 44, Sweden
| | - Mandi Chen
- School of EECS and SciLifeLab, KTH Royal Institute of Technology, Stockholm, 100 44, Sweden
| | - Hosein Toosi
- School of EECS and SciLifeLab, KTH Royal Institute of Technology, Stockholm, 100 44, Sweden
| | - Xinsong Chen
- Department of Oncology and Pathology, Karolinska Institutet, Solna, 171 77, Sweden
| | - Camilla Engblom
- Department of Cell and Molecular Biology, Karolinska Institutet, Solna, 171 77, Sweden
| | - Jeff Mold
- Department of Cell and Molecular Biology, Karolinska Institutet, Solna, 171 77, Sweden
| | - Johan Hartman
- Department of Oncology and Pathology, Karolinska Institutet, Solna, 171 77, Sweden
- Department of Clinical Pathology and Cytology, Karolinska University Laboratory, Solna, 171 76, Sweden
| | - Jens Lagergren
- School of EECS and SciLifeLab, KTH Royal Institute of Technology, Stockholm, 100 44, Sweden
| |
Collapse
|
3
|
Liu RH, Xiao XY, Yao L, Jia YY, Guo J, Wang XC, Kong Y, Kong QX. Eukaryotic translation initiation factor EIF4G1 p.Ser637Cys mutation in a family with Parkinson's disease with antecedent essential tremor. Exp Ther Med 2024; 27:206. [PMID: 38590578 PMCID: PMC11000071 DOI: 10.3892/etm.2024.12494] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2023] [Accepted: 02/09/2024] [Indexed: 04/10/2024] Open
Abstract
Essential tremor (ET) and Parkinson's disease (PD) are common chronic movement disorders that can cause a substantial degree of disability. However, the etiology underlying these two conditions remains poorly understood. In the present study, Whole-exome sequencing of peripheral blood samples from the proband and Sanger sequencing of the other 18 family members, and pedigree analysis of four generations of 29 individuals with both ET and PD in a nonconsanguineous Chinese family were performed. Specifically, family members who had available medical information, including historical documentation and physical examination records, were included. A novel c.1909A>T (p.Ser637Cys) missense mutation was identified in the eukaryotic translation initiation factor 4γ1 (EIF4G1) gene as the candidate likely responsible for both conditions. In total, 9 family members exhibited tremor of the bilateral upper limbs and/or head starting from ages of ≥40 years, 3 of whom began showing evidence of PD in their 70s. Eukaryotic initiation factor 4 (eIF4)G1, a component of the translation initiation complex eIF4F, serves as a scaffold protein that interacts with many initiation factors and then binds to the 40S ribosomal subunit. The EIF4G1 (p.Ser637Cys) might inhibit the recruitment of the mRNA to the ribosome. In conclusion, the results from the present study suggested that EIF4G1 may be responsible for the hereditary PD with 'antecedent ET' reported in the family assessed.
Collapse
Affiliation(s)
- Rui-Han Liu
- Department of Pediatrics, Affiliated Hospital of Jining Medical University, Jining, Shandong 272000, P.R. China
- College of TCM, Shandong University of Traditional Chinese Medicine, Jinan, Shandong 250399, P.R. China
| | - Xiang-Yu Xiao
- Cheeloo College of Medicine, Shandong University, Jinan, Shandong 250012, P.R. China
| | - Lei Yao
- Clinical Medical College, Jining Medical University, Jining, Shandong 272000, P.R. China
| | - Yuan-Yuan Jia
- Department of Neurology, Affiliated Hospital of Jining Medical University, Jining, Shandong 272000, P.R. China
| | - Jia Guo
- Clinical Medical College, Jining Medical University, Jining, Shandong 272000, P.R. China
| | - Xing-Chen Wang
- Cheeloo College of Medicine, Shandong University, Jinan, Shandong 250012, P.R. China
| | - Yu Kong
- Department of Medical Imaging, Affiliated Hospital of Jining Medical University, Jining, Shandong 272000, P.R. China
- College of Materials Science and Engineering, Qingdao University, Qingdao, Shandong 266071, P.R. China
| | - Qing-Xia Kong
- Department of Neurology, Affiliated Hospital of Jining Medical University, Jining, Shandong 272000, P.R. China
| |
Collapse
|
4
|
Yu X, Luo X, Cai G, Xiao F. OSCAA: A two-dimensional Gaussian mixture model for copy number variation association analysis. Genet Epidemiol 2024. [PMID: 38533840 DOI: 10.1002/gepi.22558] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Revised: 01/30/2024] [Accepted: 03/05/2024] [Indexed: 03/28/2024]
Abstract
Copy number variants (CNVs) are prevalent in the human genome and are found to have a profound effect on genomic organization and human diseases. Discovering disease-associated CNVs is critical for understanding the pathogenesis of diseases and aiding their diagnosis and treatment. However, traditional methods for assessing the association between CNVs and disease risks adopt a two-stage strategy conducting quantitative CNV measurements first and then testing for association, which may lead to biased association estimation and low statistical power, serving as a major barrier in routine genome-wide assessment of such variation. In this article, we developed One-Stage CNV-disease Association Analysis (OSCAA), a flexible algorithm to discover disease-associated CNVs for both quantitative and qualitative traits. OSCAA employs a two-dimensional Gaussian mixture model that is built upon the PCs from copy number intensities, accounting for technical biases in CNV detection while simultaneously testing for their effect on outcome traits. In OSCAA, CNVs are identified and their associations with disease risk are evaluated simultaneously in a single step, taking into account the uncertainty of CNV identification in the statistical model. Our simulations demonstrated that OSCAA outperformed the existing one-stage method and traditional two-stage methods by yielding a more accurate estimate of the CNV-disease association, especially for short CNVs or CNVs with weak signals. In conclusion, OSCAA is a powerful and flexible approach for CNV association testing with high sensitivity and specificity, which can be easily applied to different traits and clinical risk predictions.
Collapse
Affiliation(s)
- Xuanxuan Yu
- Department of Epidemiology and Biostatistics, Arnold School of Public Health, University of South Carolina, Columbia, South Carolina, USA
| | - Xizhi Luo
- Data and Statistical Sciences, AbbVie Inc., North Chicago, Illinois, USA
| | - Guoshuai Cai
- Department of Surgery, College of Medicine, University of Florida, Gainesville, Florida, USA
| | - Feifei Xiao
- Department of Biostatistics, College of Public Health and Health Promotion & College of Medicine, University of Florida, Gainesville, Florida, USA
| |
Collapse
|
5
|
Du H, Dardas Z, Jolly A, Grochowski CM, Jhangiani SN, Li H, Muzny D, Fatih JM, Yesil G, Elçioglu NH, Gezdirici A, Marafi D, Pehlivan D, Calame DG, Carvalho CMB, Posey JE, Gambin T, Coban-Akdemir Z, Lupski JR. HMZDupFinder: a robust computational approach for detecting intragenic homozygous duplications from exome sequencing data. Nucleic Acids Res 2024; 52:e18. [PMID: 38153174 PMCID: PMC10899794 DOI: 10.1093/nar/gkad1223] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Revised: 11/18/2023] [Accepted: 12/13/2023] [Indexed: 12/29/2023] Open
Abstract
Homozygous duplications contribute to genetic disease by altering gene dosage or disrupting gene regulation and can be more deleterious to organismal biology than heterozygous duplications. Intragenic exonic duplications can result in loss-of-function (LoF) or gain-of-function (GoF) alleles that when homozygosed, i.e. brought to homozygous state at a locus by identity by descent or state, could potentially result in autosomal recessive (AR) rare disease traits. However, the detection and functional interpretation of homozygous duplications from exome sequencing data remains a challenge. We developed a framework algorithm, HMZDupFinder, that is designed to detect exonic homozygous duplications from exome sequencing (ES) data. The HMZDupFinder algorithm can efficiently process large datasets and accurately identifies small intragenic duplications, including those associated with rare disease traits. HMZDupFinder called 965 homozygous duplications with three or less exons from 8,707 ES with a recall rate of 70.9% and a precision of 16.1%. We experimentally confirmed 8/10 rare homozygous duplications. Pathogenicity assessment of these copy number variant alleles allowed clinical genomics contextualization for three homozygous duplications alleles, including two affecting known OMIM disease genes EDAR (MIM# 224900), TNNT1(MIM# 605355), and one variant in a novel candidate disease gene: PAAF1.
Collapse
Affiliation(s)
- Haowei Du
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Zain Dardas
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Angad Jolly
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | | | - Shalini N Jhangiani
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030, USA
| | - He Li
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030, USA
| | - Donna Muzny
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030, USA
| | - Jawid M Fatih
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Gozde Yesil
- Department of Medical Genetics, Istanbul Medical Faculty, Istanbul 34093, Turkey
| | - Nursel H Elçioglu
- Department of Pediatric Genetics, Marmara University Medical Faculty, Istanbul and Eastern Mediterranean University Faculty of Medicine, Mersin 10, Turkey
| | - Alper Gezdirici
- Department of Medical Genetics, University of Health Sciences, Basaksehir Cam and Sakura City Hospital, 34480 Istanbul, Turkey
| | - Dana Marafi
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
- Department of Pediatrics, Faculty of Medicine, Kuwait University, Kuwait
| | - Davut Pehlivan
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
- Section of Pediatric Neurology and Developmental Neuroscience, Department of Pediatrics, Baylor College of Medicine, Houston, TX, USA
- Texas Children's Hospital, Houston, TX 77030, USA
| | - Daniel G Calame
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
- Section of Pediatric Neurology and Developmental Neuroscience, Department of Pediatrics, Baylor College of Medicine, Houston, TX, USA
- Texas Children's Hospital, Houston, TX 77030, USA
| | - Claudia M B Carvalho
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
- Pacific Northwest Research Institute, Seattle, WA 98122, USA
| | - Jennifer E Posey
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Tomasz Gambin
- Institute of Computer Science, Warsaw University of Technology, Warsaw, Poland
- Department of Medical Genetics, Institute of Mother and Child, Warsaw, Poland
| | - Zeynep Coban-Akdemir
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
- Human Genetics Center, Department of Epidemiology, Human Genetics, and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - James R Lupski
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030, USA
- Texas Children's Hospital, Houston, TX 77030, USA
- Department of Pediatrics, Baylor College of Medicine, Houston, TX 77030, USA
| |
Collapse
|
6
|
Zhao H, Baudis M. labelSeg: segment annotation for tumor copy number alteration profiles. Brief Bioinform 2024; 25:bbad541. [PMID: 38300514 PMCID: PMC10833088 DOI: 10.1093/bib/bbad541] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Revised: 11/09/2023] [Accepted: 12/28/2024] [Indexed: 02/02/2024] Open
Abstract
Somatic copy number alterations (SCNAs) are a predominant type of oncogenomic alterations that affect a large proportion of the genome in the majority of cancer samples. Current technologies allow high-throughput measurement of such copy number aberrations, generating results consisting of frequently large sets of SCNA segments. However, the automated annotation and integration of such data are particularly challenging because the measured signals reflect biased, relative copy number ratios. In this study, we introduce labelSeg, an algorithm designed for rapid and accurate annotation of CNA segments, with the aim of enhancing the interpretation of tumor SCNA profiles. Leveraging density-based clustering and exploiting the length-amplitude relationships of SCNA, our algorithm proficiently identifies distinct relative copy number states from individual segment profiles. Its compatibility with most CNA measurement platforms makes it suitable for large-scale integrative data analysis. We confirmed its performance on both simulated and sample-derived data from The Cancer Genome Atlas reference dataset, and we demonstrated its utility in integrating heterogeneous segment profiles from different data sources and measurement platforms. Our comparative and integrative analysis revealed common SCNA patterns in cancer and protein-coding genes with a strong correlation between SCNA and messenger RNA expression, promoting the investigation into the role of SCNA in cancer development.
Collapse
Affiliation(s)
- Hangjia Zhao
- Department of Molecular Life Sciences, University of Zurich, Winterthurerstrasse 190, 8057, Zurich, Switzerland
- Computational Oncogenomics Group, Swiss Institute of Bioinformatics, Winterthurerstrasse 190, 8057, Zurich, Switzerland
| | - Michael Baudis
- Department of Molecular Life Sciences, University of Zurich, Winterthurerstrasse 190, 8057, Zurich, Switzerland
- Computational Oncogenomics Group, Swiss Institute of Bioinformatics, Winterthurerstrasse 190, 8057, Zurich, Switzerland
| |
Collapse
|
7
|
Yu X, Luo X, Cai G, Xiao F. OSCAA: A Two-Dimensional Gaussian Mixture Model for Copy Number Variation Association Analysis. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.09.25.559392. [PMID: 37808739 PMCID: PMC10557568 DOI: 10.1101/2023.09.25.559392] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/10/2023]
Abstract
Copy number variants (CNVs) are prevalent in the human genome which provide profound effect on genomic organization and human diseases. Discovering disease associated CNVs is critical for understanding the pathogenesis of diseases and aiding their diagnosis and treatment. However, traditional methods for assessing the association between CNVs and disease risks adopt a two-stage strategy conducting quantitative CNV measurements first and then testing for association, which may lead to biased association estimation and low statistical power, serving as a major barrier in routine genome wide assessment of such variation. In this article, we developed OSCAA, a flexible algorithm to discover disease associated CNVs for both quantitative and qualitative traits. OSCAA employs a two-dimensional Gaussian mixture model that is built upon the principal components from copy number intensities, accounting for technical biases in CNV detection while simultaneously testing for their effect on outcome traits. In OSCAA, CNVs are identified and their associations with disease risk are evaluated simultaneously in a single step, taking into account the uncertainty of CNV identification in the statistical model. Our simulations demonstrated that OSCAA outperformed the existing one-stage method and traditional two-stage methods by yielding a more accurate estimate of the CNV-disease association, especially for short CNVs or CNVs with weak signal. In conclusion, OSCAA is a powerful and flexible approach for CNV association testing with high sensitivity and specificity, which can be easily applied to different traits and clinical risk predictions.
Collapse
|
8
|
Babadi M, Fu JM, Lee SK, Smirnov AN, Gauthier LD, Walker M, Benjamin DI, Zhao X, Karczewski KJ, Wong I, Collins RL, Sanchis-Juan A, Brand H, Banks E, Talkowski ME. GATK-gCNV enables the discovery of rare copy number variants from exome sequencing data. Nat Genet 2023; 55:1589-1597. [PMID: 37604963 PMCID: PMC10904014 DOI: 10.1038/s41588-023-01449-0] [Citation(s) in RCA: 17] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Accepted: 06/16/2023] [Indexed: 08/23/2023]
Abstract
Copy number variants (CNVs) are major contributors to genetic diversity and disease. While standardized methods, such as the genome analysis toolkit (GATK), exist for detecting short variants, technical challenges have confounded uniform large-scale CNV analyses from whole-exome sequencing (WES) data. Given the profound impact of rare and de novo coding CNVs on genome organization and human disease, we developed GATK-gCNV, a flexible algorithm to discover rare CNVs from sequencing read-depth information, complete with open-source distribution via GATK. We benchmarked GATK-gCNV in 7,962 exomes from individuals in quartet families with matched genome sequencing and microarray data, finding up to 95% recall of rare coding CNVs at a resolution of more than two exons. We used GATK-gCNV to generate a reference catalog of rare coding CNVs in WES data from 197,306 individuals in the UK Biobank, and observed strong correlations between per-gene CNV rates and measures of mutational constraint, as well as rare CNV associations with multiple traits. In summary, GATK-gCNV is a tunable approach for sensitive and specific CNV discovery in WES data, with broad applications.
Collapse
Affiliation(s)
- Mehrtash Babadi
- Data Sciences Platform, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
| | - Jack M Fu
- Program in Medical and Population Genetics and Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Samuel K Lee
- Data Sciences Platform, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Andrey N Smirnov
- Data Sciences Platform, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Laura D Gauthier
- Data Sciences Platform, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Mark Walker
- Data Sciences Platform, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - David I Benjamin
- Data Sciences Platform, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Xuefang Zhao
- Program in Medical and Population Genetics and Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Konrad J Karczewski
- Program in Medical and Population Genetics and Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Isaac Wong
- Program in Medical and Population Genetics and Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Ryan L Collins
- Program in Medical and Population Genetics and Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Alba Sanchis-Juan
- Program in Medical and Population Genetics and Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Harrison Brand
- Program in Medical and Population Genetics and Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Eric Banks
- Data Sciences Platform, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Michael E Talkowski
- Program in Medical and Population Genetics and Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA.
- Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA.
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA.
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
| |
Collapse
|
9
|
Ye B, Tang X, Liao S, Ding K. A comparison of algorithms for identifying copy number variants in family-based whole-exome sequencing data and its implications in inheritance pattern analysis. Gene 2023; 861:147237. [PMID: 36731620 DOI: 10.1016/j.gene.2023.147237] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2022] [Revised: 12/27/2022] [Accepted: 01/26/2023] [Indexed: 01/31/2023]
Abstract
There remain challenges in accurately identifying constitutional or germline copy number variants (gCNVs) based on whole-exome sequencing data that have implications for genetic diagnosis for 'rare undiagnosed disease' in the clinical setting. Although multiple algorithms have been proposed, a systematic comparison of these algorithms for calling gCNVs and analyzing inherited pattern have yet to be fully conducted. Therefore, we empirically compared seven exome-based algorithms, including XHMM, CLAMMS, CODEX2, ExomeDepth, DECoN, CN.MOPS, and GATK gCNV, for calling gCNVs in 151 individuals from 44 pedigrees, together with the gold standard of genotyping-derived gCNVs in the same cohort for the performance assessment. These algorithms demonstrated varied powers in identifying gCNVs, although the distribution of gCNVs size was similar. The number of shared gCNVs across these algorithms was limited (e.g., only four gCNVs shared among seven algorithms); however, several algorithms showed varying degrees of consistency (e.g., 1,843 gCNVs shared between DECoN and ExomeDepth). CLAMMS and CODEX2 outperformed the remaining algorithms according to a relatively higher F-score (i.e., 0.145 and 0.152, respectively). In addition, these algorithms exhibited different Mendelian inconsistencies of gCNVs and significant challenges remained in inheritance pattern analysis. In conclusion, selecting good algorithms may have important implications in gCNVs-based inheritance pattern analysis for family-based studies.
Collapse
Affiliation(s)
- Bo Ye
- Department of Bioinformatics, School of Basic Medicine, Chongqing Medical University, Chongqing 400016, PR China
| | - Xia Tang
- State Key Laboratory of Genetic Engineering and Collaborative Innovation Center for Genetics and Development, School of Life Sciences, Fudan University, Shanghai 200438, PR China
| | - Shixiu Liao
- Medical Genetic Institute of Henan Province, Henan Provincial People's Hospital, Henan Key Laboratory of Genetic Diseases and Functional Genomics, Henan Provincial People's Hospital of Henan University, People's Hospital of Zhengzhou University, Zhengzhou, Henan Province 450003, PR China.
| | - Keyue Ding
- Medical Genetic Institute of Henan Province, Henan Provincial People's Hospital, Henan Key Laboratory of Genetic Diseases and Functional Genomics, Henan Provincial People's Hospital of Henan University, People's Hospital of Zhengzhou University, Zhengzhou, Henan Province 450003, PR China; Department of Cardiovascular Medicine, Mayo Clinic, Rochester, MN 55905, United States.
| |
Collapse
|
10
|
Lee YH, Tsai CY, Lu YS, Lin PH, Chiang YT, Yang TH, Hsu JSJ, Hsu CJ, Chen PL, Liu TC, Wu CC. Revisiting Genetic Epidemiology with a Refined Targeted Gene Panel for Hereditary Hearing Impairment in the Taiwanese Population. Genes (Basel) 2023; 14:genes14040880. [PMID: 37107638 PMCID: PMC10137978 DOI: 10.3390/genes14040880] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2023] [Revised: 04/04/2023] [Accepted: 04/06/2023] [Indexed: 04/29/2023] Open
Abstract
Hearing impairment is one of the most common sensory disorders in children, and targeted next-generation sequencing (NGS)-based genetic examinations can assist in its prognostication and management. In 2020, we developed a simplified 30-gene NGS panel from the original 214-gene NGS version based on Taiwanese genetic epidemiology data to increase the accessibility of NGS-based examinations. In this study, we evaluated the diagnostic performance of the 30-gene NGS panel and compared it with that of the original 214-gene NGS panel in patient subgroups with different clinical features. Data on the clinical features, genetic etiologies, audiological profiles, and outcomes were collected from 350 patients who underwent NGS-based genetic examinations for idiopathic bilateral sensorineural hearing impairment between 2020 and 2022. The overall diagnostic yield was 52%, with slight differences in genetic etiology between patients with different degrees of hearing impairment and ages of onset. No significant difference was found in the diagnostic yields between the two panels, regardless of clinical features, except for a lower detection rate of the 30-gene panel in the late-onset group. For patients with negative genetic results, where the causative variant is undetectable on current NGS-based methods, part of the negative results may be due to genes not covered by the panel or yet to be identified. In such cases, the hearing prognosis varies and may decline over time, necessitating appropriate follow-up and consultation. In conclusion, genetic etiologies can serve as references for refining targeted NGS panels with satisfactory diagnostic performance.
Collapse
Affiliation(s)
- Yen-Hui Lee
- Department of Otolaryngology, National Taiwan University Hospital, Taipei 10002, Taiwan
| | - Cheng-Yu Tsai
- Department of Otolaryngology, National Taiwan University Hospital, Taipei 10002, Taiwan
- Graduate Institute of Medical Genomics and Proteomics, National Taiwan University College of Medicine, Taipei 10055, Taiwan
| | - Yue-Sheng Lu
- Department of Otolaryngology, National Taiwan University Hospital, Taipei 10002, Taiwan
| | - Pei-Hsuan Lin
- Department of Otolaryngology, National Taiwan University Hospital, Taipei 10002, Taiwan
| | - Yu-Ting Chiang
- Department of Otolaryngology, National Taiwan University Hospital, Taipei 10002, Taiwan
- Graduate Institute of Medical Genomics and Proteomics, National Taiwan University College of Medicine, Taipei 10055, Taiwan
| | - Ting-Hua Yang
- Department of Otolaryngology, National Taiwan University Hospital, Taipei 10002, Taiwan
| | - Jacob Shu-Jui Hsu
- Graduate Institute of Medical Genomics and Proteomics, National Taiwan University College of Medicine, Taipei 10055, Taiwan
| | - Chuan-Jen Hsu
- Department of Otolaryngology, National Taiwan University Hospital, Taipei 10002, Taiwan
- Department of Otolaryngology, Buddhist Tzuchi General Hospital, Taichung Branch, Taichung 42743, Taiwan
| | - Pei-Lung Chen
- Graduate Institute of Medical Genomics and Proteomics, National Taiwan University College of Medicine, Taipei 10055, Taiwan
- Department of Medical Genetics, National Taiwan University Hospital, Taipei 10041, Taiwan
- Graduate Institute of Clinical Medicine, National Taiwan University College of Medicine, Taipei 10002, Taiwan
| | - Tien-Chen Liu
- Department of Otolaryngology, National Taiwan University Hospital, Taipei 10002, Taiwan
- Department of Otolaryngology, National Taiwan University College of Medicine, Taipei 10002, Taiwan
| | - Chen-Chi Wu
- Department of Otolaryngology, National Taiwan University Hospital, Taipei 10002, Taiwan
- Graduate Institute of Clinical Medicine, National Taiwan University College of Medicine, Taipei 10002, Taiwan
- Department of Otolaryngology, National Taiwan University College of Medicine, Taipei 10002, Taiwan
- Department of Medical Research, National Taiwan University Hospital Hsin-Chu Branch, Hsinchu 30261, Taiwan
| |
Collapse
|
11
|
Wang XC, Liu RH, Wang T, Wang Y, Jiang Y, Chen DD, Wang XY, Hou TS, Kong QX. A novel missense mutation in SPAST causes hereditary spastic paraplegia in male members of a family: A case report. Mol Med Rep 2023; 27:79. [PMID: 36825575 PMCID: PMC10018243 DOI: 10.3892/mmr.2023.12966] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2022] [Accepted: 01/23/2023] [Indexed: 02/23/2023] Open
Abstract
Hereditary spastic paraplegia (HSP) comprises a group of hereditary and neurodegenerative diseases that are characterized by axonal degeneration or demyelination of bilateral corticospinal tracts in the spinal cord; affected patients exhibit progressive spasticity and weakness in the lower limbs. The most common manifestation of HSP is spastic paraplegia type 4 (SPG4), which is caused by mutations in the spastin (SPAST) gene. The present study reports the clinical characteristics of affected individuals and sequencing analysis of a mutation that caused SPG4 in a family. All affected family members exhibited spasticity and weakness of the lower limbs and, notably, only male members of the family were affected. Whole‑exome sequencing revealed that all affected individuals had a novel c.1785C>A (p. Ser595Arg) missense mutation in SPAST. Bioinformatics analysis revealed changes in both secondary and tertiary structures of the mutated protein. The novel missense mutation in SPAST supported the diagnosis of SPG4 in this family and expands the spectrum of pathogenic mutations that cause SPG4. Analysis of SPAST sequences revealed that most pathogenic mutations occurred in the AAA domain of the protein, which may have a close relationship with SPG4 pathogenesis.
Collapse
Affiliation(s)
- Xing-Chen Wang
- Cheeloo College of Medicine, Shandong University, Jinan, Shandong 250012, P.R. China
| | - Rui-Han Liu
- Department of Pediatrics, Affiliated Hospital of Jining Medical University, Jining, Shandong 272000, P.R. China
| | - Ting Wang
- Department of Nursing, Affiliated Hospital of Jining Medical University, Jining, Shandong 272000, P.R. China
| | - Yanling Wang
- Department of Nursing, Affiliated Hospital of Jining Medical University, Jining, Shandong 272000, P.R. China
| | - Yan Jiang
- Clinical Medical College, Jining Medical University, Jining, Shandong 272000, P.R. China
| | - Dan-Dan Chen
- Clinical Medical College, Jining Medical University, Jining, Shandong 272000, P.R. China
| | - Xin-Yu Wang
- Clinical Medical College, Jining Medical University, Jining, Shandong 272000, P.R. China
| | - Tong-Shu Hou
- Second Clinical Medical College, Binzhou Medical University, Binzhou, Shandong 256600, P.R. China
| | - Qing-Xia Kong
- Cheeloo College of Medicine, Shandong University, Jinan, Shandong 250012, P.R. China
| |
Collapse
|
12
|
Testard Q, Vanhoye X, Yauy K, Naud ME, Vieville G, Rousseau F, Dauriat B, Marquet V, Bourthoumieu S, Geneviève D, Gatinois V, Wells C, Willems M, Coubes C, Pinson L, Dard R, Tessier A, Hervé B, Vialard F, Harzallah I, Touraine R, Cogné B, Deb W, Besnard T, Pichon O, Laudier B, Mesnard L, Doreille A, Busa T, Missirian C, Satre V, Coutton C, Celse T, Harbuz R, Raymond L, Taly JF, Thevenon J. Exome sequencing as a first-tier test for copy number variant detection: retrospective evaluation and prospective screening in 2418 cases. J Med Genet 2022; 59:1234-1240. [PMID: 36137615 DOI: 10.1136/jmg-2022-108439] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2022] [Accepted: 08/10/2022] [Indexed: 01/12/2023]
Abstract
BACKGROUND Despite the availability of whole exome (WES) and genome sequencing (WGS), chromosomal microarray (CMA) remains the first-line diagnostic test in most rare disorders diagnostic workup, looking for copy number variations (CNVs), with a diagnostic yield of 10%-20%. The question of the equivalence of CMA and WES in CNV calling is an organisational and economic question, especially when ordering a WGS after a negative CMA and/or WES. METHODS This study measures the equivalence between CMA and GATK4 exome sequencing depth of coverage method in detecting coding CNVs on a retrospective cohort of 615 unrelated individuals. A prospective detection of WES-CNV on a cohort of 2418 unrelated individuals, including the 615 individuals from the validation cohort, was performed. RESULTS On the retrospective validation cohort, every CNV detectable by the method (ie, a CNV with at least one exon not in a dark zone) was accurately called (64/64 events). In the prospective cohort, 32 diagnoses were performed among the 2418 individuals with CNVs ranging from 704 bp to aneuploidy. An incidental finding was reported. The overall increase in diagnostic yield was of 1.7%, varying from 1.2% in individuals with multiple congenital anomalies to 1.9% in individuals with chronic kidney failure. CONCLUSION Combining single-nucleotide variant (SNV) and CNV detection increases the suitability of exome sequencing as a first-tier diagnostic test for suspected rare Mendelian disorders. Before considering the prescription of a WGS after a negative WES, a careful reanalysis with updated CNV calling and SNV annotation should be considered.
Collapse
Affiliation(s)
- Quentin Testard
- Service de Génétique, Eurofins Biomnis, Lyon, France.,Service de Génétique et Procréation, CHU Grenoble Alpes, Grenoble, France.,CNRS UMR 5309, INSERM, U1209, Université Grenoble Alpes, Institute for Advanced Bioscience, Grenoble, France
| | | | - Kevin Yauy
- CNRS UMR 5309, INSERM, U1209, Université Grenoble Alpes, Institute for Advanced Bioscience, Grenoble, France.,SeqOne Genomics, Montpellier, France
| | | | - Gaelle Vieville
- Service de Génétique et Procréation, CHU Grenoble Alpes, Grenoble, France
| | | | - Benjamin Dauriat
- Service de Cytogénétique, Génétique Médicale et Biologie de la Reproduction, CHU Limoges, Limoges, France
| | - Valentine Marquet
- Service de Cytogénétique, Génétique Médicale et Biologie de la Reproduction, CHU Limoges, Limoges, France
| | - Sylvie Bourthoumieu
- Service de Cytogénétique, Génétique Médicale et Biologie de la Reproduction, CHU Limoges, Limoges, France
| | - David Geneviève
- Département de Génétique Médicale, Maladies Rares et Médecine Personnalisée, CHU Montpellier, Montpellier, France.,Unité INSERM U1183, University Montpellier 1, Montpellier, France
| | - Vincent Gatinois
- Département de Génétique Médicale, Maladies Rares et Médecine Personnalisée, CHU Montpellier, Montpellier, France
| | - Constance Wells
- Département de Génétique Médicale, Maladies Rares et Médecine Personnalisée, CHU Montpellier, Montpellier, France
| | - Marjolaine Willems
- Département de Génétique Médicale, Maladies Rares et Médecine Personnalisée, CHU Montpellier, Montpellier, France
| | - Christine Coubes
- Département de Génétique Médicale, Maladies Rares et Médecine Personnalisée, CHU Montpellier, Montpellier, France
| | - Lucile Pinson
- Département de Génétique Médicale, Maladies Rares et Médecine Personnalisée, CHU Montpellier, Montpellier, France
| | - Rodolphe Dard
- Département de Génétique, CHI Poissy-Saint-Germain-en-Laye, Saint-Germain-en-Laye, France
| | - Aude Tessier
- Département de Génétique, CHI Poissy-Saint-Germain-en-Laye, Saint-Germain-en-Laye, France
| | - Bérénice Hervé
- Département de Génétique, CHI Poissy-Saint-Germain-en-Laye, Saint-Germain-en-Laye, France
| | - François Vialard
- Département de Génétique, CHI Poissy-Saint-Germain-en-Laye, Saint-Germain-en-Laye, France
| | - Ines Harzallah
- Service de génétique clinique, chromosomique et moléculaire, CHU Saint-Étienne, Saint-Etienne, France
| | - Renaud Touraine
- Service de génétique clinique, chromosomique et moléculaire, CHU Saint-Étienne, Saint-Etienne, France
| | - Benjamin Cogné
- Service de Génétique Médicale, CHU Nantes, Nantes, France
| | - Wallid Deb
- Service de Génétique Médicale, CHU Nantes, Nantes, France
| | - Thomas Besnard
- Service de Génétique Médicale, CHU Nantes, Nantes, France
| | - Olivier Pichon
- Service de Génétique Médicale, CHU Nantes, Nantes, France
| | - Béatrice Laudier
- Laboratoire d'Immunologie et Neurogénétique Expérimentales et Moléculaires INEM UMR7355, CHR d'Orléans, Orléans, France
| | - Laurent Mesnard
- Sorbonne Université, Urgences Néphrologiques et Transplantation Rénale, APHP, Hôpital Tenon, Paris, France
| | - Alice Doreille
- Sorbonne Université, Urgences Néphrologiques et Transplantation Rénale, APHP, Hôpital Tenon, Paris, France
| | - Tiffany Busa
- Département de génétique médicale, AP HM, Hôpital de la Timone Enfant, Marseille, France
| | - Chantal Missirian
- Département de génétique médicale, AP HM, Hôpital de la Timone Enfant, Marseille, France
| | - Véronique Satre
- Service de Génétique et Procréation, CHU Grenoble Alpes, Grenoble, France.,CNRS UMR 5309, INSERM, U1209, Université Grenoble Alpes, Institute for Advanced Bioscience, Grenoble, France
| | - Charles Coutton
- Service de Génétique et Procréation, CHU Grenoble Alpes, Grenoble, France.,CNRS UMR 5309, INSERM, U1209, Université Grenoble Alpes, Institute for Advanced Bioscience, Grenoble, France
| | - Tristan Celse
- Service de Génétique et Procréation, CHU Grenoble Alpes, Grenoble, France
| | - Radu Harbuz
- Service de Génétique et Procréation, CHU Grenoble Alpes, Grenoble, France
| | - Laure Raymond
- Service de Génétique, Eurofins Biomnis, Lyon, France
| | | | - Julien Thevenon
- Service de Génétique et Procréation, CHU Grenoble Alpes, Grenoble, France .,CNRS UMR 5309, INSERM, U1209, Université Grenoble Alpes, Institute for Advanced Bioscience, Grenoble, France
| |
Collapse
|
13
|
Tan R, Shen Y. Accurate in silico confirmation of rare copy number variant calls from exome sequencing data using transfer learning. Nucleic Acids Res 2022; 50:e123. [PMID: 36124672 PMCID: PMC9756945 DOI: 10.1093/nar/gkac788] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2022] [Revised: 08/08/2022] [Accepted: 09/01/2022] [Indexed: 12/24/2022] Open
Abstract
Exome sequencing is widely used in genetic studies of human diseases and clinical genetic diagnosis. Accurate detection of copy number variants (CNVs) is important to fully utilize exome sequencing data. However, exome data are noisy. None of the existing methods alone can achieve both high precision and recall rate. A common practice is to perform heuristic filtration followed by manual inspection of read depth of putative CNVs. This approach does not scale in large studies. To address this issue, we developed a transfer learning method, CNV-espresso, for in silico confirming rare CNVs from exome sequencing data. CNV-espresso encodes candidate CNVs from exome data as images and uses pretrained convolutional neural network models to classify copy number states. We trained CNV-espresso using an offspring-parents trio exome sequencing dataset, with inherited CNVs as positives and CNVs with Mendelian errors as negatives. We evaluated the performance using additional samples that have both exome and whole-genome sequencing (WGS) data. Assuming the CNVs detected from WGS data as a proxy of ground truth, CNV-espresso significantly improves precision while keeping recall almost intact, especially for CNVs that span a small number of exons. CNV-espresso can effectively replace manual inspection of CNVs in large-scale exome sequencing studies.
Collapse
Affiliation(s)
- Renjie Tan
- Department of Systems Biology, Columbia University, New York, NY 10032, USA
| | - Yufeng Shen
- Department of Systems Biology, Columbia University, New York, NY 10032, USA
- Department of Biomedical Informatics, Columbia University, New York, NY 10032, USA
- JP Sulzberger Columbia Genome Center, Columbia University, New York, NY 10032, USA
| |
Collapse
|
14
|
Wang X, Xu Y, Liu R, Lai X, Liu Y, Wang S, Zhang X, Wang J. PEcnv: accurate and efficient detection of copy number variations of various lengths. Brief Bioinform 2022; 23:6686740. [PMID: 36056740 PMCID: PMC9487654 DOI: 10.1093/bib/bbac375] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2022] [Revised: 06/19/2022] [Accepted: 08/08/2022] [Indexed: 11/14/2022] Open
Abstract
Copy number variation (CNV) is a class of key biomarkers in many complex traits and diseases. Detecting CNV from sequencing data is a substantial bioinformatics problem and a standard requirement in clinical practice. Although many proposed CNV detection approaches exist, the core statistical model at their foundation is weakened by two critical computational issues: (i) identifying the optimal setting on the sliding window and (ii) correcting for bias and noise. We designed a statistical process model to overcome these limitations by calculating regional read depths via an exponentially weighted moving average strategy. A one-run detection of CNVs of various lengths is then achieved by a dynamic sliding window, whose size is self-adopted according to the weighted averages. We also designed a novel bias/noise reduction model, accompanied by the moving average, which can handle complicated patterns and extend training data. This model, called PEcnv, accurately detects CNVs ranging from kb-scale to chromosome-arm level. The model performance was validated with simulation samples and real samples. Comparative analysis showed that PEcnv outperforms current popular approaches. Notably, PEcnv provided considerable advantages in detecting small CNVs (1 kb–1 Mb) in panel sequencing data. Thus, PEcnv fills the gap left by existing methods focusing on large CNVs. PEcnv may have broad applications in clinical testing where panel sequencing is the dominant strategy. Availability and implementation: Source code is freely available at https://github.com/Sherwin-xjtu/PEcnv
Collapse
Affiliation(s)
- Xuwen Wang
- Department of Computer Science and Technology, School of Electronics and Information Engineering, Xi'an Jiaotong University, Xi'an 710049, China.,Institute of Data Science and Information Quality, Shaanxi Engineering Research Center of Medical and Health Big Data, Xi'an Jiaotong University, Xi'an 710049, China
| | - Ying Xu
- Department of Computer Science and Technology, School of Electronics and Information Engineering, Xi'an Jiaotong University, Xi'an 710049, China.,Institute of Data Science and Information Quality, Shaanxi Engineering Research Center of Medical and Health Big Data, Xi'an Jiaotong University, Xi'an 710049, China
| | - Ruoyu Liu
- Department of Computer Science and Technology, School of Electronics and Information Engineering, Xi'an Jiaotong University, Xi'an 710049, China.,Institute of Data Science and Information Quality, Shaanxi Engineering Research Center of Medical and Health Big Data, Xi'an Jiaotong University, Xi'an 710049, China
| | - Xin Lai
- Department of Computer Science and Technology, School of Electronics and Information Engineering, Xi'an Jiaotong University, Xi'an 710049, China.,Institute of Data Science and Information Quality, Shaanxi Engineering Research Center of Medical and Health Big Data, Xi'an Jiaotong University, Xi'an 710049, China
| | - Yuqian Liu
- Department of Computer Science and Technology, School of Electronics and Information Engineering, Xi'an Jiaotong University, Xi'an 710049, China.,Institute of Data Science and Information Quality, Shaanxi Engineering Research Center of Medical and Health Big Data, Xi'an Jiaotong University, Xi'an 710049, China
| | - Shenjie Wang
- Department of Computer Science and Technology, School of Electronics and Information Engineering, Xi'an Jiaotong University, Xi'an 710049, China.,Institute of Data Science and Information Quality, Shaanxi Engineering Research Center of Medical and Health Big Data, Xi'an Jiaotong University, Xi'an 710049, China
| | - Xuanping Zhang
- Department of Computer Science and Technology, School of Electronics and Information Engineering, Xi'an Jiaotong University, Xi'an 710049, China.,Institute of Data Science and Information Quality, Shaanxi Engineering Research Center of Medical and Health Big Data, Xi'an Jiaotong University, Xi'an 710049, China
| | - Jiayin Wang
- Department of Computer Science and Technology, School of Electronics and Information Engineering, Xi'an Jiaotong University, Xi'an 710049, China.,Institute of Data Science and Information Quality, Shaanxi Engineering Research Center of Medical and Health Big Data, Xi'an Jiaotong University, Xi'an 710049, China
| |
Collapse
|
15
|
Kuśmirek W. Different Strategies for Counting the Depth of Coverage in Copy Number Variation Calling Tools. Bioinform Biol Insights 2022; 16:11779322221115534. [PMID: 35935530 PMCID: PMC9354125 DOI: 10.1177/11779322221115534] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2022] [Accepted: 07/02/2022] [Indexed: 12/04/2022] Open
Abstract
There are many copy number variation (CNV) detection tools based on the depth of coverage. A characteristic feature of all tools based on the depth of coverage is the first stage of data processing—counting the depth of coverage in the investigated sequencing regions. However, each tool implements this stage in a slightly different way. Herein, we used data from the 1000 Genomes Project to present the impact of another depth of coverage counting strategies on the results of the CNVs detection process. In the study, we used 7 CNV calling tools: CODEX, CANOES, exomeCopy, ExomeDepth, CLAMMS, CNVkit, and CNVind; from each of these applications, we separated the process of counting the depth of coverage into independent modules. Then, we counted the depth of coverage by mentioned modules, and finally, the obtained depth of coverage tables were used as the input data set to other CNV calling tools. The performed experiments showed that the best methods of counting the depth of coverage are the algorithms implemented in the CLAMMS and CNVkit applications. Both ways allow obtaining much better sets of detected CNVs compared to counting the depth of coverage implemented in other tools. What is more, some CNV detection tools are reasonably resistant to changing the input depth of coverage table. In this study, we proved that the exomeCopy application gives an approximately similar set of the resulting rare CNVs, regardless of the method of counting the depth of coverage table.
Collapse
Affiliation(s)
- Wiktor Kuśmirek
- Institute of Computer Science, Warsaw University of Technology, Warsaw, Poland
| |
Collapse
|
16
|
O'Fallon B, Durtschi J, Kellogg A, Lewis T, Close D, Best H. Algorithmic improvements for discovery of germline copy number variants in next-generation sequencing data. BMC Bioinformatics 2022; 23:285. [PMID: 35854218 PMCID: PMC9297596 DOI: 10.1186/s12859-022-04820-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2022] [Accepted: 06/06/2022] [Indexed: 12/03/2022] Open
Abstract
Background Copy number variants (CNVs) play a significant role in human heredity and disease. However, sensitive and specific characterization of germline CNVs from NGS data has remained challenging, particularly for hybridization-capture data in which read counts are the primary source of copy number information. Results We describe two algorithmic adaptations that improve CNV detection accuracy in a Hidden Markov Model (HMM) context. First, we present a method for computing target- and copy number-specific emission distributions. Second, we demonstrate that the Pointwise Maximum a posteriori (PMAP) HMM decoding procedure yields improved sensitivity for small CNV calls compared to the more common Viterbi HMM decoder. We develop a prototype implementation, called Cobalt, and compare it to other CNV detection tools using sets of simulated and previously detected CNVs with sizes spanning a single exon to a full chromosome. Conclusions In both the simulation and previously detected CNV studies Cobalt shows similar sensitivity but significantly fewer false positive detections compared to other callers. Overall sensitivity is 80–90% for deletion CNVs spanning 1–4 targets and 90–100% for larger deletion events, while sensitivity is somewhat lower for small duplication CNVs.
Collapse
Affiliation(s)
- Brendan O'Fallon
- ARUP Institute for Clinical and Experimental Pathology, Salt Lake City, UT, USA.
| | - Jacob Durtschi
- ARUP Institute for Clinical and Experimental Pathology, Salt Lake City, UT, USA
| | - Ana Kellogg
- ARUP Institute for Clinical and Experimental Pathology, Salt Lake City, UT, USA
| | - Tracey Lewis
- ARUP Institute for Clinical and Experimental Pathology, Salt Lake City, UT, USA
| | - Devin Close
- ARUP Institute for Clinical and Experimental Pathology, Salt Lake City, UT, USA
| | - Hunter Best
- ARUP Institute for Clinical and Experimental Pathology, Salt Lake City, UT, USA
| |
Collapse
|
17
|
Wenzel AT, Champa D, Venkatesh H, Sun S, Tsai CY, Mesirov JP, Bui JD, Howell SB, Harismendy O. Single-cell characterization of step-wise acquisition of carboplatin resistance in ovarian cancer. NPJ Syst Biol Appl 2022; 8:20. [PMID: 35715421 PMCID: PMC9206019 DOI: 10.1038/s41540-022-00230-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2021] [Accepted: 05/25/2022] [Indexed: 02/01/2023] Open
Abstract
The molecular underpinnings of acquired resistance to carboplatin are poorly understood and often inconsistent between in vitro modeling studies. After sequential treatment cycles, multiple isogenic clones reached similar levels of resistance, but significant transcriptional heterogeneity. Gene-expression based virtual synchronization of 26,772 single cells from 2 treatment steps and 4 resistant clones was used to evaluate the activity of Hallmark gene sets in proliferative (P) and quiescent (Q) phases. Two behaviors were associated with resistance: (1) broad repression in the P phase observed in all clones in early resistant steps and (2) prevalent induction in Q phase observed in the late treatment step of one clone. Furthermore, the induction of IFNα response in P phase or Wnt-signaling in Q phase were observed in distinct resistant clones. These observations suggest a model of resistance hysteresis, where functional alterations of the P and Q phase states affect the dynamics of the successive transitions between drug exposure and recovery, and prompts for a precise monitoring of single-cell states to develop more effective schedules for, or combination of, chemotherapy treatments.
Collapse
Affiliation(s)
- Alexander T Wenzel
- UC San Diego Bioinformatics and Systems Biology Graduate Program, San Diego, CA, USA
- Division of Medical Genetics, Department of Medicine, University of California San Diego School of Medicine, San Diego, CA, USA
| | - Devora Champa
- Moores UCSD Cancer Center, University of California San Diego School of Medicine, San Diego, CA, USA
- Arnold & Porter LLP, 601 Massachusetts Ave NW, Washington, DC, 20001, USA
| | - Hrishi Venkatesh
- UC San Diego Contiguous Bachelors-Masters program, San Diego, CA, USA
- Microbiology, Immunology and Cancer Biology Graduate Program, University of Minnesota, Minneapolis, MN, USA
| | - Si Sun
- Moores UCSD Cancer Center, University of California San Diego School of Medicine, San Diego, CA, USA
- Department of Obstetrics and Gynecology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430022, China
| | - Cheng-Yu Tsai
- Moores UCSD Cancer Center, University of California San Diego School of Medicine, San Diego, CA, USA
- Department of Pediatrics, Stanford University School of Medicine, 300 Pasteur Drive, S-175, Stanford, CA, 94305, USA
| | - Jill P Mesirov
- Division of Medical Genetics, Department of Medicine, University of California San Diego School of Medicine, San Diego, CA, USA
- Moores UCSD Cancer Center, University of California San Diego School of Medicine, San Diego, CA, USA
| | - Jack D Bui
- Department of Pathology, University of California San Diego School of Medicine, San Diego, CA, USA
| | - Stephen B Howell
- Moores UCSD Cancer Center, University of California San Diego School of Medicine, San Diego, CA, USA.
- Division of Hematology/Oncology, Department of Medicine, University of California San Diego School of Medicine, San Diego, CA, USA.
| | - Olivier Harismendy
- Moores UCSD Cancer Center, University of California San Diego School of Medicine, San Diego, CA, USA.
- Division of Biomedical Informatics, Department of Medicine, University of California School of Medicine, San Diego, CA, USA.
| |
Collapse
|
18
|
Wang X, Junqing L, Huang T. CNVABNN: An AdaBoost algorithm and neural networks-based detection of copy number variations from NGS data. Comput Biol Chem 2022; 99:107720. [DOI: 10.1016/j.compbiolchem.2022.107720] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2022] [Revised: 06/22/2022] [Accepted: 06/23/2022] [Indexed: 11/03/2022]
|
19
|
CNVind: an open source cloud-based pipeline for rare CNVs detection in whole exome sequencing data based on the depth of coverage. BMC Bioinformatics 2022; 23:85. [PMID: 35247967 PMCID: PMC8897915 DOI: 10.1186/s12859-022-04617-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2021] [Accepted: 02/22/2022] [Indexed: 11/16/2022] Open
Abstract
Background A typical Copy Number Variations (CNVs) detection process based on the depth of coverage in the Whole Exome Sequencing (WES) data consists of several steps: (I) calculating the depth of coverage in sequencing regions, (II) quality control, (III) normalizing the depth of coverage, (IV) calling CNVs. Previous tools performed one normalization process for each chromosome—all the coverage depths in the sequencing regions from a given chromosome were normalized in a single run. Methods Herein, we present the new CNVind tool for calling CNVs, where the normalization process is conducted separately for each of the sequencing regions. The total number of normalizations is equal to the number of sequencing regions in the investigated dataset. For example, when analyzing a dataset composed of n sequencing regions, CNVind performs n independent depth of coverage normalizations. Before each normalization, the application selects the k most correlated sequencing regions with the depth of coverage Pearson’s Correlation as distance metric. Then, the resulting subgroup of \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$k+1$$\end{document}k+1 sequencing regions is normalized, the results of all n independent normalizations are combined; finally, the segmentation and CNV calling process is performed on the resultant dataset. Results and conclusions We used WES data from the 1000 Genomes project to evaluate the impact of independent normalization on CNV calling performance and compared the results with state-of-the-art tools: CODEX and exomeCopy. The results proved that independent normalization allows to improve the rare CNVs detection specificity significantly. For example, for the investigated dataset, we reduced the number of FP calls from over 15,000 to around 5000 while maintaining a constant number of TP calls equal to about 150 CNVs. However, independent normalization of each sequencing region is a computationally expensive process, therefore our pipeline is customized and can be easily run in the cloud computing environment, on the computer cluster, or the single CPU server. To our knowledge, the presented application is the first attempt to implement an innovative approach to independent normalization of the depth of WES data coverage. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-022-04617-x.
Collapse
|
20
|
Gordeeva V, Sharova E, Arapidi G. Progress in Methods for Copy Number Variation Profiling. Int J Mol Sci 2022; 23:ijms23042143. [PMID: 35216262 PMCID: PMC8879278 DOI: 10.3390/ijms23042143] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2022] [Revised: 02/09/2022] [Accepted: 02/11/2022] [Indexed: 02/04/2023] Open
Abstract
Copy number variations (CNVs) are the predominant class of structural genomic variations involved in the processes of evolutionary adaptation, genomic disorders, and disease progression. Compared with single-nucleotide variants, there have been challenges associated with the detection of CNVs owing to their diverse sizes. However, the field has seen significant progress in the past 20–30 years. This has been made possible due to the rapid development of molecular diagnostic methods which ensure a more detailed view of the genome structure, further complemented by recent advances in computational methods. Here, we review the major approaches that have been used to routinely detect CNVs, ranging from cytogenetics to the latest sequencing technologies, and then cover their specific features.
Collapse
Affiliation(s)
- Veronika Gordeeva
- Center for Precision Genome Editing and Genetic Technologies for Biomedicine, Federal Research and Clinical Center of Physical-Chemical Medicine of Federal Medical Biological Agency, 119435 Moscow, Russia
- Federal Research and Clinical Center of Physical-Chemical Medicine of Federal Medical Biological Agency, 119435 Moscow, Russia; (E.S.); (G.A.)
- Moscow Institute of Physics and Technology, National Research University, Moscow Oblast, 141701 Moscow, Russia
- Correspondence:
| | - Elena Sharova
- Federal Research and Clinical Center of Physical-Chemical Medicine of Federal Medical Biological Agency, 119435 Moscow, Russia; (E.S.); (G.A.)
| | - Georgij Arapidi
- Federal Research and Clinical Center of Physical-Chemical Medicine of Federal Medical Biological Agency, 119435 Moscow, Russia; (E.S.); (G.A.)
- Moscow Institute of Physics and Technology, National Research University, Moscow Oblast, 141701 Moscow, Russia
- Shemyakin–Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Sciences, 117997 Moscow, Russia
| |
Collapse
|
21
|
Wang R, Jiang Y. Copy Number Variation Detection by Single-Cell DNA Sequencing with SCOPE. Methods Mol Biol 2022; 2493:279-288. [PMID: 35751822 DOI: 10.1007/978-1-0716-2293-3_18] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Whole-genome single-cell DNA sequencing (scDNA-seq) enables the characterization of copy number profiles at the cellular level. This circumvents the averaging effects associated with bulk-tissue sequencing and has increased resolution yet decreased ambiguity in deconvolving cancer subclones and elucidating cancer evolutionary history. ScDNA-seq data is, however, sparse, noisy, and highly variable even within a homogeneous cell population, due to the biases and artifacts that are introduced during the library preparation and sequencing procedure. Here, we describe SCOPE, a normalization and copy number estimation method for scDNA-seq data. We give an overview of the methodology and illustrate SCOPE with step-by-step demonstrations.
Collapse
Affiliation(s)
- Rujin Wang
- Department of Biostatistics, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, NC, USA
| | - Yuchao Jiang
- Department of Biostatistics, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, NC, USA.
- Department of Genetics, School of Medicine, University of North Carolina, Chapel Hill, NC, USA.
- Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC, USA.
| |
Collapse
|
22
|
Almeida ARM, Neto JL, Cachucho A, Euzébio M, Meng X, Kim R, Fernandes MB, Raposo B, Oliveira ML, Ribeiro D, Fragoso R, Zenatti PP, Soares T, de Matos MR, Corrêa JR, Duque M, Roberts KG, Gu Z, Qu C, Pereira C, Pyne S, Pyne NJ, Barreto VM, Bernard-Pierrot I, Clappier E, Mullighan CG, Grosso AR, Yunes JA, Barata JT. Interleukin-7 receptor α mutational activation can initiate precursor B-cell acute lymphoblastic leukemia. Nat Commun 2021; 12:7268. [PMID: 34907175 PMCID: PMC8671594 DOI: 10.1038/s41467-021-27197-5] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2020] [Accepted: 11/03/2021] [Indexed: 12/13/2022] Open
Abstract
Interleukin-7 receptor α (encoded by IL7R) is essential for lymphoid development. Whether acute lymphoblastic leukemia (ALL)-related IL7R gain-of-function mutations can trigger leukemogenesis remains unclear. Here, we demonstrate that lymphoid-restricted mutant IL7R, expressed at physiological levels in conditional knock-in mice, establishes a pre-leukemic stage in which B-cell precursors display self-renewal ability, initiating leukemia resembling PAX5 P80R or Ph-like human B-ALL. Full transformation associates with transcriptional upregulation of oncogenes such as Myc or Bcl2, downregulation of tumor suppressors such as Ikzf1 or Arid2, and major IL-7R signaling upregulation (involving JAK/STAT5 and PI3K/mTOR), required for leukemia cell viability. Accordingly, maximal signaling drives full penetrance and early leukemia onset in homozygous IL7R mutant animals. Notably, we identify 2 transcriptional subgroups in mouse and human Ph-like ALL, and show that dactolisib and sphingosine-kinase inhibitors are potential treatment avenues for IL-7R-related cases. Our model, a resource to explore the pathophysiology and therapeutic vulnerabilities of B-ALL, demonstrates that IL7R can initiate this malignancy.
Collapse
Affiliation(s)
- Afonso R. M. Almeida
- grid.9983.b0000 0001 2181 4263Instituto de Medicina Molecular João Lobo Antunes, Faculdade de Medicina, Universidade de Lisboa, Lisbon, Portugal
| | - João L. Neto
- grid.9983.b0000 0001 2181 4263Instituto de Medicina Molecular João Lobo Antunes, Faculdade de Medicina, Universidade de Lisboa, Lisbon, Portugal
| | - Ana Cachucho
- grid.9983.b0000 0001 2181 4263Instituto de Medicina Molecular João Lobo Antunes, Faculdade de Medicina, Universidade de Lisboa, Lisbon, Portugal
| | - Mayara Euzébio
- grid.9983.b0000 0001 2181 4263Instituto de Medicina Molecular João Lobo Antunes, Faculdade de Medicina, Universidade de Lisboa, Lisbon, Portugal ,grid.456556.1Centro Infantil Boldrini, Campinas, SP Brazil
| | - Xiangyu Meng
- grid.4444.00000 0001 2112 9282Institut Curie, PSL Research University, CNRS, UMR144, Equipe Labellisée Ligue contre le Cancer, Paris, France
| | - Rathana Kim
- grid.413328.f0000 0001 2300 6614Hematology Laboratory, Saint-Louis Hospital, AP-HP, Paris, France, and Saint-Louis Research Institute, Université de Paris, INSERM U944/Centre National de la Recherche Scientifique (CNRS) Unité Mixte de Recherche (UMR) 7212, Paris, France
| | - Marta B. Fernandes
- grid.9983.b0000 0001 2181 4263Instituto de Medicina Molecular João Lobo Antunes, Faculdade de Medicina, Universidade de Lisboa, Lisbon, Portugal
| | - Beatriz Raposo
- grid.9983.b0000 0001 2181 4263Instituto de Medicina Molecular João Lobo Antunes, Faculdade de Medicina, Universidade de Lisboa, Lisbon, Portugal
| | - Mariana L. Oliveira
- grid.9983.b0000 0001 2181 4263Instituto de Medicina Molecular João Lobo Antunes, Faculdade de Medicina, Universidade de Lisboa, Lisbon, Portugal
| | - Daniel Ribeiro
- grid.9983.b0000 0001 2181 4263Instituto de Medicina Molecular João Lobo Antunes, Faculdade de Medicina, Universidade de Lisboa, Lisbon, Portugal
| | - Rita Fragoso
- grid.9983.b0000 0001 2181 4263Instituto de Medicina Molecular João Lobo Antunes, Faculdade de Medicina, Universidade de Lisboa, Lisbon, Portugal
| | | | - Tiago Soares
- grid.9983.b0000 0001 2181 4263Instituto de Medicina Molecular João Lobo Antunes, Faculdade de Medicina, Universidade de Lisboa, Lisbon, Portugal
| | - Mafalda R. de Matos
- grid.9983.b0000 0001 2181 4263Instituto de Medicina Molecular João Lobo Antunes, Faculdade de Medicina, Universidade de Lisboa, Lisbon, Portugal
| | | | - Mafalda Duque
- grid.9983.b0000 0001 2181 4263Instituto de Medicina Molecular João Lobo Antunes, Faculdade de Medicina, Universidade de Lisboa, Lisbon, Portugal
| | - Kathryn G. Roberts
- grid.240871.80000 0001 0224 711XDepartment of Pathology and Hematological Malignancies Program, St. Jude Children’s Research Hospital, Memphis, TN US
| | - Zhaohui Gu
- grid.240871.80000 0001 0224 711XDepartment of Pathology and Hematological Malignancies Program, St. Jude Children’s Research Hospital, Memphis, TN US
| | - Chunxu Qu
- grid.240871.80000 0001 0224 711XDepartment of Pathology and Hematological Malignancies Program, St. Jude Children’s Research Hospital, Memphis, TN US
| | - Clara Pereira
- grid.8217.c0000 0004 1936 9705Smurfit Institute of Genetics, Trinity College Dublin, University of Dublin, Dublin 2, Ireland
| | - Susan Pyne
- grid.11984.350000000121138138Strathclyde Institute of Pharmacy and Biomedical Sciences (SIPBS), University of Strathclyde, Glasgow, Scotland UK
| | - Nigel J. Pyne
- grid.11984.350000000121138138Strathclyde Institute of Pharmacy and Biomedical Sciences (SIPBS), University of Strathclyde, Glasgow, Scotland UK
| | - Vasco M. Barreto
- grid.10772.330000000121511713DNA Breaks Laboratory, CEDOC - Chronic Diseases Research Center, NOVA Medical School - Faculdade de Ciências Médicas, Universidade NOVA de Lisboa, Lisbon, Portugal
| | - Isabelle Bernard-Pierrot
- grid.4444.00000 0001 2112 9282Institut Curie, PSL Research University, CNRS, UMR144, Equipe Labellisée Ligue contre le Cancer, Paris, France
| | - Emannuelle Clappier
- grid.413328.f0000 0001 2300 6614Hematology Laboratory, Saint-Louis Hospital, AP-HP, Paris, France, and Saint-Louis Research Institute, Université de Paris, INSERM U944/Centre National de la Recherche Scientifique (CNRS) Unité Mixte de Recherche (UMR) 7212, Paris, France
| | - Charles G. Mullighan
- grid.240871.80000 0001 0224 711XDepartment of Pathology and Hematological Malignancies Program, St. Jude Children’s Research Hospital, Memphis, TN US
| | - Ana R. Grosso
- grid.10772.330000000121511713UCIBIO, Departamento de Ciências da Vida, Faculdade de Ciências e Tecnologia, Universidade NOVA de Lisboa, Caparica, Portugal
| | | | - João T. Barata
- grid.9983.b0000 0001 2181 4263Instituto de Medicina Molecular João Lobo Antunes, Faculdade de Medicina, Universidade de Lisboa, Lisbon, Portugal
| |
Collapse
|
23
|
Filer DL, Kuo F, Brandt AT, Tilley CR, Mieczkowski PA, Berg JS, Robasky K, Li Y, Bizon C, Tilson JL, Powell BC, Bost DM, Jeffries CD, Wilhelmsen KC. Pre-capture multiplexing provides additional power to detect copy number variation in exome sequencing. BMC Bioinformatics 2021; 22:374. [PMID: 34284719 PMCID: PMC8293537 DOI: 10.1186/s12859-021-04246-w] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2021] [Accepted: 05/18/2021] [Indexed: 11/10/2022] Open
Abstract
Background As exome sequencing (ES) integrates into clinical practice, we should make every effort to utilize all information generated. Copy-number variation can lead to Mendelian disorders, but small copy-number variants (CNVs) often get overlooked or obscured by under-powered data collection. Many groups have developed methodology for detecting CNVs from ES, but existing methods often perform poorly for small CNVs and rely on large numbers of samples not always available to clinical laboratories. Furthermore, methods often rely on Bayesian approaches requiring user-defined priors in the setting of insufficient prior knowledge. This report first demonstrates the benefit of multiplexed exome capture (pooling samples prior to capture), then presents a novel detection algorithm, mcCNV (“multiplexed capture CNV”), built around multiplexed capture. Results We demonstrate: (1) multiplexed capture reduces inter-sample variance; (2) our mcCNV method, a novel depth-based algorithm for detecting CNVs from multiplexed capture ES data, improves the detection of small CNVs. We contrast our novel approach, agnostic to prior information, with the the commonly-used ExomeDepth. In a simulation study mcCNV demonstrated a favorable false discovery rate (FDR). When compared to calls made from matched genome sequencing, we find the mcCNV algorithm performs comparably to ExomeDepth. Conclusion Implementing multiplexed capture increases power to detect single-exon CNVs. The novel mcCNV algorithm may provide a more favorable FDR than ExomeDepth. The greatest benefits of our approach derive from (1) not requiring a database of reference samples and (2) not requiring prior information about the prevalance or size of variants. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-021-04246-w.
Collapse
Affiliation(s)
- Dayne L Filer
- Department of Genetics, UNC School of Medicine, Chapel Hill, USA. .,Renaissance Computing Institute, Chapel Hill, USA.
| | - Fengshen Kuo
- Renaissance Computing Institute, Chapel Hill, USA
| | - Alicia T Brandt
- Department of Genetics, UNC School of Medicine, Chapel Hill, USA
| | | | | | - Jonathan S Berg
- Department of Genetics, UNC School of Medicine, Chapel Hill, USA
| | - Kimberly Robasky
- Department of Genetics, UNC School of Medicine, Chapel Hill, USA.,Renaissance Computing Institute, Chapel Hill, USA.,UNC School of Information and Library Science, Chapel Hill, USA
| | - Yun Li
- Department of Genetics, UNC School of Medicine, Chapel Hill, USA.,Department of Biostatistics, UNC Gillings School of Global Public Health, Chapel Hill, USA
| | - Chris Bizon
- Renaissance Computing Institute, Chapel Hill, USA
| | | | - Bradford C Powell
- Department of Genetics, UNC School of Medicine, Chapel Hill, USA.,Renaissance Computing Institute, Chapel Hill, USA
| | - Darius M Bost
- Department of Genetics, UNC School of Medicine, Chapel Hill, USA.,Renaissance Computing Institute, Chapel Hill, USA
| | | | - Kirk C Wilhelmsen
- Department of Genetics, UNC School of Medicine, Chapel Hill, USA.,Renaissance Computing Institute, Chapel Hill, USA.,Department of Neurology, UNC School of Medicine, Chapel Hill, USA
| |
Collapse
|
24
|
Gordeeva V, Sharova E, Babalyan K, Sultanov R, Govorun VM, Arapidi G. Benchmarking germline CNV calling tools from exome sequencing data. Sci Rep 2021; 11:14416. [PMID: 34257369 PMCID: PMC8277855 DOI: 10.1038/s41598-021-93878-2] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2020] [Accepted: 06/29/2021] [Indexed: 02/06/2023] Open
Abstract
Whole-exome sequencing is an attractive alternative to microarray analysis because of the low cost and potential ability to detect copy number variations (CNV) of various sizes (from 1-2 exons to several Mb). Previous comparison of the most popular CNV calling tools showed a high portion of false-positive calls. Moreover, due to a lack of a gold standard CNV set, the results are limited and incomparable. Here, we aimed to perform a comprehensive analysis of tools capable of germline CNV calling available at the moment using a single CNV standard and reference sample set. Compiling variants from previous studies with Bayesian estimation approach, we constructed an internal standard for NA12878 sample (pilot National Institute of Standards and Technology Reference Material) including 110,050 CNV or non-CNV exons. The standard was used to evaluate the performance of 16 germline CNV calling tools on the NA12878 sample and 10 correlated exomes as a reference set with respect to length distribution, concordance, and efficiency. Each algorithm had a certain range of detected lengths and showed low concordance with other tools. Most tools are focused on detection of a limited number of CNVs one to seven exons long with a false-positive rate below 50%. EXCAVATOR2, exomeCopy, and FishingCNV focused on detection of a wide range of variations but showed low precision. Upon unified comparison, the tools were not equivalent. The analysis performed allows choosing algorithms or ensembles of algorithms most suitable for a specific goal, e.g. population studies or medical genetics.
Collapse
Affiliation(s)
- Veronika Gordeeva
- Federal Research and Clinical Center of Physical-Chemical Medicine of Federal Medical Biological Agency, Moscow, Russia.
- Moscow Institute of Physics and Technology (National Research University), Dolgoprudny, Russia.
| | - Elena Sharova
- Federal Research and Clinical Center of Physical-Chemical Medicine of Federal Medical Biological Agency, Moscow, Russia
| | - Konstantin Babalyan
- Federal Research and Clinical Center of Physical-Chemical Medicine of Federal Medical Biological Agency, Moscow, Russia
| | - Rinat Sultanov
- Federal Research and Clinical Center of Physical-Chemical Medicine of Federal Medical Biological Agency, Moscow, Russia
- Moscow Institute of Physics and Technology (National Research University), Dolgoprudny, Russia
- Center for Precision Genome Editing and Genetic Technologies for Biomedicine, Federal Research and Clinical Center of Physical-Chemical Medicine of the Federal Medical and Biological Agency, Moscow, Russia
| | - Vadim M Govorun
- Federal Research and Clinical Center of Physical-Chemical Medicine of Federal Medical Biological Agency, Moscow, Russia
- Moscow Institute of Physics and Technology (National Research University), Dolgoprudny, Russia
| | - Georgij Arapidi
- Federal Research and Clinical Center of Physical-Chemical Medicine of Federal Medical Biological Agency, Moscow, Russia
- Moscow Institute of Physics and Technology (National Research University), Dolgoprudny, Russia
- Center for Precision Genome Editing and Genetic Technologies for Biomedicine, Federal Research and Clinical Center of Physical-Chemical Medicine of the Federal Medical and Biological Agency, Moscow, Russia
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Sciences, Moscow, Russia
| |
Collapse
|
25
|
Wang R, Lin DY, Jiang Y. SCOPE: A Normalization and Copy-Number Estimation Method for Single-Cell DNA Sequencing. Cell Syst 2021; 10:445-452.e6. [PMID: 32437686 DOI: 10.1016/j.cels.2020.03.005] [Citation(s) in RCA: 38] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2019] [Revised: 02/11/2020] [Accepted: 03/26/2020] [Indexed: 01/01/2023]
Abstract
Whole-genome single-cell DNA sequencing (scDNA-seq) enables characterization of copy-number profiles at the cellular level. We propose SCOPE, a normalization and copy-number estimation method for the noisy scDNA-seq data. SCOPE's main features include the following: (1) a Poisson latent factor model for normalization, which borrows information across cells and regions to estimate bias, using in silico identified negative control cells; (2) an expectation-maximization algorithm embedded in the normalization step, which accounts for the aberrant copy-number changes and allows direct ploidy estimation without the need for post hoc adjustment; and (3) a cross-sample segmentation procedure to identify breakpoints that are shared across cells with the same genetic background. We evaluate SCOPE on a diverse set of scDNA-seq data in cancer genomics and show that SCOPE offers accurate copy-number estimates and successfully reconstructs subclonal structure. A record of this paper's transparent peer review process is included in the Supplemental Information.
Collapse
Affiliation(s)
- Rujin Wang
- Department of Biostatistics, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Dan-Yu Lin
- Department of Biostatistics, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, NC 27599, USA; Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Yuchao Jiang
- Department of Biostatistics, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, NC 27599, USA; Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC 27599, USA; Department of Genetics, School of Medicine, University of North Carolina, Chapel Hill, NC 27599, USA.
| |
Collapse
|
26
|
Zhao HY, Li Q, Tian Y, Chen YH, Alvi HAK, Yuan XG. CIRCNV: Detection of CNVs Based on a Circular Profile of Read Depth from Sequencing Data. BIOLOGY 2021; 10:biology10070584. [PMID: 34202028 PMCID: PMC8301091 DOI: 10.3390/biology10070584] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/29/2021] [Revised: 06/10/2021] [Accepted: 06/21/2021] [Indexed: 12/29/2022]
Abstract
Simple Summary In this study, we propose a copy number variation (CNV) detection method called CIRCNV, which is based on a circular profile of the read depth from sequencing data. The proposed method is an extended version of our previously developed method CNV-LOF. The main difference of CIRCNV from CNV-LOF lies in its two new features: (1) it transfers the read depth profile from a line shape to a circular shape via a polar coordinate transformation to generate a meaningful two-dimensional dataset for CNV analysis and promote fairness between the ends and middle part of the genome, and (2) it performs two rounds of CNV declaration via estimating tumor purity and recovering the truth circular RD profile. We test and evaluate the performance of CIRCNV via conducting simulation studies and real sequencing tumor sample applications. The experimental results show that CIRCNV outperforms peer methods with respect to sensitivity, precision, and the F1-score. The experiments prove that the proposed method is a reliable and effective tool in the field of variation analysis of tumor genomes. Abstract Copy number variation (CNV) is a common type of structural variation in the human genome. Accurate detection of CNVs from tumor genomes can provide crucial information for the study of tumor genesis and cancer precision diagnosis. However, the contamination of normal genomes in tumor genomes and the crude profiles of the read depth make such a task difficult. In this paper, we propose an alternative approach, called CIRCNV, for the detection of CNVs from sequencing data. CIRCNV is an extension of our previously developed method CNV-LOF, which uses local outlier factors to predict CNVs. Comparatively, CIRCNV can be performed on individual tumor samples and has the following two new features: (1) it transfers the read depth profile from a line shape to a circular shape via a polar coordinate transformation, in order to improve the efficiency of the read depth (RD) profile for the detection of CNVs; and (2) it performs a second round of CNV declaration based on the truth circular RD profile, which is recovered by estimating tumor purity. We test and validate the performance of CIRCNV based on simulation and real sequencing data and perform comparisons with several peer methods. The results demonstrate that CIRCNV can obtain superior performance in terms of sensitivity and precision. We expect that our proposed method will be a supplement to existing methods and become a routine tool in the field of variation analysis of tumor genomes.
Collapse
Affiliation(s)
- Hai-Yong Zhao
- School of Computer Science and Technology, Liaocheng University, Liaocheng 252000, China;
| | - Qi Li
- School of Computer Science and Technology, Xidian University, Xi’an 710071, China; (Q.L.); (Y.T.); (H.A.K.A.)
| | - Ye Tian
- School of Computer Science and Technology, Xidian University, Xi’an 710071, China; (Q.L.); (Y.T.); (H.A.K.A.)
| | - Yue-Hui Chen
- Shandong Provincial Key Laboratory of Network Based Intelligent Computing, University of Jinan, Ji’nan 250022, China;
| | - Haque A. K. Alvi
- School of Computer Science and Technology, Xidian University, Xi’an 710071, China; (Q.L.); (Y.T.); (H.A.K.A.)
| | - Xi-Guo Yuan
- School of Computer Science and Technology, Xidian University, Xi’an 710071, China; (Q.L.); (Y.T.); (H.A.K.A.)
- Correspondence:
| |
Collapse
|
27
|
Qin F, Luo X, Cai G, Xiao F. Shall genomic correlation structure be considered in copy number variants detection? Brief Bioinform 2021; 22:6295811. [PMID: 34114005 DOI: 10.1093/bib/bbab215] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2021] [Revised: 04/16/2021] [Accepted: 05/17/2021] [Indexed: 11/14/2022] Open
Abstract
Copy number variation has been identified as a major source of genomic variation associated with disease susceptibility. With the advent of whole-exome sequencing (WES) technology, massive WES data have been generated, allowing for the identification of copy number variants (CNVs) in the protein-coding regions with direct functional interpretation. We have previously shown evidence of the genomic correlation structure in array data and developed a novel chromosomal breakpoint detection algorithm, LDcnv, which showed significantly improved detection power through integrating the correlation structure in a systematic modeling manner. However, it remains unexplored whether the genomic correlation exists in WES data and how such correlation structure integration can improve the CNV detection accuracy. In this study, we first explored the correlation structure of the WES data using the 1000 Genomes Project data. Both real raw read depth and median-normalized data showed strong evidence of the correlation structure. Motivated by this fact, we proposed a correlation-based method, CORRseq, as a novel release of the LDcnv algorithm in profiling WES data. The performance of CORRseq was evaluated in extensive simulation studies and real data analysis from the 1000 Genomes Project. CORRseq outperformed the existing methods in detecting medium and large CNVs. In conclusion, it would be more advantageous to model genomic correlation structure in detecting relatively long CNVs. This study provides great insights for methodology development of CNV detection with NGS data.
Collapse
Affiliation(s)
- Fei Qin
- Department of Epidemiology and Biostatistics, Arnold School of Public Health, University of South Carolina (USC), Discovery 449, 915 Greene St, Columbia, SC 29208, USA
| | - Xizhi Luo
- Department of Epidemiology and Biostatistics, Arnold School of Public Health, USC, Discovery 449, 915 Greene St, Columbia, SC 29208, USA
| | - Guoshuai Cai
- Department of Environmental Health Science, Arnold School of Public Health, USC, Discovery 449, 915 Greene St, Columbia, SC 29208, USA
| | - Feifei Xiao
- Department of Epidemiology and Biostatistics, Arnold School of Public Health, USC, Discovery 449, 915 Greene St, Columbia, SC 29208, USA
| |
Collapse
|
28
|
Tsukanov AS, Barinov AA, Shubin VP, Loginova AN, Savelieva TA, Pikunov DY, Kuzminov AM, Kashnikov VN, Polyakov AV, Shelygin YA. Finding the Cause of Hereditary Disease in a Family with Adenomatous Polyposis: Why It Is Important to Accumulate Whole Exome Sequencing Data in the Russian Population. RUSS J GENET+ 2021. [DOI: 10.1134/s1022795421060120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
29
|
Bigio B, Seeleuthner Y, Kerner G, Migaud M, Rosain J, Boisson B, Nasca C, Puel A, Bustamante J, Casanova JL, Abel L, Cobat A. Detection of homozygous and hemizygous complete or partial exon deletions by whole-exome sequencing. NAR Genom Bioinform 2021; 3:lqab037. [PMID: 34046589 PMCID: PMC8140739 DOI: 10.1093/nargab/lqab037] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2020] [Revised: 03/19/2021] [Accepted: 05/03/2021] [Indexed: 12/11/2022] Open
Abstract
The detection of copy number variations (CNVs) in whole-exome sequencing (WES) data is important, as CNVs may underlie a number of human genetic disorders. The recently developed HMZDelFinder algorithm can detect rare homozygous and hemizygous (HMZ) deletions in WES data more effectively than other widely used tools. Here, we present HMZDelFinder_opt, an approach that outperforms HMZDelFinder for the detection of HMZ deletions, including partial exon deletions in particular, in WES data from laboratory patient collections that were generated over time in different experimental conditions. We show that using an optimized reference control set of WES data, based on a PCA-derived Euclidean distance for coverage, strongly improves the detection of HMZ complete exon deletions both in real patients carrying validated disease-causing deletions and in simulated data. Furthermore, we develop a sliding window approach enabling HMZDelFinder_opt to identify HMZ partial deletions of exons that are undiscovered by HMZDelFinder. HMZDelFinder_opt is a timely and powerful approach for detecting HMZ deletions, particularly partial exon deletions, in WES data from inherently heterogeneous laboratory patient collections.
Collapse
Affiliation(s)
- Benedetta Bigio
- St. Giles Laboratory of Human Genetics of Infectious Diseases, Rockefeller Branch, The Rockefeller University, New York, NY 10065, USA
| | - Yoann Seeleuthner
- Laboratory of Human Genetics of Infectious Diseases, Necker Branch, INSERM U1163, Necker Hospital for Sick Children, 75015 Paris, France
| | - Gaspard Kerner
- Laboratory of Human Genetics of Infectious Diseases, Necker Branch, INSERM U1163, Necker Hospital for Sick Children, 75015 Paris, France
| | - Mélanie Migaud
- Laboratory of Human Genetics of Infectious Diseases, Necker Branch, INSERM U1163, Necker Hospital for Sick Children, 75015 Paris, France
| | - Jérémie Rosain
- Laboratory of Human Genetics of Infectious Diseases, Necker Branch, INSERM U1163, Necker Hospital for Sick Children, 75015 Paris, France
| | - Bertrand Boisson
- St. Giles Laboratory of Human Genetics of Infectious Diseases, Rockefeller Branch, The Rockefeller University, New York, NY 10065, USA
| | - Carla Nasca
- Laboratory of Neuroendocrinology, The Rockefeller University, New York, NY 10065, USA
| | - Anne Puel
- St. Giles Laboratory of Human Genetics of Infectious Diseases, Rockefeller Branch, The Rockefeller University, New York, NY 10065, USA
| | - Jacinta Bustamante
- St. Giles Laboratory of Human Genetics of Infectious Diseases, Rockefeller Branch, The Rockefeller University, New York, NY 10065, USA
| | - Jean-Laurent Casanova
- St. Giles Laboratory of Human Genetics of Infectious Diseases, Rockefeller Branch, The Rockefeller University, New York, NY 10065, USA
| | - Laurent Abel
- St. Giles Laboratory of Human Genetics of Infectious Diseases, Rockefeller Branch, The Rockefeller University, New York, NY 10065, USA
| | - Aurelie Cobat
- Laboratory of Human Genetics of Infectious Diseases, Necker Branch, INSERM U1163, Necker Hospital for Sick Children, 75015 Paris, France
| |
Collapse
|
30
|
Jones W, Gong B, Novoradovskaya N, Li D, Kusko R, Richmond TA, Johann DJ, Bisgin H, Sahraeian SME, Bushel PR, Pirooznia M, Wilkins K, Chierici M, Bao W, Basehore LS, Lucas AB, Burgess D, Butler DJ, Cawley S, Chang CJ, Chen G, Chen T, Chen YC, Craig DJ, Del Pozo A, Foox J, Francescatto M, Fu Y, Furlanello C, Giorda K, Grist KP, Guan M, Hao Y, Happe S, Hariani G, Haseley N, Jasper J, Jurman G, Kreil DP, Łabaj P, Lai K, Li J, Li QZ, Li Y, Li Z, Liu Z, López MS, Miclaus K, Miller R, Mittal VK, Mohiyuddin M, Pabón-Peña C, Parsons BL, Qiu F, Scherer A, Shi T, Stiegelmeyer S, Suo C, Tom N, Wang D, Wen Z, Wu L, Xiao W, Xu C, Yu Y, Zhang J, Zhang Y, Zhang Z, Zheng Y, Mason CE, Willey JC, Tong W, Shi L, Xu J. A verified genomic reference sample for assessing performance of cancer panels detecting small variants of low allele frequency. Genome Biol 2021; 22:111. [PMID: 33863366 PMCID: PMC8051128 DOI: 10.1186/s13059-021-02316-z] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2020] [Accepted: 03/18/2021] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND Oncopanel genomic testing, which identifies important somatic variants, is increasingly common in medical practice and especially in clinical trials. Currently, there is a paucity of reliable genomic reference samples having a suitably large number of pre-identified variants for properly assessing oncopanel assay analytical quality and performance. The FDA-led Sequencing and Quality Control Phase 2 (SEQC2) consortium analyze ten diverse cancer cell lines individually and their pool, termed Sample A, to develop a reference sample with suitably large numbers of coding positions with known (variant) positives and negatives for properly evaluating oncopanel analytical performance. RESULTS In reference Sample A, we identify more than 40,000 variants down to 1% allele frequency with more than 25,000 variants having less than 20% allele frequency with 1653 variants in COSMIC-related genes. This is 5-100× more than existing commercially available samples. We also identify an unprecedented number of negative positions in coding regions, allowing statistical rigor in assessing limit-of-detection, sensitivity, and precision. Over 300 loci are randomly selected and independently verified via droplet digital PCR with 100% concordance. Agilent normal reference Sample B can be admixed with Sample A to create new samples with a similar number of known variants at much lower allele frequency than what exists in Sample A natively, including known variants having allele frequency of 0.02%, a range suitable for assessing liquid biopsy panels. CONCLUSION These new reference samples and their admixtures provide superior capability for performing oncopanel quality control, analytical accuracy, and validation for small to large oncopanels and liquid biopsy assays.
Collapse
Affiliation(s)
- Wendell Jones
- Q2 Solutions - EA Genomics, 5927 S Miami Blvd., Morrisville, NC, 27560, USA.
| | - Binsheng Gong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA
| | | | - Dan Li
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Rebecca Kusko
- Immuneering Corporation, One Broadway, 14th Floor, Cambridge, MA, 02142, USA
| | - Todd A Richmond
- Market & Application Development Bioinformatics, Roche Sequencing Solutions Inc., 4300 Hacienda Dr., Pleasanton, CA, 94588, USA
| | - Donald J Johann
- Winthrop P Rockefeller Cancer Institute, University of Arkansas for Medical Sciences, 4301 W Markham St., Little Rock, AR, 72205, USA
| | - Halil Bisgin
- Department of Computer Science, Engineering and Physics, University of Michigan-Flint, Flint, MI, 48502, USA
| | - Sayed Mohammad Ebrahim Sahraeian
- Bioinformatics Research & Early Development, Roche Sequencing Solutions Inc., 1301 Shoreway Rd., Suite 7 #300, Belmont, CA, 94002, USA
| | - Pierre R Bushel
- National Institute of Environmental Health Sciences, Research Triangle Park, Durham, NC, 27709, USA
| | - Mehdi Pirooznia
- Bioinformatics and Computational Biology Laboratory, National Heart Lung and Blood Institute, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Katherine Wilkins
- Agilent Technologies, 5301 Stevens Creek Blvd., Santa Clara, CA, 95051, USA
| | | | - Wenjun Bao
- JMP Life Sciences, SAS Institute Inc., Cary, NC, 27519, USA
| | - Lee Scott Basehore
- Agilent Technologies, 11011 N Torrey Pines Rd., La Jolla, CA, 92037, USA
| | | | - Daniel Burgess
- (formerly) Research and Development, Roche Sequencing Solutions Inc., 500 South Rosa Rd., Madison, WI, 53719, USA
| | - Daniel J Butler
- Department of Physiology and Biophysics, Weill Cornell Medicine, Cornell University, New York, NY, 10065, USA
| | - Simon Cawley
- (formerly) Clinical Sequencing Division, Thermo Fisher Scientific, 180 Oyster Point Blvd., South San Francisco, CA, 94080, USA
| | - Chia-Jung Chang
- Stanford Genome Technology Center, Stanford University, Palo Alto, CA, 94304, USA
| | - Guangchun Chen
- Department of Immunology, Genomics and Microarray Core Facility, University of Texas Southwestern Medical Center, 5323 Harry Hine Blvd., Dallas, TX, 75390, USA
| | - Tao Chen
- University of Texas Southwestern Medical Center, 2330 Inwood Rd., Dallas, TX, 75390, USA
| | - Yun-Ching Chen
- Bioinformatics and Computational Biology Laboratory, National Heart Lung and Blood Institute, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Daniel J Craig
- Department of Medicine, College of Medicine and Life Sciences, The University of Toledo, Toledo, OH, 43614, USA
| | - Angela Del Pozo
- Institute of Medical and Molecular Genetics (INGEMM), Hospital Universitario La Paz, CIBERER Instituto de Salud Carlos III, 28046, Madrid, Spain
| | - Jonathan Foox
- Department of Physiology and Biophysics, Weill Cornell Medicine, Cornell University, New York, NY, 10065, USA
| | | | - Yutao Fu
- Thermo Fisher Scientific, 110 Miller Ave., Ann Arbor, MI, 48104, USA
| | | | - Kristina Giorda
- Marketing, Integrated DNA Technologies, Inc., 1710 Commercial Park, Coralville, IA, 52241, USA
| | - Kira P Grist
- Q2 Solutions - EA Genomics, 5927 S Miami Blvd., Morrisville, NC, 27560, USA
| | - Meijian Guan
- JMP Life Sciences, SAS Institute Inc., Cary, NC, 27519, USA
| | - Yingyi Hao
- College of Chemistry, Sichuan University, Chengdu, 610064, Sichuan, China
| | - Scott Happe
- Agilent Technologies, 1834 State Hwy 71 West, Cedar Creek, TX, 78612, USA
| | - Gunjan Hariani
- Q2 Solutions - EA Genomics, 5927 S Miami Blvd., Morrisville, NC, 27560, USA
| | - Nathan Haseley
- Illumina Inc., 5200 Illumina Way, San Diego, CA, 92122, USA
| | - Jeff Jasper
- Q2 Solutions - EA Genomics, 5927 S Miami Blvd., Morrisville, NC, 27560, USA
| | | | - David Philip Kreil
- Bioinformatics Research, Institute of Molecular Biotechnology, Boku University Vienna, Vienna, Austria
| | - Paweł Łabaj
- Małopolska Centre of Biotechnology, Jagiellonian University, Krakow, Poland
- Department of Biotechnology, Boku University, Vienna, Austria
| | - Kevin Lai
- Bioinformatics, Integrated DNA Technologies, Inc., 1710 Commercial Park, Coralville, IA, 52241, USA
| | - Jianying Li
- Kelly Government Solutions, Inc., Research Triangle Park, NC, 27709, USA
| | - Quan-Zhen Li
- Department of Immunology, Genomics and Microarray Core Facility, University of Texas Southwestern Medical Center, 5323 Harry Hine Blvd., Dallas, TX, 75390, USA
| | - Yulong Li
- Center of Genome and Personalized Medicine, Institute of Cancer Stem Cell, Dalian Medical University, Dalian, Liaoning, China
| | - Zhiguang Li
- Center of Genome and Personalized Medicine, Institute of Cancer Stem Cell, Dalian Medical University, Dalian, Liaoning, China
| | - Zhichao Liu
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Mario Solís López
- Institute of Medical and Molecular Genetics (INGEMM), Hospital Universitario La Paz, CIBERER Instituto de Salud Carlos III, 28046, Madrid, Spain
- EATRIS ERIC- European Infrastructure for Translational Medicine, De Boelelaan 1118, 1081, HZ, Amsterdam, The Netherlands
| | - Kelci Miclaus
- JMP Life Sciences, SAS Institute Inc., Cary, NC, 27519, USA
| | - Raymond Miller
- Agilent Technologies, 5301 Stevens Creek Blvd., Santa Clara, CA, 95051, USA
| | - Vinay K Mittal
- Thermo Fisher Scientific, 110 Miller Ave., Ann Arbor, MI, 48104, USA
| | - Marghoob Mohiyuddin
- Bioinformatics Research & Early Development, Roche Sequencing Solutions Inc., 1301 Shoreway Rd., Suite 7 #300, Belmont, CA, 94002, USA
| | - Carlos Pabón-Peña
- Agilent Technologies, 5301 Stevens Creek Blvd., Santa Clara, CA, 95051, USA
| | - Barbara L Parsons
- Division of Genetic and Molecular Toxicology, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Fujun Qiu
- Research and Development, Burning Rock Biotech, Shanghai, 201114, China
| | - Andreas Scherer
- EATRIS ERIC- European Infrastructure for Translational Medicine, De Boelelaan 1118, 1081, HZ, Amsterdam, The Netherlands
- Institute for Molecular Medicine Finland (FIMM), Nordic EMBL Partnership for Molecular Medicine, HiLIFE Unit, Biomedicum Helsinki 2U (D302b), FI-00014 University of Helsinki, P.O. Box 20 (Tukholmankatu 8), Helsinki, Finland
| | - Tieliu Shi
- Center for Bioinformatics and Computational Biology, and the Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, 500 Dongchuan Rd, Shanghai, 200241, China
| | - Suzy Stiegelmeyer
- University of North Carolina Health, 101 Manning Drive, Chapel Hill, NC, 27514, USA
| | - Chen Suo
- Department of Epidemiology, School of Public Health, Fudan University, Shanghai, China
| | - Nikola Tom
- EATRIS ERIC- European Infrastructure for Translational Medicine, De Boelelaan 1118, 1081, HZ, Amsterdam, The Netherlands
- Center of Molecular Medicine, Central European Institute of Technology, Masaryk University, Kamenice 5, 625 00, Brno, Czech Republic
| | - Dong Wang
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Zhining Wen
- College of Chemistry, Sichuan University, Chengdu, 610064, Sichuan, China
| | - Leihong Wu
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Wenzhong Xiao
- Stanford Genome Technology Center, Stanford University, Palo Alto, CA, 94304, USA
- Massachusetts General Hospital, Harvard Medical School, Boston, MA, 02114, USA
| | - Chang Xu
- Research and Development, QIAGEN Sciences Inc., Frederick, MD, 21703, USA
| | - Ying Yu
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Shanghai Cancer Hospital/Cancer Institute, Fudan University, Shanghai, 200438, China
| | - Jiyang Zhang
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Shanghai Cancer Hospital/Cancer Institute, Fudan University, Shanghai, 200438, China
| | - Yifan Zhang
- University of Arkansas at Little Rock, Little Rock, AR, 72204, USA
| | - Zhihong Zhang
- Research and Development, Burning Rock Biotech, Shanghai, 201114, China
| | - Yuanting Zheng
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Shanghai Cancer Hospital/Cancer Institute, Fudan University, Shanghai, 200438, China
| | - Christopher E Mason
- Department of Physiology and Biophysics, Weill Cornell Medicine, Cornell University, New York, NY, 10065, USA
| | - James C Willey
- Departments of Medicine, Pathology, and Cancer Biology, College of Medicine and Life Sciences, University of Toledo Health Sciences Campus, 3000 Arlington Ave, Toledo, OH, 43614, USA
| | - Weida Tong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Leming Shi
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Shanghai Cancer Hospital/Cancer Institute, Fudan University, Shanghai, 200438, China
- Human Phenome Institute, Fudan University, Shanghai, 201203, China
- Fudan-Gospel Joint Research Center for Precision Medicine, Fudan University, Shanghai, 200438, China
| | - Joshua Xu
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA.
| |
Collapse
|
31
|
Tsukanov AS, Pikunov DY, Shubin VP, Barinov AA, Kashnikov VN, Shelygin YA, Kaprin AD, Filonenko EV, Sidorov DV, Maschan AA, Novichkova GA, Yasko LA, Raykina EV, Rumyantsev AG. Unique Combination of Diamond-Blackfan Anemia and Lynch Syndrome in Adult Female: A Case Report. Front Oncol 2021; 11:652696. [PMID: 33937060 PMCID: PMC8085342 DOI: 10.3389/fonc.2021.652696] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2021] [Accepted: 03/22/2021] [Indexed: 11/28/2022] Open
Abstract
We present an extremely rare clinical case of a 38-year-old Russian patient with multiple malignant neoplasms of the uterus and colon caused by genetically confirmed two hereditary diseases: Diamond-Blackfan anemia and Lynch syndrome. Molecular genetic research carried out by various methods (NGS, Sanger sequencing, aCGH, and MLPA) revealed a pathogenic nonsense variant in the MSH6 gene: NM_000179.2: c.742C>T, p.(Arg248Ter), as well as a new deletion of the chromosome 15's locus with the capture of 82,662,932-84,816,747 bp interval, including the complete sequence of the RPS17 gene. The lack of expediency of studying microsatellite instability in endometrial tumors using standard mononucleotide markers NR21, NR24, NR27, BAT25, BAT26 was demonstrated. The estimated prevalence of patients with combination of Diamond-Blackfan anemia and Lynch syndrome in the world is one per 480 million people.
Collapse
Affiliation(s)
| | - Dmitriy Y. Pikunov
- Ryzhikh National Medical Research Center of Coloproctology, Moscow, Russia
| | - Vitaly P. Shubin
- Ryzhikh National Medical Research Center of Coloproctology, Moscow, Russia
| | - Aleksey A. Barinov
- Ryzhikh National Medical Research Center of Coloproctology, Moscow, Russia
| | | | - Yuri A. Shelygin
- Ryzhikh National Medical Research Center of Coloproctology, Moscow, Russia
| | | | | | | | - Aleksey A. Maschan
- Dmitry Rogachev National Medical Research Center of Pediatric Hematology, Oncology and Immunology, Moscow, Russia
| | - Galina A. Novichkova
- Dmitry Rogachev National Medical Research Center of Pediatric Hematology, Oncology and Immunology, Moscow, Russia
| | - Liudmila A. Yasko
- Dmitry Rogachev National Medical Research Center of Pediatric Hematology, Oncology and Immunology, Moscow, Russia
| | - Elena V. Raykina
- Dmitry Rogachev National Medical Research Center of Pediatric Hematology, Oncology and Immunology, Moscow, Russia
| | - Aleksandr G. Rumyantsev
- Dmitry Rogachev National Medical Research Center of Pediatric Hematology, Oncology and Immunology, Moscow, Russia
| |
Collapse
|
32
|
Copy Number Variant Detection with Low-Coverage Whole-Genome Sequencing Represents a Viable Alternative to the Conventional Array-CGH. Diagnostics (Basel) 2021; 11:diagnostics11040708. [PMID: 33920867 PMCID: PMC8071346 DOI: 10.3390/diagnostics11040708] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2021] [Revised: 04/09/2021] [Accepted: 04/13/2021] [Indexed: 12/13/2022] Open
Abstract
Copy number variations (CNVs) represent a type of structural variant involving alterations in the number of copies of specific regions of DNA that can either be deleted or duplicated. CNVs contribute substantially to normal population variability, however, abnormal CNVs cause numerous genetic disorders. At present, several methods for CNV detection are applied, ranging from the conventional cytogenetic analysis, through microarray-based methods (aCGH), to next-generation sequencing (NGS). In this paper, we present GenomeScreen, an NGS-based CNV detection method for low-coverage, whole-genome sequencing. We determined the theoretical limits of its accuracy and obtained confirmation in an extensive in silico study and in real patient samples with known genotypes. In theory, at least 6 M uniquely mapped reads are required to detect a CNV with the length of 100 kilobases (kb) or more with high confidence (Z-score > 7). In practice, the in silico analysis required at least 8 M to obtain >99% accuracy (for 100 kb deviations). We compared GenomeScreen with one of the currently used aCGH methods in diagnostic laboratories, which has mean resolution of 200 kb. GenomeScreen and aCGH both detected 59 deviations, while GenomeScreen furthermore detected 134 other (usually) smaller variations. When compared to aCGH, overall performance of the proposed GenemoScreen tool is comparable or superior in terms of accuracy, turn-around time, and cost-effectiveness, thus providing reasonable benefits, particularly in a prenatal diagnosis setting.
Collapse
|
33
|
Sagath L, Lehtokari VL, Välipakka S, Vihola A, Gardberg M, Hackman P, Pelin K, Jokela M, Kiiski K, Udd B, Wallgren-Pettersson C. Congenital asymmetric distal myopathy with hemifacial weakness caused by a heterozygous large de novo mosaic deletion in nebulin. Neuromuscul Disord 2021; 31:539-545. [PMID: 33933294 DOI: 10.1016/j.nmd.2021.03.006] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2020] [Revised: 03/10/2021] [Accepted: 03/16/2021] [Indexed: 11/25/2022]
Abstract
We report the first mosaic mutation, a deletion of exons 11-107, identified in the nebulin gene in a Finnish patient presenting with a predominantly distal congenital myopathy and asymmetric muscle weakness. The female patient is ambulant and currently 26 years old. Muscle biopsies showed myopathic features with type 1 fibre predominance, strikingly hypotrophic type 2 fibres and central nuclei, but no nemaline bodies. The deletion was detected in a copy number variation analysis based on next-generation sequencing data. The parents of the patient did not carry the deletion. Mosaicism was detected using a custom, targeted comparative genomic hybridisation array. Expression of the truncated allele, less than half the size of full-length nebulin, was confirmed by Western blotting. The clinical and histological picture resembled that of a family with a slightly smaller deletion, and that in patients with recessively inherited distal forms of nebulin-caused myopathy. Asymmetry, however, was a novel feature.
Collapse
Affiliation(s)
- Lydia Sagath
- Folkhälsan Research Center, Helsinki, Finland; Department of Medical Genetics, Medicum, University of Helsinki, Finland.
| | - Vilma-Lotta Lehtokari
- Folkhälsan Research Center, Helsinki, Finland; Department of Medical Genetics, Medicum, University of Helsinki, Finland
| | - Salla Välipakka
- Folkhälsan Research Center, Helsinki, Finland; Department of Medical Genetics, Medicum, University of Helsinki, Finland
| | - Anna Vihola
- Folkhälsan Research Center, Helsinki, Finland; Department of Medical Genetics, Medicum, University of Helsinki, Finland; Neuromuscular Research Centre, Fimlab Laboratories, Tampere University and University Hospital, Tampere, Finland
| | - Maria Gardberg
- Department of Pathology, Turku University Hospital and Institute of Biomedicine, University of Turku, Turku, Finland
| | - Peter Hackman
- Folkhälsan Research Center, Helsinki, Finland; Department of Medical Genetics, Medicum, University of Helsinki, Finland
| | - Katarina Pelin
- Folkhälsan Research Center, Helsinki, Finland; Department of Medical Genetics, Medicum, University of Helsinki, Finland; Molecular and Integrative Biosciences Research Programme, Faculty of Biological and Environmental Sciences, University of Helsinki, Helsinki, Finland
| | - Manu Jokela
- Division of Clinical Neurosciences, Turku University Hospital and University of Turku, Turku, Finland; Laboratory of Genetics, HUS Diagnostic Centre, Helsinki University Hospital and University of Helsinki, Helsinki, Finland
| | - Kirsi Kiiski
- Folkhälsan Research Center, Helsinki, Finland; Department of Medical Genetics, Medicum, University of Helsinki, Finland; Laboratory of Genetics, HUS Diagnostic Centre, Helsinki University Hospital and University of Helsinki, Helsinki, Finland
| | - Bjarne Udd
- Folkhälsan Research Center, Helsinki, Finland; Department of Medical Genetics, Medicum, University of Helsinki, Finland; Neuromuscular Research Centre, Tampere University and University Hospital, Tampere, Finland; Department of Neurology, Vaasa Central Hospital, Vaasa, Finland
| | - Carina Wallgren-Pettersson
- Folkhälsan Research Center, Helsinki, Finland; Department of Medical Genetics, Medicum, University of Helsinki, Finland
| |
Collapse
|
34
|
Kim YS, Johnson GD, Seo J, Barrera A, Cowart TN, Majoros WH, Ochoa A, Allen AS, Reddy TE. Correcting signal biases and detecting regulatory elements in STARR-seq data. Genome Res 2021; 31:877-889. [PMID: 33722938 PMCID: PMC8092017 DOI: 10.1101/gr.269209.120] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2020] [Accepted: 03/09/2021] [Indexed: 12/13/2022]
Abstract
High-throughput reporter assays such as self-transcribing active regulatory region sequencing (STARR-seq) have made it possible to measure regulatory element activity across the entire human genome at once. The resulting data, however, present substantial analytical challenges. Here, we identify technical biases that explain most of the variance in STARR-seq data. We then develop a statistical model to correct those biases and to improve detection of regulatory elements. This approach substantially improves precision and recall over current methods, improves detection of both activating and repressive regulatory elements, and controls for false discoveries despite strong local correlations in signal.
Collapse
Affiliation(s)
- Young-Sook Kim
- Department of Biostatistics and Bioinformatics, Division of Integrative Genomics, Duke University Medical School, Durham, North Carolina 27710, USA.,Center for Genomic and Computational Biology, Duke University Medical School, Durham, North Carolina 27710, USA.,Center for Advanced Genomic Technologies, Duke University, Durham, North Carolina 27710, USA.,Duke Center for Statistical Genetics and Genomics, Duke University, Durham, North Carolina 27710, USA.,Program in Computational Biology and Bioinformatics, Duke University, Durham, North Carolina 27710, USA
| | - Graham D Johnson
- Department of Biostatistics and Bioinformatics, Division of Integrative Genomics, Duke University Medical School, Durham, North Carolina 27710, USA.,Center for Genomic and Computational Biology, Duke University Medical School, Durham, North Carolina 27710, USA.,Center for Advanced Genomic Technologies, Duke University, Durham, North Carolina 27710, USA.,Duke Center for Statistical Genetics and Genomics, Duke University, Durham, North Carolina 27710, USA
| | - Jungkyun Seo
- Department of Biostatistics and Bioinformatics, Division of Integrative Genomics, Duke University Medical School, Durham, North Carolina 27710, USA.,Center for Genomic and Computational Biology, Duke University Medical School, Durham, North Carolina 27710, USA.,Center for Advanced Genomic Technologies, Duke University, Durham, North Carolina 27710, USA.,Duke Center for Statistical Genetics and Genomics, Duke University, Durham, North Carolina 27710, USA.,Program in Computational Biology and Bioinformatics, Duke University, Durham, North Carolina 27710, USA
| | - Alejandro Barrera
- Department of Biostatistics and Bioinformatics, Division of Integrative Genomics, Duke University Medical School, Durham, North Carolina 27710, USA.,Center for Genomic and Computational Biology, Duke University Medical School, Durham, North Carolina 27710, USA.,Center for Advanced Genomic Technologies, Duke University, Durham, North Carolina 27710, USA.,Duke Center for Statistical Genetics and Genomics, Duke University, Durham, North Carolina 27710, USA
| | - Thomas N Cowart
- Department of Biostatistics and Bioinformatics, Division of Integrative Genomics, Duke University Medical School, Durham, North Carolina 27710, USA.,Duke Center for Statistical Genetics and Genomics, Duke University, Durham, North Carolina 27710, USA
| | - William H Majoros
- Department of Biostatistics and Bioinformatics, Division of Integrative Genomics, Duke University Medical School, Durham, North Carolina 27710, USA.,Center for Advanced Genomic Technologies, Duke University, Durham, North Carolina 27710, USA.,Duke Center for Statistical Genetics and Genomics, Duke University, Durham, North Carolina 27710, USA.,Program in Computational Biology and Bioinformatics, Duke University, Durham, North Carolina 27710, USA
| | - Alejandro Ochoa
- Department of Biostatistics and Bioinformatics, Division of Integrative Genomics, Duke University Medical School, Durham, North Carolina 27710, USA.,Duke Center for Statistical Genetics and Genomics, Duke University, Durham, North Carolina 27710, USA.,Program in Computational Biology and Bioinformatics, Duke University, Durham, North Carolina 27710, USA
| | - Andrew S Allen
- Department of Biostatistics and Bioinformatics, Division of Integrative Genomics, Duke University Medical School, Durham, North Carolina 27710, USA.,Center for Genomic and Computational Biology, Duke University Medical School, Durham, North Carolina 27710, USA.,Duke Center for Statistical Genetics and Genomics, Duke University, Durham, North Carolina 27710, USA.,Program in Computational Biology and Bioinformatics, Duke University, Durham, North Carolina 27710, USA
| | - Timothy E Reddy
- Department of Biostatistics and Bioinformatics, Division of Integrative Genomics, Duke University Medical School, Durham, North Carolina 27710, USA.,Center for Genomic and Computational Biology, Duke University Medical School, Durham, North Carolina 27710, USA.,Center for Advanced Genomic Technologies, Duke University, Durham, North Carolina 27710, USA.,Duke Center for Statistical Genetics and Genomics, Duke University, Durham, North Carolina 27710, USA.,Program in Computational Biology and Bioinformatics, Duke University, Durham, North Carolina 27710, USA
| |
Collapse
|
35
|
Song C, Su SC, Huo Z, Vural S, Galvin JE, Chang LC. HCMMCNVs: hierarchical clustering mixture model of copy number variants detection using whole exome sequencing technology. Bioinformatics 2021; 37:3026-3028. [PMID: 33714997 PMCID: PMC8479678 DOI: 10.1093/bioinformatics/btab183] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2020] [Revised: 03/10/2021] [Accepted: 03/12/2021] [Indexed: 02/02/2023] Open
Abstract
SUMMARY In this article, we introduce a hierarchical clustering and Gaussian mixture model with expectation-maximization (EM) algorithm for detecting copy number variants (CNVs) using whole exome sequencing (WES) data. The R shiny package 'HCMMCNVs' is also developed for processing user-provided bam files, running CNVs detection algorithm and conducting visualization. Through applying our approach to 325 cancer cell lines in 22 tumor types from Cancer Cell Line Encyclopedia (CCLE), we show that our algorithm is competitive with other existing methods and feasible in using multiple cancer cell lines for CNVs estimation. In addition, by applying our approach to WES data of 120 oral squamous cell carcinoma (OSCC) samples, our algorithm, using the tumor sample only, exhibits more power in detecting CNVs as compared with the methods using both tumors and matched normal counterparts. AVAILABILITY AND IMPLEMENTATION HCMMCNVs R shiny software is freely available at github repository https://github.com/lunching/HCMM_CNVs.and Zenodo https://doi.org/10.5281/zenodo.4593371. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Chi Song
- Division of Biostatistics, Ohio State University, Columbus, OH 43210, USA
| | - Shih-Chi Su
- Whole-Genome Research Core Laboratory of Human Diseases, Chang Gung Memorial Hospital, Keelung 204, Taiwan
| | - Zhiguang Huo
- Department of Biostatistics, University of Florida, Gainsville, FL 32611, USA
| | - Suleyman Vural
- Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - James E Galvin
- Comprehensive Center for Brain Health, Department of Neurology, Miller School of Medicine, University of Miami, Miami, FL 33101, USA
| | - Lun-Ching Chang
- Department of Mathematical Sciences, Florida Atlantic University, Boca Raton, FL 33431, USA,To whom correspondence should be addressed.
| |
Collapse
|
36
|
Yang JO, Choi MH, Yoon JY, Lee JJ, Nam SO, Jun SY, Kwon HH, Yun S, Jeon SJ, Byeon I, Halder D, Kong J, Lee B, Lee J, Kang JW, Kim NS. Characteristics of Genetic Variations Associated With Lennox-Gastaut Syndrome in Korean Families. Front Genet 2021; 11:590924. [PMID: 33584793 PMCID: PMC7874053 DOI: 10.3389/fgene.2020.590924] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2020] [Accepted: 12/31/2020] [Indexed: 12/21/2022] Open
Abstract
Lennox-Gastaut syndrome (LGS) is a severe type of childhood-onset epilepsy characterized by multiple types of seizures, specific discharges on electroencephalography, and intellectual disability. Most patients with LGS do not respond well to drug treatment and show poor long-term prognosis. Approximately 30% of patients without brain abnormalities have unidentifiable causes. Therefore, accurate diagnosis and treatment of LGS remain challenging. To identify causative mutations of LGS, we analyzed the whole-exome sequencing data of 17 unrelated Korean families, including patients with LGS and LGS-like epilepsy without brain abnormalities, using the Genome Analysis Toolkit. We identified 14 mutations in 14 genes as causes of LGS or LGS-like epilepsy. 64 percent of the identified genes were reported as LGS or epilepsy-related genes. Many of these variations were novel and considered as pathogenic or likely pathogenic. Network analysis was performed to classify the identified genes into two network clusters: neuronal signal transmission or neuronal development. Additionally, knockdown of two candidate genes with insufficient evidence of neuronal functions, SLC25A39 and TBC1D8, decreased neurite outgrowth and the expression level of MAP2, a neuronal marker. These results expand the spectrum of genetic variations and may aid the diagnosis and management of individuals with LGS.
Collapse
Affiliation(s)
- Jin Ok Yang
- Korea BioInformation Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon, South Korea.,Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology, Daejeon, South Korea
| | - Min-Hyuk Choi
- Rare-Disease Research Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon, South Korea.,Department of Functional Genomics, Korea University of Science and Technology, Daejeon, South Korea
| | - Ji-Yong Yoon
- Rare-Disease Research Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon, South Korea
| | - Jeong-Ju Lee
- Rare-Disease Research Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon, South Korea
| | - Sang Ook Nam
- Department of Pediatrics, Pusan National University Children's Hospital, Pusan National University School of Medicine, Yangsan, South Korea
| | - Soo Young Jun
- Rare-Disease Research Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon, South Korea
| | - Hyeok Hee Kwon
- Department of Medical Science and Anatomy, Chungnam National University, Daejeon, South Korea
| | - Sohyun Yun
- Rare-Disease Research Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon, South Korea
| | - Su-Jin Jeon
- Rare-Disease Research Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon, South Korea.,Department of Functional Genomics, Korea University of Science and Technology, Daejeon, South Korea
| | - Iksu Byeon
- Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology, Daejeon, South Korea
| | - Debasish Halder
- Rare-Disease Research Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon, South Korea
| | - Juhyun Kong
- Department of Pediatrics, Pusan National University Children's Hospital, Pusan National University School of Medicine, Yangsan, South Korea
| | - Byungwook Lee
- Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology, Daejeon, South Korea
| | - Jeehun Lee
- Department of Pediatrics, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, South Korea
| | - Joon-Won Kang
- Department of Pediatrics and Medical Science, Chungnam National University Hospital, College of Medicine, Chungnam National University, Daejeon, South Korea
| | - Nam-Soon Kim
- Rare-Disease Research Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon, South Korea.,Department of Functional Genomics, Korea University of Science and Technology, Daejeon, South Korea
| |
Collapse
|
37
|
Jiang Y, Li W, Lindsey-Boltz LA, Yang Y, Li Y, Sancar A. Super hotspots and super coldspots in the repair of UV-induced DNA damage in the human genome. J Biol Chem 2021; 296:100581. [PMID: 33771559 PMCID: PMC8081918 DOI: 10.1016/j.jbc.2021.100581] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2021] [Revised: 03/18/2021] [Accepted: 03/22/2021] [Indexed: 02/07/2023] Open
Abstract
The formation of UV-induced DNA damage and its repair are influenced by many factors that modulate lesion formation and the accessibility of repair machinery. However, it remains unknown which genomic sites are prioritized for immediate repair after UV damage induction, and whether these prioritized sites overlap with hotspots of UV damage. We identified the super hotspots subject to the earliest repair for (6-4) pyrimidine-pyrimidone photoproduct by using the eXcision Repair-sequencing (XR-seq) method. We further identified super coldspots for (6-4) pyrimidine-pyrimidone photoproduct repair and super hotspots for cyclobutane pyrimidine dimer repair by analyzing available XR-seq time-course data. By integrating datasets of XR-seq, Damage-seq, adductSeq, and cyclobutane pyrimidine dimer-seq, we show that neither repair super hotspots nor repair super coldspots overlap hotspots of UV damage. Furthermore, we demonstrate that repair super hotspots are significantly enriched in frequently interacting regions and superenhancers. Finally, we report our discovery of an enrichment of cytosine in repair super hotspots and super coldspots. These findings suggest that local DNA features together with large-scale chromatin features contribute to the orders of magnitude variability in the rates of UV damage repair.
Collapse
Affiliation(s)
- Yuchao Jiang
- Department of Biostatistics, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, North Carolina, USA; Department of Genetics, School of Medicine, University of North Carolina, Chapel Hill, North Carolina, USA; Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, North Carolina, USA.
| | - Wentao Li
- Department of Biochemistry and Biophysics, School of Medicine, University of North Carolina, Chapel Hill, North Carolina, USA
| | - Laura A Lindsey-Boltz
- Department of Biochemistry and Biophysics, School of Medicine, University of North Carolina, Chapel Hill, North Carolina, USA
| | - Yuchen Yang
- Department of Genetics, School of Medicine, University of North Carolina, Chapel Hill, North Carolina, USA
| | - Yun Li
- Department of Biostatistics, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, North Carolina, USA; Department of Genetics, School of Medicine, University of North Carolina, Chapel Hill, North Carolina, USA; Department of Computer Science, College of Arts and Sciences, University of North Carolina, Chapel Hill, North Carolina, USA
| | - Aziz Sancar
- Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, North Carolina, USA; Department of Biochemistry and Biophysics, School of Medicine, University of North Carolina, Chapel Hill, North Carolina, USA.
| |
Collapse
|
38
|
Performance of In Silico Prediction Tools for the Detection of Germline Copy Number Variations in Cancer Predisposition Genes in 4208 Female Index Patients with Familial Breast and Ovarian Cancer. Cancers (Basel) 2021; 13:cancers13010118. [PMID: 33401422 PMCID: PMC7794674 DOI: 10.3390/cancers13010118] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2020] [Revised: 12/17/2020] [Accepted: 12/22/2020] [Indexed: 12/12/2022] Open
Abstract
Simple Summary The identification of germline copy number variants (CNVs) by targeted nextgeneration sequencing frequently relies on in silico prediction tools with unknown sensitivities. We investigated the performances of four in silico CNV prediction tools in 17 cancer predisposition genes in a large series of 4208 female index patients with familial breast and/or ovarian cancer. We identified 77 CNVs in 76 out of 4208 patients; six CNVs were missed by at least one of the prediction tools. Experimental verification of in silico predicted CNVs is required due to high frequencies of false positive predictions. For female index patients with familial breast and/or ovarian cancer, CNV detection should not be restricted to BRCA1/2 due to the relevant proportion of CNVs in further cancer predisposition genes. Abstract The identification of germline copy number variants (CNVs) by targeted next-generation sequencing (NGS) frequently relies on in silico CNV prediction tools with unknown sensitivities. We investigated the performances of four in silico CNV prediction tools, including one commercial (Sophia Genetics DDM) and three non-commercial tools (ExomeDepth, GATK gCNV, panelcn.MOPS) in 17 cancer predisposition genes in 4208 female index patients with familial breast and/or ovarian cancer (BC/OC). CNV predictions were verified via multiplex ligation-dependent probe amplification. We identified 77 CNVs in 76 out of 4208 patients (1.81%); 33 CNVs were identified in genes other than BRCA1/2, mostly in ATM, CHEK2, and RAD51C and less frequently in BARD1, MLH1, MSH2, PALB2, PMS2, RAD51D, and TP53. The Sophia Genetics DDM software showed the highest sensitivity; six CNVs were missed by at least one of the non-commercial tools. The positive predictive values ranged from 5.9% (74/1249) for panelcn.MOPS to 79.1% (72/91) for ExomeDepth. Verification of in silico predicted CNVs is required due to high frequencies of false positive predictions, particularly affecting target regions at the extremes of the GC content or target length distributions. CNV detection should not be restricted to BRCA1/2 due to the relevant proportion of CNVs in further BC/OC predisposition genes.
Collapse
|
39
|
Liu G, Zhang J, Yuan X, Wei C. RKDOSCNV: A Local Kernel Density-Based Approach to the Detection of Copy Number Variations by Using Next-Generation Sequencing Data. Front Genet 2020; 11:569227. [PMID: 33329705 PMCID: PMC7673372 DOI: 10.3389/fgene.2020.569227] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2020] [Accepted: 09/04/2020] [Indexed: 12/04/2022] Open
Abstract
Copy number variations (CNVs) are significant causes of many human cancers and genetic diseases. The detection of CNVs has become a common method by which to analyze human diseases using next-generation sequencing (NGS) data. However, effective detection of insignificant CNVs is still a challenging task. In this study, we propose a new detection method, RKDOSCNV, to meet the need. RKDOSCNV uses kernel density estimation method to evaluate the local kernel density distribution of each read depth segment (RDS) based on an expanded nearest neighbor (k-nearest neighbors, reverse nearest neighbors, and shared nearest neighbors of each RDS) data set, and assigns a relative kernel density outlier score (RKDOS) for each RDS. According to the RKDOS profile, RKDOSCNV predicts the candidate CNVs by choosing a reasonable threshold, which it uses split read approach to correct the boundaries of candidate CNVs. The performance of RKDOSCNV is assessed by comparing it with several current popular methods via experiments with simulated and real data at different tumor purity levels. The experimental results verify that the performance of RKDOSCNV is superior to that of several other methods. In summary, RKDOSCNV is a simple and effective method for the detection of CNVs from whole genome sequencing (WGS) data, especially for samples with low tumor purity.
Collapse
Affiliation(s)
- Guojun Liu
- School of Computer Science and Technology, Xidian University, Xi'an, China
| | - Junying Zhang
- School of Computer Science and Technology, Xidian University, Xi'an, China
| | - Xiguo Yuan
- School of Computer Science and Technology, Xidian University, Xi'an, China
| | - Chao Wei
- School of Computer Science and Technology, Xidian University, Xi'an, China
| |
Collapse
|
40
|
Chanwigoon S, Piwluang S, Wichadakul D. inCNV: An Integrated Analysis Tool for Copy Number Variation on Whole Exome Sequencing. Evol Bioinform Online 2020; 16:1176934320956577. [PMID: 33029071 PMCID: PMC7520931 DOI: 10.1177/1176934320956577] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2020] [Accepted: 08/13/2020] [Indexed: 12/13/2022] Open
Abstract
The detection of copy number variations (CNVs) on whole-exome sequencing (WES) represents a cost-effective technique for the study of genetic variants. This approach, however, has encountered an obstacle with high false-positive rates due to biases from exome sequencing capture kits and GC contents. Although plenty of CNV detection tools have been developed, they do not perform well with all types of CNVs. In addition, most tools lack features of genetic annotation, CNV visualization, and flexible installation, requiring users to put much effort into CNV interpretation. Here, we present "inCNV," a web-based application that can accept multiple CNV-tool results, then integrate and prioritize them with user-friendly interfaces. This application helps users analyze the importance of called CNVs by generating CNV annotations from Ensembl, Database of Genomic Variants (DGV), ClinVar, and Online Mendelian Inheritance in Man (OMIM). Moreover, users can select and export CNVs of interest including their flanking sequences for primer design and experimental verification. We demonstrated how inCNV could help users filter and narrow down the called CNVs to a potentially novel CNV, a common CNV within a group of samples of the same disease, or a de novo CNV of a sample within the same family. Besides, we have provided in CNV as a docker image for ease of installation (https://github.com/saowwapark/inCNV).
Collapse
Affiliation(s)
- Saowwapark Chanwigoon
- Software Engineering Program, Department of Computer Engineering, Faculty of Engineering, Chulalongkorn University, Bangkok, Thailand
| | - Sakkayaphab Piwluang
- Software Engineering Program, Department of Computer Engineering, Faculty of Engineering, Chulalongkorn University, Bangkok, Thailand
| | - Duangdao Wichadakul
- Department of Computer Engineering, Faculty of Engineering, Chulalongkorn University, Bangkok, Thailand
| |
Collapse
|
41
|
Frazier AE, Compton AG, Kishita Y, Hock DH, Welch AE, Amarasekera SSC, Rius R, Formosa LE, Imai-Okazaki A, Francis D, Wang M, Lake NJ, Tregoning S, Jabbari JS, Lucattini A, Nitta KR, Ohtake A, Murayama K, Amor DJ, McGillivray G, Wong FY, van der Knaap MS, Jeroen Vermeulen R, Wiltshire EJ, Fletcher JM, Lewis B, Baynam G, Ellaway C, Balasubramaniam S, Bhattacharya K, Freckmann ML, Arbuckle S, Rodriguez M, Taft RJ, Sadedin S, Cowley MJ, Minoche AE, Calvo SE, Mootha VK, Ryan MT, Okazaki Y, Stroud DA, Simons C, Christodoulou J, Thorburn DR. Fatal perinatal mitochondrial cardiac failure caused by recurrent de novo duplications in the ATAD3 locus. MED 2020; 2:49-73. [PMID: 33575671 DOI: 10.1016/j.medj.2020.06.004] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Background In about half of all patients with a suspected monogenic disease, genomic investigations fail to identify the diagnosis. A contributing factor is the difficulty with repetitive regions of the genome, such as those generated by segmental duplications. The ATAD3 locus is one such region, in which recessive deletions and dominant duplications have recently been reported to cause lethal perinatal mitochondrial diseases characterized by pontocerebellar hypoplasia or cardiomyopathy, respectively. Methods Whole exome, whole genome and long-read DNA sequencing techniques combined with studies of RNA and quantitative proteomics were used to investigate 17 subjects from 16 unrelated families with suspected mitochondrial disease. Findings We report six different de novo duplications in the ATAD3 gene locus causing a distinctive presentation including lethal perinatal cardiomyopathy, persistent hyperlactacidemia, and frequently corneal clouding or cataracts and encephalopathy. The recurrent 68 Kb ATAD3 duplications are identifiable from genome and exome sequencing but usually missed by microarrays. The ATAD3 duplications result in the formation of identical chimeric ATAD3A/ATAD3C proteins, altered ATAD3 complexes and a striking reduction in mitochondrial oxidative phosphorylation complex I and its activity in heart tissue. Conclusions ATAD3 duplications appear to act in a dominant-negative manner and the de novo inheritance infers a low recurrence risk for families, unlike most pediatric mitochondrial diseases. More than 350 genes underlie mitochondrial diseases. In our experience the ATAD3 locus is now one of the five most common causes of nuclear-encoded pediatric mitochondrial disease but the repetitive nature of the locus means ATAD3 diagnoses may be frequently missed by current genomic strategies. Funding Australian NHMRC, US Department of Defense, Japanese AMED and JSPS agencies, Australian Genomics Health Alliance and Australian Mito Foundation.
Collapse
Affiliation(s)
- Ann E Frazier
- Murdoch Children's Research Institute, Royal Children's Hospital, Melbourne, VIC 3052, Australia.,Department of Paediatrics, University of Melbourne, Melbourne, VIC 3052, Australia.,These authors contributed equally: A.E. Frazier, A.G. Compton
| | - Alison G Compton
- Murdoch Children's Research Institute, Royal Children's Hospital, Melbourne, VIC 3052, Australia.,Department of Paediatrics, University of Melbourne, Melbourne, VIC 3052, Australia.,These authors contributed equally: A.E. Frazier, A.G. Compton
| | - Yoshihito Kishita
- Diagnostics and Therapeutics of Intractable Diseases, Intractable Disease Research Center, Juntendo University, Graduate School of Medicine, Tokyo, 113-8421, Japan
| | - Daniella H Hock
- Department of Biochemistry and Molecular Biology and Bio21 Molecular Science and Biotechnology Institute, University of Melbourne, Melbourne, VIC 3052, Australia
| | - AnneMarie E Welch
- Murdoch Children's Research Institute, Royal Children's Hospital, Melbourne, VIC 3052, Australia
| | - Sumudu S C Amarasekera
- Murdoch Children's Research Institute, Royal Children's Hospital, Melbourne, VIC 3052, Australia.,Department of Paediatrics, University of Melbourne, Melbourne, VIC 3052, Australia
| | - Rocio Rius
- Murdoch Children's Research Institute, Royal Children's Hospital, Melbourne, VIC 3052, Australia.,Department of Paediatrics, University of Melbourne, Melbourne, VIC 3052, Australia
| | - Luke E Formosa
- Department of Biochemistry and Molecular Biology, Monash Biomedicine Discovery Institute, Monash University, Melbourne, VIC 3800, Australia
| | - Atsuko Imai-Okazaki
- Diagnostics and Therapeutics of Intractable Diseases, Intractable Disease Research Center, Juntendo University, Graduate School of Medicine, Tokyo, 113-8421, Japan.,Division of Genomic Medicine Research, Medical Genomics Center, National Center for Global Health and Medicine, Tokyo 162-8655, Japan
| | - David Francis
- Victorian Clinical Genetics Services, Murdoch Children's Research Institute, Royal Children's Hospital, Melbourne, VIC 3052, Australia
| | - Min Wang
- Murdoch Children's Research Institute, Royal Children's Hospital, Melbourne, VIC 3052, Australia
| | - Nicole J Lake
- Murdoch Children's Research Institute, Royal Children's Hospital, Melbourne, VIC 3052, Australia.,Department of Paediatrics, University of Melbourne, Melbourne, VIC 3052, Australia.,Department of Genetics, Yale School of Medicine, New Haven, CT 06510, USA
| | - Simone Tregoning
- Murdoch Children's Research Institute, Royal Children's Hospital, Melbourne, VIC 3052, Australia.,Victorian Clinical Genetics Services, Murdoch Children's Research Institute, Royal Children's Hospital, Melbourne, VIC 3052, Australia
| | - Jafar S Jabbari
- Australian Genome Research Facility Ltd, Victorian Comprehensive Cancer Centre, Melbourne VIC 3052, Australia
| | - Alexis Lucattini
- Australian Genome Research Facility Ltd, Victorian Comprehensive Cancer Centre, Melbourne VIC 3052, Australia
| | - Kazuhiro R Nitta
- Diagnostics and Therapeutics of Intractable Diseases, Intractable Disease Research Center, Juntendo University, Graduate School of Medicine, Tokyo, 113-8421, Japan
| | - Akira Ohtake
- Department of Pediatrics & Clinical Genomics, Saitama Medical University Hospital, Saitama, 350-0495, Japan
| | - Kei Murayama
- Department of Metabolism, Chiba Children's Hospital, Chiba, 266-0007, Japan
| | - David J Amor
- Murdoch Children's Research Institute, Royal Children's Hospital, Melbourne, VIC 3052, Australia.,Department of Paediatrics, University of Melbourne, Melbourne, VIC 3052, Australia
| | - George McGillivray
- Victorian Clinical Genetics Services, Murdoch Children's Research Institute, Royal Children's Hospital, Melbourne, VIC 3052, Australia
| | - Flora Y Wong
- Ritchie Centre, Hudson Institute of Medical Research; Department of Paediatrics, Monash University; and Monash Newborn, Monash Children's Hospital, Melbourne, VIC 3168, Australia
| | - Marjo S van der Knaap
- Child Neurology, Emma Children's Hospital, Amsterdam University Medical Centers, Vrije Universiteit and Amsterdam Neuroscience, 1081 HV Amsterdam, The Netherlands.,Functional Genomics, Center for Neurogenomics and Cognitive Research, Vrije Universiteit and Amsterdam Neuroscience, 1081 HV Amsterdam, The Netherlands
| | - R Jeroen Vermeulen
- Department of Neurology, Maastricht University Medical Center, 6229 HX, Maastricht, The Netherlands
| | - Esko J Wiltshire
- Department of Paediatrics and Child Health, University of Otago Wellington and Capital and Coast District Health Board, Wellington 6021, New Zealand
| | - Janice M Fletcher
- Department of Genetics and Molecular Pathology, SA Pathology, Adelaide, SA 5000, Australia
| | - Barry Lewis
- Department of Clinical Biochemistry, PathWest Laboratory Medicine Western Australia, Nedlands, WA 6009, Australia
| | - Gareth Baynam
- Western Australian Register of Developmental Anomalies and Genetic Services of Western Australia and King Edward Memorial Hospital for Women Perth, Subiaco, WA 6008, Australia.,Telethon Kids Institute and School of Paediatrics and Child Health, The University of Western Australia, Perth, WA 6009, Australia
| | - Carolyn Ellaway
- Genetic Metabolic Disorders Service, Sydney Children's Hospital Network, The Children's Hospital at Westmead, Sydney, NSW 2145, Australia.,Disciplines of Genomic Medicine and Child and Adolescent Health, Sydney Medical School, University of Sydney, NSW 2145, Australia
| | - Shanti Balasubramaniam
- Genetic Metabolic Disorders Service, Sydney Children's Hospital Network, The Children's Hospital at Westmead, Sydney, NSW 2145, Australia
| | - Kaustuv Bhattacharya
- Genetic Metabolic Disorders Service, Sydney Children's Hospital Network, The Children's Hospital at Westmead, Sydney, NSW 2145, Australia.,Disciplines of Genomic Medicine and Child and Adolescent Health, Sydney Medical School, University of Sydney, NSW 2145, Australia
| | | | - Susan Arbuckle
- Department of Histopathology, The Children's Hospital at Westmead, Sydney Children's Hospital Network, Sydney, NSW 2145, Australia
| | - Michael Rodriguez
- Discipline of Pathology, School of Medical Sciences, The University of Sydney, Sydney, NSW 2006, Australia
| | | | - Simon Sadedin
- Murdoch Children's Research Institute, Royal Children's Hospital, Melbourne, VIC 3052, Australia.,Victorian Clinical Genetics Services, Murdoch Children's Research Institute, Royal Children's Hospital, Melbourne, VIC 3052, Australia
| | - Mark J Cowley
- Children's Cancer Institute, Kensington, NSW 2750, Australia; St Vincent's Clinical School, UNSW Sydney, Darlinghurst, NSW 2010, Australia.,Kinghorn Centre for Clinical Genomics, Garvan Institute of Medical Research, Darlinghurst, NSW 2010, Australia
| | - André E Minoche
- Kinghorn Centre for Clinical Genomics, Garvan Institute of Medical Research, Darlinghurst, NSW 2010, Australia
| | - Sarah E Calvo
- Broad Institute, Cambridge, MA 02142, USA; Howard Hughes Medical Institute and Department of Molecular Biology, Massachusetts General Hospital, Boston, MA 02114, USA; Harvard Medical School, Boston, MA 02446, USA
| | - Vamsi K Mootha
- Broad Institute, Cambridge, MA 02142, USA; Howard Hughes Medical Institute and Department of Molecular Biology, Massachusetts General Hospital, Boston, MA 02114, USA; Harvard Medical School, Boston, MA 02446, USA
| | - Michael T Ryan
- Department of Biochemistry and Molecular Biology, Monash Biomedicine Discovery Institute, Monash University, Melbourne, VIC 3800, Australia
| | - Yasushi Okazaki
- Diagnostics and Therapeutics of Intractable Diseases, Intractable Disease Research Center, Juntendo University, Graduate School of Medicine, Tokyo, 113-8421, Japan
| | - David A Stroud
- Department of Biochemistry and Molecular Biology and Bio21 Molecular Science and Biotechnology Institute, University of Melbourne, Melbourne, VIC 3052, Australia
| | - Cas Simons
- Murdoch Children's Research Institute, Royal Children's Hospital, Melbourne, VIC 3052, Australia.,Institute for Molecular Bioscience, The University of Queensland, Brisbane, QLD 4072 Australia
| | - John Christodoulou
- Murdoch Children's Research Institute, Royal Children's Hospital, Melbourne, VIC 3052, Australia.,Department of Paediatrics, University of Melbourne, Melbourne, VIC 3052, Australia.,Victorian Clinical Genetics Services, Murdoch Children's Research Institute, Royal Children's Hospital, Melbourne, VIC 3052, Australia.,Disciplines of Genomic Medicine and Child and Adolescent Health, Sydney Medical School, University of Sydney, NSW 2145, Australia
| | - David R Thorburn
- Murdoch Children's Research Institute, Royal Children's Hospital, Melbourne, VIC 3052, Australia.,Department of Paediatrics, University of Melbourne, Melbourne, VIC 3052, Australia.,Victorian Clinical Genetics Services, Murdoch Children's Research Institute, Royal Children's Hospital, Melbourne, VIC 3052, Australia.,Lead contact
| |
Collapse
|
42
|
Yuan X, Bai J, Zhang J, Yang L, Duan J, Li Y, Gao M. CONDEL: Detecting Copy Number Variation and Genotyping Deletion Zygosity from Single Tumor Samples Using Sequence Data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020; 17:1141-1153. [PMID: 30489272 DOI: 10.1109/tcbb.2018.2883333] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Characterizing copy number variations (CNVs) from sequenced genomes is a both feasible and cost-effective way to search for driver genes in cancer diagnosis. A number of existing algorithms for CNV detection only explored part of the features underlying sequence data and copy number structures, resulting in limited performance. Here, we describe CONDEL, a method for detecting CNVs from single tumor samples using high-throughput sequence data. CONDEL utilizes a novel statistic in combination with a peel-off scheme to assess the statistical significance of genome bins, and adopts a Bayesian approach to infer copy number gains, losses, and deletion zygosity based on statistical mixture models. We compare CONDEL to six peer methods on a large number of simulation datasets, showing improved performance in terms of true positive and false positive rates, and further validate CONDEL on three real datasets derived from the 1000 Genomes Project and the EGA archive. CONDEL obtained higher consistent results in comparison with other three single sample-based methods, and exclusively identified a number of CNVs that were previously associated with cancers. We conclude that CONDEL is a powerful tool for detecting copy number variations on single tumor samples even if these are sequenced at low-coverage.
Collapse
|
43
|
Xiao F, Luo X, Hao N, Niu YS, Xiao X, Cai G, Amos CI, Zhang H. An accurate and powerful method for copy number variation detection. Bioinformatics 2020; 35:2891-2898. [PMID: 30649252 DOI: 10.1093/bioinformatics/bty1041] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2018] [Revised: 11/28/2018] [Accepted: 01/09/2019] [Indexed: 12/12/2022] Open
Abstract
MOTIVATION Integration of multiple genetic sources for copy number variation detection (CNV) is a powerful approach to improve the identification of variants associated with complex traits. Although it has been shown that the widely used change point based methods can increase statistical power to identify variants, it remains challenging to effectively detect CNVs with weak signals due to the noisy nature of genotyping intensity data. We previously developed modSaRa, a normal mean-based model on a screening and ranking algorithm for copy number variation identification which presented desirable sensitivity with high computational efficiency. To boost statistical power for the identification of variants, here we present a novel improvement that integrates the relative allelic intensity with external information from empirical statistics with modeling, which we called modSaRa2. RESULTS Simulation studies illustrated that modSaRa2 markedly improved both sensitivity and specificity over existing methods for analyzing array-based data. The improvement in weak CNV signal detection is the most substantial, while it also simultaneously improves stability when CNV size varies. The application of the new method to a whole genome melanoma dataset identified novel candidate melanoma risk associated deletions on chromosome bands 1p22.2 and duplications on 6p22, 6q25 and 19p13 regions, which may facilitate the understanding of the possible roles of germline copy number variants in the etiology of melanoma. AVAILABILITY AND IMPLEMENTATION http://c2s2.yale.edu/software/modSaRa2 or https://github.com/FeifeiXiaoUSC/modSaRa2. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Feifei Xiao
- Department of Epidemiology and Biostatistics, University of South Carolina, Columbia, SC, USA
| | - Xizhi Luo
- Department of Epidemiology and Biostatistics, University of South Carolina, Columbia, SC, USA
| | - Ning Hao
- Department of Mathematics, University of Arizona, Tucson, AZ, USA
| | - Yue S Niu
- Department of Mathematics, University of Arizona, Tucson, AZ, USA
| | - Xiangjun Xiao
- Department of Quantitative Sciences, Baylor College of Medicine, Houston, TX, USA
| | - Guoshuai Cai
- Department of Environmental Health Science, University of South Carolina, Columbia, SC, USA
| | - Christopher I Amos
- Department of Quantitative Sciences, Baylor College of Medicine, Houston, TX, USA
| | - Heping Zhang
- Department of Biostatistics, Yale School of Public Health, New Haven, CT, USA
| |
Collapse
|
44
|
de Anda-Jáuregui G, Hernández-Lemus E. Computational Oncology in the Multi-Omics Era: State of the Art. Front Oncol 2020; 10:423. [PMID: 32318338 PMCID: PMC7154096 DOI: 10.3389/fonc.2020.00423] [Citation(s) in RCA: 49] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2019] [Accepted: 03/10/2020] [Indexed: 12/24/2022] Open
Abstract
Cancer is the quintessential complex disease. As technologies evolve faster each day, we are able to quantify the different layers of biological elements that contribute to the emergence and development of malignancies. In this multi-omics context, the use of integrative approaches is mandatory in order to gain further insights on oncological phenomena, and to move forward toward the precision medicine paradigm. In this review, we will focus on computational oncology as an integrative discipline that incorporates knowledge from the mathematical, physical, and computational fields to further the biomedical understanding of cancer. We will discuss the current roles of computation in oncology in the context of multi-omic technologies, which include: data acquisition and processing; data management in the clinical and research settings; classification, diagnosis, and prognosis; and the development of models in the research setting, including their use for therapeutic target identification. We will discuss the machine learning and network approaches as two of the most promising emerging paradigms, in computational oncology. These approaches provide a foundation on how to integrate different layers of biological description into coherent frameworks that allow advances both in the basic and clinical settings.
Collapse
Affiliation(s)
- Guillermo de Anda-Jáuregui
- Computational Genomics Division, National Institute of Genomic Medicine, Mexico City, Mexico
- Cátedras Conacyt Para Jóvenes Investigadores, National Council on Science and Technology, Mexico City, Mexico
| | - Enrique Hernández-Lemus
- Computational Genomics Division, National Institute of Genomic Medicine, Mexico City, Mexico
- Center for Complexity Sciences, Universidad Nacional Autónoma de México, Mexico City, Mexico
| |
Collapse
|
45
|
Roca I, González-Castro L, Maynou J, Palacios L, Fernández H, Couce ML, Fernández-Marmiesse A. PattRec: An easy-to-use CNV detection tool optimized for targeted NGS assays with diagnostic purposes. Genomics 2020; 112:1245-1256. [DOI: 10.1016/j.ygeno.2019.07.011] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2019] [Revised: 05/25/2019] [Accepted: 07/21/2019] [Indexed: 12/17/2022]
|
46
|
Rajagopalan R, Murrell JR, Luo M, Conlin LK. A highly sensitive and specific workflow for detecting rare copy-number variants from exome sequencing data. Genome Med 2020; 12:14. [PMID: 32000839 PMCID: PMC6993336 DOI: 10.1186/s13073-020-0712-0] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2019] [Accepted: 01/13/2020] [Indexed: 12/12/2022] Open
Abstract
Background Exome sequencing (ES) is a first-tier diagnostic test for many suspected Mendelian disorders. While it is routine to detect small sequence variants, it is not a standard practice in clinical settings to detect germline copy-number variants (CNVs) from ES data due to several reasons relating to performance. In this work, we comprehensively characterized one of the most sensitive ES-based CNV tools, ExomeDepth, against SNP array, a standard of care test in clinical settings to detect genome-wide CNVs. Methods We propose a modified ExomeDepth workflow by excluding exons with low mappability prior to variant calling to drastically reduce the false positives originating from the repetitive regions of the genome, and an iterative variant calling framework to assess the reproducibility. We used a cohort of 307 individuals with clinical ES data and clinical SNP array to estimate the sensitivity and false discovery rate of the CNV detection using exome sequencing. Further, we performed targeted testing of the STRC gene in 1972 individuals. To reduce the number of variants for downstream analysis, we performed a large-scale iterative variant calling process with random control cohorts to assess the reproducibility of the CNVs. Results The modified workflow presented in this paper reduced the number of total variants identified by one third while retaining a higher sensitivity of 97% and resulted in an improved false discovery rate of 11.4% compared to the default ExomeDepth pipeline. The exclusion of exons with low mappability removes 4.5% of the exons, including a subset of exons (0.6%) in disease-associated genes which are intractable by short-read next-generation sequencing (NGS). Results from the reproducibility analysis showed that the clinically reported variants were reproducible 100% of the time and that the modified workflow can be used to rank variants from high to low confidence. Targeted testing of 30 CNVs identified in STRC, a challenging gene to ascertain by NGS, showed a 100% validation rate. Conclusions In summary, we introduced a modification to the default ExomeDepth workflow to reduce the false positives originating from the repetitive regions of the genome, created a large-scale iterative variant calling framework for reproducibility, and provided recommendations for implementation in clinical settings.
Collapse
Affiliation(s)
- Ramakrishnan Rajagopalan
- Division of Genomic Diagnostics, Department of Pathology and Laboaratory Medicine, Children's Hospital of Philadelphia, Philadelphia, PA, USA.,School of Biomedical Engineering, Science and Health Systems, Drexel University, Philadelphia, PA, USA
| | - Jill R Murrell
- Division of Genomic Diagnostics, Department of Pathology and Laboaratory Medicine, Children's Hospital of Philadelphia, Philadelphia, PA, USA.,Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Minjie Luo
- Division of Genomic Diagnostics, Department of Pathology and Laboaratory Medicine, Children's Hospital of Philadelphia, Philadelphia, PA, USA.,Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Laura K Conlin
- Division of Genomic Diagnostics, Department of Pathology and Laboaratory Medicine, Children's Hospital of Philadelphia, Philadelphia, PA, USA. .,Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA.
| |
Collapse
|
47
|
Välipakka S, Savarese M, Sagath L, Arumilli M, Giugliano T, Udd B, Hackman P. Improving Copy Number Variant Detection from Sequencing Data with a Combination of Programs and a Predictive Model. J Mol Diagn 2020; 22:40-49. [DOI: 10.1016/j.jmoldx.2019.08.009] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2019] [Revised: 06/25/2019] [Accepted: 08/08/2019] [Indexed: 12/18/2022] Open
|
48
|
Yang H, Zhu D. Combinatorial Detection Algorithm for Copy Number Variations Using High-throughput Sequencing Reads. INT J PATTERN RECOGN 2019. [DOI: 10.1142/s0218001419500228] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Copy number variation (CNV) is a prevalent kind of genetic structural variation which leads to an abnormal number of copies of large genomic regions, such as gain or loss of DNA segments larger than 1[Formula: see text]kb. CNV exists not only in human genome but also in plant genome. Current researches have testified that CNV is associated with many complex diseases. In this paper, guanine-cytosine (GC) bias, mappability and their effect on read depth signals in sequencing data are discussed first. Subsequently, a new correction method for GC bias and an improved combinatorial detection algorithm for CNV using high-throughput sequencing reads based on hidden Markov model (CNV-HMM) are proposed. The corrected read depth signals have lower correlation with GC content, mappability of reads and the width of analysis window. Then we create a hidden Markov model which maps the reads onto the reference genome and records the unmapped reads. The unmapped reads are counted and normalized. The CNV-HMM detects the abnormal signal of read count and gains the candidate CNVs using the expectation maximization (EM) algorithm. Finally, we filter the candidate CNVs using split reads to promote the performance of our algorithm. The experiment result indicates that the CNV-HMM algorithm has higher accuracy and sensitivity for CNVs detection than most current detection algorithms.
Collapse
Affiliation(s)
- Hai Yang
- School of Computer Science and Technology, Shandong University, Qingdao 266237, P. R. China
| | - Daming Zhu
- School of Computer Science and Technology, Shandong University, Qingdao 266237, P. R. China
| |
Collapse
|
49
|
Roh V, Abramowski P, Hiou-Feige A, Cornils K, Rivals JP, Zougman A, Aranyossy T, Thielecke L, Truan Z, Mermod M, Monnier Y, Prassolov V, Glauche I, Nowrouzi A, Abdollahi A, Fehse B, Simon C, Tolstonog GV. Cellular Barcoding Identifies Clonal Substitution as a Hallmark of Local Recurrence in a Surgical Model of Head and Neck Squamous Cell Carcinoma. Cell Rep 2019; 25:2208-2222.e7. [PMID: 30463016 DOI: 10.1016/j.celrep.2018.10.090] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2017] [Revised: 09/04/2018] [Accepted: 10/24/2018] [Indexed: 01/04/2023] Open
Abstract
Local recurrence after surgery for head and neck squamous cell carcinoma (HNSCC) remains a common event associated with a dismal prognosis. Improving this outcome requires a better understanding of cancer cell populations that expand from postsurgical minimal residual disease (MRD). Therefore, we assessed clonal dynamics in a surgical model of barcoded HNSCC growing in the submental region of immunodeficient mice. Clonal substitution and massive reduction of clonal heterogeneity emerged as hallmarks of local recurrence, as the clones dominating in less heterogeneous recurrences were scarce in their matched primary tumors. These lineages were selected by their ability to persist after surgery and competitively expand from MRD. Clones enriched in recurrences exhibited both private and shared genetic features and likely originated from ancestors shared with clones dominating in primary tumors. They demonstrated high invasiveness and epithelial-to-mesenchymal transition, eventually providing an attractive target for obtaining better local control for these tumors.
Collapse
Affiliation(s)
- Vincent Roh
- Department of Otolaryngology - Head and Neck Surgery, University Hospital of Lausanne, Lausanne, Switzerland
| | - Pierre Abramowski
- Research Department Cell and Gene Therapy, Department of Stem Cell Transplantation, University Medical Center Hamburg-Eppendorf (UKE), Hamburg, Germany
| | - Agnès Hiou-Feige
- Department of Otolaryngology - Head and Neck Surgery, University Hospital of Lausanne, Lausanne, Switzerland
| | - Kerstin Cornils
- Research Department Cell and Gene Therapy, Department of Stem Cell Transplantation, University Medical Center Hamburg-Eppendorf (UKE), Hamburg, Germany
| | - Jean-Paul Rivals
- Department of Otolaryngology - Head and Neck Surgery, University Hospital of Lausanne, Lausanne, Switzerland
| | - Alexandre Zougman
- Clinical and Biomedical Proteomics Group, Cancer Research UK Centre, Leeds Institute of Cancer and Pathology, St. James's University Hospital, Leeds, UK
| | - Tim Aranyossy
- Research Department Cell and Gene Therapy, Department of Stem Cell Transplantation, University Medical Center Hamburg-Eppendorf (UKE), Hamburg, Germany
| | - Lars Thielecke
- Institute for Medical Informatics and Biometry, Faculty of Medicine Carl Gustave Carus, Technische Universität Dresden, Dresden, Germany
| | - Zinnia Truan
- Department of Otolaryngology - Head and Neck Surgery, University Hospital of Lausanne, Lausanne, Switzerland
| | - Maxime Mermod
- Department of Otolaryngology - Head and Neck Surgery, University Hospital of Lausanne, Lausanne, Switzerland
| | - Yan Monnier
- Department of Otolaryngology - Head and Neck Surgery, University Hospital of Lausanne, Lausanne, Switzerland
| | - Vladimir Prassolov
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow 119991, Russia
| | - Ingmar Glauche
- Institute for Medical Informatics and Biometry, Faculty of Medicine Carl Gustave Carus, Technische Universität Dresden, Dresden, Germany
| | - Ali Nowrouzi
- German Cancer Consortium (DKTK), Translational Radiation Oncology, German Cancer Research Center (DKFZ), Core Center Heidelberg, Heidelberg, Germany; Division of Molecular and Translational Radiation Oncology, Heidelberg University Hospital (UKHD) and DKFZ, Heidelberg Institute of Radiation Oncology (HIRO), National Center for Radiation Research in Oncology (NCRO), Heidelberg, Germany
| | - Amir Abdollahi
- German Cancer Consortium (DKTK), Translational Radiation Oncology, German Cancer Research Center (DKFZ), Core Center Heidelberg, Heidelberg, Germany; Division of Molecular and Translational Radiation Oncology, Heidelberg University Hospital (UKHD) and DKFZ, Heidelberg Institute of Radiation Oncology (HIRO), National Center for Radiation Research in Oncology (NCRO), Heidelberg, Germany
| | - Boris Fehse
- Research Department Cell and Gene Therapy, Department of Stem Cell Transplantation, University Medical Center Hamburg-Eppendorf (UKE), Hamburg, Germany.
| | - Christian Simon
- Department of Otolaryngology - Head and Neck Surgery, University Hospital of Lausanne, Lausanne, Switzerland.
| | - Genrich V Tolstonog
- Department of Otolaryngology - Head and Neck Surgery, University Hospital of Lausanne, Lausanne, Switzerland.
| |
Collapse
|
50
|
Bartha Á, Győrffy B. Comprehensive Outline of Whole Exome Sequencing Data Analysis Tools Available in Clinical Oncology. Cancers (Basel) 2019; 11:E1725. [PMID: 31690036 PMCID: PMC6895801 DOI: 10.3390/cancers11111725] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2019] [Revised: 10/31/2019] [Accepted: 11/01/2019] [Indexed: 12/17/2022] Open
Abstract
Whole exome sequencing (WES) enables the analysis of all protein coding sequences in the human genome. This technology enables the investigation of cancer-related genetic aberrations that are predominantly located in the exonic regions. WES delivers high-throughput results at a reasonable price. Here, we review analysis tools enabling utilization of WES data in clinical and research settings. Technically, WES initially allows the detection of single nucleotide variants (SNVs) and copy number variations (CNVs), and data obtained through these methods can be combined and further utilized. Variant calling algorithms for SNVs range from standalone tools to machine learning-based combined pipelines. Tools for CNV detection compare the number of reads aligned to a dedicated segment. Both SNVs and CNVs help to identify mutations resulting in pharmacologically druggable alterations. The identification of homologous recombination deficiency enables the use of PARP inhibitors. Determining microsatellite instability and tumor mutation burden helps to select patients eligible for immunotherapy. To pave the way for clinical applications, we have to recognize some limitations of WES, including its restricted ability to detect CNVs, low coverage compared to targeted sequencing, and the missing consensus regarding references and minimal application requirements. Recently, Galaxy became the leading platform in non-command line-based WES data processing. The maturation of next-generation sequencing is reinforced by Food and Drug Administration (FDA)-approved methods for cancer screening, detection, and follow-up. WES is on the verge of becoming an affordable and sufficiently evolved technology for everyday clinical use.
Collapse
Affiliation(s)
- Áron Bartha
- Semmelweis University, Department of Bioinformatics and 2nd Department of Pediatrics, H-1094 Budapest, Hungary.
- TTK Cancer Biomarker Research Group, Institute of Enzymology, Magyar tudósokkörútja 2., H-1117 Budapest, Hungary.
| | - Balázs Győrffy
- Semmelweis University, Department of Bioinformatics and 2nd Department of Pediatrics, H-1094 Budapest, Hungary.
- TTK Cancer Biomarker Research Group, Institute of Enzymology, Magyar tudósokkörútja 2., H-1117 Budapest, Hungary.
| |
Collapse
|