1
|
Spealman P, De T, Chuong JN, Gresham D. Best Practices in Microbial Experimental Evolution: Using Reporters and Long-Read Sequencing to Identify Copy Number Variation in Experimental Evolution. J Mol Evol 2023; 91:356-368. [PMID: 37012421 PMCID: PMC10275804 DOI: 10.1007/s00239-023-10102-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2022] [Accepted: 02/21/2023] [Indexed: 04/05/2023]
Abstract
Copy number variants (CNVs), comprising gene amplifications and deletions, are a pervasive class of heritable variation. CNVs play a key role in rapid adaptation in both natural, and experimental, evolution. However, despite the advent of new DNA sequencing technologies, detection and quantification of CNVs in heterogeneous populations has remained challenging. Here, we summarize recent advances in the use of CNV reporters that provide a facile means of quantifying de novo CNVs at a specific locus in the genome, and nanopore sequencing, for resolving the often complex structures of CNVs. We provide guidance for the engineering and analysis of CNV reporters and practical guidelines for single-cell analysis of CNVs using flow cytometry. We summarize recent advances in nanopore sequencing, discuss the utility of this technology, and provide guidance for the bioinformatic analysis of these data to define the molecular structure of CNVs. The combination of reporter systems for tracking and isolating CNV lineages and long-read DNA sequencing for characterizing CNV structures enables unprecedented resolution of the mechanisms by which CNVs are generated and their evolutionary dynamics.
Collapse
Affiliation(s)
- Pieter Spealman
- Department of Biology, New York University, New York, NY, 10003, USA
- Center for Genomics and Systems Biology, New York University, New York, NY, 10003, USA
| | - Titir De
- Department of Biology, New York University, New York, NY, 10003, USA
- Center for Genomics and Systems Biology, New York University, New York, NY, 10003, USA
| | - Julie N Chuong
- Department of Biology, New York University, New York, NY, 10003, USA
- Center for Genomics and Systems Biology, New York University, New York, NY, 10003, USA
| | - David Gresham
- Department of Biology, New York University, New York, NY, 10003, USA.
- Center for Genomics and Systems Biology, New York University, New York, NY, 10003, USA.
| |
Collapse
|
2
|
Jensen M, Tyryshkina A, Pizzo L, Smolen C, Das M, Huber E, Krishnan A, Girirajan S. Combinatorial patterns of gene expression changes contribute to variable expressivity of the developmental delay-associated 16p12.1 deletion. Genome Med 2021; 13:163. [PMID: 34657631 PMCID: PMC8522054 DOI: 10.1186/s13073-021-00982-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2021] [Accepted: 09/28/2021] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND Recent studies have suggested that individual variants do not sufficiently explain the variable expressivity of phenotypes observed in complex disorders. For example, the 16p12.1 deletion is associated with developmental delay and neuropsychiatric features in affected individuals, but is inherited in > 90% of cases from a mildly-affected parent. While children with the deletion are more likely to carry additional "second-hit" variants than their parents, the mechanisms for how these variants contribute to phenotypic variability are unknown. METHODS We performed detailed clinical assessments, whole-genome sequencing, and RNA sequencing of lymphoblastoid cell lines for 32 individuals in five large families with multiple members carrying the 16p12.1 deletion. We identified contributions of the 16p12.1 deletion and "second-hit" variants towards a range of expression changes in deletion carriers and their family members, including differential expression, outlier expression, alternative splicing, allele-specific expression, and expression quantitative trait loci analyses. RESULTS We found that the deletion dysregulates multiple autism and brain development genes such as FOXP1, ANK3, and MEF2. Carrier children also showed an average of 5323 gene expression changes compared with one or both parents, which matched with 33/39 observed developmental phenotypes. We identified significant enrichments for 13/25 classes of "second-hit" variants in genes with expression changes, where 4/25 variant classes were only enriched when inherited from the noncarrier parent, including loss-of-function SNVs and large duplications. In 11 instances, including for ZEB2 and SYNJ1, gene expression was synergistically altered by both the deletion and inherited "second-hits" in carrier children. Finally, brain-specific interaction network analysis showed strong connectivity between genes carrying "second-hits" and genes with transcriptome alterations in deletion carriers. CONCLUSIONS Our results suggest a potential mechanism for how "second-hit" variants modulate expressivity of complex disorders such as the 16p12.1 deletion through transcriptomic perturbation of gene networks important for early development. Our work further shows that family-based assessments of transcriptome data are highly relevant towards understanding the genetic mechanisms associated with complex disorders.
Collapse
Affiliation(s)
- Matthew Jensen
- Department of Biochemistry and Molecular Biology, Pennsylvania State University, PA, 16802, University Park, USA
- Bioinformatics and Genomics Program, Huck Institute of the Life Sciences, Pennsylvania State University, University Park, PA, 16802, USA
| | - Anastasia Tyryshkina
- Department of Biochemistry and Molecular Biology, Pennsylvania State University, PA, 16802, University Park, USA
- Neuroscience Program, Huck Institute of the Life Sciences, Pennsylvania State University, University Park, PA, 16802, USA
| | - Lucilla Pizzo
- Department of Biochemistry and Molecular Biology, Pennsylvania State University, PA, 16802, University Park, USA
| | - Corrine Smolen
- Department of Biochemistry and Molecular Biology, Pennsylvania State University, PA, 16802, University Park, USA
- Bioinformatics and Genomics Program, Huck Institute of the Life Sciences, Pennsylvania State University, University Park, PA, 16802, USA
| | - Maitreya Das
- Department of Biochemistry and Molecular Biology, Pennsylvania State University, PA, 16802, University Park, USA
| | - Emily Huber
- Department of Biochemistry and Molecular Biology, Pennsylvania State University, PA, 16802, University Park, USA
| | - Arjun Krishnan
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI, 48824, USA
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI, 48824, USA
| | - Santhosh Girirajan
- Department of Biochemistry and Molecular Biology, Pennsylvania State University, PA, 16802, University Park, USA.
- Bioinformatics and Genomics Program, Huck Institute of the Life Sciences, Pennsylvania State University, University Park, PA, 16802, USA.
- Neuroscience Program, Huck Institute of the Life Sciences, Pennsylvania State University, University Park, PA, 16802, USA.
- Department of Anthropology, Pennsylvania State University, University Park, PA, 16802, USA.
| |
Collapse
|
3
|
Zech M, Boesch S, Škorvánek M, Necpál J, Švantnerová J, Wagner M, Dincer Y, Sadr-Nabavi A, Serranová T, Rektorová I, Havránková P, Ganai S, Mosejová A, Příhodová I, Šarláková J, Kulcsarová K, Ulmanová O, Bechyně K, Ostrozovičová M, Haň V, Ventosa JR, Shariati M, Shoeibi A, Weber S, Mollenhauer B, Trenkwalder C, Berutti R, Strom TM, Ceballos-Baumann A, Mall V, Haslinger B, Jech R, Winkelmann J. Clinically relevant copy-number variants in exome sequencing data of patients with dystonia. Parkinsonism Relat Disord 2021; 84:129-34. [PMID: 33611074 DOI: 10.1016/j.parkreldis.2021.02.013] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/02/2020] [Revised: 01/25/2021] [Accepted: 02/08/2021] [Indexed: 11/20/2022]
Abstract
INTRODUCTION Next-generation sequencing is now used on a routine basis for molecular testing but studies on copy-number variant (CNV) detection from next-generation sequencing data are underrepresented. Utilizing an existing whole-exome sequencing (WES) dataset, we sought to investigate the contribution of rare CNVs to the genetic causality of dystonia. METHODS The CNV read-depth analysis tool ExomeDepth was applied to the exome sequences of 953 unrelated patients with dystonia (600 with isolated dystonia and 353 with combined dystonia; 33% with additional neurological involvement). We prioritized rare CNVs that affected known disease genes and/or were known to be associated with defined microdeletion/microduplication syndromes. Pathogenicity assessment of CNVs was based on recently published standards of the American College of Medical Genetics and Genomics and the Clinical Genome Resource. RESULTS We identified pathogenic or likely pathogenic CNVs in 14 of 953 patients (1.5%). Of the 14 different CNVs, 12 were deletions and 2 were duplications, ranging in predicted size from 124bp to 17 Mb. Within the deletion intervals, BRPF1, CHD8, DJ1, EFTUD2, FGF14, GCH1, PANK2, SGCE, UBE3A, VPS16, WARS2, and WDR45 were determined as the most clinically relevant genes. The duplications involved chromosomal regions 6q21-q22 and 15q11-q13. CNV analysis increased the diagnostic yield in the total cohort from 18.4% to 19.8%, as compared to the assessment of single-nucleotide variants and small insertions and deletions alone. CONCLUSIONS WES-based CNV analysis in dystonia is feasible, increases the diagnostic yield, and should be combined with the assessment of single-nucleotide variants and small insertions and deletions.
Collapse
|
4
|
Kaseniit KE, Hogan GJ, D'Auria KM, Haverty C, Muzzey D. Strategies to minimize false positives and interpret novel microdeletions based on maternal copy-number variants in 87,000 noninvasive prenatal screens. BMC Med Genomics 2018; 11:90. [PMID: 30340588 PMCID: PMC6194617 DOI: 10.1186/s12920-018-0410-6] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2018] [Accepted: 10/01/2018] [Indexed: 12/29/2022] Open
Abstract
BACKGROUND Noninvasive prenatal screening (NIPS) of common aneuploidies using cell-free DNA from maternal plasma is part of routine prenatal care and is widely used in both high-risk and low-risk patient populations. High specificity is needed for clinically acceptable positive predictive values. Maternal copy-number variants (mCNVs) have been reported as a source of false-positive aneuploidy results that compromises specificity. METHODS We surveyed the mCNV landscape in 87,255 patients undergoing NIPS. We evaluated both previously reported and novel algorithmic strategies for mitigating the effects of mCNVs on the screen's specificity. Further, we analyzed the frequency, length, and positional distribution of CNVs in our large dataset to investigate the curation of novel fetal microdeletions, which can be identified by NIPS but are challenging to interpret clinically. RESULTS mCNVs are common, with 65% of expecting mothers harboring an autosomal CNV spanning more than 200 kb, underscoring the need for robust NIPS analysis strategies. By analyzing empirical and simulated data, we found that general, outlier-robust strategies reduce the rate of mCNV-caused false positives but not as appreciably as algorithms specifically designed to account for mCNVs. We demonstrate that large-scale tabulation of CNVs identified via routine NIPS could be clinically useful: together with the gene density of a putative microdeletion region, we show that the region's relative tolerance to duplications versus deletions may aid the interpretation of microdeletion pathogenicity. CONCLUSIONS Our study thoroughly investigates a common source of NIPS false positives and demonstrates how to bypass its corrupting effects. Our findings offer insight into the interpretation of NIPS results and inform the design of NIPS algorithms suitable for use in screening in the general obstetric population.
Collapse
Affiliation(s)
- Kristjan Eerik Kaseniit
- Myriad Women's Health (previously Counsyl), 180 Kimball Way, South San Francisco, CA, 94080, USA
| | - Gregory J Hogan
- Myriad Women's Health (previously Counsyl), 180 Kimball Way, South San Francisco, CA, 94080, USA
| | - Kevin M D'Auria
- Myriad Women's Health (previously Counsyl), 180 Kimball Way, South San Francisco, CA, 94080, USA
| | - Carrie Haverty
- Myriad Women's Health (previously Counsyl), 180 Kimball Way, South San Francisco, CA, 94080, USA
| | - Dale Muzzey
- Myriad Women's Health (previously Counsyl), 180 Kimball Way, South San Francisco, CA, 94080, USA.
| |
Collapse
|
5
|
El-Kebir M, Raphael BJ, Shamir R, Sharan R, Zaccaria S, Zehavi M, Zeira R. Complexity and algorithms for copy-number evolution problems. Algorithms Mol Biol 2017; 12:13. [PMID: 28515774 PMCID: PMC5433102 DOI: 10.1186/s13015-017-0103-2] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2016] [Accepted: 04/11/2017] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Cancer is an evolutionary process characterized by the accumulation of somatic mutations in a population of cells that form a tumor. One frequent type of mutations is copy number aberrations, which alter the number of copies of genomic regions. The number of copies of each position along a chromosome constitutes the chromosome's copy-number profile. Understanding how such profiles evolve in cancer can assist in both diagnosis and prognosis. RESULTS We model the evolution of a tumor by segmental deletions and amplifications, and gauge distance from profile [Formula: see text] to [Formula: see text] by the minimum number of events needed to transform [Formula: see text] into [Formula: see text]. Given two profiles, our first problem aims to find a parental profile that minimizes the sum of distances to its children. Given k profiles, the second, more general problem, seeks a phylogenetic tree, whose k leaves are labeled by the k given profiles and whose internal vertices are labeled by ancestral profiles such that the sum of edge distances is minimum. CONCLUSIONS For the former problem we give a pseudo-polynomial dynamic programming algorithm that is linear in the profile length, and an integer linear program formulation. For the latter problem we show it is NP-hard and give an integer linear program formulation that scales to practical problem instance sizes. We assess the efficiency and quality of our algorithms on simulated instances. AVAILABILITY https://github.com/raphael-group/CNT-ILP.
Collapse
Affiliation(s)
- Mohammed El-Kebir
- Department of Computer Science, Princeton University, Princeton, NJ 08540 USA
- Department of Computer Science, Center for Computational Molecular Biology, Brown University, Providence, RI 02912 USA
| | - Benjamin J. Raphael
- Department of Computer Science, Princeton University, Princeton, NJ 08540 USA
- Department of Computer Science, Center for Computational Molecular Biology, Brown University, Providence, RI 02912 USA
| | - Ron Shamir
- School of Computer Science, Tel Aviv University, Tel Aviv, Israel
| | - Roded Sharan
- School of Computer Science, Tel Aviv University, Tel Aviv, Israel
| | - Simone Zaccaria
- Department of Computer Science, Princeton University, Princeton, NJ 08540 USA
- Department of Computer Science, Center for Computational Molecular Biology, Brown University, Providence, RI 02912 USA
- Dipartimento di Informatica Sistemistica e Comunicazione (DISCo), Univ. degli Studi di Milano-Bicocca, Milan, Italy
| | - Meirav Zehavi
- School of Computer Science, Tel Aviv University, Tel Aviv, Israel
| | - Ron Zeira
- School of Computer Science, Tel Aviv University, Tel Aviv, Israel
| |
Collapse
|
6
|
Royo JL, Pascual-Pons M, Lupiañez A, Sanchez-López I, Fibla J. Genotyping of common SIRPB1 copy number variant using Paralogue Ratio Test coupled to MALDI-MS quantification. Mol Cell Probes 2015; 29:517-21. [PMID: 26239731 DOI: 10.1016/j.mcp.2015.07.009] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2015] [Revised: 07/15/2015] [Accepted: 07/27/2015] [Indexed: 11/23/2022]
Abstract
Copy number variant (CNV) regions have been proven to have a significant impact on gene expression. Some of them have been also found to be associated to different human diseases. CNV genotyping is often prone to error and cross-validation with independent methods is frequently required. The platform of choice depends on whether it is a genome-wide discovery screening or a candidate CNV study, the cohort size and the number of CNVs included in the assay and, finally, the budget available. Here we illustrate a affordable approach to determine the CNV genotype using matrix-assisted laser desorption/ionization mass spectrometry (MALDI-MS) and based on the quantitative determination of single nucleotide duplicated mismatches (SNDM) mapping the CNV region and a paralogue genomic region that is used as a two-copy reference. We have genotyped nsv436327, a common CNV mapping SIRPB1 intron 1 that has been associated to human personality behavior. SIRP cluster region was subjected to several ancestral duplication events what makes SIRPB1 CNV genotyping technically challenging. We designed three sets of primer pairs that amplified paralogue regions inside and outside the CNV, containing three SNDMs. Post-PCR extension analyses of sequencing oligonucleotides mapping immediately upstream each SNDM allowed us to quantify using MALDI-MS the proportion of PCR products derived from the CNV region versus the external reference. In contrast to other approaches, setting up this genotyping method requires an affordable investment.
Collapse
|