1
|
Wan N, Weinberg D, Liu TY, Niehaus K, Ariazi EA, Delubac D, Kannan A, White B, Bailey M, Bertin M, Boley N, Bowen D, Cregg J, Drake AM, Ennis R, Fransen S, Gafni E, Hansen L, Liu Y, Otte GL, Pecson J, Rice B, Sanderson GE, Sharma A, St John J, Tang C, Tzou A, Young L, Putcha G, Haque IS. Machine learning enables detection of early-stage colorectal cancer by whole-genome sequencing of plasma cell-free DNA. BMC Cancer 2019; 19:832. [PMID: 31443703 PMCID: PMC6708173 DOI: 10.1186/s12885-019-6003-8] [Citation(s) in RCA: 91] [Impact Index Per Article: 18.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2019] [Accepted: 07/31/2019] [Indexed: 02/06/2023] Open
Abstract
Background Blood-based methods using cell-free DNA (cfDNA) are under development as an alternative to existing screening tests. However, early-stage detection of cancer using tumor-derived cfDNA has proven challenging because of the small proportion of cfDNA derived from tumor tissue in early-stage disease. A machine learning approach to discover signatures in cfDNA, potentially reflective of both tumor and non-tumor contributions, may represent a promising direction for the early detection of cancer. Methods Whole-genome sequencing was performed on cfDNA extracted from plasma samples (N = 546 colorectal cancer and 271 non-cancer controls). Reads aligning to protein-coding gene bodies were extracted, and read counts were normalized. cfDNA tumor fraction was estimated using IchorCNA. Machine learning models were trained using k-fold cross-validation and confounder-based cross-validations to assess generalization performance. Results In a colorectal cancer cohort heavily weighted towards early-stage cancer (80% stage I/II), we achieved a mean AUC of 0.92 (95% CI 0.91–0.93) with a mean sensitivity of 85% (95% CI 83–86%) at 85% specificity. Sensitivity generally increased with tumor stage and increasing tumor fraction. Stratification by age, sequencing batch, and institution demonstrated the impact of these confounders and provided a more accurate assessment of generalization performance. Conclusions A machine learning approach using cfDNA achieved high sensitivity and specificity in a large, predominantly early-stage, colorectal cancer cohort. The possibility of systematic technical and institution-specific biases warrants similar confounder analyses in other studies. Prospective validation of this machine learning method and evaluation of a multi-analyte approach are underway. Electronic supplementary material The online version of this article (10.1186/s12885-019-6003-8) contains supplementary material, which is available to authorized users.
Collapse
|
2
|
Abstract
Precision oncology seeks to leverage molecular information about cancer to improve patient outcomes. Tissue biopsy samples are widely used to characterize tumours but are limited by constraints on sampling frequency and their incomplete representation of the entire tumour bulk. Now, attention is turning to minimally invasive liquid biopsies, which enable analysis of tumour components (including circulating tumour cells and circulating tumour DNA) in bodily fluids such as blood. The potential of liquid biopsies is highlighted by studies that show they can track the evolutionary dynamics and heterogeneity of tumours and can detect very early emergence of therapy resistance, residual disease and recurrence. However, the analytical validity and clinical utility of liquid biopsies must be rigorously demonstrated before this potential can be realized.
Collapse
Affiliation(s)
- Ellen Heitzer
- Institute of Human Genetics, Diagnostic and Research Center for Molecular BioMedicine, Medical University of Graz, Graz, Austria. .,BioTechMed-Graz, Graz, Austria. .,Christian Doppler Laboratory for Liquid Biopsies for Early Detection of Cancer, Graz, Austria.
| | | | | | - Michael R Speicher
- Institute of Human Genetics, Diagnostic and Research Center for Molecular BioMedicine, Medical University of Graz, Graz, Austria.,BioTechMed-Graz, Graz, Austria
| |
Collapse
|
3
|
Cecchi AC, Vengoechea ES, Kaseniit KE, Hardy MW, Kiger LA, Mehta N, Haque IS, Moyer K, Page PZ, Muzzey D, Grinzaid KA. Screening for Tay-Sachs disease carriers by full-exon sequencing with novel variant interpretation outperforms enzyme testing in a pan-ethnic cohort. Mol Genet Genomic Med 2019; 7:e836. [PMID: 31293106 PMCID: PMC6687860 DOI: 10.1002/mgg3.836] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2018] [Revised: 03/22/2019] [Accepted: 05/07/2019] [Indexed: 01/10/2023] Open
Abstract
Background Pathogenic variants in HEXA that impair β‐hexosaminidase A (Hex A) enzyme activity cause Tay‐Sachs Disease (TSD), a severe autosomal‐recessive neurodegenerative disorder. Hex A enzyme analysis demonstrates near‐zero activity in patients affected with TSD and can also identify carriers, whose single functional copy of HEXA results in reduced enzyme activity relative to noncarriers. Although enzyme testing has been optimized and widely used for carrier screening in Ashkenazi Jewish (AJ) individuals, it has unproven sensitivity and specificity in a pan‐ethnic population. The ability to detect HEXA variants via DNA analysis has evolved from limited targeting of a few ethnicity‐specific variants to next‐generation sequencing (NGS) of the entire coding region coupled with interpretation of any discovered novel variants. Methods We combined results of enzyme testing, retrospective computational analysis, and variant reclassification to estimate the respective clinical performance of TSD screening via enzyme analysis and NGS. We maximized NGS accuracy by reclassifying variants of uncertain significance and compared to the maximum performance of enzyme analysis estimated by calculating ethnicity‐specific frequencies of variants known to yield false‐positive or false‐negative enzyme results (e.g., pseudodeficiency and B1 alleles). Results In both AJ and non‐AJ populations, the estimated clinical sensitivity, specificity, and positive predictive value were higher by NGS than by enzyme testing. The differences were significant for all comparisons except for AJ clinical sensitivity, where NGS exceeded enzyme testing, but not significantly. Conclusions Our results suggest that performance of an NGS‐based TSD carrier screen that interrogates the entire coding region and employs novel variant interpretation exceeds that of Hex A enzyme testing, warranting a reconsideration of existing guidelines.
Collapse
Affiliation(s)
| | | | | | - Melanie W Hardy
- Department of Human Genetics, Emory University School of Medicine, Atlanta, Georgia
| | - Laura A Kiger
- Myriad Women's Health, South San Francisco, California
| | - Nikita Mehta
- Myriad Women's Health, South San Francisco, California
| | - Imran S Haque
- Myriad Women's Health, South San Francisco, California
| | - Krista Moyer
- Myriad Women's Health, South San Francisco, California
| | - Patricia Z Page
- Department of Human Genetics, Emory University School of Medicine, Atlanta, Georgia
| | - Dale Muzzey
- Myriad Women's Health, South San Francisco, California
| | - Karen A Grinzaid
- Department of Human Genetics, Emory University School of Medicine, Atlanta, Georgia
| |
Collapse
|
4
|
Liu Y, Liu TY, Weinberg D, Torre CJDL, Tan CL, Schmitt AD, Selvaraj S, Tran V, Laurent LC, Bidard FC, Haque IS. Abstract 5177: Spatial co-fragmentation pattern of cell-free DNA recapitulates in vivo chromatin organization and identifies tissue-of-origin. Cancer Res 2019. [DOI: 10.1158/1538-7445.am2019-5177] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Abstract
Three-dimensional chromatin organization varies across different cell types and is essential for gene regulation. Functional genomic elements that reside kilobases to megabases away can be brought into spatial proximity by chromatin folding. In fixed cells, DNA fluorescence in situ hybridization and super-resolution microscopy can measure the distances between loci at ~10-100nm resolution, while chromosome conformation capture followed by next-generation sequencing (Hi-C) is able to profile genome-wide chromatin organization at kilobase-pair level resolution by measuring contact probabilities between pairs of loci. These methods provide a static snapshot of genome compaction and organization in different cellular states. However, assessing in vivo genome-wide chromatin organization changes non-invasively and longitudinally in patients is challenging due to the limitations of current technologies. Recently, circulating cell-free DNA (cfDNA) in blood has been shown as a promising biomarker to capture the genetic and local epigenetic changes within patients.
Here, we inferred in vivo chromatin organization in blood cells from co-fragmentation patterns of cfDNA by using fragment lengths estimated from paired-end whole genome sequencing (WGS). We performed cfDNA WGS on 100 healthy, 34 colorectal cancer, 48 lung cancer, and 19 melanoma patients. The inferred chromatin organization is highly concordant with Hi-C performed on white blood cells and not explained by technical biases, sequence composition, or other epigenetic factors. Further, we developed methods to identify the tissue-of-origin of cfDNA based on its co-fragmentation pattern and Hi-C signal in reference cell types, which confirmed that most cfDNA in healthy individuals is derived from hematopoietic cells. In cancer patients, we observed an increased contribution to cfDNA from cancer cells that was quantitatively correlated with estimated tumor fraction in cfDNA and qualitatively matched tumor type. We also verified the results using publicly available cfDNA WGS data from different healthy and cancer patients. These results are consistent with previous studies that directly measured DNA methylation or that inferred nucleosome positions from WGS on cfDNA. However, our method has distinct advantages including using only low-coverage WGS, not requiring bisulfite treatment, and providing a more robust and quantitative estimation of cell type contributions. Collectively, our results demonstrate the potential of using cfDNA WGS to non-invasively assess the in vivo three-dimensional chromatin organization and determine tissue-of-origin in different physiological and pathological conditions, which may be useful for detecting, monitoring and treating different diseases.
Citation Format: Yaping Liu, Tzu-Yu Liu, David Weinberg, Chris J. De La Torre, Catherine L. Tan, Anthony D. Schmitt, Siddarth Selvaraj, Vy Tran, Louise C. Laurent, François-Clément Bidard, Imran S. Haque. Spatial co-fragmentation pattern of cell-free DNA recapitulates in vivo chromatin organization and identifies tissue-of-origin [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2019; 2019 Mar 29-Apr 3; Atlanta, GA. Philadelphia (PA): AACR; Cancer Res 2019;79(13 Suppl):Abstract nr 5177.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | - Vy Tran
- 3University of California San Diego, San Diego, CA
| | | | | | | |
Collapse
|
5
|
Amorim CEG, Gao Z, Baker Z, Diesel JF, Simons YB, Haque IS, Pickrell J, Przeworski M. Correction: The population genetics of human disease: The case of recessive, lethal mutations. PLoS Genet 2018; 14:e1007499. [PMID: 29965964 PMCID: PMC6028076 DOI: 10.1371/journal.pgen.1007499] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
|
6
|
Haque IS, Otte G, Elemento O. Abstract 2225: Limitations on mutation detection for early detection of cancer. Cancer Res 2018. [DOI: 10.1158/1538-7445.am2018-2225] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Abstract
Introduction: Cell-free DNA (cfDNA) has potential utility for early non-invasive detection of cancer. We assess the feasibility of cfDNA mutational assays for early detection based on their physiological and economic requirements. We further review alternative biological signals of early cancer and the potential of machine learning to integrate these signals into reliable diagnostics.
Methods: A binomial model was used to assess depth and input requirements, with parameters derived from published data on cfDNA sequencing. Alternative strategies for early detection were assessed by literature review.
Results: 30,000x unique coverage is required for 95% sensitivity for 1 mutant read at 0.01% variant allele frequency, requiring 180ng cfDNA input at 50% process efficiency; current sequencing costs and reimbursement levels may make such tests economically infeasible (Table). 5th percentile plasma cfDNA concentration in the screening population is ~2.3 ng/mL, requiring >140mL blood collection for test failure rate <5%. Somatic mutational heterogeneity creates significant specificity challenges. Proteins, RNAs, and compartmentalized macromolecules are more abundant and may overcome cfDNA's issues of quantity, but irrelevant biological processes may confound the data. New techniques in machine learning, including latent variable modeling, may be used to integrate heterogeneous data from multiple sources to overcome these challenges.
Conclusions: Mutation-detection assays may not be feasible for early cancer detection due to limited sensitivity arising from low concentrations of cfDNA, heterogeneity leading to specificity challenges, and prohibitive cost. Integrating markers beyond mutations with modern machine learning may provide a potential route to statistically robust biomarker development for early cancer detection.
Table. Assay requirements for tumor liquid biopsy and mutation-based early cancer detection.
Sequencing cost by panel sizeVAF 95% sensitivityCorrected depthRaw depthInput volume (blood)TEC-Seq1 58 genes 81 kbRazavi et al2 508 genes 2000 KbWES ~20k genes 50,000 KbTumor liquid biopsy0.1%3000x15,000x15 mL$14$340$8300Early cancer detection0.01%30,000x150,000x150 mL$140$3400$83,0001Phallen J, et al. Sci Transl Med. 2017 Aug 16;9(403). 2Razavi P, et al. J Clin Oncol. 2017;35(suppl):abstr 11526.Model parameters: 1) No more than 5% of samples may fail because of insufficient cfDNA quantity; 2) 95% sensitivity to detect one read from any cancer-derived allele, assuming that one is present in the sample; 3) 50% process efficiency: half of the cfDNA molecules in the input blood sample are represented in the sequencer output; 4) 5x oversampling in sequencing for error correction; 5) 100% on-target rate in target enrichment; 6) “$1000 genome” sequencing costs: US $1000/(30 x 3 Gbp) of sequencing bandwidth; 7) only sequencing costs computed; all other costs (labor, equipment, facilities, depreciation, etc.) accounted at $0; 8) panel expansion neither reduces input requirements nor increases sequencing requirements.VAF = variant allele frequency; TEC-Seq = targeted error correction sequencing; WES = whole exome sequencing.
Citation Format: Imran S. Haque, Gabriel Otte, Olivier Elemento. Limitations on mutation detection for early detection of cancer [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2018; 2018 Apr 14-18; Chicago, IL. Philadelphia (PA): AACR; Cancer Res 2018;78(13 Suppl):Abstract nr 2225.
Collapse
|
7
|
Delubac D, Ariazi E, Berliner J, Drake A, Dulin J, Ennis R, Gafni E, Niehaus K, Otte G, Pecson J, Putcha G, Schaninger C, Sharma A, Singer M, Tzou A, Waters J, Weinberg D, White B, Haque IS. Abstract 2227: Multi-analyte profiling reveals relationships among circulating biomarkers in colorectal cancer. Cancer Res 2018. [DOI: 10.1158/1538-7445.am2018-2227] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Abstract
Introduction: Blood-based tests hold great promise as cancer diagnostics but until now have largely been restricted to the analysis of a single class of molecules (eg, circulating tumor DNA, platelet mRNA, circulating proteins). The ability to analyze multiple analytes simultaneously from the same biological sample may increase the sensitivity and specificity of such tests by exploiting independent information between signals. Here, we describe an experimental and analytical system that we developed and implemented for the integrated analysis of multiple analytes from a single blood sample, which revealed examples of both correlations and orthogonality among individual analytes. Methods: De-identified blood samples were obtained from healthy individuals, as well as individuals with pre-malignant conditions and stage I-IV colorectal cancer (CRC). After plasma separation, multiple types of analytes were assayed: cell-free DNA (cfDNA) content was assessed by low-coverage whole-genome sequencing (lcWGS) and whole-genome bisulfite sequencing (WGBS), cell-free microRNA (cf-miRNA) was assessed by small-RNA sequencing, and levels of circulating proteins were measured by quantitative immunoassay. Results: lcWGS of plasma cfDNA was able to identify CRC samples with high tumor fraction (>20%) on the basis of copy number variation (CNV) across the genome. High tumor fractions, while more frequent in late-stage cancer samples, were observed in some stage I and II patients. Aberrant signals in each of the three other analytes—cf-miRNA profiles discordant with those in healthy controls, genome-wide hypomethylation at LINE1 (long interspersed nuclear element 1) CpG loci, and elevated levels of circulating carcinoembryonic antigen (CEA) and cytokeratin fragment 21-1 (CYFRA 21-1) proteins—were also observed in cancer patients. Strikingly, aberrant profiles across analytes were indicative of high tumor fraction (as estimated from cfDNA CNV), rather than cancer stage. Conclusion: Our data suggest that tumor fraction is correlated with cancer stage but has a large potential range, even in early stage samples. Previous literature on blood-based screens for detection of cancer has displayed discordance in the claimed ability of different single analytes to detect early stage cancer. Tumor fraction may be able to explain the historical disagreement, as we found that aberrant profiles among cf-miRNA, cfDNA methylation, and circulating protein levels were more strongly associated with high tumor fraction than with late stage. These findings suggest that some positive “early stage” detection results may in fact be “high tumor fraction” detection results. Our results further demonstrate that assaying multiple analytes from a single sample may enable the development of classifiers that are reliable at low tumor fraction and for detecting pre-malignant or early-stage disease.
Citation Format: Daniel Delubac, Eric Ariazi, Jonathan Berliner, Adam Drake, John Dulin, Riley Ennis, Erik Gafni, Kate Niehaus, Gabriel Otte, Jennifer Pecson, Girish Putcha, Corey Schaninger, Aarushi Sharma, Mike Singer, Abraham Tzou, Jill Waters, David Weinberg, Brandon White, Imran S. Haque. Multi-analyte profiling reveals relationships among circulating biomarkers in colorectal cancer [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2018; 2018 Apr 14-18; Chicago, IL. Philadelphia (PA): AACR; Cancer Res 2018;78(13 Suppl):Abstract nr 2227.
Collapse
Affiliation(s)
| | | | | | - Adam Drake
- 2Freenome, Inc., South San Francisco, CA
| | - John Dulin
- 2Freenome, Inc., South San Francisco, CA
| | | | - Erik Gafni
- 2Freenome, Inc., South San Francisco, CA
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
8
|
Ghiossi CE, Goldberg JD, Haque IS, Lazarin GA, Wong KK. Clinical Utility of Expanded Carrier Screening: Reproductive Behaviors of At-Risk Couples. J Genet Couns 2018; 27:616-625. [PMID: 28956228 PMCID: PMC5943379 DOI: 10.1007/s10897-017-0160-1] [Citation(s) in RCA: 41] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2017] [Accepted: 09/18/2017] [Indexed: 11/29/2022]
Abstract
Expanded carrier screening (ECS) analyzes dozens or hundreds of recessive genes to determine reproductive risk. Data on the clinical utility of screening conditions beyond professional guidelines are scarce. Individuals underwent ECS for up to 110 genes. Five-hundred thirty-seven at-risk couples (ARC), those in which both partners carry the same recessive disease, were invited to participate in a retrospective IRB-approved survey of their reproductive decision making after receiving ECS results. Sixty-four eligible ARC completed the survey. Of 45 respondents screened preconceptionally, 62% (n = 28) planned IVF with PGD or prenatal diagnosis (PNDx) in future pregnancies. Twenty-nine percent (n = 13) were not planning to alter reproductive decisions. The remaining 9% (n = 4) of responses were unclear. Of 19 pregnant respondents, 42% (n = 8) elected PNDx, 11% (n = 2) planned amniocentesis but miscarried, and 47% (n = 9) considered the condition insufficiently severe to warrant invasive testing. Of the 8 pregnancies that underwent PNDx, 5 were unaffected and 3 were affected. Two of 3 affected pregnancies were terminated. Disease severity was found to have significant association (p = 0.000145) with changes in decision making, whereas guideline status of diseases, controlled for severity, was not (p = 0.284). Most ARC altered reproductive planning, demonstrating the clinical utility of ECS. Severity of conditions factored into decision making.
Collapse
Affiliation(s)
- Caroline E Ghiossi
- California State University Stanislaus, 1 University Cir, Turlock, CA, 95382, USA.
| | | | - Imran S Haque
- Counsyl, 180 Kimball Way, South San Francisco, 94080, CA, USA
| | | | - Kenny K Wong
- Counsyl, 180 Kimball Way, South San Francisco, 94080, CA, USA
| |
Collapse
|
9
|
Hogan GJ, Vysotskaia VS, Beauchamp KA, Seisenberger S, Grauman PV, Haas KR, Hong SH, Jeon D, Kash S, Lai HH, Melroy LM, Theilmann MR, Chu CS, Iori K, Maguire JR, Evans EA, Haque IS, Mar-Heyming R, Kang HP, Muzzey D. Validation of an Expanded Carrier Screen that Optimizes Sensitivity via Full-Exon Sequencing and Panel-wide Copy Number Variant Identification. Clin Chem 2018; 64:1063-1073. [PMID: 29760218 DOI: 10.1373/clinchem.2018.286823] [Citation(s) in RCA: 45] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2018] [Accepted: 03/26/2018] [Indexed: 11/06/2022]
Abstract
BACKGROUND By identifying pathogenic variants across hundreds of genes, expanded carrier screening (ECS) enables prospective parents to assess the risk of transmitting an autosomal recessive or X-linked condition. Detection of at-risk couples depends on the number of conditions tested, the prevalence of the respective diseases, and the screen's analytical sensitivity for identifying disease-causing variants. Disease-level analytical sensitivity is often <100% in ECS tests because copy number variants (CNVs) are typically not interrogated because of their technical complexity. METHODS We present an analytical validation and preliminary clinical characterization of a 235-gene sequencing-based ECS with full coverage across coding regions, targeted assessment of pathogenic noncoding variants, panel-wide CNV calling, and specialized assays for technically challenging genes. Next-generation sequencing, customized bioinformatics, and expert manual call review were used to identify single-nucleotide variants, short insertions and deletions, and CNVs for all genes except FMR1 and those whose low disease incidence or high technical complexity precluded novel variant identification or interpretation. RESULTS Screening of 36859 patients' blood or saliva samples revealed the substantial impact on fetal disease-risk detection attributable to novel CNVs (9.19% of risk) and technically challenging conditions (20.2% of risk), such as congenital adrenal hyperplasia. Of the 7498 couples screened, 335 were identified as at risk for an affected pregnancy, underscoring the clinical importance of the test. Validation of our ECS demonstrated >99% analytical sensitivity and >99% analytical specificity. CONCLUSIONS Validated high-fidelity identification of different variant types-especially for diseases with complicated molecular genetics-maximizes at-risk couple detection.
Collapse
|
10
|
Amorim CEG, Gao Z, Baker Z, Diesel JF, Simons YB, Haque IS, Pickrell J, Przeworski M. The population genetics of human disease: The case of recessive, lethal mutations. PLoS Genet 2017; 13:e1006915. [PMID: 28957316 PMCID: PMC5619689 DOI: 10.1371/journal.pgen.1006915] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2016] [Accepted: 07/09/2017] [Indexed: 01/08/2023] Open
Abstract
Do the frequencies of disease mutations in human populations reflect a simple balance between mutation and purifying selection? What other factors shape the prevalence of disease mutations? To begin to answer these questions, we focused on one of the simplest cases: recessive mutations that alone cause lethal diseases or complete sterility. To this end, we generated a hand-curated set of 417 Mendelian mutations in 32 genes reported to cause a recessive, lethal Mendelian disease. We then considered analytic models of mutation-selection balance in infinite and finite populations of constant sizes and simulations of purifying selection in a more realistic demographic setting, and tested how well these models fit allele frequencies estimated from 33,370 individuals of European ancestry. In doing so, we distinguished between CpG transitions, which occur at a substantially elevated rate, and three other mutation types. Intriguingly, the observed frequency for CpG transitions is slightly higher than expectation but close, whereas the frequencies observed for the three other mutation types are an order of magnitude higher than expected, with a bigger deviation from expectation seen for less mutable types. This discrepancy is even larger when subtle fitness effects in heterozygotes or lethal compound heterozygotes are taken into account. In principle, higher than expected frequencies of disease mutations could be due to widespread errors in reporting causal variants, compensation by other mutations, or balancing selection. It is unclear why these factors would have a greater impact on disease mutations that occur at lower rates, however. We argue instead that the unexpectedly high frequency of disease mutations and the relationship to the mutation rate likely reflect an ascertainment bias: of all the mutations that cause recessive lethal diseases, those that by chance have reached higher frequencies are more likely to have been identified and thus to have been included in this study. Beyond the specific application, this study highlights the parameters likely to be important in shaping the frequencies of Mendelian disease alleles. What determines the frequencies of disease mutations in human populations? To begin to answer this question, we focus on one of the simplest cases: mutations that cause completely recessive, lethal Mendelian diseases. We first review theory about what to expect from mutation and selection in a population of finite size and generate predictions based on simulations using a plausible demographic scenario of recent human evolution. For a highly mutable type of mutation, transitions at CpG sites, we find that the predictions are close to the observed frequencies of recessive lethal disease mutations. For less mutable types, however, predictions substantially under-estimate the observed frequency. We discuss possible explanations for the discrepancy and point to a complication that, to our knowledge, is not widely appreciated: that there exists ascertainment bias in disease mutation discovery. Specifically, we suggest that alleles that have been identified to date are likely the ones that by chance have reached higher frequencies and are thus more likely to have been mapped. More generally, our study highlights the factors that influence the frequencies of Mendelian disease alleles.
Collapse
Affiliation(s)
- Carlos Eduardo G. Amorim
- Department of Biological Sciences, Columbia University, New York, NY, United States of America
- CAPES Foundation, Ministry of Education of Brazil, Brasília, DF, Brazil
- * E-mail:
| | - Ziyue Gao
- Howard Hughes Medical Institution, Stanford University, Stanford, CA, United States of America
| | - Zachary Baker
- Department of Systems Biology, Columbia University, New York, NY, United States of America
| | | | - Yuval B. Simons
- Department of Biological Sciences, Columbia University, New York, NY, United States of America
| | - Imran S. Haque
- Counsyl, 180 Kimball Way, South San Francisco, CA, United States of America
| | - Joseph Pickrell
- Department of Biological Sciences, Columbia University, New York, NY, United States of America
- New York Genome Center, New York, NY, United States of America
| | - Molly Przeworski
- Department of Biological Sciences, Columbia University, New York, NY, United States of America
- Department of Systems Biology, Columbia University, New York, NY, United States of America
| |
Collapse
|
11
|
Artieri CG, Beauchamp KA, Vysotskaia VS, Welker NC, Evans EA, Chu C, Tezcan H, Haque IS. Abstract 5690: Optimized molecular barcoding enables accurate targeted mutation detection in circulating cell-free DNA (cfDNA). Cancer Res 2017. [DOI: 10.1158/1538-7445.am2017-5690] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Abstract
The evaluation of cfDNA allows novel approaches to noninvasive detection of actionable alterations, resistance mechanisms, and tumor monitoring in patients with cancer. Importantly, tumor-specific DNA fragments represent a small minority of the cfDNA and can be obscured by false positive (FP) variants introduced by chemical damage and sequencer error. To address this, we improved key processes in the design of NGS libraries, including a new molecular barcoding approach, that maximize molecular recovery while eliminating spurious variants. We engineered a set of Illumina sequencing chemistry compatible adaptors incorporating unique molecular identifiers (barcodes) enabling reconstruction of the sequence of both strands of the original DNA molecule. These barcodes incorporate a number of key design improvements as compared to published methodologies, which enhance sequencer cluster density, thereby increasing library diversity and molecular recovery. Our new design identified both chemical and sequencer errors, reducing incorrect base calls to rates below 5e-7. We validated our methodology for use in cfDNA using both dilution experiments and patient blood samples with known oncogenic alterations via a custom capture panel targeting actionable genomic alterations in a 55kb region. By identifying the molecular origin of each read, we found that the sensitivity of detection obtained from barcoded libraries followed ideal binomial sampling expectations. We obtained an average molecular depth of 1,000 molecules per site from the plasma extracted from a single blood collection tube, which corresponded to an 80% sensitivity of detection of known oncogenic single-nucleotide and indel mutations at 0.15% mutant allele frequency (MAF) in cfDNA with no FP calls. Furthermore, we successfully detected known gene-fusions at 0.5%, and amplifications (>10 copies) down to 1% MAF. We designed and validated a custom-engineered error-correcting sequencing adapters, ideal for broad range of applications requiring high accuracy detection of ultra-low frequency alterations.
Note: This abstract was not presented at the meeting.
Citation Format: Carlo G. Artieri, Kyle A. Beauchamp, Valentina S. Vysotskaia, Noah C. Welker, Eric A. Evans, Clement Chu, Haluk Tezcan, Imran S. Haque. Optimized molecular barcoding enables accurate targeted mutation detection in circulating cell-free DNA (cfDNA) [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2017; 2017 Apr 1-5; Washington, DC. Philadelphia (PA): AACR; Cancer Res 2017;77(13 Suppl):Abstract nr 5690. doi:10.1158/1538-7445.AM2017-5690
Collapse
|
12
|
Beauchamp KA, Muzzey D, Wong KK, Hogan GJ, Karimi K, Candille SI, Mehta N, Mar-Heyming R, Kaseniit KE, Kang HP, Evans EA, Goldberg JD, Lazarin GA, Haque IS. Systematic design and comparison of expanded carrier screening panels. Genet Med 2017. [PMID: 28640244 PMCID: PMC5763154 DOI: 10.1038/gim.2017.69] [Citation(s) in RCA: 42] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Open
Abstract
Purpose The recent growth in pan-ethnic expanded carrier screening (ECS) has raised questions about how such panels might be designed and evaluated systematically. Design principles for ECS panels might improve clinical detection of at-risk couples and facilitate objective discussions of panel choice. Methods Guided by medical-society statements, we propose a method for the design of ECS panels that aims to maximize the aggregate and per-disease sensitivity and specificity across a range of Mendelian disorders considered serious by a systematic classification scheme. We evaluated this method retrospectively using results from 474,644 de-identified carrier screens. We then constructed several idealized panels to highlight strengths and limitations of different ECS methodologies. Results Based on modeled fetal risks for “severe” and “profound” diseases, a commercially available ECS panel (Counsyl) is expected to detect 183 affected conceptuses per 100,000 US births. A screen’s sensitivity is greatly impacted by two factors: (i) the methodology used (e.g., full-exon sequencing finds more affected conceptuses than targeted genotyping) and (ii) the detection rate of the screen for diseases with high prevalence and complex molecular genetics (e.g., fragile X syndrome). Conclusion The described approaches enable principled, quantitative evaluation of which diseases and methodologies are appropriate for pan-ethnic expanded carrier screening.
Collapse
Affiliation(s)
| | - Dale Muzzey
- Counsyl, South San Francisco, California, USA
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
13
|
Artieri CG, Haverty C, Evans EA, Goldberg JD, Haque IS, Yaron Y, Muzzey D. Noninvasive prenatal screening at low fetal fraction: comparing whole-genome sequencing and single-nucleotide polymorphism methods. Prenat Diagn 2017; 37:482-490. [PMID: 28317136 DOI: 10.1002/pd.5036] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2016] [Revised: 02/15/2017] [Accepted: 03/14/2017] [Indexed: 12/14/2022]
Abstract
OBJECTIVE Performance of noninvasive prenatal screening (NIPS) methodologies when applied to low fetal fraction samples is not well established. The single-nucleotide polymorphism (SNP) method fails samples below a predetermined fetal fraction threshold, whereas some laboratories employing the whole-genome sequencing (WGS) method report aneuploidy calls for all samples. Here, the performance of the two methods was compared to determine which approach actually detects more fetal aneuploidies. METHODS Computational models were parameterized with up-to-date published data and used to compare the performance of the two methods at calling common fetal trisomies (T21, T18, T13) at low fetal fractions. Furthermore, clinical experience data were reviewed to determine aneuploidy detection rates based on compliance with recent invasive screening recommendations. RESULTS The SNP method's performance is dependent on the origin of the trisomy, and is lowest for the most common trisomies (maternal M1 nondisjunction). Consequently, the SNP method cannot maintain acceptable performance at fetal fractions below ~3%. In contrast, the WGS method maintains high specificity independent of fetal fraction and has >80% sensitivity for trisomies in low fetal fraction samples. CONCLUSION The WGS method will detect more aneuploidies below the fetal fraction threshold at which many labs issue a no-call result, avoiding unnecessary invasive procedures. © 2017 Counsyl Inc. Prenatal Diagnosis published by John Wiley & Sons, Ltd.
Collapse
Affiliation(s)
| | | | - Eric A Evans
- Counsyl Inc., South San Francisco, California, USA
| | | | - Imran S Haque
- Counsyl Inc., South San Francisco, California, USA.,Freenome, South San Francisco, California, USA
| | - Yuval Yaron
- Genetic Institute, Tel Aviv Sourasky Medical Center and Sackler Faculty of Medicine Tel Aviv University, Israel
| | - Dale Muzzey
- Counsyl Inc., South San Francisco, California, USA
| |
Collapse
|
14
|
Lazarin GA, Haque IS, Evans EA, Goldberg JD. Smith-Lemli-Opitz syndrome carrier frequency and estimates of in utero mortality rates. Prenat Diagn 2017; 37:350-355. [PMID: 28166604 PMCID: PMC5413855 DOI: 10.1002/pd.5018] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2016] [Revised: 01/23/2017] [Accepted: 01/25/2017] [Indexed: 12/19/2022]
Abstract
Objective To tabulate individual allele frequencies and total carrier frequency for Smith–Lemli–Opitz syndrome (SLOS) and compare expected versus observed birth incidences. Methods A total of 262 399 individuals with no known indication or increased probability of SLOS carrier status, primarily US based, were screened for SLOS mutations as part of an expanded carrier screening panel. Results were retrospectively analyzed to estimate carrier frequencies in multiple ethnic groups. SLOS birth incidences obtained from existing literature were then compared with these data to estimate the effect of SLOS on fetal survival. Results Smith–Lemli–Opitz syndrome carrier frequency is highest in Ashkenazi Jews (1 in 43) and Northern Europeans (1 in 54). Comparing predicted birth incidence with that observed in published literature suggests that approximately 42% to 88% of affected conceptuses experience prenatal demise. Conclusion Smith–Lemli–Opitz syndrome is relatively frequent in certain populations and, because of its impact on prenatal and postnatal morbidity and mortality, merits consideration for routine screening. © 2017 The Authors. Prenatal Diagnosis published by John Wiley & Sons, Ltd. What's already known about this topic?
Smith–Lemli–Opitz syndrome is an autosomal recessive multiple congenital anomaly syndrome with varying frequency estimates. Smith–Lemli–Opitz syndrome is presumed to be associated with an increased risk for pregnancy loss, although this risk has not been quantified.
What does this study add?
By reporting results from a large, diverse tested population, these data define the carrier frequency in multiple ethnic groups. Predicted Smith–Lemli–Opitz syndrome frequency at birth is compared with actual frequencies from previous studies, enabling estimation of the pregnancy loss frequency.
Collapse
|
15
|
Vysotskaia VS, Hogan GJ, Gould GM, Wang X, Robertson AD, Haas KR, Theilmann MR, Spurka L, Grauman PV, Lai HH, Jeon D, Haliburton G, Leggett M, Chu CS, Iori K, Maguire JR, Ready K, Evans EA, Kang HP, Haque IS. Development and validation of a 36-gene sequencing assay for hereditary cancer risk assessment. PeerJ 2017; 5:e3046. [PMID: 28243543 PMCID: PMC5326550 DOI: 10.7717/peerj.3046] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2016] [Accepted: 01/30/2017] [Indexed: 12/12/2022] Open
Abstract
The past two decades have brought many important advances in our understanding of the hereditary susceptibility to cancer. Numerous studies have provided convincing evidence that identification of germline mutations associated with hereditary cancer syndromes can lead to reductions in morbidity and mortality through targeted risk management options. Additionally, advances in gene sequencing technology now permit the development of multigene hereditary cancer testing panels. Here, we describe the 2016 revision of the Counsyl Inherited Cancer Screen for detecting single-nucleotide variants (SNVs), short insertions and deletions (indels), and copy number variants (CNVs) in 36 genes associated with an elevated risk for breast, ovarian, colorectal, gastric, endometrial, pancreatic, thyroid, prostate, melanoma, and neuroendocrine cancers. To determine test accuracy and reproducibility, we performed a rigorous analytical validation across 341 samples, including 118 cell lines and 223 patient samples. The screen achieved 100% test sensitivity across different mutation types, with high specificity and 100% concordance with conventional Sanger sequencing and multiplex ligation-dependent probe amplification (MLPA). We also demonstrated the screen's high intra-run and inter-run reproducibility and robust performance on blood and saliva specimens. Furthermore, we showed that pathogenic Alu element insertions can be accurately detected by our test. Overall, the validation in our clinical laboratory demonstrated the analytical performance required for collecting and reporting genetic information related to risk of developing hereditary cancers.
Collapse
Affiliation(s)
| | - Gregory J. Hogan
- Research and Development Department, Counsyl, Inc, South San Francisco, CA, United States
| | - Genevieve M. Gould
- Research and Development Department, Counsyl, Inc, South San Francisco, CA, United States
| | - Xin Wang
- Research and Development Department, Counsyl, Inc, South San Francisco, CA, United States
| | - Alex D. Robertson
- Research and Development Department, Counsyl, Inc, South San Francisco, CA, United States
- Current affiliation: Color Genomics, Inc., Burlingame, CA, United States
| | - Kevin R. Haas
- Research and Development Department, Counsyl, Inc, South San Francisco, CA, United States
| | - Mark R. Theilmann
- Research and Development Department, Counsyl, Inc, South San Francisco, CA, United States
| | - Lindsay Spurka
- Research and Development Department, Counsyl, Inc, South San Francisco, CA, United States
| | - Peter V. Grauman
- Research and Development Department, Counsyl, Inc, South San Francisco, CA, United States
| | - Henry H. Lai
- Research and Development Department, Counsyl, Inc, South San Francisco, CA, United States
| | - Diana Jeon
- Research and Development Department, Counsyl, Inc, South San Francisco, CA, United States
| | - Genevieve Haliburton
- Research and Development Department, Counsyl, Inc, South San Francisco, CA, United States
| | - Matt Leggett
- Project Management Department, Counsyl, Inc, South San Francisco, CA, United States
| | - Clement S. Chu
- Research and Development Department, Counsyl, Inc, South San Francisco, CA, United States
| | - Kevin Iori
- Research and Development Department, Counsyl, Inc, South San Francisco, CA, United States
| | - Jared R. Maguire
- Research and Development Department, Counsyl, Inc, South San Francisco, CA, United States
| | - Kaylene Ready
- Medical Affairs Department, Counsyl, Inc, South San Francisco, CA, United States
| | - Eric A. Evans
- Research and Development Department, Counsyl, Inc, South San Francisco, CA, United States
| | - Hyunseok P. Kang
- Clinical Laboratory, Counsyl, Inc, South San Francisco, California, United States
| | - Imran S. Haque
- Research and Development Department, Counsyl, Inc, South San Francisco, CA, United States
- Current affiliation: Freenome, Inc., South San Francisco, CA, United States
| |
Collapse
|
16
|
|
17
|
Kaseniit KE, Theilmann MR, Robertson A, Evans EA, Haque IS. Group Testing Approach for Trinucleotide Repeat Expansion Disorder Screening. Clin Chem 2016; 62:1401-8. [DOI: 10.1373/clinchem.2016.259796] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2016] [Accepted: 07/22/2016] [Indexed: 11/06/2022]
Abstract
Abstract
BACKGROUND
Fragile X syndrome (FXS, OMIM #300624) is an X-linked condition caused by trinucleotide repeat expansions in the 5′ UTR (untranslated region) of the fragile X mental retardation 1 (FMR1) gene. FXS testing is commonly performed in expanded carrier screening and has been proposed for inclusion in newborn screening. However, because pathogenic alleles are long and have low complexity (>200 CGG repeats), FXS is currently tested by a single-plex electrophoresis-resolved PCR assay rather than multiplexed approaches like next-generation sequencing or mass spectrometry. In this work, we sought an experimental design based on nonadaptive group testing that could accurately and reliably identify the size of abnormally expanded FMR1 alleles of males and females.
METHODS
We developed a new group testing scheme named StairCase (SC) that was designed to the constraints of the FXS testing problem, and compared its performance to existing group testing schemes by simulation. We experimentally evaluated SC's performance on 210 samples from the Coriell Institute biorepositories using pooled PCR followed by capillary electrophoresis on 3 replicates of each of 3 pooling layouts differing by the mapping of samples to pools.
RESULTS
The SC pooled PCR approach demonstrated perfect classification of samples by clinical category (normal, intermediate, premutation, or full mutation) for 90 positives and 1800 negatives, with a batch of 210 samples requiring only 21 assays.
CONCLUSIONS
Group testing based on SC is an implementable approach to trinucleotide repeat expansion disorder testing that offers ≥10-fold reduction in assay costs over current single-plex methods.
Collapse
|
18
|
Abstract
IMPORTANCE Screening for carrier status of a limited number of single-gene conditions is the current standard of prenatal care. Methods have become available allowing rapid expanded carrier screening for a substantial number of conditions. OBJECTIVES To quantify the modeled risk of recessive conditions identifiable by an expanded carrier screening panel in individuals of diverse racial and ethnic backgrounds and to compare the results with those from current screening recommendations. DESIGN, SETTING, AND PARTICIPANTS Retrospective modeling analysis of results between January 1, 2012, and July 15, 2015, from expanded carrier screening in reproductive-aged individuals without known indication for specific genetic testing, primarily from the United States. Tests were offered by clinicians providing reproductive care. EXPOSURES Individuals were tested for carrier status for up to 94 severe or profound conditions. MAIN OUTCOMES AND MEASURES Risk was defined as the probability that a hypothetical fetus created from a random pairing of individuals (within or across 15 self-reported racial/ethnic categories; there were 11 categories with >5000 samples) would be homozygous or compound heterozygous for 2 mutations presumed to cause severe or profound disease. Severe conditions were defined as those that if left untreated cause intellectual disability or a substantially shortened lifespan; profound conditions were those causing both. RESULTS The study included 346,790 individuals. Among major US racial/ethnic categories, the calculated frequency of fetuses potentially affected by a profound or severe condition ranged from 94.5 per 100,000 (95% CI, 82.4-108.3 per 100,000) for Hispanic couples to 392.2 per 100,000 (95% CI, 366.3-420.2 per 100,000) for Ashkenazi Jewish couples. In most racial/ethnic categories, expanded carrier screening modeled more hypothetical fetuses at risk for severe or profound conditions than did screening based on current professional guidelines (Mann-Whitney P < .001). For Northern European couples, the 2 professional guidelines-based screening panels modeled 55.2 hypothetical fetuses affected per 100,000 (95% CI, 51.3-59.3 per 100,000) and the expanded carrier screening modeled 159.2 fetuses per 100,000 (95% CI, 150.4-168.6 per 100,000). Overall, relative to expanded carrier screening, guideline-based screening ranged from identification of 6% (95% CI, 4%-8%) of hypothetical fetuses affected for East Asian couples to 87% (95% CI, 84%-90%) for African or African American couples. CONCLUSIONS AND RELEVANCE In a population of diverse races and ethnicities, expanded carrier screening may increase the detection of carrier status for a variety of potentially serious genetic conditions compared with current recommendations from professional societies. Prospective studies comparing current standard-of-care carrier screening with expanded carrier screening in at-risk populations are warranted before expanded screening is adopted.
Collapse
Affiliation(s)
- Imran S Haque
- Departments of Medical Affairs and Research, Counsyl, South San Francisco, California
| | - Gabriel A Lazarin
- Departments of Medical Affairs and Research, Counsyl, South San Francisco, California
| | - H Peter Kang
- Departments of Medical Affairs and Research, Counsyl, South San Francisco, California
| | - Eric A Evans
- Departments of Medical Affairs and Research, Counsyl, South San Francisco, California
| | - James D Goldberg
- Departments of Medical Affairs and Research, Counsyl, South San Francisco, California
| | - Ronald J Wapner
- Division of Reproductive Genetics, Department of Obstetrics and Gynecology, Columbia University Medical Center, New York, New York
| |
Collapse
|
19
|
Wong KK, Goldberg JD, Evans EA, Kang HP, Haque IS. Re: Carrier Screening is a Deficient Strategy for Determining Sperm Donor Eligibility and Reducing Risk of Disease in Recipient Children (From: Silver AJ, Larson JL, Silver MJ, et al. Genet Test Mol Biomarkers 2016;20:276-284). Genet Test Mol Biomarkers 2016; 20:413-4. [PMID: 27505438 DOI: 10.1089/gtmb.2016.29019.kkw] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
|
20
|
Mehta N, Lazarin GA, Spiegel E, Berentsen K, Brennan K, Giordano J, Haque IS, Wapner R. Tay-Sachs Carrier Screening by Enzyme and Molecular Analyses in the New York City Minority Population. Genet Test Mol Biomarkers 2016; 20:504-9. [PMID: 27362553 PMCID: PMC5314723 DOI: 10.1089/gtmb.2015.0302] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Background and Aims: Carrier screening for Tay-Sachs disease is performed by sequence analysis of the HEXA gene and/or hexosaminidase A enzymatic activity testing. Enzymatic analysis (EA) has been suggested as the optimal carrier screening method, especially in non-Ashkenazi Jewish (non-AJ) individuals, but its utilization and efficacy have not been fully evaluated in the general population. This study assesses the reliability of EA in comparison with HEXA sequence analysis in non-AJ populations. Methods: Five hundred eight Hispanic and African American patients (516 samples) had EA of their leukocytes performed and 12 of these patients who tested positive by EA (“carriers”) had subsequent HEXA gene sequencing performed. Results: Of the 508 patients, 25 (4.9%) were EA positive and 40 (7.9%) were inconclusive. Of the 12 patients who were sequenced, 11 did not carry a pathogenic variant and one carried a likely deleterious mutation (NM_000520.4(HEXA):c.1510C>T). Conclusions: High inconclusive rates and poor correlation between positive/inconclusive enzyme results and identification of pathogenic mutations suggest that ethnic-specific recalibration of reference ranges for EA may be necessary. Alternatively, HEXA gene sequencing could be performed.
Collapse
Affiliation(s)
| | | | - Erica Spiegel
- 2 Division of Maternal Fetal Medicine, Department of Obstetrics and Gynecology, Columbia University Medical Center , New York, New York
| | | | - Kelly Brennan
- 2 Division of Maternal Fetal Medicine, Department of Obstetrics and Gynecology, Columbia University Medical Center , New York, New York
| | - Jessica Giordano
- 1 Counsyl , South San Francisco, California.,2 Division of Maternal Fetal Medicine, Department of Obstetrics and Gynecology, Columbia University Medical Center , New York, New York.,3 Department of OBGYN-MFM, Columbia Doctors Midtown , New York, New York
| | | | - Ronald Wapner
- 2 Division of Maternal Fetal Medicine, Department of Obstetrics and Gynecology, Columbia University Medical Center , New York, New York
| |
Collapse
|
21
|
Kang HP, Maguire JR, Chu CS, Haque IS, Lai H, Mar-Heyming R, Ready K, Vysotskaia VS, Evans EA. Design and validation of a next generation sequencing assay for hereditary BRCA1 and BRCA2 mutation testing. PeerJ 2016; 4:e2162. [PMID: 27375968 PMCID: PMC4928470 DOI: 10.7717/peerj.2162] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2015] [Accepted: 06/01/2016] [Indexed: 12/02/2022] Open
Abstract
Hereditary breast and ovarian cancer syndrome, caused by a germline pathogenic variant in the BRCA1 or BRCA2 (BRCA1/2) genes, is characterized by an increased risk for breast, ovarian, pancreatic and other cancers. Identification of those who have a BRCA1/2 mutation is important so that they can take advantage of genetic counseling, screening, and potentially life-saving prevention strategies. We describe the design and analytic validation of the Counsyl Inherited Cancer Screen, a next-generation-sequencing-based test to detect pathogenic variation in the BRCA1 and BRCA2 genes. We demonstrate that the test is capable of detecting single-nucleotide variants (SNVs), short insertions and deletions (indels), and copy-number variants (CNVs, also known as large rearrangements) with zero errors over a 114-sample validation set consisting of samples from cell lines and deidentified patient samples, including 36 samples with BRCA1/2pathogenic germline mutations.
Collapse
Affiliation(s)
| | | | - Clement S Chu
- Counsyl Inc. , South San Francisco , CA , United States
| | - Imran S Haque
- Counsyl Inc. , South San Francisco , CA , United States
| | - Henry Lai
- Counsyl Inc. , South San Francisco , CA , United States
| | | | - Kaylene Ready
- Counsyl Inc. , South San Francisco , CA , United States
| | | | - Eric A Evans
- Counsyl Inc. , South San Francisco , CA , United States
| |
Collapse
|
22
|
Abstract
Carrier screening is the practice of testing individuals to identify those at increased risks of having children affected by genetic diseases. Professional guidelines on carrier screening have been available for more than 15 years, and have historically targeted specific diseases that occur at increased frequencies in defined ethnic populations. Enabled by rapidly evolving technology, expanded carrier screening aims to identify carriers for a broader array of diseases and may be applied universally (equally across all ethnic groups). This new approach deviates from the well-established criteria for screening models. In this review, we summarize the rationale for expanded carrier screening using available literature regarding clinical and technical data, as well as provider perspectives. We also discuss important avenues for further research in this burgeoning field.
Collapse
Affiliation(s)
| | - Imran S Haque
- Counsyl, 180 Kimball Way, South San Francisco, CA 94080
| |
Collapse
|
23
|
Lazarin GA, Hawthorne F, Collins NS, Platt EA, Evans EA, Haque IS. Systematic Classification of Disease Severity for Evaluation of Expanded Carrier Screening Panels. PLoS One 2014; 9:e114391. [PMID: 25494330 PMCID: PMC4262393 DOI: 10.1371/journal.pone.0114391] [Citation(s) in RCA: 72] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2013] [Accepted: 11/09/2014] [Indexed: 12/02/2022] Open
Abstract
Professional guidelines dictate that disease severity is a key criterion for carrier screening. Expanded carrier screening, which tests for hundreds to thousands of mutations simultaneously, requires an objective, systematic means of describing a given disease's severity to build screening panels. We hypothesized that diseases with characteristics deemed to be of highest impact would likewise be rated as most severe, and diseases with characteristics of lower impact would be rated as less severe. We describe a pilot test of this hypothesis in which we surveyed 192 health care professionals to determine the impact of specific disease phenotypic characteristics on perceived severity, and asked the same group to rate the severity of selected inherited diseases. The results support the hypothesis: we identified four “Tiers” of disease characteristics (1–4). Based on these responses, we developed an algorithm that, based on the combination of characteristics normally seen in an affected individual, classifies the disease as Profound, Severe, Moderate, or Mild. This algorithm allows simple classification of disease severity that is replicable and not labor intensive.
Collapse
Affiliation(s)
- Gabriel A. Lazarin
- Counsyl, South San Francisco, California, United States of America
- * E-mail:
| | | | | | | | - Eric A. Evans
- Counsyl, South San Francisco, California, United States of America
| | - Imran S. Haque
- Counsyl, South San Francisco, California, United States of America
| |
Collapse
|
24
|
Abstract
Molecular similarity has been effectively applied to many problems in cheminformatics and computational drug discovery, but modern methods can be prohibitively expensive for large-scale applications. The SCISSORS method rapidly approximates measures of pairwise molecular similarity such as ROCS and LINGO Tanimotos, acting as a filter to quickly reduce the size of a problem. We report an in-depth analysis of SCISSORS performance, including a mapping of the SCISSORS error distribution, benchmarking, and investigation of several algorithmic modifications. We show that SCISSORS can accurately predict multiconformer similarity and suggest a method for estimating optimal SCISSORS parameters in a data set-specific manner. These results are a useful resource for researchers seeking to incorporate SCISSORS into molecular similarity applications.
Collapse
Affiliation(s)
- Steven M Kearnes
- Department of Structural Biology, Stanford University , Stanford, California 94305, United States
| | | | | |
Collapse
|
25
|
|
26
|
Ready K, Haque IS, Srinivasan BS, Marshall JR. Knowledge and attitudes regarding expanded genetic carrier screening among women's healthcare providers. Fertil Steril 2011; 97:407-13. [PMID: 22137493 DOI: 10.1016/j.fertnstert.2011.11.007] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2011] [Revised: 09/28/2011] [Accepted: 11/14/2011] [Indexed: 11/29/2022]
Abstract
OBJECTIVE To determine women's healthcare providers' knowledge and attitudes regarding genetic disorders and expanded genetic screening. DESIGN Survey of American Society for Reproductive Medicine 2010 and American College of Obstetricians and Gynecologists 2011 Annual Meeting attendees. The survey included 60 items (12 demographic, 10 knowledge, and 38 attitude). Attitudes were assessed with a 5-point Likert scale. Chi-square or t tests determined significance. SETTING American Society for Reproductive Medicine 2010 and American College of Obstetricians and Gynecologists 2011 Annual Meeting. PATIENT(S) A total of 203 participants completed the survey. Of these, 48% were male, 61% were physicians, 73% were Caucasian, and 42% were aged 35-50 years. INTERVENTION(S) None. MAIN OUTCOME MEASURE(S) None. RESULT(S) Physicians had better knowledge scores than other participants (87% vs. 79%). Knowledge was not influenced by prior personal/family experience with genetic screening. Fewer correct answers were observed for the probability of a positive test (65.2%), the risk of transmitting a gene mutation (62.2%), and the risk of having an affected child (56.2%). Very few participants (18.3%) disagreed with the notion of carrier screening as socially responsible behavior. Some had concerns about test result confidentiality (40.1%) and resulting insurance rate increases (37.0%). Assuming equal costs, most participants preferred to be tested for a larger number of diseases (77.7%) and believed posttest counseling to be helpful (83.7%). CONCLUSION(S) Women's healthcare providers generally had good knowledge and positive attitudes about genetic disorders and expanded genetic screening. Specific misperceptions, both medical and legal, require education.
Collapse
|
27
|
Abstract
The SCISSORS method for approximating chemical similarities has shown excellent empirical performance on a number of real-world chemical data sets but lacks theoretically proven bounds on its worst-case error performance. This paper first proves reductions showing SCISSORS to be equivalent to two previous kernel methods: kernel principal components analysis and the rank-k Nyström approximation of a Gram matrix. These reductions allow the use of generalization bounds on these techniques to show that the expected error in SCISSORS approximations of molecular similarity kernels is bounded in expected pairwise inner product error, in matrix 2-norm and Frobenius norm for full kernel matrix approximations and in root-mean-square deviation for approximated matrices. Finally, we show that the actual performance of SCISSORS is significantly better than these worst-case bounds, indicating that chemical space is well-structured for chemical sampling algorithms.
Collapse
Affiliation(s)
- Imran S Haque
- Department of Computer Science, Stanford University, Stanford, California, United States
| | | |
Collapse
|
28
|
Abstract
Similarity measures based on the comparison of dense bit vectors of two-dimensional chemical features are a dominant method in chemical informatics. For large-scale problems, including compound selection and machine learning, computing the intersection between two dense bit vectors is the overwhelming bottleneck. We describe efficient implementations of this primitive as well as example applications using features of modern CPUs that allow 20-40× performance increases relative to typical code. Specifically, we describe fast methods for population count on modern x86 processors and cache-efficient matrix traversal and leader clustering algorithms that alleviate memory bandwidth bottlenecks in similarity matrix construction and clustering. The speed of our 2D comparison primitives is within a small factor of that obtained on GPUs and does not require specialized hardware.
Collapse
|
29
|
Beauchamp KA, Bowman GR, Lane TJ, Maibaum L, Haque IS, Pande VS. MSMBuilder2: Modeling Conformational Dynamics at the Picosecond to Millisecond Scale. J Chem Theory Comput 2011; 7:3412-3419. [PMID: 22125474 DOI: 10.1021/ct200463m] [Citation(s) in RCA: 298] [Impact Index Per Article: 22.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
Markov State Models provide a framework for understanding the fundamental states and rates in the conformational dynamics of biomolecules. We describe an improved protocol for constructing Markov State Models from molecular dynamics simulations. The new protocol includes advances in clustering, data preparation, and model estimation; these improvements lead to significant increases in model accuracy, as assessed by the ability to recapitulate equilibrium and kinetic properties of reference systems. A high-performance implementation of this protocol, provided in MSMBuilder2, is validated on dynamics ranging from picoseconds to milliseconds.
Collapse
|
30
|
Abstract
Algorithms for several emerging large-scale problems in cheminformatics have as their rate-limiting step the evaluation of relatively slow chemical similarity measures, such as structural similarity or three-dimensional (3-D) shape comparison. In this article we present SCISSORS, a linear-algebraical technique (related to multidimensional scaling and kernel principal components analysis) to rapidly estimate chemical similarities for several popular measures. We demonstrate that SCISSORS faithfully reflects its source similarity measures for both Tanimoto calculation and rank ordering. After an efficient precalculation step on a database, SCISSORS affords several orders of magnitude of speedup in database screening. SCISSORS furthermore provides an asymptotic speedup for large similarity matrix construction problems, reducing the number of conventional slow similarity evaluations required from quadratic to linear scaling.
Collapse
Affiliation(s)
- Imran S Haque
- Department of Computer Science and Department of Chemistry, Stanford University, Stanford, CA, USA
| | | |
Collapse
|
31
|
Abstract
LINGOs are a holographic measure of chemical similarity based on text comparison of SMILES strings. We present a new algorithm for calculating LINGO similarities amenable to parallelization on SIMD architectures (such as GPUs and vector units of modern CPUs). We show that it is nearly 3x as fast as existing algorithms on a CPU, and over 80x faster than existing methods when run on a GPU.
Collapse
Affiliation(s)
- Imran S Haque
- Department of Computer Science, Stanford University, Stanford, California, USA
| | | | | |
Collapse
|
32
|
Abstract
Modern graphics processing units (GPUs) are flexibly programmable and have peak computational throughput significantly faster than conventional CPUs. Herein, we describe the design and implementation of PAPER, an open-source implementation of Gaussian molecular shape overlay for NVIDIA GPUs. We demonstrate one to two order-of-magnitude speedups on high-end commodity GPU hardware relative to a reference CPU implementation of the shape overlay algorithm and speedups of over one order of magnitude relative to the commercial OpenEye ROCS package. In addition, we describe errors incurred by approximations used in common implementations of the algorithm.
Collapse
Affiliation(s)
- Imran S Haque
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | | |
Collapse
|