1
|
Li W, Miller D, Liu X, Tosi L, Chkaiban L, Mei H, Hung PH, Parekkadan B, Sherlock G, Levy S. Arrayed in vivo barcoding for multiplexed sequence verification of plasmid DNA and demultiplexing of pooled libraries. Nucleic Acids Res 2024; 52:e47. [PMID: 38709890 PMCID: PMC11162764 DOI: 10.1093/nar/gkae332] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2023] [Revised: 02/23/2024] [Accepted: 04/16/2024] [Indexed: 05/08/2024] Open
Abstract
Sequence verification of plasmid DNA is critical for many cloning and molecular biology workflows. To leverage high-throughput sequencing, several methods have been developed that add a unique DNA barcode to individual samples prior to pooling and sequencing. However, these methods require an individual plasmid extraction and/or in vitro barcoding reaction for each sample processed, limiting throughput and adding cost. Here, we develop an arrayed in vivo plasmid barcoding platform that enables pooled plasmid extraction and library preparation for Oxford Nanopore sequencing. This method has a high accuracy and recovery rate, and greatly increases throughput and reduces cost relative to other plasmid barcoding methods or Sanger sequencing. We use in vivo barcoding to sequence verify >45 000 plasmids and show that the method can be used to transform error-containing dispersed plasmid pools into sequence-perfect arrays or well-balanced pools. In vivo barcoding does not require any specialized equipment beyond a low-overhead Oxford Nanopore sequencer, enabling most labs to flexibly process hundreds to thousands of plasmids in parallel.
Collapse
Affiliation(s)
- Weiyi Li
- SLAC National Accelerator Laboratory, Stanford University, Stanford, CA, USA
| | - Darach Miller
- SLAC National Accelerator Laboratory, Stanford University, Stanford, CA, USA
| | - Xianan Liu
- SLAC National Accelerator Laboratory, Stanford University, Stanford, CA, USA
| | - Lorenzo Tosi
- Department of Biomedical Engineering, Rutgers University, Piscataway, NJ, USA
| | - Lamia Chkaiban
- Department of Biomedical Engineering, Rutgers University, Piscataway, NJ, USA
| | - Han Mei
- SLAC National Accelerator Laboratory, Stanford University, Stanford, CA, USA
| | - Po-Hsiang Hung
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Biju Parekkadan
- Department of Biomedical Engineering, Rutgers University, Piscataway, NJ, USA
| | - Gavin Sherlock
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Sasha F Levy
- SLAC National Accelerator Laboratory, Stanford University, Stanford, CA, USA
| |
Collapse
|
2
|
Van Deynze K, Mumm C, Maltby CJ, Switzenberg JA, Todd PK, Boyle AP. Enhanced Detection and Genotyping of Disease-Associated Tandem Repeats Using HMMSTR and Targeted Long-Read Sequencing. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.05.01.24306681. [PMID: 38746091 PMCID: PMC11092683 DOI: 10.1101/2024.05.01.24306681] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
Tandem repeat sequences comprise approximately 8% of the human genome and are linked to more than 50 neurodegenerative disorders. Accurate characterization of disease-associated repeat loci remains resource intensive and often lacks high resolution genotype calls. We introduce a multiplexed, targeted nanopore sequencing panel and HMMSTR, a sequence-based tandem repeat copy number caller. HMMSTR outperforms current signal- and sequence-based callers relative to two assemblies and we show it performs with high accuracy in heterozygous regions and at low read coverage. The flexible panel allows us to capture disease associated regions at an average coverage of >150x. Using these tools, we successfully characterize known or suspected repeat expansions in patient derived samples. In these samples we also identify unexpected expanded alleles at tandem repeat loci not previously associated with the underlying diagnosis. This genotyping approach for tandem repeat expansions is scalable, simple, flexible, and accurate, offering significant potential for diagnostic applications and investigation of expansion co-occurrence in neurodegenerative disorders. Abstract Figure
Collapse
|
3
|
Vegh P, Donovan S, Rosser S, Stracquadanio G, Fragkoudis R. Biofoundry-Scale DNA Assembly Validation Using Cost-Effective High-Throughput Long-Read Sequencing. ACS Synth Biol 2024; 13:683-686. [PMID: 38329009 PMCID: PMC10877595 DOI: 10.1021/acssynbio.3c00589] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Revised: 01/16/2024] [Accepted: 01/16/2024] [Indexed: 02/09/2024]
Abstract
Biofoundries are automated high-throughput facilities specializing in the design, construction, and testing of engineered/synthetic DNA constructs (plasmids), often from genetic parts. A critical step of this process is assessing the fidelity of the assembled DNA construct to the desired design. Current methods utilized for this purpose are restriction digest or PCR followed by fragment analysis and sequencing. The Edinburgh Genome Foundry (EGF) has recently established a single-molecule sequencing quality control step using the Oxford Nanopore sequencing technology, along with a companion Nextflow pipeline and a Python package, to perform in-depth analysis and generate a detailed report. Our software enables researchers working with plasmids, including biofoundry scientists, to rapidly analyze and interpret sequencing data. In conclusion, we have created a laboratory and software protocol that validates assembled, cloned, or edited plasmids, using Nanopore long-reads, which can serve as a useful resource for the genetics, synthetic biology, and sequencing communities.
Collapse
Affiliation(s)
- Peter Vegh
- Edinburgh
Genome Foundry, School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3BF, United
Kingdom
| | - Sophie Donovan
- Edinburgh
Genome Foundry, School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3BF, United
Kingdom
| | - Susan Rosser
- Edinburgh
Genome Foundry, School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3BF, United
Kingdom
| | - Giovanni Stracquadanio
- Edinburgh
Genome Foundry, School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3BF, United
Kingdom
| | - Rennos Fragkoudis
- Edinburgh
Genome Foundry, School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3BF, United
Kingdom
- Department
of Biochemistry and Biotechnology, University
of Thessaly, 41500 Larissa, Greece
| |
Collapse
|
4
|
Ramírez Rojas A, Brinkmann CK, Köbel TS, Schindler D. DuBA.flow─A Low-Cost, Long-Read Amplicon Sequencing Workflow for the Validation of Synthetic DNA Constructs. ACS Synth Biol 2024; 13:457-465. [PMID: 38295293 PMCID: PMC10877597 DOI: 10.1021/acssynbio.3c00522] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Revised: 10/27/2023] [Accepted: 11/13/2023] [Indexed: 02/02/2024]
Abstract
Modern biological science, especially synthetic biology, relies heavily on the construction of DNA elements, often in the form of plasmids. Plasmids are used for a variety of applications, including the expression of proteins for subsequent purification, the expression of heterologous pathways for the production of valuable compounds, and the study of biological functions and mechanisms. For all applications, a critical step after the construction of a plasmid is its sequence validation. The traditional method for sequence determination is Sanger sequencing, which is limited to approximately 1000 bp per reaction. Here, we present a highly scalable in-house method for rapid validation of amplified DNA sequences using long-read Nanopore sequencing. We developed two-step amplicon and transposase strategies to provide maximum flexibility for dual barcode sequencing. We also provide an automated analysis pipeline to quickly and reliably analyze sequencing results and provide easy-to-interpret results for each sample. The user-friendly DuBA.flow start-to-finish pipeline is widely applicable. Furthermore, we show that construct validation using DuBA.flow can be performed by barcoded colony PCR amplicon sequencing, thus accelerating research.
Collapse
Affiliation(s)
- Adán
A. Ramírez Rojas
- Max
Planck Institute for Terrestrial Microbiology, Karl-von-Frisch-Str. 10, 35043 Marburg, Germany
| | - Cedric K. Brinkmann
- Max
Planck Institute for Terrestrial Microbiology, Karl-von-Frisch-Str. 10, 35043 Marburg, Germany
| | - Tania S. Köbel
- Max
Planck Institute for Terrestrial Microbiology, Karl-von-Frisch-Str. 10, 35043 Marburg, Germany
| | - Daniel Schindler
- Max
Planck Institute for Terrestrial Microbiology, Karl-von-Frisch-Str. 10, 35043 Marburg, Germany
- Center
for Synthetic Microbiology, Philipps-University
Marburg, Karl-von-Frisch-Str.
14, 35032 Marburg, Germany
| |
Collapse
|
5
|
Li W, Miller D, Liu X, Tosi L, Chkaiban L, Mei H, Hung PH, Parekkadan B, Sherlock G, Levy SF. Arrayed in vivo barcoding for multiplexed sequence verification of plasmid DNA and demultiplexing of pooled libraries. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.13.562064. [PMID: 37873145 PMCID: PMC10592806 DOI: 10.1101/2023.10.13.562064] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/25/2023]
Abstract
Sequence verification of plasmid DNA is critical for many cloning and molecular biology workflows. To leverage high-throughput sequencing, several methods have been developed that add a unique DNA barcode to individual samples prior to pooling and sequencing. However, these methods require an individual plasmid extraction and/or in vitro barcoding reaction for each sample processed, limiting throughput and adding cost. Here, we develop an arrayed in vivo plasmid barcoding platform that enables pooled plasmid extraction and library preparation for Oxford Nanopore sequencing. This method has a high accuracy and recovery rate, and greatly increases throughput and reduces cost relative to other plasmid barcoding methods or Sanger sequencing. We use in vivo barcoding to sequence verify >45,000 plasmids and show that the method can be used to transform error-containing dispersed plasmid pools into sequence-perfect arrays or well-balanced pools. In vivo barcoding does not require any specialized equipment beyond a low-overhead Oxford Nanopore sequencer, enabling most labs to flexibly process hundreds to thousands of plasmids in parallel.
Collapse
Affiliation(s)
- Weiyi Li
- SLAC National Accelerator Laboratory, Stanford University, Stanford, CA, USA
| | - Darach Miller
- SLAC National Accelerator Laboratory, Stanford University, Stanford, CA, USA
| | - Xianan Liu
- SLAC National Accelerator Laboratory, Stanford University, Stanford, CA, USA
| | - Lorenzo Tosi
- Department of Biomedical Engineering, Rutgers University, Piscataway, NJ, USA
| | - Lamia Chkaiban
- Department of Biomedical Engineering, Rutgers University, Piscataway, NJ, USA
| | - Han Mei
- SLAC National Accelerator Laboratory, Stanford University, Stanford, CA, USA
| | - Po-Hsiang Hung
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Biju Parekkadan
- Department of Biomedical Engineering, Rutgers University, Piscataway, NJ, USA
| | - Gavin Sherlock
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Sasha F Levy
- SLAC National Accelerator Laboratory, Stanford University, Stanford, CA, USA
- Present Address: BacStitch DNA, Los Altos, CA, USA
| |
Collapse
|