151
|
Molecular-level similarity search brings computing to DNA data storage. Nat Commun 2021; 12:4764. [PMID: 34362913 PMCID: PMC8346626 DOI: 10.1038/s41467-021-24991-z] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2021] [Accepted: 07/19/2021] [Indexed: 11/30/2022] Open
Abstract
As global demand for digital storage capacity grows, storage technologies based on synthetic DNA have emerged as a dense and durable alternative to traditional media. Existing approaches leverage robust error correcting codes and precise molecular mechanisms to reliably retrieve specific files from large databases. Typically, files are retrieved using a pre-specified key, analogous to a filename. However, these approaches lack the ability to perform more complex computations over the stored data, such as similarity search: e.g., finding images that look similar to an image of interest without prior knowledge of their file names. Here we demonstrate a technique for executing similarity search over a DNA-based database of 1.6 million images. Queries are implemented as hybridization probes, and a key step in our approach was to learn an image-to-sequence encoding ensuring that queries preferentially bind to targets representing visually similar images. Experimental results show that our molecular implementation performs comparably to state-of-the-art in silico algorithms for similarity search. Storage technology based on DNA is emerging as an information dense and durable medium. Here the authors use machine learning-based encoding and hybridization probes to execute similarity searches in a DNA database.
Collapse
|
152
|
Wu J, Zhang S, Zhang T, Liu Y. HD-Code: End-to-End High Density Code for DNA Storage. IEEE Trans Nanobioscience 2021; 20:455-463. [PMID: 34343096 DOI: 10.1109/tnb.2021.3102122] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
With the rapid development of digital information techniques, the use of DNA media for information storage is considered as the future direction of data storage. Existing DNA storage schemes simply map compressed binary multimedia data into DNA base data, which has the disadvantages of data loss, low logical storage density and high cost of synthesis. This paper presents an end-to-end high density DNA encoding algorithm(referred to as HD-code, where HD stands for high density). The novelty and contributions of this work contain three parts. First, by taking full advantage of the statistical characteristics of the original multimedia data and considering the biological constraints on the DNA bases, the proposed scheme achieves higher logical storage density and improves the flexibility and consistency in data storage. Second, by performing data conversion, the proposed scheme can effectively encode extreme images with large proportion of single color. Third, the proposed method can reconstruct high quality images and reduce synthesis costs by yielding better rate-PSNR(Peak Signal to Noise Ratio).
Collapse
|
153
|
Zhang JX, Yordanov B, Gaunt A, Wang MX, Dai P, Chen YJ, Zhang K, Fang JZ, Dalchau N, Li J, Phillips A, Zhang DY. A deep learning model for predicting next-generation sequencing depth from DNA sequence. Nat Commun 2021; 12:4387. [PMID: 34282137 PMCID: PMC8290051 DOI: 10.1038/s41467-021-24497-8] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2020] [Accepted: 06/17/2021] [Indexed: 11/29/2022] Open
Abstract
Targeted high-throughput DNA sequencing is a primary approach for genomics and molecular diagnostics, and more recently as a readout for DNA information storage. Oligonucleotide probes used to enrich gene loci of interest have different hybridization kinetics, resulting in non-uniform coverage that increases sequencing costs and decreases sequencing sensitivities. Here, we present a deep learning model (DLM) for predicting Next-Generation Sequencing (NGS) depth from DNA probe sequences. Our DLM includes a bidirectional recurrent neural network that takes as input both DNA nucleotide identities as well as the calculated probability of the nucleotide being unpaired. We apply our DLM to three different NGS panels: a 39,145-plex panel for human single nucleotide polymorphisms (SNP), a 2000-plex panel for human long non-coding RNA (lncRNA), and a 7373-plex panel targeting non-human sequences for DNA information storage. In cross-validation, our DLM predicts sequencing depth to within a factor of 3 with 93% accuracy for the SNP panel, and 99% accuracy for the non-human panel. In independent testing, the DLM predicts the lncRNA panel with 89% accuracy when trained on the SNP panel. The same model is also effective at predicting the measured single-plex kinetic rate constants of DNA hybridization and strand displacement. DNA probes used in next generation sequencing (NGS) have variable hybridisation kinetics, resulting in non-uniform coverage. Here, the authors develop a deep learning model to predict NGS depth using DNA probe sequences and apply to human and non-human sequencing panels.
Collapse
Affiliation(s)
- Jinny X Zhang
- Department of Bioengineering, Rice University, Houston, TX, USA.,Systems, Synthetic, and Physical Biology, Rice University, Houston, TX, USA
| | - Boyan Yordanov
- Microsoft Research, Cambridge, UK.,Scientific Technologies, London, UK
| | | | - Michael X Wang
- Department of Bioengineering, Rice University, Houston, TX, USA
| | - Peng Dai
- Department of Bioengineering, Rice University, Houston, TX, USA
| | | | - Kerou Zhang
- Department of Bioengineering, Rice University, Houston, TX, USA
| | - John Z Fang
- Department of Bioengineering, Rice University, Houston, TX, USA
| | | | - Jiaming Li
- Department of Bioengineering, Rice University, Houston, TX, USA.,Systems, Synthetic, and Physical Biology, Rice University, Houston, TX, USA
| | | | - David Yu Zhang
- Department of Bioengineering, Rice University, Houston, TX, USA. .,Systems, Synthetic, and Physical Biology, Rice University, Houston, TX, USA.
| |
Collapse
|
154
|
Zhang S, Wu J, Huang B, Liu Y. High-density information storage and random access scheme using synthetic DNA. 3 Biotech 2021; 11:328. [PMID: 34194912 PMCID: PMC8197696 DOI: 10.1007/s13205-021-02882-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2021] [Accepted: 06/03/2021] [Indexed: 10/21/2022] Open
Abstract
The high-storage density, long-life cycle, and low-energy consumption of DNA molecules make it the future of next-generation storage technology. However, DNA storage has the disadvantages of high-synthesis cost and low-random access efficiency. A high-density DNA-coding scheme can effectively reduce the cost of DNA synthesis. This paper first proposes a DNA-mapping method based on codebook and a random access method for DNA information based on encoded content. The mapping method satisfies the two biological constraints of homopolymer length and GC content. The random access method can efficiently and selectively read specific files in the DNA pool. To increase storage density, convolutional neural networks are combined with mapping methods to generate base sequences. In the experiments, our method was compared with the results of existing DNA information storage methods, which showed that the proposed scheme has better information storage density.
Collapse
Affiliation(s)
- Shufang Zhang
- School of Electrical and Information Engineering, Tianjin University, Tianjin, 300072 China
| | - Jianjun Wu
- School of Electrical and Information Engineering, Tianjin University, Tianjin, 300072 China
| | - Beibei Huang
- School of Electrical and Information Engineering, Tianjin University, Tianjin, 300072 China
| | - Yuhong Liu
- Computer Science and Engineering Department, Santa Clara University, Santa Clara, CA 95053 USA
| |
Collapse
|
155
|
Zhu J, Ermann N, Chen K, Keyser UF. Image Encoding Using Multi-Level DNA Barcodes with Nanopore Readout. SMALL (WEINHEIM AN DER BERGSTRASSE, GERMANY) 2021; 17:e2100711. [PMID: 34133074 DOI: 10.1002/smll.202100711] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/03/2021] [Revised: 03/30/2021] [Indexed: 05/25/2023]
Abstract
Deoxyribonucleic acid (DNA) nanostructure-based data encoding is an emerging information storage mode, offering rewritable, editable, and secure data storage. Herein, a DNA nanostructure-based storage method established on a solid-state nanopore sensing platform to save and encrypt a 2D grayscale image is proposed. DNA multi-way junctions of different sizes are attached to a double strand of DNA carriers, resulting in distinct levels of current blockades when passing through a glass nanopore with diameters around 14 nm. The resulting quaternary encoding doubles the capacity relative to a classical binary system. Through toehold-mediated strand displacement reactions, the DNA nanostructures can be precisely added to and removed from the DNA carrier. By encoding the image into 16 DNA carriers using the quaternary barcodes and reading them in one simultaneous measurement, the image is successfully saved, encrypted, and recovered. Avoiding any proteins or enzymatic reactions, the authors thus realize a pure DNA storage system on a nanopore platform with increased capacity and programmability.
Collapse
Affiliation(s)
- Jinbo Zhu
- Cavendish Laboratory, University of Cambridge, JJ Thompson Avenue, Cambridge, CB3 0HE, UK
| | - Niklas Ermann
- Cavendish Laboratory, University of Cambridge, JJ Thompson Avenue, Cambridge, CB3 0HE, UK
| | - Kaikai Chen
- Cavendish Laboratory, University of Cambridge, JJ Thompson Avenue, Cambridge, CB3 0HE, UK
| | - Ulrich F Keyser
- Cavendish Laboratory, University of Cambridge, JJ Thompson Avenue, Cambridge, CB3 0HE, UK
| |
Collapse
|
156
|
Kim ES, Kim JS, Chakrabarty N, Yun CH. Covalent Positioning of Single DNA Molecules for Nanopatterning. NANOMATERIALS 2021; 11:nano11071725. [PMID: 34209077 PMCID: PMC8307146 DOI: 10.3390/nano11071725] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/30/2021] [Revised: 06/26/2021] [Accepted: 06/28/2021] [Indexed: 11/16/2022]
Abstract
Bottom-up micropatterning or nanopatterning can be viewed as the localization of target molecules to the desired area of a surface. A majority of these processes rely on the physical adsorption of ink-like molecules to the paper-like surface, resulting in unstable immobilization of the target molecules owing to their noncovalent linkage to the surface. Herein, successive single nick-sealing facilitated the covalent immobilization of individual DNA molecules at defined positions on a dendron-coated silicon surface using atomic force microscopy. The covalently-patterned ssDNA was visualized when the streptavidin-coated gold nanoparticles bound to the biotinylated DNA. The successive covalent positioning of the target DNA under ambient conditions may facilitate the bottom-up construction of DNA-based durable nanostructures, nanorobots, or memory system.
Collapse
Affiliation(s)
- Eung-Sam Kim
- Department of Biological Sciences, Research Center of Ecomimetics and Center for Next Generation Sensor Research and Development, Chonnam National University, Gwangju 61186, Korea
- Correspondence: ; Tel.: +82-62-530-3416; Fax: +82-62-530-3409
| | - Jung Sook Kim
- Department of Chemistry, Division of Integrative Biosciences and Biotechnology, Pohang University of Science and Technology, Pohang 37673, Korea;
| | - Nishan Chakrabarty
- School of Biological Sciences and Biotechnology, Chonnam National University, Gwangju 61186, Korea; (N.C.); (C.-H.Y.)
| | - Chul-Ho Yun
- School of Biological Sciences and Biotechnology, Chonnam National University, Gwangju 61186, Korea; (N.C.); (C.-H.Y.)
| |
Collapse
|
157
|
Yuan Y, Lv H, Zhang Q. DNA strand displacement reactions to accomplish a two-degree-of-freedom PID controller and its application in subtraction gate. IEEE Trans Nanobioscience 2021; 20:554-564. [PMID: 34161242 DOI: 10.1109/tnb.2021.3091685] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Synthesis control circuits can be used to effectively control biochemical molecule processes. In the controller design based on chemical reaction networks (CRNs), generally only the tracking set-point is considered. However, the influence of disturbances, which are frequently encountered in biochemical systems, is often neglected, thus weakening the control effect of the system. In this article, tracking set-point input and suppressing disturbance input are considered in the control effect. Firstly, CRNs are adopted to construct a two-degree-of-freedom PID controller by combining a one-degree-of-freedom PID controller with a feedforward controller for the first time. Then, CRN expressions of the two input functions (step function and ramp function) used as input signals are defined. Furthermore, the two-degree-of-freedom PID controller is founded by DNA strand displacement (DSD) reaction networks, because DNA is an ideal engineering material to constitute molecular devices based on CRNs. The overshoot of the two-degree-of-freedom PID control system is significantly reduced compared to the one-degree-of-freedom PID control system. Finally, a leak reaction is treated as an extraneous disturbance input to a subtraction gate. The influence of external disturbance is solved by the two-degree-of-freedom PID controller. It is worth noting that the two-degree-of-freedom subtraction gate control system better restrains the impact of a disturbance input (leak reaction).
Collapse
|
158
|
Song LF, Deng ZH, Gong ZY, Li LL, Li BZ. Large-Scale de novo Oligonucleotide Synthesis for Whole-Genome Synthesis and Data Storage: Challenges and Opportunities. Front Bioeng Biotechnol 2021; 9:689797. [PMID: 34239862 PMCID: PMC8258115 DOI: 10.3389/fbioe.2021.689797] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2021] [Accepted: 05/27/2021] [Indexed: 11/13/2022] Open
Abstract
Over the past decades, remarkable progress on phosphoramidite chemistry-based large-scale de novo oligonucleotide synthesis has been achieved, enabling numerous novel and exciting applications. Among them, de novo genome synthesis and DNA data storage are striking. However, to make these two applications more practical, the synthesis length, speed, cost, and throughput require vast improvements, which is a challenge to be met by the phosphoramidite chemistry. Harnessing the power of enzymes, the recently emerged enzymatic methods provide a competitive route to overcome this challenge. In this review, we first summarize the status of large-scale oligonucleotide synthesis technologies including the basic methodology and large-scale synthesis approaches, with special focus on the emerging enzymatic methods. Afterward, we discuss the opportunities and challenges of large-scale oligonucleotide synthesis on de novo genome synthesis and DNA data storage respectively.
Collapse
Affiliation(s)
- Li-Fu Song
- Frontiers Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin, China
- School of Chemical Engineering and Technology, Tianjin University, Tianjin, China
| | - Zheng-Hua Deng
- Frontiers Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin, China
- School of Chemical Engineering and Technology, Tianjin University, Tianjin, China
| | - Zi-Yi Gong
- Frontiers Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin, China
- School of Chemical Engineering and Technology, Tianjin University, Tianjin, China
| | - Lu-Lu Li
- LC-BIO Technologies Co., Ltd., Hangzhou, China
| | - Bing-Zhi Li
- Frontiers Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin, China
- School of Chemical Engineering and Technology, Tianjin University, Tianjin, China
| |
Collapse
|
159
|
Ličen M, Masiero S, Pieraccini S, Drevenšek-Olenik I. Reversible Photoisomerization in Thin Surface Films from Azo-Functionalized Guanosine Derivatives. ACS OMEGA 2021; 6:15421-15430. [PMID: 34151120 PMCID: PMC8210406 DOI: 10.1021/acsomega.1c01879] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/08/2021] [Accepted: 05/25/2021] [Indexed: 06/13/2023]
Abstract
Two novel azo-functionalized guanosine derivatives were synthesized, and their photoisomerization process was investigated in molecular monolayers at the air-water interface and in the Langmuir-Blodgett (LB) films on solid substrates. Measurements of surface pressure vs area isotherms, surface potential measurements, UV-visible (vis) absorption spectroscopy, Brewster angle microscopy (BAM), and atomic force microscopy (AFM) were performed. Despite not having a typical amphiphilic molecular structure, the derivatives formed stable films on the water surface. They could also undergo repeated photoisomerization in all of the investigated thin-film configurations. The observations suggest that in the films at the air-water interface, the molecules first exhibit a conformational change, and then they reorient to an energetically more favored orientation. In the LB films transferred onto solid substrates, the isomerization process occurs on a similar time scale as in solution. However, the isomerization efficiency is about an order of magnitude lower than that in solution. Our results show that DNA nucleobases functionalized with azobenzene moieties are suitable candidates for the fabrication of photoactive two-dimensional (2D) materials that can provide all beneficial functionalities of DNA-based compounds.
Collapse
Affiliation(s)
- Matjaž Ličen
- Faculty
of Mathematics and Physics, University of
Ljubljana, Jadranska 19, SI-1000 Ljubljana, Slovenia
| | - Stefano Masiero
- Dipartimento
di Chimica “Giacomo Ciamician”, Alma Mater Studiorum—Università di Bologna, Via San Giacomo 11, I-40126 Bologna, Italy
| | - Silvia Pieraccini
- Dipartimento
di Chimica “Giacomo Ciamician”, Alma Mater Studiorum—Università di Bologna, Via San Giacomo 11, I-40126 Bologna, Italy
| | - Irena Drevenšek-Olenik
- Faculty
of Mathematics and Physics, University of
Ljubljana, Jadranska 19, SI-1000 Ljubljana, Slovenia
- Department
of Complex Matter, Jožef Stefan Institute, Jamova 39, SI-1000 Ljubljana, Slovenia
| |
Collapse
|
160
|
Gangadharan S, Raman K. The art of molecular computing: Whence and whither. Bioessays 2021; 43:e2100051. [PMID: 34101866 DOI: 10.1002/bies.202100051] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2021] [Revised: 05/11/2021] [Accepted: 05/18/2021] [Indexed: 12/30/2022]
Abstract
An astonishingly diverse biomolecular circuitry orchestrates the functioning machinery underlying every living cell. These biomolecules and their circuits have been engineered not only for various industrial applications but also to perform other atypical functions that they were not evolved for-including computation. Various kinds of computational challenges, such as solving NP-complete problems with many variables, logical computation, neural network operations, and cryptography, have all been attempted through this unconventional computing paradigm. In this review, we highlight key experiments across three different ''eras'' of molecular computation, beginning with molecular solutions, transitioning to logic circuits and ultimately, more complex molecular networks. We also discuss a variety of applications of molecular computation, from solving NP-hard problems to self-assembled nanostructures for delivering molecules, and provide a glimpse into the exciting potential that molecular computing holds for the future. Also see the video abstract here: https://youtu.be/9Mw0K0vCSQw.
Collapse
Affiliation(s)
- Sahana Gangadharan
- Bhupat and Jyoti Mehta School of Biosciences, Department of Biotechnology, Indian Institute of Technology Madras, Chennai, India.,Initiative for Biological Systems Engineering, Indian Institute of Technology Madras, Chennai, India
| | - Karthik Raman
- Bhupat and Jyoti Mehta School of Biosciences, Department of Biotechnology, Indian Institute of Technology Madras, Chennai, India.,Initiative for Biological Systems Engineering, Indian Institute of Technology Madras, Chennai, India.,Robert Bosch Centre for Data Science and Artificial Intelligence (RBCDSAI), Indian Institute of Technology Madras, Chennai, India
| |
Collapse
|
161
|
Xu C, Zhao C, Ma B, Liu H. Uncertainties in synthetic DNA-based data storage. Nucleic Acids Res 2021; 49:5451-5469. [PMID: 33836076 PMCID: PMC8191772 DOI: 10.1093/nar/gkab230] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2020] [Revised: 02/16/2021] [Accepted: 03/22/2021] [Indexed: 12/12/2022] Open
Abstract
Deoxyribonucleic acid (DNA) has evolved to be a naturally selected, robust biomacromolecule for gene information storage, and biological evolution and various diseases can find their origin in uncertainties in DNA-related processes (e.g. replication and expression). Recently, synthetic DNA has emerged as a compelling molecular media for digital data storage, and it is superior to the conventional electronic memory devices in theoretical retention time, power consumption, storage density, and so forth. However, uncertainties in the in vitro DNA synthesis and sequencing, along with its conjugation chemistry and preservation conditions can lead to severe errors and data loss, which limit its practical application. To maintain data integrity, complicated error correction algorithms and substantial data redundancy are usually required, which can significantly limit the efficiency and scale-up of the technology. Herein, we summarize the general procedures of the state-of-the-art DNA-based digital data storage methods (e.g. write, read, and preservation), highlighting the uncertainties involved in each step as well as potential approaches to correct them. We also discuss challenges yet to overcome and research trends in the promising field of DNA-based data storage.
Collapse
Affiliation(s)
- Chengtao Xu
- State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing, Jiangsu 210096, China
| | - Chao Zhao
- State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing, Jiangsu 210096, China
| | - Biao Ma
- State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing, Jiangsu 210096, China
| | - Hong Liu
- State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing, Jiangsu 210096, China
| |
Collapse
|
162
|
Barnes JC. Reading and writing data by using self-immolative, sequence-defined oligourethanes. Chem 2021. [DOI: 10.1016/j.chempr.2021.05.012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
|
163
|
Bhardwaj V, Pevzner PA, Rashtchian C, Safonova Y. Trace Reconstruction Problems in Computational Biology. IEEE TRANSACTIONS ON INFORMATION THEORY 2021; 67:3295-3314. [PMID: 34176957 PMCID: PMC8224466 DOI: 10.1109/tit.2020.3030569] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
The problem of reconstructing a string from its error-prone copies, the trace reconstruction problem, was introduced by Vladimir Levenshtein two decades ago. While there has been considerable theoretical work on trace reconstruction, practical solutions have only recently started to emerge in the context of two rapidly developing research areas: immunogenomics and DNA data storage. In immunogenomics, traces correspond to mutated copies of genes, with mutations generated naturally by the adaptive immune system. In DNA data storage, traces correspond to noisy copies of DNA molecules that encode digital data, with errors being artifacts of the data retrieval process. In this paper, we introduce several new trace generation models and open questions relevant to trace reconstruction for immunogenomics and DNA data storage, survey theoretical results on trace reconstruction, and highlight their connections to computational biology. Throughout, we discuss the applicability and shortcomings of known solutions and suggest future research directions.
Collapse
Affiliation(s)
- Vinnu Bhardwaj
- Electrical and Computer Engineering Department, University of California San Diego, La Jolla, USA
| | - Pavel A. Pevzner
- Computer Science and Engineering Department, University of California San Diego, La Jolla, USA
| | - Cyrus Rashtchian
- Computer Science and Engineering Department, University of California San Diego, La Jolla, USA
- Qualcomm Institute, University of California San Diego, La Jolla, USA
| | - Yana Safonova
- Computer Science and Engineering Department, University of California San Diego, La Jolla, USA
| |
Collapse
|
164
|
Soete M, Mertens C, Aksakal R, Badi N, Du Prez F. Sequence-Encoded Macromolecules with Increased Data Storage Capacity through a Thiol-Epoxy Reaction. ACS Macro Lett 2021; 10:616-622. [PMID: 35570768 DOI: 10.1021/acsmacrolett.1c00275] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Sequence-encoded oligo(thioether urethane)s with two different coding monomers per backbone unit were prepared via a solid phase, two-step iterative protocol based on thiolactone chemistry. The first step of the synthetic cycle consists of the thiolactone ring opening with a primary amine, whereby the in situ released thiol is immediately reacted with an epoxide. In the second step, the thiolactone group is reinstalled to initiate the next cycle. This strategy allows to introduce two different coding monomers per synthetic cycle, rendering the resulting macromolecules especially attractive in the area of (macro)molecular data storage because of their increased data storage capacity. Subsequently, the efficiency of the herein reported synthesis route and the applicability of the dual-encoded sequence-defined macromolecules as a potential data storage platform have been demonstrated by unraveling the exact monomer order using tandem mass spectrometry techniques.
Collapse
Affiliation(s)
- Matthieu Soete
- Polymer Chemistry Research Group, Centre of Macromolecular Chemistry (CMaC), Department of organic and Macromolecular Chemistry, Faculty of Sciences, Ghent University, Krijgslaan 281 S4-bis, B-9000 Ghent, Belgium
| | - Chiel Mertens
- Polymer Chemistry Research Group, Centre of Macromolecular Chemistry (CMaC), Department of organic and Macromolecular Chemistry, Faculty of Sciences, Ghent University, Krijgslaan 281 S4-bis, B-9000 Ghent, Belgium
| | - Resat Aksakal
- Polymer Chemistry Research Group, Centre of Macromolecular Chemistry (CMaC), Department of organic and Macromolecular Chemistry, Faculty of Sciences, Ghent University, Krijgslaan 281 S4-bis, B-9000 Ghent, Belgium
| | - Nezha Badi
- Polymer Chemistry Research Group, Centre of Macromolecular Chemistry (CMaC), Department of organic and Macromolecular Chemistry, Faculty of Sciences, Ghent University, Krijgslaan 281 S4-bis, B-9000 Ghent, Belgium
| | - Filip Du Prez
- Polymer Chemistry Research Group, Centre of Macromolecular Chemistry (CMaC), Department of organic and Macromolecular Chemistry, Faculty of Sciences, Ghent University, Krijgslaan 281 S4-bis, B-9000 Ghent, Belgium
| |
Collapse
|
165
|
Zhou Y, Xu X, Wei Y, Cheng Y, Guo Y, Khudyakov I, Liu F, He P, Song Z, Li Z, Gao Y, Ang EL, Zhao H, Zhang Y, Zhao S. A widespread pathway for substitution of adenine by diaminopurine in phage genomes. Science 2021; 372:512-516. [PMID: 33926954 DOI: 10.1126/science.abe4882] [Citation(s) in RCA: 49] [Impact Index Per Article: 16.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2020] [Accepted: 03/05/2021] [Indexed: 01/04/2023]
Abstract
DNA modifications vary in form and function but generally do not alter Watson-Crick base pairing. Diaminopurine (Z) is an exception because it completely replaces adenine and forms three hydrogen bonds with thymine in cyanophage S-2L genomic DNA. However, the biosynthesis, prevalence, and importance of Z genomes remain unexplored. Here, we report a multienzyme system that supports Z-genome synthesis. We identified dozens of globally widespread phages harboring such enzymes, and we further verified the Z genome in one of these phages, Acinetobacter phage SH-Ab 15497, by using liquid chromatography with ultraviolet and mass spectrometry. The Z genome endows phages with evolutionary advantages for evading the attack of host restriction enzymes, and the characterization of its biosynthetic pathway enables Z-DNA production on a large scale for a diverse range of applications.
Collapse
Affiliation(s)
- Yan Zhou
- Tianjin Key Laboratory for Modern Drug Delivery and High-Efficiency, Collaborative Innovation Center of Chemical Science and Engineering, School of Pharmaceutical Science and Technology, Tianjin University, Tianjin 300072, China.,Frontiers Science Center for Synthetic Biology (Ministry of Education), Tianjin University, Tianjin 300072, China
| | - Xuexia Xu
- iHuman Institute, ShanghaiTech University, Shanghai 201210, China.,School of Life Science and Technology, ShanghaiTech University, Shanghai 201210, China.,University of Chinese Academy of Sciences, Beijing 100049, China.,Shanghai Institute of Nutrition and Health, Chinese Academy of Sciences, Shanghai 200061, China
| | - Yifeng Wei
- Singapore Institute of Food and Biotechnology Innovation, Agency for Science, Technology and Research (A*STAR), Singapore
| | - Yu Cheng
- iHuman Institute, ShanghaiTech University, Shanghai 201210, China.,School of Life Science and Technology, ShanghaiTech University, Shanghai 201210, China
| | - Yu Guo
- iHuman Institute, ShanghaiTech University, Shanghai 201210, China.,School of Life Science and Technology, ShanghaiTech University, Shanghai 201210, China
| | - Ivan Khudyakov
- All-Russian Research Institute for Agricultural Microbiology, St. Petersburg 196608, Russia
| | - Fuli Liu
- Department of Medical Microbiology and Immunology, Shanghai Jiao Tong University School of Medicine, Shanghai 200025, China
| | - Ping He
- Department of Medical Microbiology and Immunology, Shanghai Jiao Tong University School of Medicine, Shanghai 200025, China
| | - Zhangyue Song
- Biomedical Big Data Platform, SIAIS, ShanghaiTech University, Shanghai 201210, China
| | - Zhi Li
- Tianjin Key Laboratory for Modern Drug Delivery and High-Efficiency, Collaborative Innovation Center of Chemical Science and Engineering, School of Pharmaceutical Science and Technology, Tianjin University, Tianjin 300072, China
| | - Yan Gao
- Tianjin Key Laboratory for Modern Drug Delivery and High-Efficiency, Collaborative Innovation Center of Chemical Science and Engineering, School of Pharmaceutical Science and Technology, Tianjin University, Tianjin 300072, China
| | - Ee Lui Ang
- Singapore Institute of Food and Biotechnology Innovation, Agency for Science, Technology and Research (A*STAR), Singapore
| | - Huimin Zhao
- Singapore Institute of Food and Biotechnology Innovation, Agency for Science, Technology and Research (A*STAR), Singapore. .,Department of Chemical and Biomolecular Engineering, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Yan Zhang
- Tianjin Key Laboratory for Modern Drug Delivery and High-Efficiency, Collaborative Innovation Center of Chemical Science and Engineering, School of Pharmaceutical Science and Technology, Tianjin University, Tianjin 300072, China. .,Frontiers Science Center for Synthetic Biology (Ministry of Education), Tianjin University, Tianjin 300072, China
| | - Suwen Zhao
- iHuman Institute, ShanghaiTech University, Shanghai 201210, China. .,School of Life Science and Technology, ShanghaiTech University, Shanghai 201210, China
| |
Collapse
|
166
|
Chen W, Han M, Zhou J, Ge Q, Wang P, Zhang X, Zhu S, Song L, Yuan Y. An artificial chromosome for data storage. Natl Sci Rev 2021; 8:nwab028. [PMID: 34691648 PMCID: PMC8288405 DOI: 10.1093/nsr/nwab028] [Citation(s) in RCA: 44] [Impact Index Per Article: 14.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2020] [Revised: 02/07/2021] [Accepted: 02/07/2021] [Indexed: 12/14/2022] Open
Abstract
DNA digital storage provides an alternative for information storage with high density and long-term stability. Here, we report the de novo design and synthesis of an artificial chromosome that encodes two pictures and a video clip. The encoding paradigm utilizing the superposition of sparsified error correction codewords and pseudo-random sequences tolerates base insertions/deletions and is well suited to error-prone nanopore sequencing for data retrieval. The entire 254 kb sequence was 95.27% occupied by encoded data. The Transformation-Associated Recombination method was used in the construction of this chromosome from DNA fragments and necessary autonomous replication sequences. The stability was demonstrated by transmitting the data-carrying chromosome to the 100th generation. This study demonstrates a data storage method using encoded artificial chromosomes via in vivo assembly for write-once and stable replication for multiple retrievals, similar to a compact disc, with potential in economically massive data distribution.
Collapse
Affiliation(s)
- Weigang Chen
- School of Microelectronics, Tianjin University, Tianjin 300072, China
- Frontier Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin 300072, China
| | - Mingzhe Han
- Frontier Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin 300072, China
- SynBio Research Platform, Collaborative Innovation Center of Chemical Science and Engineering (Tianjin), School of Chemical Engineering and Technology, Tianjin University, Tianjin 300072, China
| | - Jianting Zhou
- Frontier Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin 300072, China
- SynBio Research Platform, Collaborative Innovation Center of Chemical Science and Engineering (Tianjin), School of Chemical Engineering and Technology, Tianjin University, Tianjin 300072, China
| | - Qi Ge
- School of Microelectronics, Tianjin University, Tianjin 300072, China
| | - Panpan Wang
- School of Microelectronics, Tianjin University, Tianjin 300072, China
| | - Xinchen Zhang
- Frontier Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin 300072, China
- SynBio Research Platform, Collaborative Innovation Center of Chemical Science and Engineering (Tianjin), School of Chemical Engineering and Technology, Tianjin University, Tianjin 300072, China
| | - Siyu Zhu
- Frontier Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin 300072, China
- SynBio Research Platform, Collaborative Innovation Center of Chemical Science and Engineering (Tianjin), School of Chemical Engineering and Technology, Tianjin University, Tianjin 300072, China
| | - Lifu Song
- Frontier Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin 300072, China
- SynBio Research Platform, Collaborative Innovation Center of Chemical Science and Engineering (Tianjin), School of Chemical Engineering and Technology, Tianjin University, Tianjin 300072, China
| | - Yingjin Yuan
- Frontier Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin 300072, China
- SynBio Research Platform, Collaborative Innovation Center of Chemical Science and Engineering (Tianjin), School of Chemical Engineering and Technology, Tianjin University, Tianjin 300072, China
| |
Collapse
|
167
|
Chen Z, Elowitz MB. Programmable protein circuit design. Cell 2021; 184:2284-2301. [PMID: 33848464 PMCID: PMC8087657 DOI: 10.1016/j.cell.2021.03.007] [Citation(s) in RCA: 61] [Impact Index Per Article: 20.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2020] [Revised: 02/22/2021] [Accepted: 03/02/2021] [Indexed: 12/11/2022]
Abstract
A fundamental challenge in synthetic biology is to create molecular circuits that can program complex cellular functions. Because proteins can bind, cleave, and chemically modify one another and interface directly and rapidly with endogenous pathways, they could extend the capabilities of synthetic circuits beyond what is possible with gene regulation alone. However, the very diversity that makes proteins so powerful also complicates efforts to harness them as well-controlled synthetic circuit components. Recent work has begun to address this challenge, focusing on principles such as orthogonality and composability that permit construction of diverse circuit-level functions from a limited set of engineered protein components. These approaches are now enabling the engineering of circuits that can sense, transmit, and process information; dynamically control cellular behaviors; and enable new therapeutic strategies, establishing a powerful paradigm for programming biology.
Collapse
Affiliation(s)
- Zibo Chen
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA
| | - Michael B Elowitz
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA; Howard Hughes Medical Institute, California Institute of Technology, Pasadena, CA 91125, USA.
| |
Collapse
|
168
|
Berk KL, Blum SM, Funk VL, Sun Y, Yang IY, Gostomski MV, Roth PA, Liem AT, Emanuel PA, Hogan ME, Miklos AE, Lux MW. Rapid Visual Authentication Based on DNA Strand Displacement. ACS APPLIED MATERIALS & INTERFACES 2021; 13:19476-19486. [PMID: 33852293 DOI: 10.1021/acsami.1c02429] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Novel ways to track and verify items of a high value or security is an ever-present need. Taggants made from deoxyribonucleic acid (DNA) have several advantageous properties, such as high information density and robust synthesis; however, existing methods require laboratory techniques to verify, limiting applications. Here, we leverage DNA nanotechnology to create DNA taggants that can be validated in the field in seconds to minutes with a simple equipment. The system is driven by toehold-mediated strand-displacement reactions where matching oligonucleotide sequences drive the generation of a fluorescent signal through the potential energy of base pairing. By pooling different "input" oligonucleotide sequences in a taggant and spatially separating "reporter" oligonucleotide sequences on a paper ticket, unique, sequence-driven patterns emerge for different taggant formulations. Algorithmically generated oligonucleotide sequences show no crosstalk and ink-embedded taggants maintain activity for at least 99 days at 60 °C (equivalent to nearly 2 years at room temperature). The resulting fluorescent signals can be analyzed by the eye or a smartphone when paired with a UV flashlight and filtered glasses.
Collapse
Affiliation(s)
- Kimberly L Berk
- US Army Combat Capabilities Development Command Chemical Biological Center, Aberdeen Proving Ground, Edgewood, Maryland 21010, United States
| | - Steven M Blum
- US Army Combat Capabilities Development Command Chemical Biological Center, Aberdeen Proving Ground, Edgewood, Maryland 21010, United States
| | - Vanessa L Funk
- US Army Combat Capabilities Development Command Chemical Biological Center, Aberdeen Proving Ground, Edgewood, Maryland 21010, United States
| | - Yuhua Sun
- Applied DNA Sciences, Stony Brook, New York 11790, United States
| | - In-Young Yang
- Applied DNA Sciences, Stony Brook, New York 11790, United States
| | - Mark V Gostomski
- US Army Combat Capabilities Development Command Chemical Biological Center, Aberdeen Proving Ground, Edgewood, Maryland 21010, United States
| | - Pierce A Roth
- US Army Combat Capabilities Development Command Chemical Biological Center, Aberdeen Proving Ground, Edgewood, Maryland 21010, United States
- DCS Corporation, Belcamp, Maryland 21017, United States
| | - Alvin T Liem
- US Army Combat Capabilities Development Command Chemical Biological Center, Aberdeen Proving Ground, Edgewood, Maryland 21010, United States
- DCS Corporation, Belcamp, Maryland 21017, United States
| | - Peter A Emanuel
- US Army Combat Capabilities Development Command Chemical Biological Center, Aberdeen Proving Ground, Edgewood, Maryland 21010, United States
| | - Michael E Hogan
- Applied DNA Sciences, Stony Brook, New York 11790, United States
| | - Aleksandr E Miklos
- US Army Combat Capabilities Development Command Chemical Biological Center, Aberdeen Proving Ground, Edgewood, Maryland 21010, United States
| | - Matthew W Lux
- US Army Combat Capabilities Development Command Chemical Biological Center, Aberdeen Proving Ground, Edgewood, Maryland 21010, United States
| |
Collapse
|
169
|
DNA Sequencing Flow Cells and the Security of the Molecular-Digital Interface. PROCEEDINGS ON PRIVACY ENHANCING TECHNOLOGIES 2021. [DOI: 10.2478/popets-2021-0054] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Abstract
DNA sequencing is the molecular-to-digital conversion of DNA molecules, which are made up of a linear sequence of bases (A,C,G,T), into digital information. Central to this conversion are specialized fluidic devices, called sequencing flow cells, that distribute DNA onto a surface where the molecules can be read. As more computing becomes integrated with physical systems, we set out to explore how sequencing flow cell architecture can affect the security and privacy of the sequencing process and downstream data analysis. In the course of our investigation, we found that the unusual nature of molecular processing and flow cell design contributes to two security and privacy issues. First, DNA molecules are ‘sticky’ and stable for long periods of time. In a manner analogous to data recovery from discarded hard drives, we hypothesized that residual DNA attached to used flow cells could be collected and re-sequenced to recover a significant portion of the previously sequenced data. In experiments we were able to recover over 23.4% of a previously sequenced genome sample and perfectly decode image files encoded in DNA, suggesting that flow cells may be at risk of data recovery attacks. Second, we hypothesized that methods used to simultaneously sequence separate DNA samples together to increase sequencing throughput (multiplex sequencing), which incidentally leaks small amounts of data between samples, could cause data corruption and allow samples to adversarially manipulate sequencing data. We find that a maliciously crafted synthetic DNA sample can be used to alter targeted genetic variants in other samples using this vulnerability. Such a sample could be used to corrupt sequencing data or even be spiked into tissue samples, whenever untrusted samples are sequenced together. Taken together, these results suggest that, like many computing boundaries, the molecular-to-digital interface raises potential issues that should be considered in future sequencing and molecular sensing systems, especially as they become more ubiquitous.
Collapse
|
170
|
Dahlhauser SD, Moor SR, Vera MS, York JT, Ngo P, Boley AJ, Coronado JN, Simpson ZB, Anslyn EV. Efficient molecular encoding in multifunctional self-immolative urethanes. CELL REPORTS. PHYSICAL SCIENCE 2021; 2:100393. [PMID: 34755143 PMCID: PMC8573738 DOI: 10.1016/j.xcrp.2021.100393] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
Molecular encoding in sequence-defined polymers shows promise as a new paradigm for data storage. Here, we report what is, to our knowledge, the first use of self-immolative oligourethanes for storing and reading encoded information. As a proof of principle, we describe how a text passage from Jane Austen's Mansfield Park was encoded in sequence-defined oligourethanes and reconstructed via self-immolative sequencing. We develop Mol.E-coder, a software tool that uses a Huffman encoding scheme to convert the character table to hexadecimal. The oligourethanes are then generated by a high-throughput parallel synthesis. Sequencing of the oligourethanes by self-immolation is done concurrently in a parallel fashion, and the liquid chromatography-mass spectrometry (LC-MS) information decoded by our Mol.E-decoder software. The passage is capable of being reproduced wholly intact by a third-party, without any purifications or the use of tandem MS (MS/MS), despite multiple rounds of compression, encoding, and synthesis.
Collapse
Affiliation(s)
| | - Sarah R. Moor
- University of Texas at Austin, Austin, TX 78712, USA
| | | | | | - Phuoc Ngo
- University of Texas at Austin, Austin, TX 78712, USA
| | | | | | | | - Eric V. Anslyn
- University of Texas at Austin, Austin, TX 78712, USA
- Lead contact
| |
Collapse
|
171
|
Jiang C, Zhang Y, Wang F, Liu H. Toward Smart Information Processing with Synthetic DNA Molecules. Macromol Rapid Commun 2021; 42:e2100084. [PMID: 33864315 DOI: 10.1002/marc.202100084] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2021] [Revised: 03/13/2021] [Indexed: 11/07/2022]
Abstract
DNA, a biological macromolecule, is a naturally evolved information material. From the structural point of view, an individual DNA strand can be considered as a chain of data with its bases working as single units. For decades, due to the high biochemical stability, large information storage capacity, and high recognition specificity, DNA has been recognized as an attractive material for information processing. Especially, the chemical synthesis strategies and DNA sequencing techniques have been rapidly developed recently, further enabling encoding information with synthetic DNA molecules. Herein, recent progresses are summarized on information processing based on synthetic DNA molecules from three aspects including information storage, computation, and encryption, and proposed the challenges and future development directions.
Collapse
Affiliation(s)
- Chu Jiang
- School of Chemical Science and Engineering, Key Laboratory of Advanced Civil Engineering Materials of Ministry of Education, Shanghai Research Institute for Intelligent Autonomous Systems, Tongji University, Shanghai, 200092, China
| | - Yinan Zhang
- School of Chemistry and Chemical Engineering, Frontiers Science Center for Transformative Molecules and National Center for Translational Medicine, Shanghai Jiao Tong University, Shanghai, 200240, China
- Center for Molecular Design and Biomimetics, School of Molecular Sciences, The Biodesign Institute, Arizona State University, Tempe, AZ, 85287, USA
| | - Fei Wang
- School of Chemistry and Chemical Engineering, Frontiers Science Center for Transformative Molecules and National Center for Translational Medicine, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Huajie Liu
- School of Chemical Science and Engineering, Key Laboratory of Advanced Civil Engineering Materials of Ministry of Education, Shanghai Research Institute for Intelligent Autonomous Systems, Tongji University, Shanghai, 200092, China
| |
Collapse
|
172
|
Abstract
In biological systems, the storage and transfer of genetic information rely on sequence-controlled nucleic acids such as DNA and RNA. It has been realized for quite some time that this property is not only crucial for life but could also be very useful in human applications. For instance, DNA has been actively investigated as a digital storage medium over the past decade. Indeed, the "hard-disk of life" is an obvious choice and a highly optimized material for storing data. Through decades of nucleic acids research, technological tools for parallel synthesis and sequencing of DNA have been readily available. Consequently, it has already been demonstrated that different types of documents (e.g., texts, images, videos, and industrial data) can be stored in chemically synthesized DNA libraries. However, DNA is subject to biological constraints, and its molecular structure cannot be easily varied to match technological needs. In fact, DNA is not the only macromolecule that enables data storage. In recent years, it has been demonstrated that a wide variety of synthetic polymers can also be used for such a purpose. Indeed, modern polymer synthesis allows the preparation of synthetic macromolecules with precisely controlled monomer sequences. Altogether, about a dozens of synthetic digital polymers have already been described, and many more can be foreseen. Among them, sequence-defined poly(phosphodiester)s are one of the most promising options. These polymers are prepared by stepwise phosphoramidite chemistry like chemically synthesized oligonucleotides. However, they are constructed with non-natural building blocks and therefore share almost no structural characteristics with nucleic acids, except phosphate repeat units. Still, they contain readable digital messages that can be deciphered by nanopore sequencing or mass spectrometry sequencing. In this Account, we describe our recent research efforts in synthesizing and sequencing optimal abiological digital poly(phosphodiester)s. A major advantage of these polymers over DNA is that their molecular structure can easily be varied to tune their properties. During the last 5 years, we have engineered the molecular structure of these polymers to adjust crucial parameters such as the storage density, storage capacity, erasability, and readability. Consequently, high-capacity PPDE chains, containing hundreds of bits per chains, can now be synthesized and efficiently sequenced using a routine mass spectrometer. Furthermore, sequencing data can be automatically decrypted with the help of decoding software. This new type of coded matter can also be edited using practical physical triggers such as light and organized in space by programmed self-assembly. All of these recent improvements are summarized and discussed herein.
Collapse
Affiliation(s)
- Laurence Charles
- Aix Marseille Université, CNRS, Institute for Radical Chemistry, UMR 7273, 23 Av Escadrille Nomandie-Niemen, 13397 Marseille Cedex 20, France
| | - Jean-François Lutz
- Université de Strasbourg, CNRS, Institut Charles Sadron UPR22, 23 rue du Loess, 67034 Strasbourg Cedex 2, France
| |
Collapse
|
173
|
Bhattarai-Kline S, Lear SK, Shipman SL. One-step data storage in cellular DNA. Nat Chem Biol 2021; 17:232-233. [PMID: 33500580 DOI: 10.1038/s41589-021-00737-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Affiliation(s)
- Santi Bhattarai-Kline
- Gladstone Institutes and the University of California, San Francisco, San Francisco, CA, USA
| | - Sierra K Lear
- Gladstone Institutes and the University of California, San Francisco, San Francisco, CA, USA
| | - Seth L Shipman
- Gladstone Institutes and the University of California, San Francisco, San Francisco, CA, USA.
| |
Collapse
|
174
|
Zhou Z, Wang J, Levine RD, Remacle F, Willner I. DNA-based constitutional dynamic networks as functional modules for logic gates and computing circuit operations. Chem Sci 2021; 12:5473-5483. [PMID: 34168788 PMCID: PMC8179666 DOI: 10.1039/d1sc01098k] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2021] [Accepted: 03/09/2021] [Indexed: 11/21/2022] Open
Abstract
A nucleic acid-based constitutional dynamic network (CDN) is introduced as a single computational module that, in the presence of different sets of inputs, operates a variety of logic gates including a half adder, 2 : 1 multiplexer and 1 : 2 demultiplexer, a ternary multiplication matrix and a cascaded logic circuit. The CDN-based computational module leads to four logically equivalent outputs for each of the logic operations. Beyond the significance of the four logically equivalent outputs in establishing reliable and robust readout signals of the computational module, each of the outputs may be fanned out, in the presence of different inputs, to a set of different logic circuits. In addition, the ability to intercommunicate constitutional dynamic networks (CDNs) and to construct DNA-based CDNs of higher complexity provides versatile means to design computing circuits of enhanced complexity.
Collapse
Affiliation(s)
- Zhixin Zhou
- The Institute of Chemistry, The Hebrew University of Jerusalem Jerusalem 91904 Israel
| | - Jianbang Wang
- The Institute of Chemistry, The Hebrew University of Jerusalem Jerusalem 91904 Israel
| | - R D Levine
- The Institute of Chemistry, The Hebrew University of Jerusalem Jerusalem 91904 Israel
| | - Francoise Remacle
- Theoretical Physical Chemistry, UR MolSys B6c, University of Liège B4000 Liège Belgium
| | - Itamar Willner
- The Institute of Chemistry, The Hebrew University of Jerusalem Jerusalem 91904 Israel
| |
Collapse
|
175
|
Laurent E, Amalian JA, Schutz T, Launay K, Clément JL, Gigmes D, Burel A, Carapito C, Charles L, Delsuc MA, Lutz JF. Storing the portrait of Antoine de Lavoisier in a single macromolecule. CR CHIM 2021. [DOI: 10.5802/crchim.72] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
|
176
|
Yim SS, McBee RM, Song AM, Huang Y, Sheth RU, Wang HH. Robust direct digital-to-biological data storage in living cells. Nat Chem Biol 2021; 17:246-253. [PMID: 33432236 PMCID: PMC7904632 DOI: 10.1038/s41589-020-00711-4] [Citation(s) in RCA: 41] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2020] [Revised: 10/30/2020] [Accepted: 11/12/2020] [Indexed: 02/06/2023]
Abstract
DNA has been the predominant information storage medium for biology and holds great promise as a next-generation high-density data medium in the digital era. Currently, the vast majority of DNA-based data storage approaches rely on in vitro DNA synthesis. As such, there are limited methods to encode digital data into the chromosomes of living cells in a single step. Here, we describe a new electrogenetic framework for direct storage of digital data in living cells. Using an engineered redox-responsive CRISPR adaptation system, we encoded binary data in 3-bit units into CRISPR arrays of bacterial cells by electrical stimulation. We demonstrate multiplex data encoding into barcoded cell populations to yield meaningful information storage and capacity up to 72 bits, which can be maintained over many generations in natural open environments. This work establishes a direct digital-to-biological data storage framework and advances our capacity for information exchange between silicon- and carbon-based entities.
Collapse
Affiliation(s)
- Sung Sun Yim
- Department of Systems Biology, Columbia University, New York, NY, USA
| | - Ross M McBee
- Department of Systems Biology, Columbia University, New York, NY, USA
- Department of Biological Sciences, Columbia University, New York, NY, USA
| | - Alan M Song
- Department of Systems Biology, Columbia University, New York, NY, USA
| | - Yiming Huang
- Department of Systems Biology, Columbia University, New York, NY, USA
- Integrated Program in Cellular, Molecular, and Biomedical Studies, Columbia University, New York, NY, USA
| | - Ravi U Sheth
- Department of Systems Biology, Columbia University, New York, NY, USA
- Integrated Program in Cellular, Molecular, and Biomedical Studies, Columbia University, New York, NY, USA
| | - Harris H Wang
- Department of Systems Biology, Columbia University, New York, NY, USA.
- Department of Pathology and Cell Biology, Columbia University, New York, NY, USA.
| |
Collapse
|
177
|
Matange K, Tuck JM, Keung AJ. DNA stability: a central design consideration for DNA data storage systems. Nat Commun 2021; 12:1358. [PMID: 33649304 PMCID: PMC7921107 DOI: 10.1038/s41467-021-21587-5] [Citation(s) in RCA: 54] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2020] [Accepted: 02/02/2021] [Indexed: 11/09/2022] Open
Abstract
Data storage in DNA is a rapidly evolving technology that could be a transformative solution for the rising energy, materials, and space needs of modern information storage. Given that the information medium is DNA itself, its stability under different storage and processing conditions will fundamentally impact and constrain design considerations and data system capabilities. Here we analyze the storage conditions, molecular mechanisms, and stabilization strategies influencing DNA stability and pose specific design configurations and scenarios for future systems that best leverage the considerable advantages of DNA storage.
Collapse
Affiliation(s)
- Karishma Matange
- Department of Chemical and Biomolecular Engineering, North Carolina State University, Raleigh, NC, USA
| | - James M Tuck
- Department of Electrical and Computer Engineering, North Carolina State University, Raleigh, NC, USA.
| | - Albert J Keung
- Department of Chemical and Biomolecular Engineering, North Carolina State University, Raleigh, NC, USA.
| |
Collapse
|
178
|
Aksakal R, Mertens C, Soete M, Badi N, Du Prez F. Applications of Discrete Synthetic Macromolecules in Life and Materials Science: Recent and Future Trends. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2021; 8:2004038. [PMID: 33747749 PMCID: PMC7967060 DOI: 10.1002/advs.202004038] [Citation(s) in RCA: 57] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/21/2020] [Revised: 11/22/2020] [Indexed: 05/19/2023]
Abstract
In the last decade, the field of sequence-defined polymers and related ultraprecise, monodisperse synthetic macromolecules has grown exponentially. In the early stage, mainly articles or reviews dedicated to the development of synthetic routes toward their preparation have been published. Nowadays, those synthetic methodologies, combined with the elucidation of the structure-property relationships, allow envisioning many promising applications. Consequently, in the past 3 years, application-oriented papers based on discrete synthetic macromolecules emerged. Hence, material science applications such as macromolecular data storage and encryption, self-assembly of discrete structures and foldamers have been the object of many fascinating studies. Moreover, in the area of life sciences, such structures have also been the focus of numerous research studies. Here, it is aimed to highlight these recent applications and to give the reader a critical overview of the future trends in this area of research.
Collapse
Affiliation(s)
- Resat Aksakal
- Polymer Chemistry Research GroupCentre of Macromolecular Chemistry (CMaC)Department of Organic and Macromolecular ChemistryGhent UniversityKrijgslaan 281 S4‐bisGhentB‐9000Belgium
| | - Chiel Mertens
- Polymer Chemistry Research GroupCentre of Macromolecular Chemistry (CMaC)Department of Organic and Macromolecular ChemistryGhent UniversityKrijgslaan 281 S4‐bisGhentB‐9000Belgium
| | - Matthieu Soete
- Polymer Chemistry Research GroupCentre of Macromolecular Chemistry (CMaC)Department of Organic and Macromolecular ChemistryGhent UniversityKrijgslaan 281 S4‐bisGhentB‐9000Belgium
| | - Nezha Badi
- Polymer Chemistry Research GroupCentre of Macromolecular Chemistry (CMaC)Department of Organic and Macromolecular ChemistryGhent UniversityKrijgslaan 281 S4‐bisGhentB‐9000Belgium
| | - Filip Du Prez
- Polymer Chemistry Research GroupCentre of Macromolecular Chemistry (CMaC)Department of Organic and Macromolecular ChemistryGhent UniversityKrijgslaan 281 S4‐bisGhentB‐9000Belgium
| |
Collapse
|
179
|
Tan X, Ge L, Zhang T, Lu Z. Preservation of DNA for data storage. RUSSIAN CHEMICAL REVIEWS 2021. [DOI: 10.1070/rcr4994] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
The preservation of DNA has attracted significant interest of scientists in diverse research fields from ancient biological remains to the information field. In light of the different DNA safekeeping requirements (e.g., storage time, storage conditions) in these disparate fields, scientists have proposed distinct methods to maintain the DNA integrity. Specifically, DNA data storage is an emerging research, which means that the binary digital information is converted to the sequences of nucleotides leading to dense and durable data storage in the form of synthesized DNA. The intact preservation of DNA plays a significant role because it is closely related to data integrity. This review discusses DNA preservation methods, aiming to confirm an appropriate one for synthetic oligonucleotides in DNA data storage. First, we analyze the impact factors of the DNA long-term storage, including the intrinsic stability of DNA, environmental factors, and storage methods. Then, the benefits and disadvantages of diverse conservation approaches (e.g., encapsulation-free, chemical encapsulation) are discussed. Finally, we provide advice for storing non-genetic information in DNA in vitro. We expect these preservation suggestions to promote further research that may extend the DNA storage time.
The bibliography includes 99 references.
Collapse
|
180
|
Luo T, Fan S, Liu Y, Song J. Information processing based on DNA toehold-mediated strand displacement (TMSD) reaction. NANOSCALE 2021; 13:2100-2112. [PMID: 33475669 DOI: 10.1039/d0nr07865d] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
SemiSynBio is an emerging topic toward the construction of platforms for next-generation information processing. Recent research has indicated its promising prospect toward information processing including algorithm design and pattern manipulation with the DNA TMSD reaction, which is one of the cores of the SemiSynBio technology route. The DNA TMSD reaction is the process in which an invader strand displaces the incumbent strand from the gate strand through initiation at the exposed toehold domain. Also, the DNA TMSD reaction generally involves three processes: toehold association, branch migration and strand disassociation. Herein, we review the recent progress on information processing with the DNA TMSD reaction. We highlight the diverse developments on information processing with the logic circuit, analog circuit, combinational circuit and information relay with the DNA origami structure. Additionally, we explore the current challenges and various trends toward the design and application of the DNA TMSD reaction in future information processing.
Collapse
Affiliation(s)
- Tao Luo
- Institute of Nano Biomedicine and Engineering, Department of Instrument Science and Engineering, School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China.
| | - Sisi Fan
- Institute of Nano Biomedicine and Engineering, Department of Instrument Science and Engineering, School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China.
| | - Yan Liu
- Institute of Nano Biomedicine and Engineering, Department of Instrument Science and Engineering, School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China.
| | - Jie Song
- Institute of Nano Biomedicine and Engineering, Department of Instrument Science and Engineering, School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China. and Institute of Cancer and Basic Medicine (IBMC), Chinese Academy of Sciences; The Cancer Hospital of the University of Chinese Academy of Sciences, Hangzhou, Zhejiang 310022, China
| |
Collapse
|
181
|
Lim CK, Nirantar S, Yew WS, Poh CL. Novel Modalities in DNA Data Storage. Trends Biotechnol 2021; 39:990-1003. [PMID: 33455842 DOI: 10.1016/j.tibtech.2020.12.008] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2020] [Revised: 12/14/2020] [Accepted: 12/15/2020] [Indexed: 10/22/2022]
Abstract
The field of storing information in DNA has expanded exponentially. Most common modalities involve encoding information from bits into synthesized nucleotides, storage in liquid or dry media, and decoding via sequencing. However, limitations to this paradigm include the cost of DNA synthesis and sequencing, along with low throughput. Further unresolved questions include the appropriate media of storage and the scalability of such approaches for commercial viability. In this review, we examine various storage modalities involving the use of DNA from a systems-level perspective. We compare novel methods that draw inspiration from molecular biology techniques that have been devised to overcome the difficulties posed by standard workflows and conceptualize potential applications that can arise from these advances.
Collapse
Affiliation(s)
- Cheng Kai Lim
- NUS Graduate School of Integrative Sciences and Engineering, National University of Singapore, Singapore 119077, Singapore; NUS Synthetic Biology for Clinical and Technological Innovation (SynCTI), Centre for Life Sciences, National University of Singapore, Singapore 117456, Singapore
| | | | - Wen Shan Yew
- Department of Biochemistry, Faculty of Medicine, National University of Singapore, Singapore 117597, Singapore; NUS Synthetic Biology for Clinical and Technological Innovation (SynCTI), Centre for Life Sciences, National University of Singapore, Singapore 117456, Singapore
| | - Chueh Loo Poh
- Department of Biomedical Engineering, Faculty of Engineering, National University of Singapore, Singapore 117583, Singapore; NUS Synthetic Biology for Clinical and Technological Innovation (SynCTI), Centre for Life Sciences, National University of Singapore, Singapore 117456, Singapore.
| |
Collapse
|
182
|
McKenzie LK, El-Khoury R, Thorpe JD, Damha MJ, Hollenstein M. Recent progress in non-native nucleic acid modifications. Chem Soc Rev 2021; 50:5126-5164. [DOI: 10.1039/d0cs01430c] [Citation(s) in RCA: 76] [Impact Index Per Article: 25.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
While Nature harnesses RNA and DNA to store, read and write genetic information, the inherent programmability, synthetic accessibility and wide functionality of these nucleic acids make them attractive tools for use in a vast array of applications.
Collapse
Affiliation(s)
- Luke K. McKenzie
- Institut Pasteur
- Department of Structural Biology and Chemistry
- Laboratory for Bioorganic Chemistry of Nucleic Acids
- CNRS UMR3523
- 75724 Paris Cedex 15
| | | | | | | | - Marcel Hollenstein
- Institut Pasteur
- Department of Structural Biology and Chemistry
- Laboratory for Bioorganic Chemistry of Nucleic Acids
- CNRS UMR3523
- 75724 Paris Cedex 15
| |
Collapse
|
183
|
Abstract
DNA has become a popular soft material for low energy, high-density information storage, but it is susceptible to damage through oxidation, pH, temperature, and nucleases in the environment. Here, we describe a new molecular chemotype for data archiving based on the unnatural genetic framework of α-l-threofuranosyl nucleic acid (TNA). Using a simple genetic coding strategy, 23 kilobytes of digital information were stored in DNA-primed TNA oligonucleotides and recovered with perfect accuracy after exposure to biological nucleases that destroyed equivalent DNA messages. We suggest that these results extend the capacity for nucleic acids to function as a soft material for low energy, high-density information storage by providing a safeguard against information loss caused by nuclease digestion.
Collapse
Affiliation(s)
- Kefan Yang
- Departments of Pharmaceutical Sciences, Chemistry, and Molecular Biology and Biochemistry, University of California, Irvine, California 92697-3958, United States
| | - Cailen M. McCloskey
- Departments of Pharmaceutical Sciences, Chemistry, and Molecular Biology and Biochemistry, University of California, Irvine, California 92697-3958, United States
| | - John C. Chaput
- Departments of Pharmaceutical Sciences, Chemistry, and Molecular Biology and Biochemistry, University of California, Irvine, California 92697-3958, United States
| |
Collapse
|
184
|
Meiser LC, Koch J, Antkowiak PL, Stark WJ, Heckel R, Grass RN. DNA synthesis for true random number generation. Nat Commun 2020; 11:5869. [PMID: 33208744 PMCID: PMC7675991 DOI: 10.1038/s41467-020-19757-y] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2020] [Accepted: 10/28/2020] [Indexed: 11/09/2022] Open
Abstract
The volume of securely encrypted data transmission required by today's network complexity of people, transactions and interactions increases continuously. To guarantee security of encryption and decryption schemes for exchanging sensitive information, large volumes of true random numbers are required. Here we present a method to exploit the stochastic nature of chemistry by synthesizing DNA strands composed of random nucleotides. We compare three commercial random DNA syntheses giving a measure for robustness and synthesis distribution of nucleotides and show that using DNA for random number generation, we can obtain 7 million GB of randomness from one synthesis run, which can be read out using state-of-the-art sequencing technologies at rates of ca. 300 kB/s. Using the von Neumann algorithm for data compression, we remove bias introduced from human or technological sources and assess randomness using NIST's statistical test suite.
Collapse
Affiliation(s)
- Linda C Meiser
- Department of Chemistry and Applied Biosciences, Institute for Chemical and Bioengineering, ETH Zurich, Vladimir-Prelog-Weg 1, CH-8093, Zurich, Switzerland
| | - Julian Koch
- Department of Chemistry and Applied Biosciences, Institute for Chemical and Bioengineering, ETH Zurich, Vladimir-Prelog-Weg 1, CH-8093, Zurich, Switzerland
| | - Philipp L Antkowiak
- Department of Chemistry and Applied Biosciences, Institute for Chemical and Bioengineering, ETH Zurich, Vladimir-Prelog-Weg 1, CH-8093, Zurich, Switzerland
| | - Wendelin J Stark
- Department of Chemistry and Applied Biosciences, Institute for Chemical and Bioengineering, ETH Zurich, Vladimir-Prelog-Weg 1, CH-8093, Zurich, Switzerland
| | - Reinhard Heckel
- Department of Electrical and Computer Engineering, Technical University of Munich, Arcistrasse 21, 80333, Munich, Germany
| | - Robert N Grass
- Department of Chemistry and Applied Biosciences, Institute for Chemical and Bioengineering, ETH Zurich, Vladimir-Prelog-Weg 1, CH-8093, Zurich, Switzerland.
| |
Collapse
|
185
|
Ohshiro T, Komoto Y, Taniguchi M. Single-Molecule Counting of Nucleotide by Electrophoresis with Nanochannel-Integrated Nano-Gap Devices. MICROMACHINES 2020; 11:mi11110982. [PMID: 33142705 PMCID: PMC7693128 DOI: 10.3390/mi11110982] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/30/2020] [Revised: 10/28/2020] [Accepted: 10/28/2020] [Indexed: 12/11/2022]
Abstract
We utilized electrophoresis to control the fluidity of sample biomolecules in sample aqueous solutions inside the nanochannel for single-molecule detection by using a nanochannel-integrated nanogap electrode, which is composed of a nano-gap sensing electrode, nanochannel, and tapered focusing channel. In order to suppress electro-osmotic flow and thermal convection inside this nanochannel, we optimized the reduction ratios of the tapered focusing channel, and the ratio of inlet 10 μm to outlet 0.5 μm was found to be high performance of electrophoresis with lower concentration of 0.05 × TBE (Tris/Borate/EDTA) buffer containing a surfactant of 0.1 w/v% polyvinylpyrrolidone (PVP). Under the optimized conditions, single-molecule electrical measurement of deoxyguanosine monophosphate (dGMP) was performed and it was found that the throughput was significantly improved by nearly an order of magnitude compared to that without electrophoresis. In addition, it was also found that the long-duration signals that could interfere with discrimination were significantly reduced. This is because the strong electrophoresis flow inside the nanochannels prevents the molecules’ adsorption near the electrodes. This single-molecule electrical measurement with nanochannel-integrated nano-gap electrodes by electrophoresis significantly improved the throughput of signal detection and identification accuracy.
Collapse
|
186
|
Schwarz M, Welzel M, Kabdullayeva T, Becker A, Freisleben B, Heider D. MESA: automated assessment of synthetic DNA fragments and simulation of DNA synthesis, storage, sequencing and PCR errors. Bioinformatics 2020; 36:3322-3326. [PMID: 32129840 PMCID: PMC7267826 DOI: 10.1093/bioinformatics/btaa140] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2019] [Revised: 01/30/2020] [Accepted: 02/27/2020] [Indexed: 11/18/2022] Open
Abstract
Summary The development of de novo DNA synthesis, polymerase chain reaction (PCR), DNA sequencing and molecular cloning gave researchers unprecedented control over DNA and DNA-mediated processes. To reduce the error probabilities of these techniques, DNA composition has to adhere to method-dependent restrictions. To comply with such restrictions, a synthetic DNA fragment is often adjusted manually or by using custom-made scripts. In this article, we present MESA (Mosla Error Simulator), a web application for the assessment of DNA fragments based on limitations of DNA synthesis, amplification, cloning, sequencing methods and biological restrictions of host organisms. Furthermore, MESA can be used to simulate errors during synthesis, PCR, storage and sequencing processes. Availability and implementation MESA is available at mesa.mosla.de, with the source code available at github.com/umr-ds/mesa_dna_sim. Contact dominik.heider@uni-marburg.de Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | | | | | - Anke Becker
- Department of Biology, SYNMIKRO, University of Marburg, Marburg D-35032, Germany
| | | | | |
Collapse
|
187
|
Lee H, Wiegand DJ, Griswold K, Punthambaker S, Chun H, Kohman RE, Church GM. Photon-directed multiplexed enzymatic DNA synthesis for molecular digital data storage. Nat Commun 2020; 11:5246. [PMID: 33067441 PMCID: PMC7567835 DOI: 10.1038/s41467-020-18681-5] [Citation(s) in RCA: 43] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2020] [Accepted: 08/27/2020] [Indexed: 01/07/2023] Open
Abstract
New storage technologies are needed to keep up with the global demands of data generation. DNA is an ideal storage medium due to its stability, information density and ease-of-readout with advanced sequencing techniques. However, progress in writing DNA is stifled by the continued reliance on chemical synthesis methods. The enzymatic synthesis of DNA is a promising alternative, but thus far has not been well demonstrated in a parallelized manner. Here, we report a multiplexed enzymatic DNA synthesis method using maskless photolithography. Rapid uncaging of Co2+ ions by patterned UV light activates Terminal deoxynucleotidyl Transferase (TdT) for spatially-selective synthesis on an array surface. Spontaneous quenching of reactions by the diffusion of excess caging molecules confines synthesis to light patterns and controls the extension length. We show that our multiplexed synthesis method can be used to store digital data by encoding 12 unique DNA oligonucleotide sequences with video game music, which is equivalent to 84 trits or 110 bits of data.
Collapse
Affiliation(s)
- Howon Lee
- Department of Genetics, Harvard Medical School, Boston, MA, 02115, USA
- Wyss Institute for Biologically Inspired Engineering, Boston, MA, 02115, USA
| | - Daniel J Wiegand
- Department of Genetics, Harvard Medical School, Boston, MA, 02115, USA
- Wyss Institute for Biologically Inspired Engineering, Boston, MA, 02115, USA
| | - Kettner Griswold
- Department of Genetics, Harvard Medical School, Boston, MA, 02115, USA
- Wyss Institute for Biologically Inspired Engineering, Boston, MA, 02115, USA
- MIT Media Lab, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
- Charles Stark Draper Laboratory, Cambridge, MA, 02139, USA
| | - Sukanya Punthambaker
- Department of Genetics, Harvard Medical School, Boston, MA, 02115, USA
- Wyss Institute for Biologically Inspired Engineering, Boston, MA, 02115, USA
| | - Honggu Chun
- Department of Biomedical Engineering, Korea University, 466 Hana Science Hall, 145 Anamro, Seongbukgu, 02841, Seoul, South Korea
| | - Richie E Kohman
- Department of Genetics, Harvard Medical School, Boston, MA, 02115, USA.
- Wyss Institute for Biologically Inspired Engineering, Boston, MA, 02115, USA.
| | - George M Church
- Department of Genetics, Harvard Medical School, Boston, MA, 02115, USA.
- Wyss Institute for Biologically Inspired Engineering, Boston, MA, 02115, USA.
| |
Collapse
|
188
|
Metastable hybridization-based DNA information storage to allow rapid and permanent erasure. Nat Commun 2020; 11:5008. [PMID: 33024123 PMCID: PMC7538566 DOI: 10.1038/s41467-020-18842-6] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2019] [Accepted: 09/14/2020] [Indexed: 11/25/2022] Open
Abstract
The potential of DNA as an information storage medium is rapidly growing due to advances in DNA synthesis and sequencing. However, the chemical stability of DNA challenges the complete erasure of information encoded in DNA sequences. Here, we encode information in a DNA information solution, a mixture of true message- and false message-encoded oligonucleotides, and enables rapid and permanent erasure of information. True messages are differentiated by their hybridization to a "truth marker” oligonucleotide, and only true messages can be read; binding of the truth marker can be effectively randomized even with a brief exposure to the elevated temperature. We show 8 separate bitmap images can be stably encoded and read after storage at 25 °C for 65 days with an average of over 99% correct information recall, which extrapolates to a half-life of over 15 years at 25 °C. Heating to 95 °C for 5 minutes, however, permanently erases the message. The chemical stability of DNA makes complete erasure of DNA-encoded data difficult. Here the authors mix true and false messages, differentiated by whether a truth marker oligo is bound to it, and show that brief exposure to elevated temperatures randomizes the binding of truth markers preventing data recovery.
Collapse
|
189
|
Stanley PM, Strittmatter LM, Vickers AM, Lee KCK. Decoding DNA data storage for investment. Biotechnol Adv 2020; 45:107639. [PMID: 33002583 PMCID: PMC7521213 DOI: 10.1016/j.biotechadv.2020.107639] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2020] [Revised: 09/25/2020] [Accepted: 09/26/2020] [Indexed: 11/24/2022]
Abstract
While DNA's perpetual role in biology and life science is well documented, its burgeoning digital applications are beginning to garner significant interest. As the development of novel technologies requires continuous research, product development, startup creation, and financing, this work provides an overview of each respective area and highlights current trends, challenges, and opportunities. These are supported by numerous interviews with key opinion leaders from across academia, government agencies and the commercial sector, as well as investment data analysis. Our findings illustrate the societal and economic need for technological innovation and disruption in data storage, paving the way for nature's own time-tested, advantageous, and unrivaled solution. We anticipate a significant increase in available investment capital and continuous scientific progress, creating a ripe environment on which DNA data storage-enabling startups can capitalize to bring DNA data storage into daily life. Overview on current DNA data storage technologies and commercialization hurdles Insights from leading DNA data storage experts and investment financing data DNA synthesis remains the biggest challenge in the industry Archiving cold data is the low-hanging fruit in DNA data storage Upwards trend in investment landscape suggests optimal startup fundraising period
Collapse
Affiliation(s)
- Philip M Stanley
- M Ventures, Gustav Mahlerplein 102, 20(th) Floor, 1082 MA Amsterdam, The Netherlands.
| | - Lisa M Strittmatter
- M Ventures, Gustav Mahlerplein 102, 20(th) Floor, 1082 MA Amsterdam, The Netherlands
| | - Alice M Vickers
- M Ventures, Gustav Mahlerplein 102, 20(th) Floor, 1082 MA Amsterdam, The Netherlands
| | - Kevin C K Lee
- M Ventures, Gustav Mahlerplein 102, 20(th) Floor, 1082 MA Amsterdam, The Netherlands
| |
Collapse
|
190
|
|
191
|
|
192
|
Hao M, Qiao H, Gao Y, Wang Z, Qiao X, Chen X, Qi H. A mixed culture of bacterial cells enables an economic DNA storage on a large scale. Commun Biol 2020; 3:416. [PMID: 32737399 PMCID: PMC7395121 DOI: 10.1038/s42003-020-01141-7] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2020] [Accepted: 07/02/2020] [Indexed: 11/25/2022] Open
Abstract
DNA emerged as a novel potential material for mass data storage, offering the possibility to cheaply solve a great data storage problem. Large oligonucleotide pools demonstrated high potential of large-scale data storage in test tube, meanwhile, living cell with high fidelity in information replication. Here we show a mixed culture of bacterial cells carrying a large oligo pool that was assembled in a high-copy-number plasmid was presented as a stable material for large-scale data storage. The underlying principle was explored by deep bioinformatic analysis. Although homology assembly showed sequence context dependent bias, the large oligonucleotide pools in the mixed culture were constant over multiple successive passages. Finally, over ten thousand distinct oligos encompassing 2304 Kbps encoding 445 KB digital data, were stored in cells, the largest storage in living cells reported so far and present a previously unreported approach for bridging the gap between in vitro and in vivo systems. Hao, Qiao, Gao et al. show that over ten thousand oligonucleotides encoding 445 KB of digital data can be stored in cultured bacterial cells. Data storage in living cells increases the information storage capacity while enabling its economical propagation.
Collapse
Affiliation(s)
- Min Hao
- School of Chemical Engineering and Technology, Tianjin University, Tianjin, China.,Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin, China
| | - Hongyan Qiao
- School of Chemical Engineering and Technology, Tianjin University, Tianjin, China.,Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin, China
| | - Yanmin Gao
- School of Chemical Engineering and Technology, Tianjin University, Tianjin, China.,Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin, China
| | - Zhaoguan Wang
- School of Chemical Engineering and Technology, Tianjin University, Tianjin, China.,Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin, China
| | - Xin Qiao
- School of Chemical Engineering and Technology, Tianjin University, Tianjin, China.,Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin, China
| | - Xin Chen
- Center for Applied Mathematics, Tianjin University, Tianjin, China
| | - Hao Qi
- School of Chemical Engineering and Technology, Tianjin University, Tianjin, China. .,Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin, China.
| |
Collapse
|
193
|
Abstract
DNA polymerases play a central role in biology by transferring genetic information from one generation to the next during cell division. Harnessing the power of these enzymes in the laboratory has fueled an increase in biomedical applications that involve the synthesis, amplification, and sequencing of DNA. However, the high substrate specificity exhibited by most naturally occurring DNA polymerases often precludes their use in practical applications that require modified substrates. Moving beyond natural genetic polymers requires sophisticated enzyme-engineering technologies that can be used to direct the evolution of engineered polymerases that function with tailor-made activities. Such efforts are expected to uniquely drive emerging applications in synthetic biology by enabling the synthesis, replication, and evolution of synthetic genetic polymers with new physicochemical properties.
Collapse
|
194
|
HEDGES error-correcting code for DNA storage corrects indels and allows sequence constraints. Proc Natl Acad Sci U S A 2020; 117:18489-18496. [PMID: 32675237 PMCID: PMC7414044 DOI: 10.1073/pnas.2004821117] [Citation(s) in RCA: 56] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
This paper constructs an error-correcting code for the {A,C,G,T} alphabet of DNA. By contrast with previous work, the code corrects insertions and deletions directly, in a single strand of DNA, without the need for multiple alignment of strands. This code, when coupled to a standard outer code, can achieve error-free storage of petabyte-scale data even when ∼10% of all nucleotides are erroneous. Synthetic DNA is rapidly emerging as a durable, high-density information storage platform. A major challenge for DNA-based information encoding strategies is the high rate of errors that arise during DNA synthesis and sequencing. Here, we describe the HEDGES (Hash Encoded, Decoded by Greedy Exhaustive Search) error-correcting code that repairs all three basic types of DNA errors: insertions, deletions, and substitutions. HEDGES also converts unresolved or compound errors into substitutions, restoring synchronization for correction via a standard Reed–Solomon outer code that is interleaved across strands. Moreover, HEDGES can incorporate a broad class of user-defined sequence constraints, such as avoiding excess repeats, or too high or too low windowed guanine–cytosine (GC) content. We test our code both via in silico simulations and with synthesized DNA. From its measured performance, we develop a statistical model applicable to much larger datasets. Predicted performance indicates the possibility of error-free recovery of petabyte- and exabyte-scale data from DNA degraded with as much as 10% errors. As the cost of DNA synthesis and sequencing continues to drop, we anticipate that HEDGES will find applications in large-scale error-free information encoding.
Collapse
|
195
|
Yimyai T, Phakkeeree T, Crespy D. Tattooing Plastics with Reversible and Irreversible Encryption. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2020; 7:1903785. [PMID: 32670754 PMCID: PMC7341078 DOI: 10.1002/advs.201903785] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/25/2019] [Revised: 02/28/2020] [Indexed: 05/18/2023]
Abstract
Self-healing materials are explored for restoring mechanical, electrical, and chemical properties. Inspired by the process of tattooing on human skin, a method for engraving non-permanent or permanent messages on plastics is developed. A self-healing polymer containing dynamic disulfide bonds is employed as substrate for encryption of written messages. The polymer is engraved with a dye solution which is subsequently covered by the polymer matrix upon activation with temperature increase. The dye is then located at the subsurface of the substrate so that the information cannot be removed easily by wear or extraction with solvents. Therefore, self-healing polymers can be applied as sustainable substrates for reversibly and irreversibly engraving information.
Collapse
Affiliation(s)
- Tiwa Yimyai
- Department of Chemical and Bimolecular EngineeringSchool of Energy Science and EngineeringVidyasirimedhi Institute of Science and Technology (VISTEC)Rayong21210Thailand
| | - Treethip Phakkeeree
- Department of Materials Science and EngineeringSchool of Molecular Science and EngineeringVidyasirimedhi Institute of Science and Technology (VISTEC)Rayong21210Thailand
| | - Daniel Crespy
- Department of Materials Science and EngineeringSchool of Molecular Science and EngineeringVidyasirimedhi Institute of Science and Technology (VISTEC)Rayong21210Thailand
| |
Collapse
|
196
|
Chen YJ, Takahashi CN, Organick L, Bee C, Ang SD, Weiss P, Peck B, Seelig G, Ceze L, Strauss K. Quantifying molecular bias in DNA data storage. Nat Commun 2020; 11:3264. [PMID: 32601272 PMCID: PMC7324401 DOI: 10.1038/s41467-020-16958-3] [Citation(s) in RCA: 41] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2019] [Accepted: 05/19/2020] [Indexed: 11/10/2022] Open
Abstract
DNA has recently emerged as an attractive medium for archival data storage. Recent work has demonstrated proof-of-principle prototype systems; however, very uneven (biased) sequencing coverage has been reported, which indicates inefficiencies in the storage process. Deviations from the average coverage in the sequence copy distribution can either cause wasteful provisioning in sequencing or excessive number of missing sequences. Here, we use millions of unique sequences from a DNA-based digital data archival system to study the oligonucleotide copy unevenness problem and show that the two paramount sources of bias are the synthesis and amplification (PCR) processes. Based on these findings, we develop a statistical model for each molecular process as well as the overall process. We further use our model to explore the trade-offs between synthesis bias, storage physical density, logical redundancy, and sequencing redundancy, providing insights for engineering efficient, robust DNA data storage systems.
Collapse
Affiliation(s)
| | - Christopher N Takahashi
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, Washington, 98195, USA
| | - Lee Organick
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, Washington, 98195, USA
| | - Callista Bee
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, Washington, 98195, USA
| | | | - Patrick Weiss
- Twist Bioscience, San Francisco, California, 94158, USA
| | - Bill Peck
- Twist Bioscience, San Francisco, California, 94158, USA
| | - Georg Seelig
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, Washington, 98195, USA.,Department of Electrical and Computer Engineering, University of Washington, Seattle, Washington, 98195, USA
| | - Luis Ceze
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, Washington, 98195, USA.
| | - Karin Strauss
- Microsoft Research, Redmond, Washington, 98052, USA.
| |
Collapse
|
197
|
Na D. DNA steganography: hiding undetectable secret messages within the single nucleotide polymorphisms of a genome and detecting mutation-induced errors. Microb Cell Fact 2020; 19:128. [PMID: 32527315 PMCID: PMC7291742 DOI: 10.1186/s12934-020-01387-0] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2020] [Accepted: 06/06/2020] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND As cell engineering technology advances, more complex synthetically designed cells and metabolically engineered cells are being developed. Engineered cells are important resources in industry. Similar to image watermarking, engineered cells should be watermarked for protection against improper use. RESULTS In this study, a DNA steganography methodology was developed to hide messages in variable regions (single nucleotide polymorphisms) of the genome to create hidden messages and thereby prevent from hacking. Additionally, to detect errors (mutations) within the encrypted messages, a block sum check algorithm was employed, similar to that used in network data transmission to detect noise-induced information changes. CONCLUSIONS This DNA steganography methodology could be used to hide secret messages in a genome and detect errors within the encrypted messages. This approach is expected to be useful for tracking cells and protecting biological assets (e.g., engineered cells).
Collapse
Affiliation(s)
- Dokyun Na
- Department of Biomedical Engineering, School of Integrative Engineering, Chung-Ang University, 84 Heukseok-ro, Dongjak-gu, Seoul, 06974, Republic of Korea.
| |
Collapse
|
198
|
Fan S, Wang D, Cheng J, Liu Y, Luo T, Cui D, Ke Y, Song J. Information Coding in a Reconfigurable DNA Origami Domino Array. Angew Chem Int Ed Engl 2020; 59:12991-12997. [PMID: 32304157 DOI: 10.1002/anie.202003823] [Citation(s) in RCA: 37] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2020] [Indexed: 01/26/2023]
Abstract
DNA nanostructures with programmable nanoscale patterns has been achieved in the past decades, and molecular information coding (MIC) on those designed nanostructures has gained increasing attention for information security. However, achieving steganography and cryptography synchronously on DNA nanostructures remains a challenge. Herein, we demonstrated MIC in a reconfigurable DNA origami domino array (DODA), which can reconfigure intrinsic patterns but keep the DODA outline the same for steganography. When a set of keys (DNA strands) are added, the cryptographic data can be translated into visible patterns within DODA. More complex cryptography with the ASCII code within a programmable 6×6 lattice is demonstrated to demosntrate the versatility of MIC in the DODA. Furthermore, an anti-counterfeiting approach based on conformational transformation-mediated toehold strand displacement reaction is designed to protect MIC from decoding and falsification.
Collapse
Affiliation(s)
- Sisi Fan
- Institute of Nano Biomedicine and Engineering, Department of Instrument Science and Engineering, School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Dongfang Wang
- Institute of Nano Biomedicine and Engineering, Department of Instrument Science and Engineering, School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Jin Cheng
- Institute of Nano Biomedicine and Engineering, Department of Instrument Science and Engineering, School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Yan Liu
- Institute of Nano Biomedicine and Engineering, Department of Instrument Science and Engineering, School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Tao Luo
- Institute of Nano Biomedicine and Engineering, Department of Instrument Science and Engineering, School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Daxiang Cui
- Institute of Nano Biomedicine and Engineering, Department of Instrument Science and Engineering, School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Yonggang Ke
- Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA, 30322, USA
| | - Jie Song
- Institute of Nano Biomedicine and Engineering, Department of Instrument Science and Engineering, School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China.,Institute of Cancer and Basic Medicine (IBMC), Chinese Academy of Sciences, The Cancer Hospital of the University of Chinese Academy of Sciences, Hangzhou, Zhejiang, 310022, China
| |
Collapse
|
199
|
Fan S, Wang D, Cheng J, Liu Y, Luo T, Cui D, Ke Y, Song J. Information Coding in a Reconfigurable DNA Origami Domino Array. Angew Chem Int Ed Engl 2020. [DOI: 10.1002/ange.202003823] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Affiliation(s)
- Sisi Fan
- Institute of Nano Biomedicine and Engineering Department of Instrument Science and Engineering School of Electronic Information and Electrical Engineering Shanghai Jiao Tong University Shanghai 200240 China
| | - Dongfang Wang
- Institute of Nano Biomedicine and Engineering Department of Instrument Science and Engineering School of Electronic Information and Electrical Engineering Shanghai Jiao Tong University Shanghai 200240 China
| | - Jin Cheng
- Institute of Nano Biomedicine and Engineering Department of Instrument Science and Engineering School of Electronic Information and Electrical Engineering Shanghai Jiao Tong University Shanghai 200240 China
| | - Yan Liu
- Institute of Nano Biomedicine and Engineering Department of Instrument Science and Engineering School of Electronic Information and Electrical Engineering Shanghai Jiao Tong University Shanghai 200240 China
| | - Tao Luo
- Institute of Nano Biomedicine and Engineering Department of Instrument Science and Engineering School of Electronic Information and Electrical Engineering Shanghai Jiao Tong University Shanghai 200240 China
| | - Daxiang Cui
- Institute of Nano Biomedicine and Engineering Department of Instrument Science and Engineering School of Electronic Information and Electrical Engineering Shanghai Jiao Tong University Shanghai 200240 China
| | - Yonggang Ke
- Wallace H. Coulter Department of Biomedical Engineering Georgia Institute of Technology and Emory University Atlanta GA 30322 USA
| | - Jie Song
- Institute of Nano Biomedicine and Engineering Department of Instrument Science and Engineering School of Electronic Information and Electrical Engineering Shanghai Jiao Tong University Shanghai 200240 China
- Institute of Cancer and Basic Medicine (IBMC) Chinese Academy of Sciences The Cancer Hospital of the University of Chinese Academy of Sciences Hangzhou Zhejiang 310022 China
| |
Collapse
|
200
|
Chen K, Zhu J, Bošković F, Keyser UF. Nanopore-Based DNA Hard Drives for Rewritable and Secure Data Storage. NANO LETTERS 2020; 20:3754-3760. [PMID: 32223267 DOI: 10.1021/acs.nanolett.0c00755] [Citation(s) in RCA: 66] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
Nanopores are powerful single-molecule tools for label-free sensing of nanoscale molecules including DNA that can be used for building designed nanostructures and performing computations. Here, DNA hard drives (DNA-HDs) are introduced based on DNA nanotechnology and nanopore sensing as a rewritable molecular memory system, allowing for storing, operating, and reading data in the changeable three-dimensional structure of DNA. Writing and erasing data are significantly improved compared to previous molecular storage systems by employing controllable attachment and removal of molecules on a long double-stranded DNA. Data reading is achieved by detecting the single molecules at the millisecond time scale using nanopores. The DNA-HD also ensures secure data storage where the data can only be read after providing the correct physical molecular keys. Our approach allows for easy-writing and easy-reading, rewritable, and secure data storage toward a promising miniature scale integration for molecular data storage and computation.
Collapse
Affiliation(s)
- Kaikai Chen
- Cavendish Laboratory, University of Cambridge, JJ Thomson Avenue, Cambridge CB3 0HE, United Kingdom
| | - Jinbo Zhu
- Cavendish Laboratory, University of Cambridge, JJ Thomson Avenue, Cambridge CB3 0HE, United Kingdom
| | - Filip Bošković
- Cavendish Laboratory, University of Cambridge, JJ Thomson Avenue, Cambridge CB3 0HE, United Kingdom
| | - Ulrich F Keyser
- Cavendish Laboratory, University of Cambridge, JJ Thomson Avenue, Cambridge CB3 0HE, United Kingdom
| |
Collapse
|