1
|
Wang S, Mao X, Wang F, Zuo X, Fan C. Data Storage Using DNA. ADVANCED MATERIALS (DEERFIELD BEACH, FLA.) 2024; 36:e2307499. [PMID: 37800877 DOI: 10.1002/adma.202307499] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Revised: 10/01/2023] [Indexed: 10/07/2023]
Abstract
The exponential growth of global data has outpaced the storage capacities of current technologies, necessitating innovative storage strategies. DNA, as a natural medium for preserving genetic information, has emerged as a highly promising candidate for next-generation storage medium. Storing data in DNA offers several advantages, including ultrahigh physical density and exceptional durability. Facilitated by significant advancements in various technologies, such as DNA synthesis, DNA sequencing, and DNA nanotechnology, remarkable progress has been made in the field of DNA data storage over the past decade. However, several challenges still need to be addressed to realize practical applications of DNA data storage. In this review, the processes and strategies of in vitro DNA data storage are first introduced, highlighting recent advancements. Next, a brief overview of in vivo DNA data storage is provided, with a focus on the various writing strategies developed to date. At last, the challenges encountered in each step of DNA data storage are summarized and promising techniques are discussed that hold great promise in overcoming these obstacles.
Collapse
Affiliation(s)
- Shaopeng Wang
- Institute of Molecular Medicine, Shanghai Key Laboratory for Nucleic Acids Chemistry and Nanomedicine, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, 200127, China
| | - Xiuhai Mao
- Institute of Molecular Medicine, Shanghai Key Laboratory for Nucleic Acids Chemistry and Nanomedicine, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, 200127, China
| | - Fei Wang
- School of Chemistry and Chemical Engineering, New Cornerstone Science Laboratory, Frontiers Science Center for Transformative Molecules, Zhangjiang Institute for Advanced Study and National Center for Translational Medicine, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Xiaolei Zuo
- Institute of Molecular Medicine, Shanghai Key Laboratory for Nucleic Acids Chemistry and Nanomedicine, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, 200127, China
- School of Chemistry and Chemical Engineering, New Cornerstone Science Laboratory, Frontiers Science Center for Transformative Molecules, Zhangjiang Institute for Advanced Study and National Center for Translational Medicine, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Chunhai Fan
- Institute of Molecular Medicine, Shanghai Key Laboratory for Nucleic Acids Chemistry and Nanomedicine, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, 200127, China
- School of Chemistry and Chemical Engineering, New Cornerstone Science Laboratory, Frontiers Science Center for Transformative Molecules, Zhangjiang Institute for Advanced Study and National Center for Translational Medicine, Shanghai Jiao Tong University, Shanghai, 200240, China
| |
Collapse
|
2
|
El-Shaikh A, Seeger B. Content-based filter queries on DNA data storage systems. Sci Rep 2023; 13:7053. [PMID: 37120614 PMCID: PMC10148835 DOI: 10.1038/s41598-023-34160-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2023] [Accepted: 04/25/2023] [Indexed: 05/01/2023] Open
Abstract
Recent developments in DNA data storage systems have revealed the great potential to store large amounts of data at a very high density with extremely long persistence and low cost. However, despite recent contributions to robust data encoding, current DNA storage systems offer limited support for random access on DNA storage devices due to restrictive biochemical constraints. Moreover, state-of-the-art approaches do not support content-based filter queries on DNA storage. This paper introduces the first encoding for DNA that enables content-based searches on structured data like relational database tables. We provide the details of the methods for coding and decoding millions of directly accessible data objects on DNA. We evaluate the derived codes on real data sets and verify their robustness.
Collapse
Affiliation(s)
- Alex El-Shaikh
- Departement of Mathematics and Computer Science, University of Marburg, 35037, Marburg, Germany.
| | - Bernhard Seeger
- Departement of Mathematics and Computer Science, University of Marburg, 35037, Marburg, Germany
| |
Collapse
|
3
|
Welzel M, Schwarz PM, Löchel HF, Kabdullayeva T, Clemens S, Becker A, Freisleben B, Heider D. DNA-Aeon provides flexible arithmetic coding for constraint adherence and error correction in DNA storage. Nat Commun 2023; 14:628. [PMID: 36746948 PMCID: PMC9902613 DOI: 10.1038/s41467-023-36297-3] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2022] [Accepted: 01/25/2023] [Indexed: 02/08/2023] Open
Abstract
The extensive information capacity of DNA, coupled with decreasing costs for DNA synthesis and sequencing, makes DNA an attractive alternative to traditional data storage. The processes of writing, storing, and reading DNA exhibit specific error profiles and constraints DNA sequences have to adhere to. We present DNA-Aeon, a concatenated coding scheme for DNA data storage. It supports the generation of variable-sized encoded sequences with a user-defined Guanine-Cytosine (GC) content, homopolymer length limitation, and the avoidance of undesired motifs. It further enables users to provide custom codebooks adhering to further constraints. DNA-Aeon can correct substitution errors, insertions, deletions, and the loss of whole DNA strands. Comparisons with other codes show better error-correction capabilities of DNA-Aeon at similar redundancy levels with decreased DNA synthesis costs. In-vitro tests indicate high reliability of DNA-Aeon even in the case of skewed sequencing read distributions and high read-dropout.
Collapse
Affiliation(s)
- Marius Welzel
- Department of Mathematics and Computer Science, University of Marburg, Marburg, Germany
- Center for Synthetic Microbiology (SYNMIKRO), University of Marburg, Marburg, Germany
| | - Peter Michael Schwarz
- Department of Mathematics and Computer Science, University of Marburg, Marburg, Germany
- Center for Synthetic Microbiology (SYNMIKRO), University of Marburg, Marburg, Germany
| | - Hannah F Löchel
- Department of Mathematics and Computer Science, University of Marburg, Marburg, Germany
- Center for Synthetic Microbiology (SYNMIKRO), University of Marburg, Marburg, Germany
| | - Tolganay Kabdullayeva
- Center for Synthetic Microbiology (SYNMIKRO), University of Marburg, Marburg, Germany
| | - Sandra Clemens
- Department of Mathematics and Computer Science, University of Marburg, Marburg, Germany
- Center for Synthetic Microbiology (SYNMIKRO), University of Marburg, Marburg, Germany
| | - Anke Becker
- Center for Synthetic Microbiology (SYNMIKRO), University of Marburg, Marburg, Germany
| | - Bernd Freisleben
- Department of Mathematics and Computer Science, University of Marburg, Marburg, Germany
- Center for Synthetic Microbiology (SYNMIKRO), University of Marburg, Marburg, Germany
| | - Dominik Heider
- Department of Mathematics and Computer Science, University of Marburg, Marburg, Germany.
- Center for Synthetic Microbiology (SYNMIKRO), University of Marburg, Marburg, Germany.
| |
Collapse
|
4
|
Song Z, Liang Y, Yang J. Nanopore Detection Assisted DNA Information Processing. NANOMATERIALS (BASEL, SWITZERLAND) 2022; 12:nano12183135. [PMID: 36144924 PMCID: PMC9504103 DOI: 10.3390/nano12183135] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/06/2022] [Revised: 09/04/2022] [Accepted: 09/06/2022] [Indexed: 05/27/2023]
Abstract
The deoxyribonucleotide (DNA) molecule is a stable carrier for large amounts of genetic information and provides an ideal storage medium for next-generation information processing technologies. Technologies that process DNA information, representing a cross-disciplinary integration of biology and computer techniques, have become attractive substitutes for technologies that process electronic information alone. The detailed applications of DNA technologies can be divided into three components: storage, computing, and self-assembly. The quality of DNA information processing relies on the accuracy of DNA reading. Nanopore detection allows researchers to accurately sequence nucleotides and is thus widely used to read DNA. In this paper, we introduce the principles and development history of nanopore detection and conduct a systematic review of recent developments and specific applications in DNA information processing involving nanopore detection and nanopore-based storage. We also discuss the potential of artificial intelligence in nanopore detection and DNA information processing. This work not only provides new avenues for future nanopore detection development, but also offers a foundation for the construction of more advanced DNA information processing technologies.
Collapse
Affiliation(s)
- Zichen Song
- School of Control and Computer Engineering, North China Electric Power University, Beijing 102206, China
| | - Yuan Liang
- Department of Computer Science and Technology, School of Electronics Engineering and Computer Science, Peking University, Beijing 100871, China
| | - Jing Yang
- School of Control and Computer Engineering, North China Electric Power University, Beijing 102206, China
| |
Collapse
|
5
|
Ezekannagha C, Becker A, Heider D, Hattab G. Design considerations for advancing data storage with synthetic DNA for long-term archiving. Mater Today Bio 2022; 15:100306. [PMID: 35677811 PMCID: PMC9167972 DOI: 10.1016/j.mtbio.2022.100306] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2022] [Revised: 05/05/2022] [Accepted: 05/22/2022] [Indexed: 11/22/2022]
Abstract
Deoxyribonucleic acid (DNA) is increasingly emerging as a serious medium for long-term archival data storage because of its remarkable high-capacity, high-storage-density characteristics and its lasting ability to store data for thousands of years. Various encoding algorithms are generally required to store digital information in DNA and to maintain data integrity. Indeed, since DNA is the information carrier, its performance under different processing and storage conditions significantly impacts the capabilities of the data storage system. Therefore, the design of a DNA storage system must meet specific design considerations to be less error-prone, robust and reliable. In this work, we summarize the general processes and technologies employed when using synthetic DNA as a storage medium. We also share the design considerations for sustainable engineering to include viability. We expect this work to provide insight into how sustainable design can be used to develop an efficient and robust synthetic DNA-based storage system for long-term archiving.
Collapse
Affiliation(s)
- Chisom Ezekannagha
- Department of Mathematics and Computer Science, Philipps-Universität Marburg, Hans-Meerwein-Str. 6, D-35043, Marburg, Germany
- Corresponding author.
| | - Anke Becker
- Center for Synthetic Microbiology (SYNMIKRO), Philipps-Universität Marburg, Karl-von-Frisch-Str. 14, D-35043, Marburg, Germany
| | - Dominik Heider
- Department of Mathematics and Computer Science, Philipps-Universität Marburg, Hans-Meerwein-Str. 6, D-35043, Marburg, Germany
| | - Georges Hattab
- Department of Mathematics and Computer Science, Philipps-Universität Marburg, Hans-Meerwein-Str. 6, D-35043, Marburg, Germany
| |
Collapse
|