1
|
Sabary O, Yucovich A, Shapira G, Yaakobi E. Reconstruction algorithms for DNA-storage systems. Sci Rep 2024; 14:1951. [PMID: 38263421 PMCID: PMC10806084 DOI: 10.1038/s41598-024-51730-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2023] [Accepted: 01/09/2024] [Indexed: 01/25/2024] Open
Abstract
Motivated by DNA storage systems, this work presents the DNA reconstruction problem, in which a length-n string, is passing through the DNA-storage channel, which introduces deletion, insertion and substitution errors. This channel generates multiple noisy copies of the transmitted string which are called traces. A DNA reconstruction algorithm is a mapping which receives t traces as an input and produces an estimation of the original string. The goal in the DNA reconstruction problem is to minimize the edit distance between the original string and the algorithm's estimation. In this work, we present several new algorithms for this problem. Our algorithms look globally on the entire sequence of the traces and use dynamic programming algorithms, which are used for the shortest common supersequence and the longest common subsequence problems, in order to decode the original string. Our algorithms do not require any limitations on the input and the number of traces, and more than that, they perform well even for error probabilities as high as 0.27. The algorithms have been tested on simulated data, on data from previous DNA storage experiments, and on a new synthesized dataset, and are shown to outperform previous algorithms in reconstruction accuracy.
Collapse
Affiliation(s)
- Omer Sabary
- The Henry and Marilyn Taub Faculty of Computer Science, Technion, 3200003, Haifa, Israel.
| | - Alexander Yucovich
- The Henry and Marilyn Taub Faculty of Computer Science, Technion, 3200003, Haifa, Israel
| | - Guy Shapira
- The Henry and Marilyn Taub Faculty of Computer Science, Technion, 3200003, Haifa, Israel
| | - Eitan Yaakobi
- The Henry and Marilyn Taub Faculty of Computer Science, Technion, 3200003, Haifa, Israel
| |
Collapse
|
2
|
Gao R, Wei XS, Chen Z, Xie A, Dong W. Leveraging DNA-Based Nanostructures for Advanced Error Detection and Correction in Data Communication. ACS NANO 2023; 17:18055-18061. [PMID: 37498772 DOI: 10.1021/acsnano.3c04777] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/29/2023]
Abstract
This study demonstrates the implementation of the Hamming code using DNA-based nanostructures for error detection and correction in communication systems. The designed DNA nanostructures conduct logical operations to compute check codes and identify and correct erroneous data based on fluorescence signals. The execution of intricate DNA logic operations requires individuals with specialized training. By interpretation of the fluorescence signals generated by the DNA nanostructures, binary language can be extracted, effectively protecting data security. The findings highlight the potential of DNA as a versatile platform for reliable data transmission.
Collapse
Affiliation(s)
- Ruru Gao
- School of Chemistry and Chemical Engineering, Nanjing University of Science and Technology, Nanjing, 210094, China
| | - Xiu-Shen Wei
- School of Computer Science and Engineering, Southeast University, Nanjing, 210096, China
- Key Laboratory of New Generation Artificial Intelligence Technology and Its Interdisciplinary Applications (Southeast University), Ministry of Education, Nanjing, 210096, China
| | - Zelin Chen
- School of Chemistry and Chemical Engineering, Nanjing University of Science and Technology, Nanjing, 210094, China
| | - Aming Xie
- School of Chemistry and Chemical Engineering, Nanjing University of Science and Technology, Nanjing, 210094, China
| | - Wei Dong
- School of Chemistry and Chemical Engineering, Nanjing University of Science and Technology, Nanjing, 210094, China
| |
Collapse
|
3
|
Xie XS. Round-Trip Journey of a Physical Chemist. J Phys Chem B 2023; 127:7800-7809. [PMID: 37731371 DOI: 10.1021/acs.jpcb.3c05597] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/22/2023]
Affiliation(s)
- Xiaoliang Sunney Xie
- Biomedical Pioneering Innovation Center, Peking University, 5 Yiheyuan Road, Beijing 100871, China
| |
Collapse
|
4
|
Mao C, Wang S, Li J, Feng Z, Zhang T, Wang R, Fan C, Jiang X. Metal-Organic Frameworks in Microfluidics Enable Fast Encapsulation/Extraction of DNA for Automated and Integrated Data Storage. ACS NANO 2023; 17:2840-2850. [PMID: 36728704 DOI: 10.1021/acsnano.2c11241] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
DNA as an exceptional data storage medium offers high information density. However, DNA storage requires specialized equipment and tightly controlled environments for storage. Fast encapsulation within minutes for enhanced DNA stability to do away with specialized equipment and fast DNA extraction remain a challenge. Here, we report a DNA microlibrary that can be encapsulated by metal-organic frameworks (MOFs) within 10 min and extracted (5 min) in a single microfluidic chip for automated and integrated DNA-based data storage. The DNA microlibrary@MOFs enhances the stability of data-encoded DNA against harsh environments. The encoded information can be read out perfectly after accelerated aging, equivalent to being readable after 10 years of storage at 25 °C, 50% relative humidity, and 10 000 lx sunlight radiation. Moreover, the library enables fast retrieval of target data via flow cytometry and can be reproduced after each access.
Collapse
Affiliation(s)
- Cuiping Mao
- Guangdong Provincial Key Laboratory of Advanced Biomaterials, Shenzhen Key Laboratory of Smart Healthcare Engineering, Department of Biomedical Engineering, Southern University of Science and Technology, No 1088, Xueyuan Road, Nanshan District, Shenzhen, Guangdong 518055, People's Republic of China
| | - Shuchen Wang
- Guangdong Provincial Key Laboratory of Advanced Biomaterials, Shenzhen Key Laboratory of Smart Healthcare Engineering, Department of Biomedical Engineering, Southern University of Science and Technology, No 1088, Xueyuan Road, Nanshan District, Shenzhen, Guangdong 518055, People's Republic of China
| | - Jiankai Li
- Guangdong Provincial Key Laboratory of Advanced Biomaterials, Shenzhen Key Laboratory of Smart Healthcare Engineering, Department of Biomedical Engineering, Southern University of Science and Technology, No 1088, Xueyuan Road, Nanshan District, Shenzhen, Guangdong 518055, People's Republic of China
| | - Zhuowei Feng
- Guangdong Provincial Key Laboratory of Advanced Biomaterials, Shenzhen Key Laboratory of Smart Healthcare Engineering, Department of Biomedical Engineering, Southern University of Science and Technology, No 1088, Xueyuan Road, Nanshan District, Shenzhen, Guangdong 518055, People's Republic of China
| | - Tong Zhang
- Department of Electronic and Electrical Engineering, Southern University of Science and Technology, No 1088, Xueyuan Road, Nanshan District, Shenzhen, Guangdong 518055, People's Republic of China
| | - Rui Wang
- Department of Electronic and Electrical Engineering, Southern University of Science and Technology, No 1088, Xueyuan Road, Nanshan District, Shenzhen, Guangdong 518055, People's Republic of China
| | - Chunhai Fan
- Frontiers Science Center for Transformative Molecules, School of Chemistry and Chemical Engineering, Shanghai Jiao Tong University, No 800, DongChuan Road, Minhang District, Shanghai 200240, People's Republic of China
| | - Xingyu Jiang
- Guangdong Provincial Key Laboratory of Advanced Biomaterials, Shenzhen Key Laboratory of Smart Healthcare Engineering, Department of Biomedical Engineering, Southern University of Science and Technology, No 1088, Xueyuan Road, Nanshan District, Shenzhen, Guangdong 518055, People's Republic of China
| |
Collapse
|
5
|
Cheng C, Fei Z, Xiao P. Methods to improve the accuracy of next-generation sequencing. Front Bioeng Biotechnol 2023; 11:982111. [PMID: 36741756 PMCID: PMC9895957 DOI: 10.3389/fbioe.2023.982111] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Accepted: 01/11/2023] [Indexed: 01/21/2023] Open
Abstract
Next-generation sequencing (NGS) is present in all fields of life science, which has greatly promoted the development of basic research while being gradually applied in clinical diagnosis. However, the cost and throughput advantages of next-generation sequencing are offset by large tradeoffs with respect to read length and accuracy. Specifically, its high error rate makes it extremely difficult to detect SNPs or low-abundance mutations, limiting its clinical applications, such as pharmacogenomics studies primarily based on SNP and early clinical diagnosis primarily based on low abundance mutations. Currently, Sanger sequencing is still considered to be the gold standard due to its high accuracy, so the results of next-generation sequencing require verification by Sanger sequencing in clinical practice. In order to maintain high quality next-generation sequencing data, a variety of improvements at the levels of template preparation, sequencing strategy and data processing have been developed. This study summarized the general procedures of next-generation sequencing platforms, highlighting the improvements involved in eliminating errors at each step. Furthermore, the challenges and future development of next-generation sequencing in clinical application was discussed.
Collapse
|
6
|
Cheng C, Fei Z, Xiao P, Huang H, Zhou G, Lu Z. Analysis of mutational genotyping using correctable decoding sequencing with superior specificity. Analyst 2023; 148:402-411. [PMID: 36537878 DOI: 10.1039/d2an01805e] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
The ability to accurately identify SNPs or low-abundance mutations is important for early clinical diagnosis of diseases, but the existing high-throughput sequencing platforms are limited in terms of their accuracy. Here, we propose a correctable decoding sequencing strategy that may be used for high-throughput sequencing platforms. This strategy is based on adding a mixture of two types of mononucleotides, natural nucleotide and cyclic reversible termination (CRT), for cyclic sequencing. Using the synthetic characteristic of CRTs, about 75% of the calls are unambiguous for a single sequencing run, and the remaining ambiguous sequence can be accurately deduced by two parallel sequencing runs. We demonstrate the feasibility of this strategy, and its cycle efficiency can reach approximately 99.3%. This strategy is proved to be effective for correcting errors and identifying whether the sequencing information is correct or not. And its conservative theoretical error rate was determined to be 0.0009%, which is lower than that of Sanger sequencing. In addition, we establish that the information of only a single sequencing run can be used to detect samples with known mutation sites. We apply this strategy to accurately identify a mutation site in mitochondrial DNA from human cells.
Collapse
Affiliation(s)
- Chu Cheng
- State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing, 210096, China.
| | - Zhongjie Fei
- State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing, 210096, China.
| | - Pengfeng Xiao
- State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing, 210096, China.
| | - Huan Huang
- Department of Obstetrics and Gynecology, The first Affiliated Hospital of Nanjing Medical University, Nanjing, 210029, China.
| | - Guohua Zhou
- Department of Clinical Pharmacy, Jinling Hospital, State Key Laboratory of Analytical Chemistry for Life Science & Jiangsu Key Laboratory of Molecular, Medical School of Nanjing University, Nanjing, 210000, China.
| | - Zuhong Lu
- State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing, 210096, China.
| |
Collapse
|
7
|
Zhang Y, Ren Y, Liu Y, Wang F, Zhang H, Liu K. Preservation and Encryption in DNA Digital Data Storage. Chempluschem 2022; 87:e202200183. [PMID: 35856827 DOI: 10.1002/cplu.202200183] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2022] [Revised: 07/01/2022] [Indexed: 11/08/2022]
Abstract
The exponential growth of the total amount of global data presents a huge challenge to mainstream storage media. The emergence of molecular digital storage inspires the development of the new-generation higher-density digital data storage. In particular, DNA with high storage density, reproducibility, and long recoverable lifetime behaves the ideal representative of molecular digital storage media. With the development of DNA synthesis and sequencing technologies and the reduction of cost, DNA digital storage has attracted more and more attention and achieved significant breakthroughs. Herein, this Review briefly describes the workflow of DNA storage, and highlights the storage step of DNA digital data storage. Then, according to different information storage forms, the current DNA information encryption methods are emphatically expounded. Finally, the brief perspectives on the current challenges and optimizing proposals in DNA information preservation and encryption are presented.
Collapse
Affiliation(s)
- Yi Zhang
- State Key Laboratory of Rare Earth Resource Utilization, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun, 130022, P. R. China
| | - Yubin Ren
- Department of Chemistry, Tsinghua University, Beijing, 100084, P. R. China
| | - Yangyi Liu
- Department of Chemistry, Tsinghua University, Beijing, 100084, P. R. China
| | - Fan Wang
- State Key Laboratory of Rare Earth Resource Utilization, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun, 130022, P. R. China
| | - Hongjie Zhang
- State Key Laboratory of Rare Earth Resource Utilization, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun, 130022, P. R. China
- Department of Chemistry, Tsinghua University, Beijing, 100084, P. R. China
| | - Kai Liu
- State Key Laboratory of Rare Earth Resource Utilization, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun, 130022, P. R. China
- Department of Chemistry, Tsinghua University, Beijing, 100084, P. R. China
| |
Collapse
|
8
|
Cheng C, Xiao P. Evaluation of the correctable decoding sequencing as a new powerful strategy for DNA sequencing. Life Sci Alliance 2022; 5:5/8/e202101294. [PMID: 35422436 PMCID: PMC9012935 DOI: 10.26508/lsa.202101294] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2021] [Revised: 04/01/2022] [Accepted: 04/01/2022] [Indexed: 12/01/2022] Open
Abstract
This article proposed the correctable decoding sequencing technology with conservative theoretical error rate of 0.0009%, and evaluated its robustness by simulation. This technology can provide a powerful new protocol for NGS platforms, enabling accurate identification of rare mutations in medicine. Next-generation sequencing (NGS) promises to revolutionize precision medicine, but the existing sequencing technologies are limited in accuracy. To overcome this limitation, we propose the correctable decoding sequencing strategy, which is a duplex sequencing protocol with conservative theoretical error rates of 0.0009%. This rate is lower than that for Sanger sequencing. Here, we simulate the sequencing reactions by the self-developed software, and find that this approach has great potential in NGS in terms of sequence decoding, reassembly, error correction, and sequencing accuracy. Besides, this approach can be compatible with most SBS-based sequencing platforms, and also has the ability to compensate for some of the shortcomings of NGS platforms, thereby broadening its application for researchers. Hopefully, it can provide a powerful new protocol that can be used as an alternative to the existing NGS platforms, enabling accurate identification of rare mutations in a variety of applications in biology and medicine.
Collapse
Affiliation(s)
- Chu Cheng
- State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing, China
| | - Pengfeng Xiao
- State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing, China
| |
Collapse
|
9
|
Zhou W, Kang L, Duan H, Qiao S, Tao L, Chen Z, Huang Y. A virtual sequencer reveals the dephasing patterns in error-correction code DNA sequencing. Natl Sci Rev 2021; 8:nwaa227. [PMID: 34691637 PMCID: PMC8288425 DOI: 10.1093/nsr/nwaa227] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2020] [Revised: 08/16/2020] [Accepted: 08/16/2020] [Indexed: 12/12/2022] Open
Abstract
An error-correction code (ECC) sequencing approach has recently been reported to effectively reduce sequencing errors by interrogating a DNA fragment with three orthogonal degenerate sequencing-by-synthesis (SBS) reactions. However, similar to other non-single-molecule SBS methods, the reaction will gradually lose its synchronization within a molecular colony in ECC sequencing. This phenomenon, called dephasing, causes sequencing error, and in ECC sequencing, induces distinctive dephasing patterns. To understand the characteristic dephasing patterns of the dual-base flowgram in ECC sequencing and to generate a correction algorithm, we built a virtual sequencer in silico. Starting from first principles and based on sequencing chemical reactions, we simulated ECC sequencing results, identified the key factors of dephasing in ECC sequencing chemistry and designed an effective dephasing algorithm. The results show that our dephasing algorithm is applicable to sequencing signals with at least 500 cycles, or 1000-bp average read length, with acceptably low error rate for further parity checks and ECC deduction. Our virtual sequencer with our dephasing algorithm can further be extended to a dichromatic form of ECC sequencing, allowing for a potentially much more accurate sequencing approach.
Collapse
Affiliation(s)
- Wenxiong Zhou
- Biomedical Pioneering Innovation Center (BIOPIC), School of Life Sciences, Beijing Advanced Innovation Center for Genomics (ICG), and Peking-Tsinghua Center for Life Sciences, Peking University, Beijing 100871, China
| | - Li Kang
- Biomedical Pioneering Innovation Center (BIOPIC), School of Life Sciences, Beijing Advanced Innovation Center for Genomics (ICG), and Peking-Tsinghua Center for Life Sciences, Peking University, Beijing 100871, China
| | - Haifeng Duan
- Biomedical Pioneering Innovation Center (BIOPIC), School of Life Sciences, Beijing Advanced Innovation Center for Genomics (ICG), and Peking-Tsinghua Center for Life Sciences, Peking University, Beijing 100871, China
| | - Shuo Qiao
- Biomedical Pioneering Innovation Center (BIOPIC), School of Life Sciences, Beijing Advanced Innovation Center for Genomics (ICG), and Peking-Tsinghua Center for Life Sciences, Peking University, Beijing 100871, China
| | - Louis Tao
- Center for Bioinformatics, State Key Laboratory of Protein Engineering and Plant Genetic Engineering, Peking University, Beijing 100871, China
| | - Zitian Chen
- Biomedical Pioneering Innovation Center (BIOPIC), School of Life Sciences, Beijing Advanced Innovation Center for Genomics (ICG), and Peking-Tsinghua Center for Life Sciences, Peking University, Beijing 100871, China
| | - Yanyi Huang
- Biomedical Pioneering Innovation Center (BIOPIC), School of Life Sciences, Beijing Advanced Innovation Center for Genomics (ICG), and Peking-Tsinghua Center for Life Sciences, Peking University, Beijing 100871, China
| |
Collapse
|
10
|
Liu Y, Li J. Hamming-shifting graph of genomic short reads: Efficient construction and its application for compression. PLoS Comput Biol 2021; 17:e1009229. [PMID: 34280186 PMCID: PMC8321399 DOI: 10.1371/journal.pcbi.1009229] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2021] [Revised: 07/29/2021] [Accepted: 06/30/2021] [Indexed: 11/21/2022] Open
Abstract
Graphs such as de Bruijn graphs and OLC (overlap-layout-consensus) graphs have been widely adopted for the de novo assembly of genomic short reads. This work studies another important problem in the field: how graphs can be used for high-performance compression of the large-scale sequencing data. We present a novel graph definition named Hamming-Shifting graph to address this problem. The definition originates from the technological characteristics of next-generation sequencing machines, aiming to link all pairs of distinct reads that have a small Hamming distance or a small shifting offset or both. We compute multiple lexicographically minimal k-mers to index the reads for an efficient search of the weight-lightest edges, and we prove a very high probability of successfully detecting these edges. The resulted graph creates a full mutual reference of the reads to cascade a code-minimized transfer of every child-read for an optimal compression. We conducted compression experiments on the minimum spanning forest of this extremely sparse graph, and achieved a 10 − 30% more file size reduction compared to the best compression results using existing algorithms. As future work, the separation and connectivity degrees of these giant graphs can be used as economical measurements or protocols for quick quality assessment of wet-lab machines, for sufficiency control of genomic library preparation, and for accurate de novo genome assembly. We present a novel graph-based algorithm to compress next-generation short sequencing reads. The novelty of the algorithm is attributed to a new definition of genomic sequence graph named Hamming-Shifting graph. It consists of edges between distinct reads that have a small Hamming distance or a small shifting offset or both. Efficient construction of Hamming-Shifting graphs is challenging. We introduce a heuristic technique to detect the weight-lightest edges through multiple minimizers from each read, then search the minimum spanning trees and forests of the Hamming-Shifting graph for a high-performance compression of the reads. Our method achieves an additional 10 − 30% file size reduction compared to contemporary compression techniques.
Collapse
Affiliation(s)
- Yuansheng Liu
- Data Science Institute, University of Technology Sydney, Sydney, Australia
| | - Jinyan Li
- Data Science Institute, University of Technology Sydney, Sydney, Australia
| |
Collapse
|
11
|
Zhao Y, Zuo X, Li Q, Chen F, Chen YR, Deng J, Han D, Hao C, Huang F, Huang Y, Ke G, Kuang H, Li F, Li J, Li M, Li N, Lin Z, Liu D, Liu J, Liu L, Liu X, Lu C, Luo F, Mao X, Sun J, Tang B, Wang F, Wang J, Wang L, Wang S, Wu L, Wu ZS, Xia F, Xu C, Yang Y, Yuan BF, Yuan Q, Zhang C, Zhu Z, Yang C, Zhang XB, Yang H, Tan W, Fan C. Nucleic Acids Analysis. Sci China Chem 2020; 64:171-203. [PMID: 33293939 PMCID: PMC7716629 DOI: 10.1007/s11426-020-9864-7] [Citation(s) in RCA: 68] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2020] [Accepted: 09/04/2020] [Indexed: 12/11/2022]
Abstract
Nucleic acids are natural biopolymers of nucleotides that store, encode, transmit and express genetic information, which play central roles in diverse cellular events and diseases in living things. The analysis of nucleic acids and nucleic acids-based analysis have been widely applied in biological studies, clinical diagnosis, environmental analysis, food safety and forensic analysis. During the past decades, the field of nucleic acids analysis has been rapidly advancing with many technological breakthroughs. In this review, we focus on the methods developed for analyzing nucleic acids, nucleic acids-based analysis, device for nucleic acids analysis, and applications of nucleic acids analysis. The representative strategies for the development of new nucleic acids analysis in this field are summarized, and key advantages and possible limitations are discussed. Finally, a brief perspective on existing challenges and further research development is provided.
Collapse
Affiliation(s)
- Yongxi Zhao
- Institute of Analytical Chemistry and Instrument for Life Science, The Key Laboratory of Biomedical Information Engineering of Ministry of Education, School of Life Science and Technology, Xi’an Jiaotong University, Xi’an, 710049 China
| | - Xiaolei Zuo
- Institute of Molecular Medicine, Shanghai Key Laboratory for Nucleic Acid Chemistry and Nanomedicine, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, 200127 China
| | - Qian Li
- School of Chemistry and Chemical Engineering, Frontiers Science Center for Transformative Molecules, Institute of Translational Medicine, Shanghai Jiao Tong University, Shanghai, 200240 China
| | - Feng Chen
- Institute of Analytical Chemistry and Instrument for Life Science, The Key Laboratory of Biomedical Information Engineering of Ministry of Education, School of Life Science and Technology, Xi’an Jiaotong University, Xi’an, 710049 China
| | - Yan-Ru Chen
- Cancer Metastasis Alert and Prevention Center, Fujian Provincial Key Laboratory of Cancer Metastasis Chemoprevention and Chemotherapy, State Key Laboratory of Photocatalysis on Energy and Environment, College of Chemistry, Fuzhou University, Fuzhou, 350108 China
| | - Jinqi Deng
- CAS Key Laboratory of Standardization and Measurement for Nanotechnology, CAS Center for Excellence in Nanoscience, National Center for Nanoscience and Technology, Beijing, 100190 China
| | - Da Han
- Institute of Molecular Medicine, Shanghai Key Laboratory for Nucleic Acid Chemistry and Nanomedicine, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, 200127 China
| | - Changlong Hao
- State Key Lab of Food Science and Technology, International Joint Research Laboratory for Biointerface and Biodetection, School of Food Science and Technology, Jiangnan University, Wuxi, 214122 China
| | - Fujian Huang
- Faculty of Materials Science and Chemistry, Engineering Research Center of Nano-Geomaterials of Ministry of Education, China University of Geosciences, Wuhan, 430074 China
| | - Yanyi Huang
- College of Chemistry and Molecular Engineering, Biomedical Pioneering Innovation Center (BIOPIC), Beijing Advanced Innovation Center for Genomics (ICG), Peking-Tsinghua Center for Life Sciences, Peking University, Beijing, 100871 China
| | - Guoliang Ke
- State Key Laboratory of Chemo/Biosensing and Chemometrics, College of Chemistry and Chemical Engineering, Hunan University, Changsha, 410082 China
| | - Hua Kuang
- State Key Lab of Food Science and Technology, International Joint Research Laboratory for Biointerface and Biodetection, School of Food Science and Technology, Jiangnan University, Wuxi, 214122 China
| | - Fan Li
- Institute of Molecular Medicine, Shanghai Key Laboratory for Nucleic Acid Chemistry and Nanomedicine, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, 200127 China
| | - Jiang Li
- Division of Physical Biology, CAS Key Laboratory of Interfacial Physics and Technology, Shanghai Institute of Applied Physics, Chinese Academy of Sciences, Shanghai, 201800 China
- Bioimaging Center, Shanghai Synchrotron Radiation Facility, Zhangjiang Laboratory, Shanghai Advanced Research Institute, Chinese Academy of Sciences, Shanghai, 201210 China
| | - Min Li
- Institute of Molecular Medicine, Shanghai Key Laboratory for Nucleic Acid Chemistry and Nanomedicine, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, 200127 China
| | - Na Li
- College of Chemistry, Chemical Engineering and Materials Science, Key Laboratory of Molecular and Nano Probes, Ministry of Education, Shandong Normal University, Jinan, 250014 China
| | - Zhenyu Lin
- Ministry of Education Key Laboratory for Analytical Science of Food Safety and Biology, Fujian Provincial Key Laboratory of Analysis and Detection for Food Safety, College of Chemistry, Fuzhou University, Fuzhou, 350116 China
| | - Dingbin Liu
- College of Chemistry, Research Center for Analytical Sciences, State Key Laboratory of Medicinal Chemical Biology, and Tianjin Key Laboratory of Molecular Recognition and Biosensing, Nankai University, Tianjin, 300071 China
| | - Juewen Liu
- Department of Chemistry, Waterloo Institute for Nanotechnology, University of Waterloo, Waterloo, Ontario N2L 3G1 Canada
| | - Libing Liu
- Laboratory of Organic Solids, Institute of Chemistry, Chinese Academy of Sciences, Beijing, 100190 China
- College of Chemistry, University of Chinese Academy of Sciences, Beijing, 100049 China
| | - Xiaoguo Liu
- School of Chemistry and Chemical Engineering, Frontiers Science Center for Transformative Molecules, Institute of Translational Medicine, Shanghai Jiao Tong University, Shanghai, 200240 China
| | - Chunhua Lu
- Ministry of Education Key Laboratory for Analytical Science of Food Safety and Biology, Fujian Provincial Key Laboratory of Analysis and Detection for Food Safety, College of Chemistry, Fuzhou University, Fuzhou, 350116 China
| | - Fang Luo
- Ministry of Education Key Laboratory for Analytical Science of Food Safety and Biology, Fujian Provincial Key Laboratory of Analysis and Detection for Food Safety, College of Chemistry, Fuzhou University, Fuzhou, 350116 China
| | - Xiuhai Mao
- Institute of Molecular Medicine, Shanghai Key Laboratory for Nucleic Acid Chemistry and Nanomedicine, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, 200127 China
| | - Jiashu Sun
- CAS Key Laboratory of Standardization and Measurement for Nanotechnology, CAS Center for Excellence in Nanoscience, National Center for Nanoscience and Technology, Beijing, 100190 China
| | - Bo Tang
- College of Chemistry, Chemical Engineering and Materials Science, Key Laboratory of Molecular and Nano Probes, Ministry of Education, Shandong Normal University, Jinan, 250014 China
| | - Fei Wang
- School of Chemistry and Chemical Engineering, Frontiers Science Center for Transformative Molecules, Institute of Translational Medicine, Shanghai Jiao Tong University, Shanghai, 200240 China
| | - Jianbin Wang
- School of Life Sciences, Tsinghua-Peking Center for Life Sciences, Beijing Advanced Innovation Center for Structural Biology (ICSB), Chinese Institute for Brain Research (CIBR), Tsinghua University, Beijing, 100084 China
| | - Lihua Wang
- Division of Physical Biology, CAS Key Laboratory of Interfacial Physics and Technology, Shanghai Institute of Applied Physics, Chinese Academy of Sciences, Shanghai, 201800 China
- Bioimaging Center, Shanghai Synchrotron Radiation Facility, Zhangjiang Laboratory, Shanghai Advanced Research Institute, Chinese Academy of Sciences, Shanghai, 201210 China
| | - Shu Wang
- Department of Chemistry, Waterloo Institute for Nanotechnology, University of Waterloo, Waterloo, Ontario N2L 3G1 Canada
| | - Lingling Wu
- Institute of Molecular Medicine, Shanghai Key Laboratory for Nucleic Acid Chemistry and Nanomedicine, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, 200127 China
| | - Zai-Sheng Wu
- Cancer Metastasis Alert and Prevention Center, Fujian Provincial Key Laboratory of Cancer Metastasis Chemoprevention and Chemotherapy, State Key Laboratory of Photocatalysis on Energy and Environment, College of Chemistry, Fuzhou University, Fuzhou, 350108 China
| | - Fan Xia
- Faculty of Materials Science and Chemistry, Engineering Research Center of Nano-Geomaterials of Ministry of Education, China University of Geosciences, Wuhan, 430074 China
| | - Chuanlai Xu
- State Key Lab of Food Science and Technology, International Joint Research Laboratory for Biointerface and Biodetection, School of Food Science and Technology, Jiangnan University, Wuxi, 214122 China
| | - Yang Yang
- Institute of Molecular Medicine, Shanghai Key Laboratory for Nucleic Acid Chemistry and Nanomedicine, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, 200127 China
| | - Bi-Feng Yuan
- Department of Chemistry, Wuhan University, Wuhan, 430072 China
| | - Quan Yuan
- State Key Laboratory of Chemo/Biosensing and Chemometrics, College of Chemistry and Chemical Engineering, Hunan University, Changsha, 410082 China
| | - Chao Zhang
- Institute of Molecular Medicine, Shanghai Key Laboratory for Nucleic Acid Chemistry and Nanomedicine, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, 200127 China
| | - Zhi Zhu
- The MOE Key Laboratory of Spectrochemical Analysis and Instrumentation, Key Laboratory for Chemical Biology of Fujian Province, State Key Laboratory of Physical Chemistry of Solid Surfaces, Department of Chemical Biology, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen, 361005 China
| | - Chaoyong Yang
- Institute of Molecular Medicine, Shanghai Key Laboratory for Nucleic Acid Chemistry and Nanomedicine, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, 200127 China
- The MOE Key Laboratory of Spectrochemical Analysis and Instrumentation, Key Laboratory for Chemical Biology of Fujian Province, State Key Laboratory of Physical Chemistry of Solid Surfaces, Department of Chemical Biology, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen, 361005 China
| | - Xiao-Bing Zhang
- State Key Laboratory of Chemo/Biosensing and Chemometrics, College of Chemistry and Chemical Engineering, Hunan University, Changsha, 410082 China
| | - Huanghao Yang
- Ministry of Education Key Laboratory for Analytical Science of Food Safety and Biology, Fujian Provincial Key Laboratory of Analysis and Detection for Food Safety, College of Chemistry, Fuzhou University, Fuzhou, 350116 China
| | - Weihong Tan
- Institute of Molecular Medicine, Shanghai Key Laboratory for Nucleic Acid Chemistry and Nanomedicine, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, 200127 China
- State Key Laboratory of Chemo/Biosensing and Chemometrics, College of Chemistry and Chemical Engineering, Hunan University, Changsha, 410082 China
| | - Chunhai Fan
- Institute of Molecular Medicine, Shanghai Key Laboratory for Nucleic Acid Chemistry and Nanomedicine, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, 200127 China
- School of Chemistry and Chemical Engineering, Frontiers Science Center for Transformative Molecules, Institute of Translational Medicine, Shanghai Jiao Tong University, Shanghai, 200240 China
| |
Collapse
|
12
|
Chanda P, Costa E, Hu J, Sukumar S, Van Hemert J, Walia R. Information Theory in Computational Biology: Where We Stand Today. ENTROPY (BASEL, SWITZERLAND) 2020; 22:E627. [PMID: 33286399 PMCID: PMC7517167 DOI: 10.3390/e22060627] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/30/2020] [Revised: 05/31/2020] [Accepted: 06/03/2020] [Indexed: 12/30/2022]
Abstract
"A Mathematical Theory of Communication" was published in 1948 by Claude Shannon to address the problems in the field of data compression and communication over (noisy) communication channels. Since then, the concepts and ideas developed in Shannon's work have formed the basis of information theory, a cornerstone of statistical learning and inference, and has been playing a key role in disciplines such as physics and thermodynamics, probability and statistics, computational sciences and biological sciences. In this article we review the basic information theory based concepts and describe their key applications in multiple major areas of research in computational biology-gene expression and transcriptomics, alignment-free sequence comparison, sequencing and error correction, genome-wide disease-gene association mapping, metabolic networks and metabolomics, and protein sequence, structure and interaction analysis.
Collapse
Affiliation(s)
- Pritam Chanda
- Corteva Agriscience™, Indianapolis, IN 46268, USA
- Computer and Information Science, Indiana University-Purdue University, Indianapolis, IN 46202, USA
| | - Eduardo Costa
- Corteva Agriscience™, Mogi Mirim, Sao Paulo 13801-540, Brazil
| | - Jie Hu
- Corteva Agriscience™, Indianapolis, IN 46268, USA
| | | | | | - Rasna Walia
- Corteva Agriscience™, Johnston, IA 50131, USA
| |
Collapse
|
13
|
Sun F, Zhao S, Peng M, Fu Q, Gao H, Jia Y, Na N, Ouyang J. Sequencing of Small DNA Fragments with Aggregated-Induced-Emission Molecule-Labeled Nucleotides. Anal Chem 2020; 92:7179-7185. [PMID: 32329345 DOI: 10.1021/acs.analchem.0c00707] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
Abstract
Sequencing by synthesis is a significant method for high-throughput DNA sequencing. Herein, we synthesized terminal aggregated-induced-emission luminogen (AIEgen) labeled nucleotides (dNTPs-HCAP) that could serve as substrates for some polymerases and applied them into the sequencing of small DNA fragments. In the process of DNA amplification, ratiometric AIEgens are released from dNTPs-HCAP and aggregate through the effects of phosphatase, which results in changes in the ratiometric fluorescent signals. With the AIEgen-labeled nucleotides, we accomplished the sequencing of small DNA fragments through double changes in fluorescence. In addition, we achieved the differentiation of single nucleotide polymorphisms through rolling circle amplification reactions without the addition of signal probes, which is fast and cost-effective. The introduction of ratiometric AIEgens into DNA synthesis makes the detection of DNA sequences more efficient and accurate. Therefore, the development of AIEgen-labeled nucleotides is meaningful for the study of DNA sequencing methods.
Collapse
Affiliation(s)
- Feifei Sun
- Key Laboratory of Theoretical and Computational Photochemistry, Ministry of Education, College of Chemistry, Beijing Normal University, Beijing, 100875, China
| | - Shengnan Zhao
- Hebei Provincial Laboratory for Research and Development of Chinese Medicine, Chengde Medical College, Hebei Chengde, 067000, China
| | - Manshu Peng
- Key Laboratory of Theoretical and Computational Photochemistry, Ministry of Education, College of Chemistry, Beijing Normal University, Beijing, 100875, China
| | - Qiang Fu
- Key Laboratory of Theoretical and Computational Photochemistry, Ministry of Education, College of Chemistry, Beijing Normal University, Beijing, 100875, China
| | - Huimin Gao
- Key Laboratory of Theoretical and Computational Photochemistry, Ministry of Education, College of Chemistry, Beijing Normal University, Beijing, 100875, China
| | - Yijing Jia
- Key Laboratory of Theoretical and Computational Photochemistry, Ministry of Education, College of Chemistry, Beijing Normal University, Beijing, 100875, China
| | - Na Na
- Key Laboratory of Theoretical and Computational Photochemistry, Ministry of Education, College of Chemistry, Beijing Normal University, Beijing, 100875, China
| | - Jin Ouyang
- Key Laboratory of Theoretical and Computational Photochemistry, Ministry of Education, College of Chemistry, Beijing Normal University, Beijing, 100875, China
| |
Collapse
|
14
|
Mitchell K, Brito JJ, Mandric I, Wu Q, Knyazev S, Chang S, Martin LS, Karlsberg A, Gerasimov E, Littman R, Hill BL, Wu NC, Yang HT, Hsieh K, Chen L, Littman E, Shabani T, Enik G, Yao D, Sun R, Schroeder J, Eskin E, Zelikovsky A, Skums P, Pop M, Mangul S. Benchmarking of computational error-correction methods for next-generation sequencing data. Genome Biol 2020; 21:71. [PMID: 32183840 PMCID: PMC7079412 DOI: 10.1186/s13059-020-01988-3] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2019] [Accepted: 03/06/2020] [Indexed: 12/16/2022] Open
Abstract
BACKGROUND Recent advancements in next-generation sequencing have rapidly improved our ability to study genomic material at an unprecedented scale. Despite substantial improvements in sequencing technologies, errors present in the data still risk confounding downstream analysis and limiting the applicability of sequencing technologies in clinical tools. Computational error correction promises to eliminate sequencing errors, but the relative accuracy of error correction algorithms remains unknown. RESULTS In this paper, we evaluate the ability of error correction algorithms to fix errors across different types of datasets that contain various levels of heterogeneity. We highlight the advantages and limitations of computational error correction techniques across different domains of biology, including immunogenomics and virology. To demonstrate the efficacy of our technique, we apply the UMI-based high-fidelity sequencing protocol to eliminate sequencing errors from both simulated data and the raw reads. We then perform a realistic evaluation of error-correction methods. CONCLUSIONS In terms of accuracy, we find that method performance varies substantially across different types of datasets with no single method performing best on all types of examined data. Finally, we also identify the techniques that offer a good balance between precision and sensitivity.
Collapse
Affiliation(s)
- Keith Mitchell
- Department of Computer Science, University of California Los Angeles, 404 Westwood Plaza, Los Angeles, CA, 90095, USA
| | - Jaqueline J Brito
- Department of Clinical Pharmacy, School of Pharmacy, University of Southern California, 1985 Zonal Avenue, Los Angeles, CA, 90089, USA
| | - Igor Mandric
- Department of Computer Science, University of California Los Angeles, 404 Westwood Plaza, Los Angeles, CA, 90095, USA
- Department of Computer Science, Georgia State University, 1 Park Place, Atlanta, GA, 30303, USA
| | - Qiaozhen Wu
- Department of Mathematics, University of California Los Angeles, 520 Portola Plaza, Los Angeles, CA, 90095, USA
| | - Sergey Knyazev
- Department of Computer Science, Georgia State University, 1 Park Place, Atlanta, GA, 30303, USA
| | - Sei Chang
- Department of Computer Science, University of California Los Angeles, 404 Westwood Plaza, Los Angeles, CA, 90095, USA
| | - Lana S Martin
- Department of Clinical Pharmacy, School of Pharmacy, University of Southern California, 1985 Zonal Avenue, Los Angeles, CA, 90089, USA
| | - Aaron Karlsberg
- Department of Clinical Pharmacy, School of Pharmacy, University of Southern California, 1985 Zonal Avenue, Los Angeles, CA, 90089, USA
| | - Ekaterina Gerasimov
- Department of Computer Science, Georgia State University, 1 Park Place, Atlanta, GA, 30303, USA
| | - Russell Littman
- UCLA Bioinformatics, 621 Charles E Young Dr S, Los Angeles, CA, 90024, USA
| | - Brian L Hill
- Department of Computer Science, University of California Los Angeles, 404 Westwood Plaza, Los Angeles, CA, 90095, USA
| | - Nicholas C Wu
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, 92037, USA
| | - Harry Taegyun Yang
- Department of Computer Science, University of California Los Angeles, 404 Westwood Plaza, Los Angeles, CA, 90095, USA
| | - Kevin Hsieh
- Department of Computer Science, University of California Los Angeles, 404 Westwood Plaza, Los Angeles, CA, 90095, USA
| | - Linus Chen
- Department of Computer Science, University of California Los Angeles, 404 Westwood Plaza, Los Angeles, CA, 90095, USA
| | - Eli Littman
- Department of Computer Science, University of California Los Angeles, 404 Westwood Plaza, Los Angeles, CA, 90095, USA
| | - Taylor Shabani
- Department of Computer Science, University of California Los Angeles, 404 Westwood Plaza, Los Angeles, CA, 90095, USA
| | - German Enik
- Department of Computer Science, University of California Los Angeles, 404 Westwood Plaza, Los Angeles, CA, 90095, USA
| | - Douglas Yao
- Department of Molecular, Cell, and Developmental Biology, University of California Los Angeles, 650 Charles E. Young Drive South, Los Angeles, CA, 90095, USA
| | - Ren Sun
- Department of Molecular and Medical Pharmacology, University of California Los Angeles, 650 Charles E. Young Drive South, Los Angeles, CA, 90095, USA
| | - Jan Schroeder
- Epigenetics & Reprogramming Laboratory, Monash University, 15 Innovation Walk, Melbourne, VIC, 3800, Australia
| | - Eleazar Eskin
- Department of Computer Science, University of California Los Angeles, 404 Westwood Plaza, Los Angeles, CA, 90095, USA
| | - Alex Zelikovsky
- Department of Computer Science, Georgia State University, 1 Park Place, Atlanta, GA, 30303, USA
- The Laboratory of Bioinformatics, I.M, Sechenov First Moscow State Medical University, Moscow, Russia, 119991
| | - Pavel Skums
- Department of Computer Science, Georgia State University, 1 Park Place, Atlanta, GA, 30303, USA
| | - Mihai Pop
- Department of Computer Science and Center for Bioinformatics and Computational Biology, University of Maryland, College Park, MD, 20742, USA
| | - Serghei Mangul
- Department of Clinical Pharmacy, School of Pharmacy, University of Southern California, 1985 Zonal Avenue, Los Angeles, CA, 90089, USA.
| |
Collapse
|
15
|
Zhao Y, Fang X, Chen F, Bai M, Fan C, Zhao Y. Locus-patterned sequence oriented enrichment for multi-dimensional gene analysis. Chem Sci 2019; 10:8421-8427. [PMID: 31803421 PMCID: PMC6844269 DOI: 10.1039/c9sc02496d] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2019] [Accepted: 07/22/2019] [Indexed: 11/21/2022] Open
Abstract
Multi-dimensional gene analysis provides in-depth insights into gene sequence, locus variations and molecular abundance, whereas it is vulnerable to the perturbation of complex reaction networks and always compromises on the discrimination of analogous sequences. Here, we present a sequence oriented enrichment method patterned by the prescribed locus without crosstalk between concurrent reactions. Energetically favourable structures of nucleic acid probes are theoretically derived and oriented to a specific gene locus. We designed a pair of universal probes for multiple conserved loci to avoid side reactions from undesired interactions among increased probe sets. Furthermore, competitive probes were customized to sink analogues for differentiating the reaction equilibrium and kinetics of sequence enrichment from the target, so variant loci can be synchronously identified with nucleotide-level resolution. Thus, the gene locus guides sequence enrichment and combinatorial signals to create unique codes, which provides access to multidimensional and precise information for gene decoding.
Collapse
Affiliation(s)
- Yue Zhao
- Institute of Analytical Chemistry and Instrument for Life Science , Key Laboratory of Biomedical Information Engineering of Ministry of Education , School of Life Science and Technology , Xi'an Jiaotong University , Xianning West Road , Xi'an , Shaanxi 710049 , P. R. China .
| | - Xiaoxing Fang
- Institute of Analytical Chemistry and Instrument for Life Science , Key Laboratory of Biomedical Information Engineering of Ministry of Education , School of Life Science and Technology , Xi'an Jiaotong University , Xianning West Road , Xi'an , Shaanxi 710049 , P. R. China .
| | - Feng Chen
- Institute of Analytical Chemistry and Instrument for Life Science , Key Laboratory of Biomedical Information Engineering of Ministry of Education , School of Life Science and Technology , Xi'an Jiaotong University , Xianning West Road , Xi'an , Shaanxi 710049 , P. R. China .
| | - Min Bai
- Institute of Analytical Chemistry and Instrument for Life Science , Key Laboratory of Biomedical Information Engineering of Ministry of Education , School of Life Science and Technology , Xi'an Jiaotong University , Xianning West Road , Xi'an , Shaanxi 710049 , P. R. China .
| | - Chunhai Fan
- School of Chemistry and Chemical Engineering , Institute of Molecular Medicine , Renji Hospital , School of Medicine , Shanghai Jiao Tong University , Shanghai 200240 , P. R. China
| | - Yongxi Zhao
- Institute of Analytical Chemistry and Instrument for Life Science , Key Laboratory of Biomedical Information Engineering of Ministry of Education , School of Life Science and Technology , Xi'an Jiaotong University , Xianning West Road , Xi'an , Shaanxi 710049 , P. R. China .
| |
Collapse
|
16
|
Data storage in DNA with fewer synthesis cycles using composite DNA letters. Nat Biotechnol 2019; 37:1229-1236. [PMID: 31501560 DOI: 10.1038/s41587-019-0240-x] [Citation(s) in RCA: 68] [Impact Index Per Article: 13.6] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2018] [Accepted: 07/25/2019] [Indexed: 12/24/2022]
Abstract
The density and long-term stability of DNA make it an appealing storage medium, particularly for long-term data archiving. Existing DNA storage technologies involve the synthesis and sequencing of multiple nominally identical molecules in parallel, resulting in information redundancy. We report the development of encoding and decoding methods that exploit this redundancy using composite DNA letters. A composite DNA letter is a representation of a position in a sequence that consists of a mixture of all four DNA nucleotides in a predetermined ratio. Our methods encode data using fewer synthesis cycles. We encode 6.4 MB into composite DNA, with distinguishable composition medians, using 20% fewer synthesis cycles per unit of data, as compared to previous reports. We also simulate encoding with larger composite alphabets, with distinguishable composition deciles, to show that 75% fewer synthesis cycles are potentially sufficient. We describe applicable error-correcting codes and inference methods, and investigate error patterns in the context of composite DNA letters.
Collapse
|
17
|
Wang J, Wu A. A head-to-toe makeover for classical sequencing-by-synthesis helps users to squeeze more out of each base. Natl Sci Rev 2019; 6:3-4. [PMID: 34691818 PMCID: PMC8291463 DOI: 10.1093/nsr/nwy013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Affiliation(s)
- Jianbin Wang
- School of Life Sciences, Tsinghua University, China
| | - Angela Wu
- Division of Life Science, Hong Kong University of Science and Technology, China
- Department of Chemical and Biological Engineering, Hong Kong University of Science and Technology, China
| |
Collapse
|
18
|
Sloan DB, Broz AK, Sharbrough J, Wu Z. Detecting Rare Mutations and DNA Damage with Sequencing-Based Methods. Trends Biotechnol 2018; 36:729-740. [PMID: 29550161 PMCID: PMC6004327 DOI: 10.1016/j.tibtech.2018.02.009] [Citation(s) in RCA: 45] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2018] [Revised: 02/16/2018] [Accepted: 02/20/2018] [Indexed: 12/18/2022]
Abstract
There is a great need in biomedical and genetic research to detect DNA damage and de novo mutations, but doing so is inherently challenging because of the rarity of these events. The enormous capacity of current DNA sequencing technologies has opened the door for quantifying sequence variants present at low frequencies in vivo, such as within cancerous tissues. However, these sequencing technologies are error prone, resulting in high noise thresholds. Most DNA sequencing methods are also generally incapable of identifying chemically modified bases arising from DNA damage. In recent years, numerous specialized modifications to sequencing methods have been developed to address these shortcomings. Here, we review this landscape of emerging techniques, highlighting their respective strengths, weaknesses, and target applications.
Collapse
Affiliation(s)
- Daniel B Sloan
- Department of Biology, Colorado State University, Fort Collins, CO, USA.
| | - Amanda K Broz
- Department of Biology, Colorado State University, Fort Collins, CO, USA
| | - Joel Sharbrough
- Department of Biology, Colorado State University, Fort Collins, CO, USA
| | - Zhiqiang Wu
- Department of Biology, Colorado State University, Fort Collins, CO, USA
| |
Collapse
|
19
|
Guo S, Lin WN, Hu Y, Sun G, Phan DT, Chen CH. Ultrahigh-throughput droplet microfluidic device for single-cell miRNA detection with isothermal amplification. LAB ON A CHIP 2018; 18:1914-1920. [PMID: 29877542 DOI: 10.1039/c8lc00390d] [Citation(s) in RCA: 39] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
Analysis of microRNA (miRNA), a pivotal primary regulator of fundamental cellular processes, at the single-cell level is essential to elucidate regulated gene expression precisely. Most single-cell gene sequencing methods use the polymerase chain reaction (PCR) to increase the concentration of the target gene for detection, thus requiring a barcoding process for cell identification and creating a challenge for real-time, large-scale screening of sequences in cells to rapidly profile physiological samples. In this study, a rapid, PCR-free, single-cell miRNA assay is developed from a continuous-flow microfluidic process employing a DNA hybridization chain reaction to amplify the target miRNA signal. Individual cells are encapsulated with DNA amplifiers in water-in-oil droplets and then lysed. The released target miRNA interacts with the DNA amplifiers to trigger hybridization reactions, producing fluorescence signals. Afterward, the target sequences are recycled to trigger a cyclic cascade reaction and significantly amplify the fluorescence signals without using PCR thermal cycling. Multiple DNA amplifiers with distinct fluorescence signals can be encapsulated simultaneously in a droplet to measure multiple miRNAs from a single cell simultaneously. Moreover, this process converts the lab bench PCR assay to a real-time droplet assay with the post-reaction fluorescence signal as a readout to allow flow cytometry-like continuous-flow measurement of sequences in a single cell with an ultrahigh throughput (300-500 cells per minute) for rapid biomedical identification.
Collapse
Affiliation(s)
- Song Guo
- Department of Biomedical Engineering, National University of Singapore, 21 Lower Kent Ridge Road, 119077 Singapore.
| | | | | | | | | | | |
Collapse
|
20
|
Nawy T. Sequencing DNA, no mistake. Nat Methods 2018. [DOI: 10.1038/nmeth.4571] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
21
|
DNA sequencing at ultra-high fidelity. Nat Biotechnol 2017; 35:1143-1144. [DOI: 10.1038/nbt.4001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|