1
|
Al'Khafaji AM, Smith JT, Garimella KV, Babadi M, Popic V, Sade-Feldman M, Gatzen M, Sarkizova S, Schwartz MA, Blaum EM, Day A, Costello M, Bowers T, Gabriel S, Banks E, Philippakis AA, Boland GM, Blainey PC, Hacohen N. High-throughput RNA isoform sequencing using programmed cDNA concatenation. Nat Biotechnol 2024; 42:582-586. [PMID: 37291427 DOI: 10.1038/s41587-023-01815-7] [Citation(s) in RCA: 33] [Impact Index Per Article: 33.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2021] [Accepted: 05/02/2023] [Indexed: 06/10/2023]
Abstract
Full-length RNA-sequencing methods using long-read technologies can capture complete transcript isoforms, but their throughput is limited. We introduce multiplexed arrays isoform sequencing (MAS-ISO-seq), a technique for programmably concatenating complementary DNAs (cDNAs) into molecules optimal for long-read sequencing, increasing the throughput >15-fold to nearly 40 million cDNA reads per run on the Sequel IIe sequencer. When applied to single-cell RNA sequencing of tumor-infiltrating T cells, MAS-ISO-seq demonstrated a 12- to 32-fold increase in the discovery of differentially spliced genes.
Collapse
Affiliation(s)
| | | | | | | | | | - Moshe Sade-Feldman
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Medicine, Center for Cancer Research, Massachusetts General Hospital, Boston, MA, USA
| | | | | | - Marc A Schwartz
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Pediatrics, Harvard Medical School, Boston, MA, USA
- Division of Hematology/Oncology, Boston Children's Hospital, Boston, MA, USA
- Department of Pediatric Oncology, Dana Farber Cancer Institute, Boston, MA, USA
| | - Emily M Blaum
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Medicine, Center for Cancer Research, Massachusetts General Hospital, Boston, MA, USA
| | - Allyson Day
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | - Tera Bowers
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | - Eric Banks
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | - Genevieve M Boland
- Division of Surgical Oncology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - Paul C Blainey
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA.
- Koch Institute for Integrative Cancer Research at the Massachusetts Institute of Technology, Cambridge, MA, USA.
| | - Nir Hacohen
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Department of Medicine, Center for Cancer Research, Massachusetts General Hospital, Boston, MA, USA.
- Harvard Medical School, Boston, MA, USA.
- Center for Immunology and Inflammatory Diseases, Massachusetts General Hospital, Charlestown, MA, USA.
| |
Collapse
|
2
|
Wijeratne S, Gonzalez MEH, Roach K, Miller KE, Schieffer KM, Fitch JR, Leonard J, White P, Kelly BJ, Cottrell CE, Mardis ER, Wilson RK, Miller AR. Full-length isoform concatenation sequencing to resolve cancer transcriptome complexity. BMC Genomics 2024; 25:122. [PMID: 38287261 PMCID: PMC10823626 DOI: 10.1186/s12864-024-10021-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Accepted: 01/16/2024] [Indexed: 01/31/2024] Open
Abstract
BACKGROUND Cancers exhibit complex transcriptomes with aberrant splicing that induces isoform-level differential expression compared to non-diseased tissues. Transcriptomic profiling using short-read sequencing has utility in providing a cost-effective approach for evaluating isoform expression, although short-read assembly displays limitations in the accurate inference of full-length transcripts. Long-read RNA sequencing (Iso-Seq), using the Pacific Biosciences (PacBio) platform, can overcome such limitations by providing full-length isoform sequence resolution which requires no read assembly and represents native expressed transcripts. A constraint of the Iso-Seq protocol is due to fewer reads output per instrument run, which, as an example, can consequently affect the detection of lowly expressed transcripts. To address these deficiencies, we developed a concatenation workflow, PacBio Full-Length Isoform Concatemer Sequencing (PB_FLIC-Seq), designed to increase the number of unique, sequenced PacBio long-reads thereby improving overall detection of unique isoforms. In addition, we anticipate that the increase in read depth will help improve the detection of moderate to low-level expressed isoforms. RESULTS In sequencing a commercial reference (Spike-In RNA Variants; SIRV) with known isoform complexity we demonstrated a 3.4-fold increase in read output per run and improved SIRV recall when using the PB_FLIC-Seq method compared to the same samples processed with the Iso-Seq protocol. We applied this protocol to a translational cancer case, also demonstrating the utility of the PB_FLIC-Seq method for identifying differential full-length isoform expression in a pediatric diffuse midline glioma compared to its adjacent non-malignant tissue. Our data analysis revealed increased expression of extracellular matrix (ECM) genes within the tumor sample, including an isoform of the Secreted Protein Acidic and Cysteine Rich (SPARC) gene that was expressed 11,676-fold higher than in the adjacent non-malignant tissue. Finally, by using the PB_FLIC-Seq method, we detected several cancer-specific novel isoforms. CONCLUSION This work describes a concatenation-based methodology for increasing the number of sequenced full-length isoform reads on the PacBio platform, yielding improved discovery of expressed isoforms. We applied this workflow to profile the transcriptome of a pediatric diffuse midline glioma and adjacent non-malignant tissue. Our findings of cancer-specific novel isoform expression further highlight the importance of long-read sequencing for characterization of complex tumor transcriptomes.
Collapse
Affiliation(s)
- Saranga Wijeratne
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Abigail Wexner Research Institute at Nationwide Children's Hospital, 575 Children's Crossroad, Columbus, OH, 43215, USA
| | - Maria E Hernandez Gonzalez
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Abigail Wexner Research Institute at Nationwide Children's Hospital, 575 Children's Crossroad, Columbus, OH, 43215, USA
| | - Kelli Roach
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Abigail Wexner Research Institute at Nationwide Children's Hospital, 575 Children's Crossroad, Columbus, OH, 43215, USA
| | - Katherine E Miller
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Abigail Wexner Research Institute at Nationwide Children's Hospital, 575 Children's Crossroad, Columbus, OH, 43215, USA
- Department of Pediatrics, The Ohio State University College of Medicine, Columbus, OH, USA
| | - Kathleen M Schieffer
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Abigail Wexner Research Institute at Nationwide Children's Hospital, 575 Children's Crossroad, Columbus, OH, 43215, USA
- Department of Pediatrics, The Ohio State University College of Medicine, Columbus, OH, USA
- Department of Pathology, The Ohio State University College of Medicine, Columbus, OH, USA
| | - James R Fitch
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Abigail Wexner Research Institute at Nationwide Children's Hospital, 575 Children's Crossroad, Columbus, OH, 43215, USA
| | - Jeffrey Leonard
- Department of Neurosurgery, Nationwide Children's Hospital, Columbus, OH, USA
- Department of Neurosurgery, The Ohio State University College of Medicine, Columbus, OH, USA
| | - Peter White
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Abigail Wexner Research Institute at Nationwide Children's Hospital, 575 Children's Crossroad, Columbus, OH, 43215, USA
- Department of Pediatrics, The Ohio State University College of Medicine, Columbus, OH, USA
| | - Benjamin J Kelly
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Abigail Wexner Research Institute at Nationwide Children's Hospital, 575 Children's Crossroad, Columbus, OH, 43215, USA
| | - Catherine E Cottrell
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Abigail Wexner Research Institute at Nationwide Children's Hospital, 575 Children's Crossroad, Columbus, OH, 43215, USA
- Department of Pediatrics, The Ohio State University College of Medicine, Columbus, OH, USA
- Department of Pathology, The Ohio State University College of Medicine, Columbus, OH, USA
| | - Elaine R Mardis
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Abigail Wexner Research Institute at Nationwide Children's Hospital, 575 Children's Crossroad, Columbus, OH, 43215, USA
- Department of Pediatrics, The Ohio State University College of Medicine, Columbus, OH, USA
- Department of Neurosurgery, The Ohio State University College of Medicine, Columbus, OH, USA
| | - Richard K Wilson
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Abigail Wexner Research Institute at Nationwide Children's Hospital, 575 Children's Crossroad, Columbus, OH, 43215, USA
- Department of Pediatrics, The Ohio State University College of Medicine, Columbus, OH, USA
| | - Anthony R Miller
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Abigail Wexner Research Institute at Nationwide Children's Hospital, 575 Children's Crossroad, Columbus, OH, 43215, USA.
| |
Collapse
|
3
|
Swaminath S, Russell AB. The use of single-cell RNA-seq to study heterogeneity at varying levels of virus-host interactions. PLoS Pathog 2024; 20:e1011898. [PMID: 38236826 PMCID: PMC10796064 DOI: 10.1371/journal.ppat.1011898] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2024] Open
Abstract
The outcome of viral infection depends on the diversity of the infecting viral population and the heterogeneity of the cell population that is infected. Until almost a decade ago, the study of these dynamic processes during viral infection was challenging and limited to certain targeted measurements. Presently, with the use of single-cell sequencing technology, the complex interface defined by the interactions of cells with infecting virus can now be studied across the breadth of the transcriptome in thousands of individual cells simultaneously. In this review, we will describe the use of single-cell RNA sequencing (scRNA-seq) to study the heterogeneity of viral infections, ranging from individual virions to the immune response between infected individuals. In addition, we highlight certain key experimental limitations and methodological decisions that are critical to analyzing scRNA-seq data at each scale.
Collapse
Affiliation(s)
- Sharmada Swaminath
- School of Biological Sciences, University of California, San Diego, La Jolla, California, United States of America
| | - Alistair B. Russell
- School of Biological Sciences, University of California, San Diego, La Jolla, California, United States of America
| |
Collapse
|
4
|
Lozachmeur G, Bramoulle A, Aubert A, Stüder F, Moehlin J, Madrange L, Yates F, Deslys JP, Mendoza-Parra MA. Three-dimensional molecular cartography of human cerebral organoids revealed by double-barcoded spatial transcriptomics. CELL REPORTS METHODS 2023; 3:100573. [PMID: 37751695 PMCID: PMC10545904 DOI: 10.1016/j.crmeth.2023.100573] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Revised: 06/29/2023] [Accepted: 08/07/2023] [Indexed: 09/28/2023]
Abstract
Spatially resolved transcriptomics is revolutionizing our understanding of complex tissues, but scaling these approaches to multiple tissue sections and three-dimensional tissue reconstruction remains challenging and cost prohibitive. In this work, we present a low-cost strategy for manufacturing molecularly double-barcoded DNA arrays, enabling large-scale spatially resolved transcriptomics studies. We applied this technique to spatially resolve gene expression in several human brain organoids, including the reconstruction of a three-dimensional view from multiple consecutive sections, revealing gene expression heterogeneity throughout the tissue.
Collapse
Affiliation(s)
- Gwendoline Lozachmeur
- UMR 8030 Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, University of Evry-val-d'Essonne, University Paris-Saclay, 91057 Évry, France
| | - Aude Bramoulle
- UMR 8030 Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, University of Evry-val-d'Essonne, University Paris-Saclay, 91057 Évry, France
| | - Antoine Aubert
- UMR 8030 Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, University of Evry-val-d'Essonne, University Paris-Saclay, 91057 Évry, France
| | - François Stüder
- UMR 8030 Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, University of Evry-val-d'Essonne, University Paris-Saclay, 91057 Évry, France
| | - Julien Moehlin
- UMR 8030 Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, University of Evry-val-d'Essonne, University Paris-Saclay, 91057 Évry, France
| | - Lucie Madrange
- Service d'Etude des Prions et des Infections Atypiques (SEPIA), Institut François Jacob, Commissariat à l'Energie Atomique et aux Energies Alternatives (CEA), Université Paris Saclay, Fontenay-aux-Roses, France; Sup'Biotech Engineering School - CellTechs Team, Villejuif, France
| | - Frank Yates
- Service d'Etude des Prions et des Infections Atypiques (SEPIA), Institut François Jacob, Commissariat à l'Energie Atomique et aux Energies Alternatives (CEA), Université Paris Saclay, Fontenay-aux-Roses, France; Sup'Biotech Engineering School - CellTechs Team, Villejuif, France
| | - Jean-Philippe Deslys
- Service d'Etude des Prions et des Infections Atypiques (SEPIA), Institut François Jacob, Commissariat à l'Energie Atomique et aux Energies Alternatives (CEA), Université Paris Saclay, Fontenay-aux-Roses, France
| | - Marco Antonio Mendoza-Parra
- UMR 8030 Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, University of Evry-val-d'Essonne, University Paris-Saclay, 91057 Évry, France.
| |
Collapse
|
5
|
Hook PW, Timp W. Beyond assembly: the increasing flexibility of single-molecule sequencing technology. Nat Rev Genet 2023; 24:627-641. [PMID: 37161088 PMCID: PMC10169143 DOI: 10.1038/s41576-023-00600-1] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/30/2023] [Indexed: 05/11/2023]
Abstract
The maturation of high-throughput short-read sequencing technology over the past two decades has shaped the way genomes are studied. Recently, single-molecule, long-read sequencing has emerged as an essential tool in deciphering genome structure and function, including filling gaps in the human reference genome, measuring the epigenome and characterizing splicing variants in the transcriptome. With recent technological developments, these single-molecule technologies have moved beyond genome assembly and are being used in a variety of ways, including to selectively sequence specific loci with long reads, measure chromatin state and protein-DNA binding in order to investigate the dynamics of gene regulation, and rapidly determine copy number variation. These increasingly flexible uses of single-molecule technologies highlight a young and fast-moving part of the field that is leading to a more accessible era of nucleic acid sequencing.
Collapse
Affiliation(s)
- Paul W Hook
- Department of Biomedical Engineering, Molecular Biology and Genetics, and Genetic Medicine, Johns Hopkins University, Baltimore, MD, USA
| | - Winston Timp
- Department of Biomedical Engineering, Molecular Biology and Genetics, and Genetic Medicine, Johns Hopkins University, Baltimore, MD, USA.
| |
Collapse
|
6
|
Creydt M, Fischer M. Artefact Profiling: Panomics Approaches for Understanding the Materiality of Written Artefacts. Molecules 2023; 28:4872. [PMID: 37375427 DOI: 10.3390/molecules28124872] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Revised: 06/15/2023] [Accepted: 06/18/2023] [Indexed: 06/29/2023] Open
Abstract
This review explains the strategies behind genomics, proteomics, metabolomics, metallomics and isotopolomics approaches and their applicability to written artefacts. The respective sub-chapters give an insight into the analytical procedure and the conclusions drawn from such analyses. A distinction is made between information that can be obtained from the materials used in the respective manuscript and meta-information that cannot be obtained from the manuscript itself, but from residues of organisms such as bacteria or the authors and readers. In addition, various sampling techniques are discussed in particular, which pose a special challenge in manuscripts. The focus is on high-resolution, non-targeted strategies that can be used to extract the maximum amount of information about ancient objects. The combination of the various omics disciplines (panomics) especially offers potential added value in terms of the best possible interpretations of the data received. The information obtained can be used to understand the production of ancient artefacts, to gain impressions of former living conditions, to prove their authenticity, to assess whether there is a toxic hazard in handling the manuscripts, and to be able to determine appropriate measures for their conservation and restoration.
Collapse
Affiliation(s)
- Marina Creydt
- Institute of Food Chemistry, Hamburg School of Food Science, University of Hamburg, Grindelallee 117, 20146 Hamburg, Germany
- Cluster of Excellence, Understanding Written Artefacts, University of Hamburg, Warburgstraße 26, 20354 Hamburg, Germany
| | - Markus Fischer
- Institute of Food Chemistry, Hamburg School of Food Science, University of Hamburg, Grindelallee 117, 20146 Hamburg, Germany
- Cluster of Excellence, Understanding Written Artefacts, University of Hamburg, Warburgstraße 26, 20354 Hamburg, Germany
| |
Collapse
|
7
|
Shi ZX, Chen ZC, Zhong JY, Hu KH, Zheng YF, Chen Y, Xie SQ, Bo XC, Luo F, Tang C, Xiao CL, Liu YZ. High-throughput and high-accuracy single-cell RNA isoform analysis using PacBio circular consensus sequencing. Nat Commun 2023; 14:2631. [PMID: 37149708 PMCID: PMC10164132 DOI: 10.1038/s41467-023-38324-9] [Citation(s) in RCA: 19] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Accepted: 04/24/2023] [Indexed: 05/08/2023] Open
Abstract
Although long-read single-cell RNA isoform sequencing (scISO-Seq) can reveal alternative RNA splicing in individual cells, it suffers from a low read throughput. Here, we introduce HIT-scISOseq, a method that removes most artifact cDNAs and concatenates multiple cDNAs for PacBio circular consensus sequencing (CCS) to achieve high-throughput and high-accuracy single-cell RNA isoform sequencing. HIT-scISOseq can yield >10 million high-accuracy long-reads in a single PacBio Sequel II SMRT Cell 8M. We also report the development of scISA-Tools that demultiplex HIT-scISOseq concatenated reads into single-cell cDNA reads with >99.99% accuracy and specificity. We apply HIT-scISOseq to characterize the transcriptomes of 3375 corneal limbus cells and reveal cell-type-specific isoform expression in them. HIT-scISOseq is a high-throughput, high-accuracy, technically accessible method and it can accelerate the burgeoning field of long-read single-cell transcriptomics.
Collapse
Affiliation(s)
- Zhuo-Xing Shi
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangzhou, 510060, China
| | - Zhi-Chao Chen
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Jia-Yong Zhong
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangzhou, 510060, China
| | - Kun-Hua Hu
- Guangdong Key Laboratory of Liver Disease Research, the Third Affiliated Hospital of Sun Yat-sen University, Guangzhou, 510630, China
| | - Ying-Feng Zheng
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangzhou, 510060, China
| | - Ying Chen
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangzhou, 510060, China
| | - Shang-Qian Xie
- Key Laboratory of Genetics and Germplasm Innovation of Tropical Special Forest Trees and Ornamental Plants, Ministry of Education, College of Forestry, Hainan University, Haikou, 570228, China
| | - Xiao-Chen Bo
- Beijing Institute of Radiation Medicine, Beijing, China.
| | - Feng Luo
- School of Computing, Clemson University, Clemson, SC, 29634-0974, USA.
| | - Chong Tang
- BGI Genomics, BGI Shenzhen, Shenzhen, China.
| | - Chuan-Le Xiao
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangzhou, 510060, China.
| | - Yi-Zhi Liu
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangzhou, 510060, China.
- Research Unit of Ocular Development and Regeneration, Chinese Academy of Medical Sciences, Beijing, China.
| |
Collapse
|
8
|
Solanki A, Chen T, Riedel M. Computing mathematical functions with chemical reactions via stochastic logic. PLoS One 2023; 18:e0281574. [PMID: 37155644 PMCID: PMC10166555 DOI: 10.1371/journal.pone.0281574] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2022] [Accepted: 01/26/2023] [Indexed: 05/10/2023] Open
Abstract
This paper presents a novel strategy for computing mathematical functions with molecular reactions, based on theory from the realm of digital design. It demonstrates how to design chemical reaction networks based on truth tables that specify analog functions, computed by stochastic logic. The theory of stochastic logic entails the use of random streams of zeros and ones to represent probabilistic values. A link is made between the representation of random variables with stochastic logic on the one hand, and the representation of variables in molecular systems as the concentration of molecular species, on the other. Research in stochastic logic has demonstrated that many mathematical functions of interest can be computed with simple circuits built with logic gates. This paper presents a general and efficient methodology for translating mathematical functions computed by stochastic logic circuits into chemical reaction networks. Simulations show that the computation performed by the reaction networks is accurate and robust to variations in the reaction rates, within a log-order constraint. Reaction networks are given that compute functions for applications such as image and signal processing, as well as machine learning: arctan, exponential, Bessel, and sinc. An implementation is proposed with a specific experimental chassis: DNA strand displacement with units called DNA "concatemers".
Collapse
Affiliation(s)
- Arnav Solanki
- Department of Electrical and Computer Engineering, University of Minnesota Twin-Cities, Minneapolis, MN, United States of America
| | - Tonglin Chen
- Department of Electrical and Computer Engineering, University of Minnesota Twin-Cities, Minneapolis, MN, United States of America
| | - Marc Riedel
- Department of Electrical and Computer Engineering, University of Minnesota Twin-Cities, Minneapolis, MN, United States of America
| |
Collapse
|
9
|
Deeg CM, Sutherland BJG, Ming TJ, Wallace C, Jonsen K, Flynn KL, Rondeau EB, Beacham TD, Miller KM. In-field genetic stock identification of overwintering coho salmon in the Gulf of Alaska: Evaluation of Nanopore sequencing for remote real-time deployment. Mol Ecol Resour 2022; 22:1824-1835. [PMID: 35212146 PMCID: PMC9303916 DOI: 10.1111/1755-0998.13595] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2021] [Revised: 01/24/2022] [Accepted: 02/03/2022] [Indexed: 11/27/2022]
Abstract
Genetic stock identification (GSI) from genotyping‐by‐sequencing of single nucleotide polymorphism (SNP) loci has become the gold standard for stock of origin identification in Pacific salmon. The sequencing platforms currently applied require large batch sizes and multiday processing in specialized facilities to perform genotyping by the thousands. However, recent advances in third‐generation single‐molecule sequencing platforms, such as the Oxford Nanopore minION, provide base calling on portable, pocket‐sized sequencers and promise real‐time, in‐field stock identification of variable batch sizes. Here we evaluate utility and comparability to established GSI platforms of at‐sea stock identification of coho salmon (Oncorhynchus kisutch) using targeted SNP amplicon sequencing on the minION platform during a high‐sea winter expedition to the Gulf of Alaska. As long read sequencers are not optimized for short amplicons, we concatenate amplicons to increase coverage and throughput. Nanopore sequencing at‐sea yielded data sufficient for stock assignment for 50 out of 80 individuals. Nanopore‐based SNP calls agreed with Ion Torrent‐based genotypes in 83.25%, but assignment of individuals to stock of origin only agreed in 61.5% of individuals, highlighting inherent challenges of Nanopore sequencing, such as resolution of homopolymer tracts and indels. However, poor representation of assayed salmon in the queried baseline data set contributed to poor assignment confidence on both platforms. Future improvements will focus on lowering turnaround time and cost, increasing accuracy and throughput, as well as augmentation of the existing baselines. If successfully implemented, Nanopore sequencing will provide an alternative method to the large‐scale laboratory approach by providing mobile small batch genotyping to diverse stakeholders.
Collapse
Affiliation(s)
- Christoph M Deeg
- Forest and Conservation Sciences, University of British Columbia, Vancouver, British Columbia, Canada.,Pacific Salmon Foundation, Vancouver, British Columbia, Canada
| | - Ben J G Sutherland
- Fisheries and Oceans Canada, Pacific Biological Station, Nanaimo, British Columbia, Canada
| | - Tobi J Ming
- Fisheries and Oceans Canada, Pacific Biological Station, Nanaimo, British Columbia, Canada
| | - Colin Wallace
- Fisheries and Oceans Canada, Pacific Biological Station, Nanaimo, British Columbia, Canada
| | - Kim Jonsen
- Fisheries and Oceans Canada, Pacific Biological Station, Nanaimo, British Columbia, Canada
| | - Kelsey L Flynn
- Fisheries and Oceans Canada, Pacific Biological Station, Nanaimo, British Columbia, Canada
| | - Eric B Rondeau
- Fisheries and Oceans Canada, Pacific Biological Station, Nanaimo, British Columbia, Canada
| | - Terry D Beacham
- Fisheries and Oceans Canada, Pacific Biological Station, Nanaimo, British Columbia, Canada
| | - Kristina M Miller
- Forest and Conservation Sciences, University of British Columbia, Vancouver, British Columbia, Canada.,Fisheries and Oceans Canada, Pacific Biological Station, Nanaimo, British Columbia, Canada
| |
Collapse
|
10
|
Kuiper BP, Prins RC, Billerbeck S. Oligo Pools as an Affordable Source of Synthetic DNA for Cost-Effective Library Construction in Protein- and Metabolic Pathway Engineering. Chembiochem 2021; 23:e202100507. [PMID: 34817110 PMCID: PMC9300125 DOI: 10.1002/cbic.202100507] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2021] [Revised: 11/23/2021] [Indexed: 11/11/2022]
Abstract
The construction of custom libraries is critical for rational protein engineering and directed evolution. Array‐synthesized oligo pools of thousands of user‐defined sequences (up to ∼350 bases in length) have emerged as a low‐cost commercially available source of DNA. These pools cost ≤10 % (depending on error rate and length) of other commercial sources of custom DNA, and this significant cost difference can determine whether an enzyme engineering project can be realized on a given research budget. However, while being cheap, oligo pools do suffer from a low concentration of individual oligos and relatively high error rates. Several powerful techniques that specifically make use of oligo pools have been developed and proven valuable or even essential for next‐generation protein and pathway engineering strategies, such as sequence‐function mapping, enzyme minimization, or de‐novo design. Here we consolidate the knowledge on these techniques and their applications to facilitate the use of oligo pools within the protein engineering community.
Collapse
Affiliation(s)
- Bastiaan P Kuiper
- Molecular Microbiology, Groningen Biomolecular Sciences and Biotechnology Institute, University of Groningen, Groningen, The Netherlands
| | - Rianne C Prins
- Molecular Microbiology, Groningen Biomolecular Sciences and Biotechnology Institute, University of Groningen, Groningen, The Netherlands
| | - Sonja Billerbeck
- Molecular Microbiology, Groningen Biomolecular Sciences and Biotechnology Institute, University of Groningen, Groningen, The Netherlands
| |
Collapse
|
11
|
PacBio sequencing output increased through uniform and directional fivefold concatenation. Sci Rep 2021; 11:18065. [PMID: 34508117 PMCID: PMC8433307 DOI: 10.1038/s41598-021-96829-z] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2021] [Accepted: 08/17/2021] [Indexed: 12/20/2022] Open
Abstract
Advances in sequencing technology have allowed researchers to sequence DNA with greater ease and at decreasing costs. Main developments have focused on either sequencing many short sequences or fewer large sequences. Methods for sequencing mid-sized sequences of 600-5,000 bp are currently less efficient. For example, the PacBio Sequel I system yields ~ 100,000-300,000 reads with an accuracy per base pair of 90-99%. We sought to sequence several DNA populations of ~ 870 bp in length with a sequencing accuracy of 99% and to the greatest depth possible. We optimised a simple, robust method to concatenate genes of ~ 870 bp five times and then sequenced the resulting DNA of ~ 5,000 bp by PacBioSMRT long-read sequencing. Our method improved upon previously published concatenation attempts, leading to a greater sequencing depth, high-quality reads and limited sample preparation at little expense. We applied this efficient concatenation protocol to sequence nine DNA populations from a protein engineering study. The improved method is accompanied by a simple and user-friendly analysis pipeline, DeCatCounter, to sequence medium-length sequences efficiently at one-fifth of the cost.
Collapse
|
12
|
Oikonomopoulos S, Bayega A, Fahiminiya S, Djambazian H, Berube P, Ragoussis J. Methodologies for Transcript Profiling Using Long-Read Technologies. Front Genet 2020; 11:606. [PMID: 32733532 PMCID: PMC7358353 DOI: 10.3389/fgene.2020.00606] [Citation(s) in RCA: 53] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2020] [Accepted: 05/19/2020] [Indexed: 12/28/2022] Open
Abstract
RNA sequencing using next-generation sequencing technologies (NGS) is currently the standard approach for gene expression profiling, particularly for large-scale high-throughput studies. NGS technologies comprise high throughput, cost efficient short-read RNA-Seq, while emerging single molecule, long-read RNA-Seq technologies have enabled new approaches to study the transcriptome and its function. The emerging single molecule, long-read technologies are currently commercially available by Pacific Biosciences (PacBio) and Oxford Nanopore Technologies (ONT), while new methodologies based on short-read sequencing approaches are also being developed in order to provide long range single molecule level information-for example, the ones represented by the 10x Genomics linked read methodology. The shift toward long-read sequencing technologies for transcriptome characterization is based on current increases in throughput and decreases in cost, making these attractive for de novo transcriptome assembly, isoform expression quantification, and in-depth RNA species analysis. These types of analyses were challenging with standard short sequencing approaches, due to the complex nature of the transcriptome, which consists of variable lengths of transcripts and multiple alternatively spliced isoforms for most genes, as well as the high sequence similarity of highly abundant species of RNA, such as rRNAs. Here we aim to focus on single molecule level sequencing technologies and single-cell technologies that, combined with perturbation tools, allow the analysis of complete RNA species, whether short or long, at high resolution. In parallel, these tools have opened new ways in understanding gene functions at the tissue, network, and pathway levels, as well as their detailed functional characterization. Analysis of the epi-transcriptome, including RNA methylation and modification and the effects of such modifications on biological systems is now enabled through direct RNA sequencing instead of classical indirect approaches. However, many difficulties and challenges remain, such as methodologies to generate full-length RNA or cDNA libraries from all different species of RNAs, not only poly-A containing transcripts, and the identification of allele-specific transcripts due to current error rates of single molecule technologies, while the bioinformatics analysis on long-read data for accurate identification of 5' and 3' UTRs is still in development.
Collapse
Affiliation(s)
- Spyros Oikonomopoulos
- McGill Genome Centre, Department of Human Genetics, McGill University, Montréal, QC, Canada
| | - Anthony Bayega
- McGill Genome Centre, Department of Human Genetics, McGill University, Montréal, QC, Canada
| | - Somayyeh Fahiminiya
- McGill Genome Centre, Department of Human Genetics, McGill University, Montréal, QC, Canada
| | - Haig Djambazian
- McGill Genome Centre, Department of Human Genetics, McGill University, Montréal, QC, Canada
| | - Pierre Berube
- McGill Genome Centre, Department of Human Genetics, McGill University, Montréal, QC, Canada
| | - Jiannis Ragoussis
- McGill Genome Centre, Department of Human Genetics, McGill University, Montréal, QC, Canada
- Department of Bioengineering, McGill University, Montréal, QC, Canada
| |
Collapse
|
13
|
Prabakar RK, Xu L, Hicks J, Smith AD. SMURF-seq: efficient copy number profiling on long-read sequencers. Genome Biol 2019; 20:134. [PMID: 31287019 PMCID: PMC6615205 DOI: 10.1186/s13059-019-1732-1] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2018] [Accepted: 06/06/2019] [Indexed: 12/21/2022] Open
Abstract
We present SMURF-seq, a protocol to efficiently sequence short DNA molecules on a long-read sequencer by randomly ligating them to form long molecules. Applying SMURF-seq using the Oxford Nanopore MinION yields up to 30 fragments per read, providing an average of 6.2 and up to 7.5 million mappable fragments per run, increasing information throughput for read-counting applications. We apply SMURF-seq on the MinION to generate copy number profiles. A comparison with profiles from Illumina sequencing reveals that SMURF-seq attains similar accuracy. More broadly, SMURF-seq expands the utility of long-read sequencers for read-counting applications.
Collapse
Affiliation(s)
- Rishvanth K. Prabakar
- Quantitative and Computational Biology Section, Department of Biological Sciences, University of Southern California, 1050 Childs Way, Los Angeles, 90089 USA
| | - Liya Xu
- Michelson Center for Convergent Bioscience, University of Southern California, 1002 Childs Way, Los Angeles, 90089 USA
| | - James Hicks
- Michelson Center for Convergent Bioscience, University of Southern California, 1002 Childs Way, Los Angeles, 90089 USA
| | - Andrew D. Smith
- Quantitative and Computational Biology Section, Department of Biological Sciences, University of Southern California, 1050 Childs Way, Los Angeles, 90089 USA
| |
Collapse
|