1
|
Romero-López C, Díaz-González R, Berzal-Herranz A. RNA Selection and EvolutionIn Vitro:Powerful Techniques for the Analysis and Identification of new Molecular Tools. BIOTECHNOL BIOTEC EQ 2014. [DOI: 10.1080/13102818.2007.10817461] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023] Open
|
2
|
Library preparation methods for next-generation sequencing: tone down the bias. Exp Cell Res 2014; 322:12-20. [PMID: 24440557 DOI: 10.1016/j.yexcr.2014.01.008] [Citation(s) in RCA: 250] [Impact Index Per Article: 25.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2013] [Revised: 01/07/2014] [Accepted: 01/08/2014] [Indexed: 11/22/2022]
Abstract
Next-generation sequencing (NGS) has caused a revolution in biology. NGS requires the preparation of libraries in which (fragments of) DNA or RNA molecules are fused with adapters followed by PCR amplification and sequencing. It is evident that robust library preparation methods that produce a representative, non-biased source of nucleic acid material from the genome under investigation are of crucial importance. Nevertheless, it has become clear that NGS libraries for all types of applications contain biases that compromise the quality of NGS datasets and can lead to their erroneous interpretation. A detailed knowledge of the nature of these biases will be essential for a careful interpretation of NGS data on the one hand and will help to find ways to improve library quality or to develop bioinformatics tools to compensate for the bias on the other hand. In this review we discuss the literature on bias in the most common NGS library preparation protocols, both for DNA sequencing (DNA-seq) as well as for RNA sequencing (RNA-seq). Strikingly, almost all steps of the various protocols have been reported to introduce bias, especially in the case of RNA-seq, which is technically more challenging than DNA-seq. For each type of bias we discuss methods for improvement with a view to providing some useful advice to the researcher who wishes to convert any kind of raw nucleic acid into an NGS library.
Collapse
|
3
|
Kwok CK, Ding Y, Sherlock ME, Assmann SM, Bevilacqua PC. A hybridization-based approach for quantitative and low-bias single-stranded DNA ligation. Anal Biochem 2013; 435:181-6. [PMID: 23399535 DOI: 10.1016/j.ab.2013.01.008] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2012] [Revised: 01/09/2013] [Accepted: 01/12/2013] [Indexed: 01/11/2023]
Abstract
Single-stranded DNA (ssDNA) ligation is a crucial step in many biochemical assays. Efficient ways of carrying out this reaction, however, are lacking. We show here that existing ssDNA ligation methods suffer from slow kinetics, poor yield, and severe nucleotide preference. To resolve these issues, we introduce a hybridization-based strategy that provides efficient and low-bias ligation of ssDNA. Our method uses a hairpin DNA to hybridize to any incoming acceptor ssDNA with low bias, with ligation of these strands mediated by T4 DNA ligase. This technique potentially can be applied in protocols that require ligation of ssDNA, including ligation-mediated polymerase chain reaction (LMPCR) and complementary DNA (cDNA) library construction.
Collapse
Affiliation(s)
- Chun Kit Kwok
- Department of Chemistry, The Pennsylvania State University, University Park, PA 16802, USA
| | | | | | | | | |
Collapse
|
4
|
Sorefan K, Pais H, Hall AE, Kozomara A, Griffiths-Jones S, Moulton V, Dalmay T. Reducing ligation bias of small RNAs in libraries for next generation sequencing. SILENCE 2012; 3:4. [PMID: 22647250 PMCID: PMC3489589 DOI: 10.1186/1758-907x-3-4] [Citation(s) in RCA: 148] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/18/2012] [Accepted: 05/30/2012] [Indexed: 02/07/2023]
Abstract
BACKGROUND The use of nucleic acid-modifying enzymes has driven the rapid advancement in molecular biology. Understanding their function is important for modifying or improving their activity. However, functional analysis usually relies upon low-throughput experiments. Here we present a method for functional analysis of nucleic acid-modifying enzymes using next generation sequencing. FINDINGS We demonstrate that sequencing data of libraries generated by RNA ligases can reveal novel secondary structure preferences of these enzymes, which are used in small RNA cloning and library preparation for NGS. Using this knowledge we demonstrate that the cloning bias in small RNA libraries is RNA ligase-dependent. We developed a high definition (HD) protocol that reduces the RNA ligase-dependent cloning bias. The HD protocol doubled read coverage, is quantitative and found previously unidentified microRNAs. In addition, we show that microRNAs in miRBase are those preferred by the adapters of the main sequencing platform. CONCLUSIONS Sequencing bias of small RNAs partially influenced which microRNAs have been studied in depth; therefore most previous small RNA profiling experiments should be re-evaluated. New microRNAs are likely to be found, which were selected against by existing adapters. Preference of currently used adapters towards known microRNAs suggests that the annotation of all existing small RNAs, including miRNAs, siRNAs and piRNAs, has been biased.
Collapse
Affiliation(s)
- Karim Sorefan
- School of Biological Sciences, University of East Anglia, Norwich, NR4 7TJ, UK
| | - Helio Pais
- School of Computing Sciences, University of East Anglia, Norwich, NR4 7TJ, UK
| | - Adam E Hall
- School of Biological Sciences, University of East Anglia, Norwich, NR4 7TJ, UK
| | - Ana Kozomara
- Faculty of Life Sciences, University of Manchester, Manchester, M13 9PT, UK
| | | | - Vincent Moulton
- School of Computing Sciences, University of East Anglia, Norwich, NR4 7TJ, UK
| | - Tamas Dalmay
- School of Biological Sciences, University of East Anglia, Norwich, NR4 7TJ, UK
| |
Collapse
|
5
|
A primer-free method that selects high-affinity single-stranded DNA aptamers using thermostable RNA ligase. Anal Biochem 2011; 414:246-53. [PMID: 21420926 DOI: 10.1016/j.ab.2011.03.018] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2011] [Revised: 03/11/2011] [Accepted: 03/15/2011] [Indexed: 11/22/2022]
Abstract
This article describes a method for selecting single-stranded DNA (ssDNA) molecules that bind with high-affinity aptamers to specific target proteins. This SELEX (systematic evolution of ligands by exponential enrichment) method is similar to other "primer-free" approaches where the random sequence ssDNA starting pool has no fixed sequences at the 5' and 3' termini. Therefore, there are no predetermined sequences that could bias selection. Like other SELEX methods, repeated cycles (typically 5-15) of selection and then amplification and reselection are used. The method differs from other primer-free approaches in that the key step for regenerating new material for subsequent rounds is ligation of the selected ssDNA to a defined sequence oligonucleotide using thermostable RNA ligase. Under specific conditions, this ligase ligated 30-nt random sequence ssDNA (5'-N(30)-3') to a specified 20-nt ssDNA with approximately 50% efficiency. Efficiency was improved to approximately 90% by the addition of a single T residue to the 3' end (5'-N(29)T-3'). High efficiency in this step is critical, especially early in the procedure because any selected material that is not ligated is lost. In this study, human immunodeficiency virus reverse transcriptase was used as the target protein, but the method could be applied to essentially any protein.
Collapse
|
6
|
Frith MC, Valen E, Krogh A, Hayashizaki Y, Carninci P, Sandelin A. A code for transcription initiation in mammalian genomes. Genes Dev 2008; 18:1-12. [PMID: 18032727 PMCID: PMC2134772 DOI: 10.1101/gr.6831208] [Citation(s) in RCA: 182] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2007] [Accepted: 10/14/2007] [Indexed: 11/24/2022]
Abstract
Genome-wide detection of transcription start sites (TSSs) has revealed that RNA Polymerase II transcription initiates at millions of positions in mammalian genomes. Most core promoters do not have a single TSS, but an array of closely located TSSs with different rates of initiation. As a rule, genes have more than one such core promoter; however, defining the boundaries between core promoters is not trivial. These discoveries prompt a re-evaluation of our models for transcription initiation. We describe a new framework for understanding the organization of transcription initiation. We show that initiation events are clustered on the chromosomes at multiple scales-clusters within clusters-indicating multiple regulatory processes. Within the smallest of such clusters, which can be interpreted as core promoters, the local DNA sequence predicts the relative transcription start usage of each nucleotide with a remarkable 91% accuracy, implying the existence of a DNA code that determines TSS selection. Conversely, the total expression strength of such clusters is only partially determined by the local DNA sequence. Thus, the overall control of transcription can be understood as a combination of large- and small-scale effects; the selection of transcription start sites is largely governed by the local DNA sequence, whereas the transcriptional activity of a locus is regulated at a different level; it is affected by distal features or events such as enhancers and chromatin remodeling.
Collapse
Affiliation(s)
- Martin C. Frith
- Genome Exploration Research Group (Genome Network Project Core Group), RIKEN Genomic Sciences Center (GSC), RIKEN Yokohama Institute, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa, 230-0045, Japan
- ARC Centre in Bioinformatics, Institute for Molecular Bioscience, University of Queensland, Brisbane, Qld 4072, Australia
| | - Eivind Valen
- The Bioinformatics Centre, Department of Molecular Biology & Biotech Research and Innovation Centre, University of Copenhagen, Ole Maaløes Vej 5, DK-2200 København N, Denmark
| | - Anders Krogh
- The Bioinformatics Centre, Department of Molecular Biology & Biotech Research and Innovation Centre, University of Copenhagen, Ole Maaløes Vej 5, DK-2200 København N, Denmark
| | - Yoshihide Hayashizaki
- Genome Exploration Research Group (Genome Network Project Core Group), RIKEN Genomic Sciences Center (GSC), RIKEN Yokohama Institute, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa, 230-0045, Japan
- Genome Science Laboratory, Discovery Research Institute, RIKEN Wako Institute, 2-1 Hirosawa, Wako, Saitama, 351-0198, Japan
| | - Piero Carninci
- Genome Exploration Research Group (Genome Network Project Core Group), RIKEN Genomic Sciences Center (GSC), RIKEN Yokohama Institute, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa, 230-0045, Japan
- Genome Science Laboratory, Discovery Research Institute, RIKEN Wako Institute, 2-1 Hirosawa, Wako, Saitama, 351-0198, Japan
| | - Albin Sandelin
- The Bioinformatics Centre, Department of Molecular Biology & Biotech Research and Innovation Centre, University of Copenhagen, Ole Maaløes Vej 5, DK-2200 København N, Denmark
| |
Collapse
|
7
|
Abstract
Measurement of DNA damage and repair at the nucleotide level in intact cells has provided compelling evidence for the molecular details of these events as they occur in intact organisms. Furthermore, these measurements give the most accurate picture of the rates of repair in different structural domains of DNA in chromatin. In this report, we describe two methods currently used in our laboratories to map DNA lesions at (or near) nucleotide resolution in yeast cells. The low-resolution method couples damage-specific strand breaks in DNA with indirect end-labeling to measure DNA lesions over a span of 1.5 to 2 kb of DNA sequence. The resolution of this method is limited by the resolution of DNA length measurements on alkaline agarose gels (about +/-20 bp on average). The high-resolution method uses streptavidin magnetic beads and special biotinylated oligonucleotides to facilitate end-labeling of DNA fragments specifically cleaved at damage sites. The latter method maps DNA damage sites at nucleotide resolution over a shorter distance (<500 bp), and is constrained to the length of DNA resolvable on DNA sequencing gels. These methods are used in tandem for answering questions regarding DNA damage and repair in different chromatin domains and states of gene expression.
Collapse
Affiliation(s)
- S Li
- Biochemistry and Biophysics, School of Molecular Biosciences, Pullman, Washington, 99164-4660, USA
| | | | | |
Collapse
|
8
|
|
9
|
Landweber LF, Pokrovskaya ID. Emergence of a dual-catalytic RNA with metal-specific cleavage and ligase activities: the spandrels of RNA evolution. Proc Natl Acad Sci U S A 1999; 96:173-8. [PMID: 9874791 PMCID: PMC15112 DOI: 10.1073/pnas.96.1.173] [Citation(s) in RCA: 45] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/1998] [Accepted: 10/26/1998] [Indexed: 11/18/2022] Open
Abstract
In vitro selection, or directed molecular evolution, allows the isolation and amplification of rare sequences that satisfy a functional-selection criterion. This technique can be used to isolate novel ribozymes (RNA enzymes) from large pools of random sequences. We used in vitro evolution to select a ribozyme that catalyzes a novel template-directed RNA ligation that requires surprisingly few nucleotides for catalytic activity. With the exception of two nucleotides, most of the ribozyme contributes to a template, suggesting that it is a general prebiotic ligase. More surprisingly, the catalytic core built from randomized sequences actually contains a 7-nt manganese-dependent self-cleavage motif originally discovered in the Tetrahymena group I intron. Further experiments revealed that we have selected a dual-catalytic RNA from random sequences: the RNA promotes both cleavage at one site and ligation at another site, suggesting two conformations surrounding at least one divalent metal ion-binding site. Together, these results imply that similar catalytic RNA motifs can arise under fairly simple conditions and that multiple catalytic structures, including bifunctional ligases, can evolve from very small preexisting parts. By breaking apart and joining different RNA strands, such ribozymes could have led to the production of longer and more complex RNA polymers in prebiotic evolution.
Collapse
Affiliation(s)
- L F Landweber
- Department of Ecology and Evolutionary Biology, Princeton University, Princeton, NJ 08544, USA.
| | | |
Collapse
|
10
|
James KD, Boles AR, Henckel D, Ellington AD. The fidelity of template-directed oligonucleotide ligation and its relevance to DNA computation. Nucleic Acids Res 1998; 26:5203-11. [PMID: 9801320 PMCID: PMC147951 DOI: 10.1093/nar/26.22.5203] [Citation(s) in RCA: 24] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Several different computational problems have been solved using DNA as a medium. However, the DNA computations that have so far been carried out have examined a relatively small number of possible sequence solutions in order to find correct sequence solutions. We have encoded a search algorithm in DNA that required the evaluation of >16 000 000 possible sequence solutions in order to find a single, correct sequence solution. Experimental evaluation of the search algorithm revealed bounds for the accuracies of answers to other large, computationally complex problems and suggested methods for the optimization of DNA computations in general. Short oligonucleotide substrates performed substantially better than longer substrates. Large, computationally complex problems whose evaluation requires hybridization and ligation can likely best be encoded and evaluated using short oligonucleotides at mesophilic temperatures.
Collapse
Affiliation(s)
- K D James
- Department of Chemistry, Indiana University, Bloomington, IN 47405, USA
| | | | | | | |
Collapse
|
11
|
Mules EH, Uzun O, Gabriel A. Replication errors during in vivo Ty1 transposition are linked to heterogeneous RNase H cleavage sites. Mol Cell Biol 1998; 18:1094-104. [PMID: 9448007 PMCID: PMC108822 DOI: 10.1128/mcb.18.2.1094] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open
Abstract
We previously identified a mutational hotspot upstream of the Ty1 U5-primer binding site (PBS) border and proposed a novel mechanism to account for this phenomenon during Ty1 replication. In this report, we verify key points of our model and show that in vivo RNase H cleavage of Ty1 RNA during minus-strand strong-stop synthesis creates heterogeneous 5' RNA ends. The preferred cleavage sites closest to the PBS are 6 and 3 bases upstream of the U5-PBS border. Minus-strand cDNA synthesis terminates at multiple sites determined by RNase H cleavage, and DNA intermediates frequently contain 3'-terminal sequence changes at or near their template ends. These data indicate that nontemplated terminal base addition during reverse transcription is a real in vivo phenomenon and suggest that this mechanism is a major source of sequence variability among retrotransposed genetic elements.
Collapse
Affiliation(s)
- E H Mules
- Department of Molecular Biology and Biochemistry, Rutgers University, Piscataway, New Jersey 08855, USA
| | | | | |
Collapse
|
12
|
Harada K, Orgel LE. In vitro selection of optimal DNA substrates for ligation by a water-soluble carbodiimide. J Mol Evol 1994; 38:558-60. [PMID: 8083882 DOI: 10.1007/bf00175874] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Abstract
We have used in vitro selection to investigate the sequence requirements for efficient template-directed ligation of oligonucleotides at 0 degrees C using a water-soluble carbodiimide as condensing agent. We find that only 2 bp at each side of the ligation junction are needed. We also studied chemical ligation of substrate ensembles that we have previously selected as optimal for ligation by RNA ligase or by DNA ligase. As anticipated, we find that substrates selected with DNA ligase ligate efficiently with a chemical ligating agent, and vice versa. Substrates selected using RNA ligase are not ligated by the chemical condensing agent and vice versa. The implications of these results for prebiotic chemistry are discussed.
Collapse
Affiliation(s)
- K Harada
- Salk Institute for Biological Studies, San Diego, CA 92186-5800
| | | |
Collapse
|