51
|
Scatena C, Murtas D, Tomei S. Cutaneous Melanoma Classification: The Importance of High-Throughput Genomic Technologies. Front Oncol 2021; 11:635488. [PMID: 34123788 PMCID: PMC8193952 DOI: 10.3389/fonc.2021.635488] [Citation(s) in RCA: 37] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2020] [Accepted: 03/30/2021] [Indexed: 02/06/2023] Open
Abstract
Cutaneous melanoma is an aggressive tumor responsible for 90% of mortality related to skin cancer. In the recent years, the discovery of driving mutations in melanoma has led to better treatment approaches. The last decade has seen a genomic revolution in the field of cancer. Such genomic revolution has led to the production of an unprecedented mole of data. High-throughput genomic technologies have facilitated the genomic, transcriptomic and epigenomic profiling of several cancers, including melanoma. Nevertheless, there are a number of newer genomic technologies that have not yet been employed in large studies. In this article we describe the current classification of cutaneous melanoma, we review the current knowledge of the main genetic alterations of cutaneous melanoma and their related impact on targeted therapies, and we describe the most recent high-throughput genomic technologies, highlighting their advantages and disadvantages. We hope that the current review will also help scientists to identify the most suitable technology to address melanoma-related relevant questions. The translation of this knowledge and all actual advancements into the clinical practice will be helpful in better defining the different molecular subsets of melanoma patients and provide new tools to address relevant questions on disease management. Genomic technologies might indeed allow to better predict the biological - and, subsequently, clinical - behavior for each subset of melanoma patients as well as to even identify all molecular changes in tumor cell populations during disease evolution toward a real achievement of a personalized medicine.
Collapse
Affiliation(s)
- Cristian Scatena
- Division of Pathology, Department of Translational Research and New Technologies in Medicine and Surgery, University of Pisa, Pisa, Italy
| | - Daniela Murtas
- Department of Biomedical Sciences, Section of Cytomorphology, University of Cagliari, Cagliari, Italy
| | - Sara Tomei
- Omics Core, Integrated Genomics Services, Research Department, Sidra Medicine, Doha, Qatar
| |
Collapse
|
52
|
Marangio P, Law KYT, Sanguinetti G, Granneman S. diffBUM-HMM: a robust statistical modeling approach for detecting RNA flexibility changes in high-throughput structure probing data. Genome Biol 2021; 22:165. [PMID: 34044851 PMCID: PMC8157727 DOI: 10.1186/s13059-021-02379-y] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2020] [Accepted: 05/10/2021] [Indexed: 11/21/2022] Open
Abstract
Advancing RNA structural probing techniques with next-generation sequencing has generated demands for complementary computational tools to robustly extract RNA structural information amidst sampling noise and variability. We present diffBUM-HMM, a noise-aware model that enables accurate detection of RNA flexibility and conformational changes from high-throughput RNA structure-probing data. diffBUM-HMM is widely compatible, accounting for sampling variation and sequence coverage biases, and displays higher sensitivity than existing methods while robust against false positives. Our analyses of datasets generated with a variety of RNA probing chemistries demonstrate the value of diffBUM-HMM for quantitatively detecting RNA structural changes and RNA-binding protein binding sites.
Collapse
Affiliation(s)
- Paolo Marangio
- School of Informatics, The University of Edinburgh, Edinburgh, UK
- SISSA Data Science Excellence Department Initiative, Trieste, Italy
| | - Ka Ying Toby Law
- Centre for Synthetic and Systems Biology, The University of Edinburgh, Edinburgh, UK
| | - Guido Sanguinetti
- Centre for Synthetic and Systems Biology, The University of Edinburgh, Edinburgh, UK.
- School of Informatics, The University of Edinburgh, Edinburgh, UK.
- SISSA Data Science Excellence Department Initiative, Trieste, Italy.
| | - Sander Granneman
- Centre for Synthetic and Systems Biology, The University of Edinburgh, Edinburgh, UK.
| |
Collapse
|
53
|
Bias in RNA-seq Library Preparation: Current Challenges and Solutions. BIOMED RESEARCH INTERNATIONAL 2021; 2021:6647597. [PMID: 33987443 PMCID: PMC8079181 DOI: 10.1155/2021/6647597] [Citation(s) in RCA: 30] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/22/2020] [Accepted: 04/09/2021] [Indexed: 12/26/2022]
Abstract
Although RNA sequencing (RNA-seq) has become the most advanced technology for transcriptome analysis, it also confronts various challenges. As we all know, the workflow of RNA-seq is extremely complicated and it is easy to produce bias. This may damage the quality of RNA-seq dataset and lead to an incorrect interpretation for sequencing result. Thus, our detailed understanding of the source and nature of these biases is essential for the interpretation of RNA-seq data, finding methods to improve the quality of RNA-seq experimental, or development bioinformatics tools to compensate for these biases. Here, we discuss the sources of experimental bias in RNA-seq. And for each type of bias, we discussed the method for improvement, in order to provide some useful suggestions for researcher in RNA-seq experimental.
Collapse
|
54
|
Hu T, Chitnis N, Monos D, Dinh A. Next-generation sequencing technologies: An overview. Hum Immunol 2021; 82:801-811. [PMID: 33745759 DOI: 10.1016/j.humimm.2021.02.012] [Citation(s) in RCA: 223] [Impact Index Per Article: 74.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2020] [Revised: 02/18/2021] [Accepted: 02/23/2021] [Indexed: 12/14/2022]
Abstract
Since the days of Sanger sequencing, next-generation sequencing technologies have significantly evolved to provide increased data output, efficiencies, and applications. These next generations of technologies can be categorized based on read length. This review provides an overview of these technologies as two paradigms: short-read, or "second-generation," technologies, and long-read, or "third-generation," technologies. Herein, short-read sequencing approaches are represented by the most prevalent technologies, Illumina and Ion Torrent, and long-read sequencing approaches are represented by Pacific Biosciences and Oxford Nanopore technologies. All technologies are reviewed along with reported advantages and disadvantages. Until recently, short-read sequencing was thought to provide high accuracy limited by read-length, while long-read technologies afforded much longer read-lengths at the expense of accuracy. Emerging developments for third-generation technologies hold promise for the next wave of sequencing evolution, with the co-existence of longer read lengths and high accuracy.
Collapse
Affiliation(s)
- Taishan Hu
- Department of Pathology and Laboratory Medicine, Children's Hospital of Philadelphia, Philadelphia, PA, United States
| | - Nilesh Chitnis
- Department of Pathology and Laboratory Medicine, Children's Hospital of Philadelphia, Philadelphia, PA, United States; Department of Surgery, Baylor College of Medicine, Houston, TX, United States
| | - Dimitri Monos
- Department of Pathology and Laboratory Medicine, Children's Hospital of Philadelphia, Philadelphia, PA, United States; Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States.
| | - Anh Dinh
- Department of Pathology and Laboratory Medicine, Children's Hospital of Philadelphia, Philadelphia, PA, United States; Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States.
| |
Collapse
|
55
|
Collins JH, Keating KW, Jones TR, Balaji S, Marsan CB, Çomo M, Newlon ZJ, Mitchell T, Bartley B, Adler A, Roehner N, Young EM. Engineered yeast genomes accurately assembled from pure and mixed samples. Nat Commun 2021; 12:1485. [PMID: 33674578 PMCID: PMC7935868 DOI: 10.1038/s41467-021-21656-9] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2020] [Accepted: 02/04/2021] [Indexed: 01/31/2023] Open
Abstract
Yeast whole genome sequencing (WGS) lacks end-to-end workflows that identify genetic engineering. Here we present Prymetime, a tool that assembles yeast plasmids and chromosomes and annotates genetic engineering sequences. It is a hybrid workflow-it uses short and long reads as inputs to perform separate linear and circular assembly steps. This structure is necessary to accurately resolve genetic engineering sequences in plasmids and the genome. We show this by assembling diverse engineered yeasts, in some cases revealing unintended deletions and integrations. Furthermore, the resulting whole genomes are high quality, although the underlying assembly software does not consistently resolve highly repetitive genome features. Finally, we assemble plasmids and genome integrations from metagenomic sequencing, even with 1 engineered cell in 1000. This work is a blueprint for building WGS workflows and establishes WGS-based identification of yeast genetic engineering.
Collapse
Affiliation(s)
- Joseph H Collins
- Department of Chemical Engineering, Worcester Polytechnic Institute, Worcester, MA, USA
| | - Kevin W Keating
- Department of Chemical Engineering, Worcester Polytechnic Institute, Worcester, MA, USA
| | - Trent R Jones
- Department of Chemical Engineering, Worcester Polytechnic Institute, Worcester, MA, USA
| | - Shravani Balaji
- Department of Chemical Engineering, Worcester Polytechnic Institute, Worcester, MA, USA
| | - Celeste B Marsan
- Department of Chemical Engineering, Worcester Polytechnic Institute, Worcester, MA, USA
| | - Marina Çomo
- Department of Chemical Engineering, Worcester Polytechnic Institute, Worcester, MA, USA
| | - Zachary J Newlon
- Department of Chemical Engineering, Worcester Polytechnic Institute, Worcester, MA, USA
| | - Tom Mitchell
- Synthetic Biology, Raytheon BBN Technologies, Cambridge, MA, USA
| | - Bryan Bartley
- Synthetic Biology, Raytheon BBN Technologies, Cambridge, MA, USA
| | - Aaron Adler
- Synthetic Biology, Raytheon BBN Technologies, Cambridge, MA, USA
| | - Nicholas Roehner
- Synthetic Biology, Raytheon BBN Technologies, Cambridge, MA, USA
| | - Eric M Young
- Department of Chemical Engineering, Worcester Polytechnic Institute, Worcester, MA, USA.
| |
Collapse
|
56
|
Potla P, Ali SA, Kapoor M. A bioinformatics approach to microRNA-sequencing analysis. OSTEOARTHRITIS AND CARTILAGE OPEN 2021; 3:100131. [DOI: 10.1016/j.ocarto.2020.100131] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2020] [Accepted: 12/14/2020] [Indexed: 01/20/2023] Open
|
57
|
Xiao Y, Sosa F, de Armas LR, Pan L, Hansen PJ. An improved method for specific-target preamplification PCR analysis of single blastocysts useful for embryo sexing and high-throughput gene expression analysis. J Dairy Sci 2021; 104:3722-3735. [PMID: 33455782 PMCID: PMC8050830 DOI: 10.3168/jds.2020-19497] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2020] [Accepted: 10/19/2020] [Indexed: 12/15/2022]
Abstract
Gene expression analysis in preimplantation embryos has been used for answering fundamental questions related to development, prediction of pregnancy outcome, and other topics. Limited amounts of mRNA in preimplantation embryos hinders progress in studying the preimplantation embryo. Here, a method was developed involving direct synthesis and specific-target preamplification (STA) of cDNA for gene expression analysis in single blastocysts. Effective cell lysis and genomic DNA removal steps were incorporated into the method. In addition, conditions for real-time PCR of cDNA generated from these processes were improved. By using this system, reliable embryo sexing results based on expression of sex-chromosome linked genes was demonstrated. Calibration curve analysis of PCR results using the Fluidigm Biomark microfluidic platform (Fluidigm, South San Francisco, CA) was performed to evaluate 96 STA cDNA from single blastocysts. In total, 93.75% of the genes were validated. Robust amplification was detected even when STA cDNA from a single blastocyst was diluted 1,024-fold. Further analysis showed that within-assay variation increased when cycle threshold values exceeded 18. Overall, STA quantitative real-time PCR analysis was shown to be useful for analysis of gene expression of multiple specific targets in single blastocysts.
Collapse
Affiliation(s)
- Yao Xiao
- Department of Animal Sciences, D.H. Barron Reproductive and Perinatal Biology Research Program, University of Florida, Gainesville 32611-0910
| | - Froylan Sosa
- Department of Animal Sciences, D.H. Barron Reproductive and Perinatal Biology Research Program, University of Florida, Gainesville 32611-0910
| | - Lesley R de Armas
- Department of Microbiology and Immunology, University of Miami Miller School of Medicine, Miami, FL 33136
| | - Li Pan
- Department of Microbiology and Immunology, University of Miami Miller School of Medicine, Miami, FL 33136
| | - Peter J Hansen
- Department of Animal Sciences, D.H. Barron Reproductive and Perinatal Biology Research Program, University of Florida, Gainesville 32611-0910.
| |
Collapse
|
58
|
Lanner J, Gstöttenmayer F, Curto M, Geslin B, Huchler K, Orr MC, Pachinger B, Sedivy C, Meimberg H. Evidence for multiple introductions of an invasive wild bee species currently under rapid range expansion in Europe. BMC Ecol Evol 2021; 21:17. [PMID: 33546597 PMCID: PMC7866639 DOI: 10.1186/s12862-020-01729-x] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2020] [Accepted: 11/30/2020] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Invasive species are increasingly driving biodiversity decline, and knowledge of colonization dynamics, including both drivers and dispersal modes, are important to prevent future invasions. The bee species Megachile sculpturalis (Hymenoptera: Megachilidae), native to East-Asia, was first recognized in Southeast-France in 2008, and has since spread throughout much of Europe. The spread is very fast, and colonization may result from multiple fronts. RESULT To track the history of this invasion, codominant markers were genotyped using Illumina sequencing and the invasion history and degree of connectivity between populations across the European invasion axis were investigated. Distinctive genetic clusters were detected with east-west differentiations in Middle-Europe. CONCLUSION We hypothesize that the observed cluster formation resulted from multiple, independent introductions of the species to the European continent. This study draws a first picture of an early invasion stage of this wild bee and forms a foundation for further investigations, including studies of the species in their native Asian range and in the invaded range in North America.
Collapse
Affiliation(s)
- Julia Lanner
- Institute for Integrative Nature Conservation Research, University of Natural Resources and Life Sciences Vienna (BOKU), Gregor-Mendel-Straße 33, 1180, Vienna, Austria.
| | - Fabian Gstöttenmayer
- Insect Pest Control Laboratory, Joint FAO/IAEA Division of Nuclear Techniques in Food & Agriculture, Wagramer Straße 5, 1400, Vienna, Austria
| | - Manuel Curto
- Institute for Integrative Nature Conservation Research, University of Natural Resources and Life Sciences Vienna (BOKU), Gregor-Mendel-Straße 33, 1180, Vienna, Austria.,MARE Marine and Environmental Sciences Centre, Faculdade de Ciências, Universidade de Lisboa, Camop Grande, 1749-016, Lisboa, Portugal
| | - Benoît Geslin
- IMBE, Aix Marseille Université, Avignon Université, CNRS, Marseille, France
| | - Katharina Huchler
- Institute for Integrative Nature Conservation Research, University of Natural Resources and Life Sciences Vienna (BOKU), Gregor-Mendel-Straße 33, 1180, Vienna, Austria
| | - Michael C Orr
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, 1 Beichen West Road, Beijing, 100101, China
| | - Bärbel Pachinger
- Institute for Integrative Nature Conservation Research, University of Natural Resources and Life Sciences Vienna (BOKU), Gregor-Mendel-Straße 33, 1180, Vienna, Austria
| | | | - Harald Meimberg
- Institute for Integrative Nature Conservation Research, University of Natural Resources and Life Sciences Vienna (BOKU), Gregor-Mendel-Straße 33, 1180, Vienna, Austria
| |
Collapse
|
59
|
Luan Y, Hu H, Liu C, Chen B, Liu X, Xu Y, Luo X, Chen J, Ye B, Huang F, Wang J, Duan C. A proof-of-concept study of an automated solution for clinical metagenomic next-generation sequencing. J Appl Microbiol 2021; 131:1007-1016. [PMID: 33440055 DOI: 10.1111/jam.15003] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2020] [Revised: 01/06/2021] [Accepted: 01/11/2021] [Indexed: 11/29/2022]
Abstract
AIMS Metagenomic next-generation sequencing (mNGS) has been utilized for diagnosing infectious diseases. It is a culture-free and hypothesis-free nucleic acid test for diagnosing all pathogens with known genomic sequences, including bacteria, fungi, viruses and parasites. While this technique greatly expands the clinical capacity of pathogen detection, it is a second-line choice due to lengthy procedures and microbial contaminations introduced from wet-lab processes. As a result, we aimed to reduce the hands-on time and exogenous contaminations in mNGS. METHODS AND RESULTS We developed a device (NGSmaster) that automates the wet-lab workflow, including nucleic acid extraction, PCR-free library preparation and purification. It shortens the sample-to-results time to 16 and 18·5 h for DNA and RNA sequencing respectively. We used it to test cultured bacteria for validation of the workflow and bioinformatic pipeline. We also compared PCR-free with PCR-based library prep and discovered no differences in microbial reads. Moreover we analysed results by automation and manual testing and found that automation can significantly reduce microbial contaminations. Finally, we tested artificial and clinical samples and showed mNGS results were concordant with traditional culture. CONCLUSION NGSmaster can fulfil the microbiological diagnostic needs in a variety of sample types. SIGNIFICANCE AND IMPACT OF THE STUDY This study opens up an opportunity of performing in-house mNGS to reduce turnaround time and workload, instead of transferring potentially contagious specimen to a third-party laboratory.
Collapse
Affiliation(s)
- Y Luan
- Department of Clinical Laboratory, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou, China.,Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene Regulation, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou, China.,RNA Biomedical Institute, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou, China
| | - H Hu
- Department of Clinical Laboratory, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou, China.,Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene Regulation, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou, China.,RNA Biomedical Institute, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou, China
| | - C Liu
- Matridx Biotechnology Co., Ltd, Hangzhou, China
| | - B Chen
- Matridx Biotechnology Co., Ltd, Hangzhou, China
| | - X Liu
- Department of Clinical Laboratory, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou, China.,Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene Regulation, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou, China.,RNA Biomedical Institute, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou, China
| | - Y Xu
- Department of Clinical Laboratory, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou, China.,Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene Regulation, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou, China.,RNA Biomedical Institute, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou, China
| | - X Luo
- Department of Clinical Laboratory, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou, China.,Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene Regulation, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou, China.,RNA Biomedical Institute, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou, China
| | - J Chen
- Matridx Biotechnology Co., Ltd, Hangzhou, China
| | - B Ye
- Matridx Biotechnology Co., Ltd, Hangzhou, China
| | - F Huang
- Matridx Biotechnology Co., Ltd, Hangzhou, China
| | - J Wang
- Matridx Biotechnology Co., Ltd, Hangzhou, China
| | - C Duan
- Department of Clinical Laboratory, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou, China.,Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene Regulation, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou, China.,RNA Biomedical Institute, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou, China
| |
Collapse
|
60
|
Wang Z, Maluenda J, Giraut L, Vieille T, Lefevre A, Salthouse D, Radou G, Moulinas R, Astete S, D'Avezac P, Smith G, André C, Allemand JF, Bensimon D, Croquette V, Ouellet J, Hamilton G. Detection of genetic variation and base modifications at base-pair resolution on both DNA and RNA. Commun Biol 2021; 4:128. [PMID: 33514840 PMCID: PMC7846774 DOI: 10.1038/s42003-021-01648-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2020] [Accepted: 12/28/2020] [Indexed: 11/14/2022] Open
Abstract
Accurate decoding of nucleic acid variation is critical to understand the complexity and regulation of genome function. Here we use a single-molecule magnetic tweezer (MT) platform to identify sequence variation and map a range of important epigenetic base modifications with high sensitivity, specificity, and precision in the same single molecules of DNA or RNA. We have also developed a highly specific amplification-free CRISPR-Cas enrichment strategy to isolate genomic regions from native DNA. We demonstrate enrichment of DNA from both E. coli and the FMR1 5'UTR coming from cells derived from a Fragile X carrier. From these kilobase-length enriched molecules we could characterize the differential levels of adenine and cytosine base modifications on E. coli, and the repeat expansion length and methylation status of FMR1. Together these results demonstrate that our platform can detect a variety of genetic, epigenetic, and base modification changes concomitantly within the same single molecules.
Collapse
Affiliation(s)
- Zhen Wang
- Depixus SAS, 3/5 Impasse Reille, 75014, Paris, France
| | | | | | | | | | | | - Gaël Radou
- Depixus SAS, 3/5 Impasse Reille, 75014, Paris, France
| | - Rémi Moulinas
- Depixus SAS, 3/5 Impasse Reille, 75014, Paris, France
| | - Sandra Astete
- Depixus SAS, 3/5 Impasse Reille, 75014, Paris, France
| | - Pol D'Avezac
- Depixus SAS, 3/5 Impasse Reille, 75014, Paris, France
| | - Geoff Smith
- Depixus SAS, 3/5 Impasse Reille, 75014, Paris, France
| | - Charles André
- Depixus SAS, 3/5 Impasse Reille, 75014, Paris, France
| | - Jean-François Allemand
- Laboratoire de physique de L'École normale supérieure de Paris, CNRS, ENS, Université PSL, Sorbonne Université, Université de Paris, Paris, 75005, France
- IBENS, Département de biologie, École normale supérieure, CNRS, INSERM, PSL Research University, 75005, Paris, France
| | - David Bensimon
- Laboratoire de physique de L'École normale supérieure de Paris, CNRS, ENS, Université PSL, Sorbonne Université, Université de Paris, Paris, 75005, France
- IBENS, Département de biologie, École normale supérieure, CNRS, INSERM, PSL Research University, 75005, Paris, France
- Department of Chemistry and Biochemistry, UCLA, 607 Charles E Young Drive East, Los Angeles, 90095, USA
| | - Vincent Croquette
- Laboratoire de physique de L'École normale supérieure de Paris, CNRS, ENS, Université PSL, Sorbonne Université, Université de Paris, Paris, 75005, France
- IBENS, Département de biologie, École normale supérieure, CNRS, INSERM, PSL Research University, 75005, Paris, France
- ESPCI Paris, PSL University, 10 rue Vauquelin, 75005, Paris, France
| | - Jimmy Ouellet
- Depixus SAS, 3/5 Impasse Reille, 75014, Paris, France
| | | |
Collapse
|
61
|
Modlin SJ, Robinhold C, Morrissey C, Mitchell SN, Ramirez-Busby SM, Shmaya T, Valafar F. Exact mapping of Illumina blind spots in the Mycobacterium tuberculosis genome reveals platform-wide and workflow-specific biases. Microb Genom 2021; 7. [PMID: 33502304 PMCID: PMC8190613 DOI: 10.1099/mgen.0.000465] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Whole-genome sequencing (WGS) is fundamental to Mycobacterium tuberculosis basic research and many clinical applications. Coverage across Illumina-sequenced M. tuberculosis genomes is known to vary with sequence context, but this bias is poorly characterized. Here, through a novel application of phylogenomics that distinguishes genuine coverage bias from deletions, we discern Illumina ‘blind spots’ in the M. tuberculosis reference genome for seven sequencing workflows. We find blind spots to be widespread, affecting 529 genes, and provide their exact coordinates, enabling salvage of unaffected regions. Fifty-seven pe/ppe genes (the primary families assumed to exhibit Illumina bias) lack blind spots entirely, while the remaining pe/ppe genes account for 55.1 % of blind spots. Surprisingly, we find coverage bias persists in homopolymers as short as 6 bp, shorter tracts than previously reported. While G+C-rich regions challenge all Illumina sequencing workflows, a modified Nextera library preparation that amplifies DNA with a high-fidelity polymerase markedly attenuates coverage bias in G+C-rich and homopolymeric sequences, expanding the ‘Illumina-sequenceable’ genome. Through these findings, and by defining workflow-specific exclusion criteria, we spotlight effective strategies for handling bias in M. tuberculosis Illumina WGS. This empirical analysis framework may be used to systematically evaluate coverage bias in other species using existing sequencing data.
Collapse
Affiliation(s)
- Samuel J Modlin
- Laboratory for Pathogenesis of Clinical Drug Resistance and Persistence, School of Public Health, San Diego State University, San Diego, CA 92182, USA
| | - Cassidy Robinhold
- Laboratory for Pathogenesis of Clinical Drug Resistance and Persistence, School of Public Health, San Diego State University, San Diego, CA 92182, USA
| | - Christopher Morrissey
- Laboratory for Pathogenesis of Clinical Drug Resistance and Persistence, School of Public Health, San Diego State University, San Diego, CA 92182, USA
| | - Scott N Mitchell
- Laboratory for Pathogenesis of Clinical Drug Resistance and Persistence, School of Public Health, San Diego State University, San Diego, CA 92182, USA
| | - Sarah M Ramirez-Busby
- Laboratory for Pathogenesis of Clinical Drug Resistance and Persistence, School of Public Health, San Diego State University, San Diego, CA 92182, USA
| | - Tal Shmaya
- Laboratory for Pathogenesis of Clinical Drug Resistance and Persistence, School of Public Health, San Diego State University, San Diego, CA 92182, USA
| | - Faramarz Valafar
- Laboratory for Pathogenesis of Clinical Drug Resistance and Persistence, School of Public Health, San Diego State University, San Diego, CA 92182, USA
| |
Collapse
|
62
|
Togi S, Ura H, Niida Y. Optimization and Validation of Multimodular, Long-Range PCR-Based Next-Generation Sequencing Assays for Comprehensive Detection of Mutation in Tuberous Sclerosis Complex. J Mol Diagn 2021; 23:424-446. [PMID: 33486073 DOI: 10.1016/j.jmoldx.2020.12.009] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2020] [Revised: 10/01/2020] [Accepted: 12/16/2020] [Indexed: 12/17/2022] Open
Abstract
The genetic diagnosis of tuberous sclerosis complex is difficult because of its broad spectrum of mutations. In addition to point mutations in coding regions, intragenic or chromosomal-level large deletions, deep intronic splicing mutations, and mosaic mutations represent a significant proportion of the mutations. In this study, multimodular, long-range PCR-based next-generation sequencing assays were optimized and validated using >100 samples with known TSC1 and TSC2 variants. Multiplex, long-range PCR covering the entire genomic region of both genes detected all 138 known variants; however, it also yielded false-positive results. Intragenic large deletions were detected with accurate breakpoint sequences. Chromosomal-level deletions were estimated by discordant allele segregation in the family and confirmed by DNA microarray. Deep intronic mutations were verified using a combination of long-range DNA PCR and full-length mRNA sequencing. DNA samples were mixed to simulate mosaic mutations, and most variants were detected but could not be distinguished from equivalently detected false-positive results. Repeated false-positive results were classified, and the strategy of selecting the common variants detected in the duplicate analysis and eliminating known false-positive results improved the sensitivity (85.2%) and positive predictive value (96.6%) of a 10% mosaic simulation. Long-range PCRbased next-generation sequencing is a highly versatile genetic test; however, confirmation tests remain necessary for clinical use because false-positive results cannot be completely eliminated from single experiments.
Collapse
Affiliation(s)
- Sumihito Togi
- Center for Clinical Genomics, Kanazawa Medical University, Uchinada, Japan
| | - Hiroki Ura
- Center for Clinical Genomics, Kanazawa Medical University, Uchinada, Japan
| | - Yo Niida
- Division of Genomic Medicine, Department of Advanced Medicine, Medical Research Institute, Kanazawa Medical University, Uchinada, Japan.
| |
Collapse
|
63
|
van Dijk EL, Thermes C. A Small RNA-Seq Protocol with Less Bias and Improved Capture of 2'-O-Methyl RNAs. Methods Mol Biol 2021; 2298:153-167. [PMID: 34085244 DOI: 10.1007/978-1-0716-1374-0_10] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
The study of small RNAs (sRNAs) by next-generation sequencing (NGS) is challenged by bias issues during library preparation. Several types of sRNAs such as plant microRNAs (miRNAs) carry a 2'-O-methyl (2'-OMe) modification at their 3' terminal nucleotide. This modification adds another level of difficulty as it inhibits 3' adapter ligation. We previously demonstrated that modified versions of the "TruSeq (TS)" protocol have less bias and an improved detection of 2'-OMe RNAs. Here we describe in detail protocol "TS5," which showed the best overall performance. We also provide guidelines for bioinformatics analysis of the sequencing data.
Collapse
Affiliation(s)
- Erwin L van Dijk
- Institute for Integrative Biology of the Cell, UMR9198, CNRS CEA Univ Paris-Sud, Université Paris-Saclay, Gif sur Yvette Cedex, France.
| | - Claude Thermes
- Institute for Integrative Biology of the Cell, UMR9198, CNRS CEA Univ Paris-Sud, Université Paris-Saclay, Gif sur Yvette Cedex, France
| |
Collapse
|
64
|
Gao Y, Chen X, Qiao H, Ke Y, Qi H. Low-Bias Manipulation of DNA Oligo Pool for Robust Data Storage. ACS Synth Biol 2020; 9:3344-3352. [PMID: 33185422 DOI: 10.1021/acssynbio.0c00419] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
Abstract
In DNA data storage, the massive sequence complexity creates challenges in repeatable and efficient information readout. Here, our study clearly demonstrated that PCR created significant DNA amplification biases due to its inherent mechanism of inefficient priming, product-as-template, and error-spreading prone, which greatly hinder subsequent applications such as data retrieval in DNA-based storage. To mitigate the amplification bias, we recruited an isothermal DNA amplification by combining strand displacement amplification (SDA) with magnetic beads (MB) DNA immobilization for robust, repeated, and low-bias amplification of DNA oligo pool, comprising over 100 thousand oligos, in a primer-free and low-error-spreading fashion. Furthermore, we introduced oligo pool normalization (OPN), a cost-effective and scalable method for normalizing an oligo pool, by which oligo pools comprising from 256 to 1024 distinct oligos were simply modified with improved Gini-index. Therefore, we believe that the combination of SDA and OPN can provide an ideal amplification mechanism for a low-bias copy of a large oligo pool, which is of vital importance for successful data retrieval in DNA information storage.
Collapse
Affiliation(s)
- Yanmin Gao
- School of Chemical Engineering and Technology, Tianjin University, Tianjin 300350, P. R. China
- Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin 300350, P. R. China
| | - Xin Chen
- Center for Applied Mathematics, Tianjin University, Tianjin 300350, P. R. China
| | - Hongyan Qiao
- School of Chemical Engineering and Technology, Tianjin University, Tianjin 300350, P. R. China
- Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin 300350, P. R. China
| | - Yonggang Ke
- Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, Georgia 30322, United States
| | - Hao Qi
- School of Chemical Engineering and Technology, Tianjin University, Tianjin 300350, P. R. China
- Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin 300350, P. R. China
| |
Collapse
|
65
|
Neelagandan N, Lamberti I, Carvalho HJF, Gobet C, Naef F. What determines eukaryotic translation elongation: recent molecular and quantitative analyses of protein synthesis. Open Biol 2020; 10:200292. [PMID: 33292102 PMCID: PMC7776565 DOI: 10.1098/rsob.200292] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2020] [Accepted: 11/10/2020] [Indexed: 12/14/2022] Open
Abstract
Protein synthesis from mRNA is an energy-intensive and tightly controlled cellular process. Translation elongation is a well-coordinated, multifactorial step in translation that undergoes dynamic regulation owing to cellular state and environmental determinants. Recent studies involving genome-wide approaches have uncovered some crucial aspects of translation elongation including the mRNA itself and the nascent polypeptide chain. Additionally, these studies have fuelled quantitative and mathematical modelling of translation elongation. In this review, we provide a comprehensive overview of the key determinants of translation elongation. We discuss consequences of ribosome stalling or collision, and how the cells regulate translation in case of such events. Next, we review theoretical approaches and widely used mathematical models that have become an essential ingredient to interpret complex molecular datasets and study translation dynamics quantitatively. Finally, we review recent advances in live-cell reporter and related analysis techniques, to monitor the translation dynamics of single cells and single-mRNA molecules in real time.
Collapse
Affiliation(s)
| | | | | | | | - Felix Naef
- Institute of Bioengineering, School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne CH-1015, Switzerland
| |
Collapse
|
66
|
Lee N, Park MJ, Song W, Jeon K, Jeong S. Currently Applied Molecular Assays for Identifying ESR1 Mutations in Patients with Advanced Breast Cancer. Int J Mol Sci 2020; 21:ijms21228807. [PMID: 33233830 PMCID: PMC7699999 DOI: 10.3390/ijms21228807] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2020] [Revised: 11/17/2020] [Accepted: 11/19/2020] [Indexed: 12/11/2022] Open
Abstract
Approximately 70% of breast cancers, the leading cause of cancer-related mortality worldwide, are positive for the estrogen receptor (ER). Treatment of patients with luminal subtypes is mainly based on endocrine therapy. However, ER positivity is reduced and ESR1 mutations play an important role in resistance to endocrine therapy, leading to advanced breast cancer. Various methodologies for the detection of ESR1 mutations have been developed, and the most commonly used method is next-generation sequencing (NGS)-based assays (50.0%) followed by droplet digital PCR (ddPCR) (45.5%). Regarding the sample type, tissue (50.0%) was more frequently used than plasma (27.3%). However, plasma (46.2%) became the most used method in 2016-2019, in contrast to 2012-2015 (22.2%). In 2016-2019, ddPCR (61.5%), rather than NGS (30.8%), became a more popular method than it was in 2012-2015. The easy accessibility, non-invasiveness, and demonstrated usefulness with high sensitivity of ddPCR using plasma have changed the trends. When using these assays, there should be a comprehensive understanding of the principles, advantages, vulnerability, and precautions for interpretation. In the future, advanced NGS platforms and modified ddPCR will benefit patients by facilitating treatment decisions efficiently based on information regarding ESR1 mutations.
Collapse
Affiliation(s)
- Nuri Lee
- Department of Laboratory Medicine, Kangnam Sacred Heart Hospital, Hallym University College of Medicine, Seoul 07440, Korea; (N.L.); (M.-J.P.); (W.S.)
| | - Min-Jeong Park
- Department of Laboratory Medicine, Kangnam Sacred Heart Hospital, Hallym University College of Medicine, Seoul 07440, Korea; (N.L.); (M.-J.P.); (W.S.)
| | - Wonkeun Song
- Department of Laboratory Medicine, Kangnam Sacred Heart Hospital, Hallym University College of Medicine, Seoul 07440, Korea; (N.L.); (M.-J.P.); (W.S.)
| | - Kibum Jeon
- Department of Laboratory Medicine, Hangang Sacred Heart Hospital, Hallym University College of Medicine, Seoul 07440, Korea;
| | - Seri Jeong
- Department of Laboratory Medicine, Kangnam Sacred Heart Hospital, Hallym University College of Medicine, Seoul 07440, Korea; (N.L.); (M.-J.P.); (W.S.)
- Correspondence: ; Tel.: +82-845-5305
| |
Collapse
|
67
|
Komarova N, Barkova D, Kuznetsov A. Implementation of High-Throughput Sequencing (HTS) in Aptamer Selection Technology. Int J Mol Sci 2020; 21:E8774. [PMID: 33233573 PMCID: PMC7699794 DOI: 10.3390/ijms21228774] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2020] [Revised: 11/18/2020] [Accepted: 11/19/2020] [Indexed: 12/18/2022] Open
Abstract
Aptamers are nucleic acid ligands that bind specifically to a target of interest. Aptamers have gained in popularity due to their high potential for different applications in analysis, diagnostics, and therapeutics. The procedure called systematic evolution of ligands by exponential enrichment (SELEX) is used for aptamer isolation from large nucleic acid combinatorial libraries. The huge number of unique sequences implemented in the in vitro evolution in the SELEX process imposes the necessity of performing extensive sequencing of the selected nucleic acid pools. High-throughput sequencing (HTS) meets this demand of SELEX. Analysis of the data obtained from sequencing of the libraries produced during and after aptamer isolation provides an informative basis for precise aptamer identification and for examining the structure and function of nucleic acid ligands. This review discusses the technical aspects and the potential of the integration of HTS with SELEX.
Collapse
Affiliation(s)
- Natalia Komarova
- Scientific-Manufacturing Complex Technological Centre, 1–7 Shokin Square, Zelenograd, 124498 Moscow, Russia; (D.B.); (A.K.)
| | | | | |
Collapse
|
68
|
Lecomte E, Saleun S, Bolteau M, Guy-Duché A, Adjali O, Blouin V, Penaud-Budloo M, Ayuso E. The SSV-Seq 2.0 PCR-Free Method Improves the Sequencing of Adeno-Associated Viral Vector Genomes Containing GC-Rich Regions and Homopolymers. Biotechnol J 2020; 16:e2000016. [PMID: 33064875 DOI: 10.1002/biot.202000016] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2020] [Revised: 09/29/2020] [Indexed: 11/08/2022]
Abstract
Adeno-associated viral vectors (AAV) are efficient engineered tools for delivering genetic material into host cells. The commercialization of AAV-based drugs must be accompanied by the development of appropriate quality control (QC) assays. Given the potential risk of co-transfer of oncogenic or immunogenic sequences with therapeutic vectors, accurate methods to assess the level of residual DNA in AAV vector stocks are particularly important. An assay based on high-throughput sequencing (HTS) to identify and quantify DNA species in recombinant AAV batches is developed. Here, it is shown that PCR amplification of regions that have a local GC content >90% and include successive mononucleotide stretches, such as the CAG promoter, can introduce bias during DNA library preparation, leading to drops in sequencing coverage. To circumvent this problem, SSV-Seq 2.0, a PCR-free protocol for sequencing AAV vector genomes containing such sequences, is developed. The PCR-free protocol improves the evenness of the rAAV genome coverage and consequently leads to a more accurate relative quantification of residual DNA. HTS-based assays provide a more comprehensive assessment of DNA impurities and AAV vector genome integrity than conventional QC tests based on real-time PCR and are useful methods to improve the safety and efficacy of these viral vectors.
Collapse
Affiliation(s)
- Emilie Lecomte
- INSERM UMR1089, Translational Gene Therapy Laboratory, University of Nantes, Centre Hospitalier Universitaire of Nantes, Nantes, 44200, France
| | - Sylvie Saleun
- INSERM UMR1089, Translational Gene Therapy Laboratory, University of Nantes, Centre Hospitalier Universitaire of Nantes, Nantes, 44200, France
| | - Mathieu Bolteau
- INSERM UMR1089, Translational Gene Therapy Laboratory, University of Nantes, Centre Hospitalier Universitaire of Nantes, Nantes, 44200, France
| | - Aurélien Guy-Duché
- INSERM UMR1089, Translational Gene Therapy Laboratory, University of Nantes, Centre Hospitalier Universitaire of Nantes, Nantes, 44200, France
| | - Oumeya Adjali
- INSERM UMR1089, Translational Gene Therapy Laboratory, University of Nantes, Centre Hospitalier Universitaire of Nantes, Nantes, 44200, France
| | - Véronique Blouin
- INSERM UMR1089, Translational Gene Therapy Laboratory, University of Nantes, Centre Hospitalier Universitaire of Nantes, Nantes, 44200, France
| | - Magalie Penaud-Budloo
- INSERM UMR1089, Translational Gene Therapy Laboratory, University of Nantes, Centre Hospitalier Universitaire of Nantes, Nantes, 44200, France
| | - Eduard Ayuso
- INSERM UMR1089, Translational Gene Therapy Laboratory, University of Nantes, Centre Hospitalier Universitaire of Nantes, Nantes, 44200, France
| |
Collapse
|
69
|
Zagorski D, Hartmann M, Bertrand YJK, Paštová L, Slavíková R, Josefiová J, Fehrer J. Characterization and Dynamics of Repeatomes in Closely Related Species of Hieracium (Asteraceae) and Their Synthetic and Apomictic Hybrids. FRONTIERS IN PLANT SCIENCE 2020; 11:591053. [PMID: 33224172 PMCID: PMC7667050 DOI: 10.3389/fpls.2020.591053] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/03/2020] [Accepted: 10/09/2020] [Indexed: 05/05/2023]
Abstract
The repetitive content of the plant genome (repeatome) often represents its largest fraction and is frequently correlated with its size. Transposable elements (TEs), the main component of the repeatome, are an important driver in the genome diversification due to their fast-evolving nature. Hybridization and polyploidization events are hypothesized to induce massive bursts of TEs resulting, among other effects, in an increase of copy number and genome size. Little is known about the repeatome dynamics following hybridization and polyploidization in plants that reproduce by apomixis (asexual reproduction via seeds). To address this, we analyzed the repeatomes of two diploid parental species, Hieracium intybaceum and H. prenanthoides (sexual), their diploid F1 synthetic and their natural triploid hybrids (H. pallidiflorum and H. picroides, apomictic). Using low-coverage next-generation sequencing (NGS) and a graph-based clustering approach, we detected high overall similarity across all major repeatome categories between the parental species, despite their large phylogenetic distance. Medium and highly abundant repetitive elements comprise ∼70% of Hieracium genomes; most prevalent were Ty3/Gypsy chromovirus Tekay and Ty1/Copia Maximus-SIRE elements. No TE bursts were detected, neither in synthetic nor in natural hybrids, as TE abundance generally followed theoretical expectations based on parental genome dosage. Slight over- and under-representation of TE cluster abundances reflected individual differences in genome size. However, in comparative analyses, apomicts displayed an overabundance of pararetrovirus clusters not observed in synthetic hybrids. Substantial deviations were detected in rDNAs and satellite repeats, but these patterns were sample specific. rDNA and satellite repeats (three of them were newly developed as cytogenetic markers) were localized on chromosomes by fluorescence in situ hybridization (FISH). In a few cases, low-abundant repeats (5S rDNA and certain satellites) showed some discrepancy between NGS data and FISH results, which is due partly to the bias of low-coverage sequencing and partly to low amounts of the satellite repeats or their sequence divergence. Overall, satellite DNA (including rDNA) was markedly affected by hybridization, but independent of the ploidy or reproductive mode of the progeny, whereas bursts of TEs did not play an important role in the evolutionary history of Hieracium.
Collapse
|
70
|
Tegally H, San JE, Giandhari J, de Oliveira T. Unlocking the efficiency of genomics laboratories with robotic liquid-handling. BMC Genomics 2020; 21:729. [PMID: 33081689 PMCID: PMC7576741 DOI: 10.1186/s12864-020-07137-1] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2020] [Accepted: 10/11/2020] [Indexed: 02/08/2023] Open
Abstract
In research and clinical genomics laboratories today, sample preparation is the bottleneck of experiments, particularly when it comes to high-throughput next generation sequencing (NGS). More genomics laboratories are now considering liquid-handling automation to make the sequencing workflow more efficient and cost effective. The question remains as to its suitability and return on investment. A number of points need to be carefully considered before introducing robots into biological laboratories. Here, we describe the state-of-the-art technology of both sophisticated and do-it-yourself (DIY) robotic liquid-handlers and provide a practical review of the motivation, implications and requirements of laboratory automation for genome sequencing experiments.
Collapse
Affiliation(s)
- Houriiyah Tegally
- Kwazulu-Natal Research and Innovation Sequencing Platform (KRISP), College of Health Sciences, K-RITH Tower Building, Nelson R Mandela School of Medicine, University of KwaZulu-Natal, 719 Umbilo Road, Durban, South Africa.
| | - James Emmanuel San
- Kwazulu-Natal Research and Innovation Sequencing Platform (KRISP), College of Health Sciences, K-RITH Tower Building, Nelson R Mandela School of Medicine, University of KwaZulu-Natal, 719 Umbilo Road, Durban, South Africa
| | - Jennifer Giandhari
- Kwazulu-Natal Research and Innovation Sequencing Platform (KRISP), College of Health Sciences, K-RITH Tower Building, Nelson R Mandela School of Medicine, University of KwaZulu-Natal, 719 Umbilo Road, Durban, South Africa
| | - Tulio de Oliveira
- Kwazulu-Natal Research and Innovation Sequencing Platform (KRISP), College of Health Sciences, K-RITH Tower Building, Nelson R Mandela School of Medicine, University of KwaZulu-Natal, 719 Umbilo Road, Durban, South Africa.
- Department of Global Health, University of Washington, 908 Jefferson Street, 13th Floor, Seattle, WA, 98104, USA.
| |
Collapse
|
71
|
OneStopRNAseq: A Web Application for Comprehensive and Efficient Analyses of RNA-Seq Data. Genes (Basel) 2020; 11:genes11101165. [PMID: 33023248 PMCID: PMC7650687 DOI: 10.3390/genes11101165] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2020] [Revised: 09/22/2020] [Accepted: 09/29/2020] [Indexed: 01/21/2023] Open
Abstract
Over the past decade, a large amount of RNA sequencing (RNA-seq) data were deposited in public repositories, and more are being produced at an unprecedented rate. However, there are few open source tools with point-and-click interfaces that are versatile and offer streamlined comprehensive analysis of RNA-seq datasets. To maximize the capitalization of these vast public resources and facilitate the analysis of RNA-seq data by biologists, we developed a web application called OneStopRNAseq for the one-stop analysis of RNA-seq data. OneStopRNAseq has user-friendly interfaces and offers workflows for common types of RNA-seq data analyses, such as comprehensive data-quality control, differential analysis of gene expression, exon usage, alternative splicing, transposable element expression, allele-specific gene expression quantification, and gene set enrichment analysis. Users only need to select the desired analyses and genome build, and provide a Gene Expression Omnibus (GEO) accession number or Dropbox links to sequence files, alignment files, gene-expression-count tables, or rank files with the corresponding metadata. Our pipeline facilitates the comprehensive and efficient analysis of private and public RNA-seq data.
Collapse
|
72
|
Sediment-associated bacterial community and predictive functionalities are influenced by choice of 16S ribosomal RNA hypervariable region(s): An amplicon-based diversity study. Genomics 2020; 112:4968-4979. [PMID: 32911024 DOI: 10.1016/j.ygeno.2020.09.006] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2020] [Revised: 08/15/2020] [Accepted: 09/03/2020] [Indexed: 11/22/2022]
Abstract
Meta-omics approaches such as high-throughput sequencing of 16S hypervariable region(s) [HVR(s)] is extensively applied for profiling microbial community. Several studies have deciphered the influence of HVR(s) on bacterial diversity; most of these were devoted to human body habitats. Extent to which targeted HVR(s) influences the diversity estimates of environmental samples is rather unclear. Here, we evaluated the performance of five widely used universal primer pairs spanning V1-V3, V3-V4, V4, V5-V6 and V7-V9 HVRs to characterize bacterial diversity and predictive functionality of complex marine sediments. Obtained results revealed that the HVR(s) V4 and V5-V6 represented the higher species richness than others while, V1-V3 and V7-V9 were unsuccessful to detect Bacteroidetes and Planctomycetes. Further, PICRUSt analysis showed that the selected HVR(s) also had significant impact on the predictive functional profile. Conclusively, this study proved that HVR selection has a profound effect on overall results and thus should be selected with utmost caution.
Collapse
|
73
|
Beule L, Karlovsky P. Improved normalization of species count data in ecology by scaling with ranked subsampling (SRS): application to microbial communities. PeerJ 2020; 8:e9593. [PMID: 32832266 PMCID: PMC7409812 DOI: 10.7717/peerj.9593] [Citation(s) in RCA: 85] [Impact Index Per Article: 21.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2020] [Accepted: 07/01/2020] [Indexed: 11/20/2022] Open
Abstract
Background Analysis of species count data in ecology often requires normalization to an identical sample size. Rarefying (random subsampling without replacement), which is the current standard method for normalization, has been widely criticized for its poor reproducibility and potential distortion of the community structure. In the context of microbiome count data, researchers explicitly advised against the use of rarefying. Here we introduce a normalization method for species count data called scaling with ranked subsampling (SRS) and demonstrate its suitability for the analysis of microbial communities. Methods SRS consists of two steps. In the scaling step, the counts for all species or operational taxonomic units (OTUs) are divided by a scaling factor chosen in such a way that the sum of scaled counts equals the selected total number of counts Cmin. The relative frequencies of all OTUs remain unchanged. In the subsequent ranked subsampling step, non-integer count values are converted into integers by an algorithm that minimizes subsampling error with regard to the population structure (relative frequencies of species or OTUs) while keeping the total number of counts equal Cmin. SRS and rarefying were compared by normalizing a test library representing a soil bacterial community. Common parameters of biodiversity and population structure (Shannon index H’, species richness, species composition, and relative abundances of OTUs) were determined for libraries normalized to different size by rarefying as well as SRS with 10,000 replications each. An implementation of SRS in R is available for download (https://doi.org/10.20387/BONARES-2657-1NP3). Results SRS showed greater reproducibility and preserved OTU frequencies and alpha diversity better than rarefying. The variance in Shannon diversity increased with the reduction of the library size after rarefying but remained zero for SRS. Relative abundances of OTUs strongly varied among libraries generated by rarefying, whereas libraries normalized by SRS showed only negligible variation. Bray–Curtis index of dissimilarity among replicates of the same library normalized by rarefying revealed a large variation in species composition, which reached complete dissimilarity (not a single OTU shared) among some libraries rarefied to a small size. The dissimilarity among replicated libraries normalized by SRS remained negligibly low at each library size. The variance in dissimilarity increased with the decreasing library size after rarefying, whereas it remained either zero or negligibly low after SRS. Conclusions Normalization of OTU or species counts by scaling with ranked subsampling preserves the original community structure by minimizing subsampling errors. We therefore propose SRS for the normalization of biological count data.
Collapse
Affiliation(s)
- Lukas Beule
- Molecular Phytopathology and Mycotoxin Research, Georg-August Universität Göttingen, Göttingen, Germany
| | - Petr Karlovsky
- Molecular Phytopathology and Mycotoxin Research, Georg-August Universität Göttingen, Göttingen, Germany
| |
Collapse
|
74
|
Alvarez-Suarez DE, Tovar H, Hernández-Lemus E, Orjuela M, Sadowinski-Pine S, Cabrera-Muñoz L, Camacho J, Favari L, Hernández-Angeles A, Ponce-Castañeda MV. Discovery of a transcriptomic core of genes shared in 8 primary retinoblastoma with a novel detection score analysis. J Cancer Res Clin Oncol 2020; 146:2029-2040. [PMID: 32474753 DOI: 10.1007/s00432-020-03266-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2020] [Accepted: 05/14/2020] [Indexed: 01/03/2023]
Abstract
PURPOSE Expression microarrays are powerful technology that allows large-scale analysis of RNA profiles in a tissue; these platforms include underexploited detection scores outputs. We developed an algorithm using the detection score, to generate a detection profile of shared elements in retinoblastoma as well as to determine its transcriptomic size and structure. METHODS We analyzed eight briefly cultured primary retinoblastomas with the Human transcriptome array 2.0 (HTA2.0). Transcripts and genes detection scores were determined using the Detection Above Background algorithm (DABG). We used unsupervised and supervised computational tools to analyze detected and undetected elements; WebGestalt was used to explore functions encoded by genes in relevant clusters and performed experimental validation. RESULTS We found a core cluster with 7,513 genes detected and shared by all samples, 4,321 genes in a cluster that was commonly absent, and 7,681 genes variably detected across the samples accounting for tumor heterogeneity. Relevant pathways identified in the core cluster relate to cell cycle, RNA transport, and DNA replication. We performed a kinome analysis of the core cluster and found 4 potential therapeutic kinase targets. Through analysis of the variably detected genes, we discovered 123 differentially expressed transcripts between bilateral and unilateral cases. CONCLUSIONS This novel analytical approach allowed determining the retinoblastoma transcriptomic size, a shared active transcriptomic core among the samples, potential therapeutic target kinases shared by all samples, transcripts related to inter tumor heterogeneity, and to determine transcriptomic profiles without the need of control tissues. This approach is useful to analyze other cancer or tissue types.
Collapse
Affiliation(s)
- Diana E Alvarez-Suarez
- Medical Research Unit in Infectious Diseases, Hospital de Pediatría, CMN SXXI, Instituto Mexicano del Seguro Social, Mexico City, Mexico
- Pharmacology Department, CINVESTAV, Mexico City, Mexico
| | - Hugo Tovar
- Computational Genomics Division, National Institute of Genomic Medicine (INMEGEN), Mexico City, Mexico
| | - Enrique Hernández-Lemus
- Computational Genomics Division, National Institute of Genomic Medicine (INMEGEN), Mexico City, Mexico
| | - Manuela Orjuela
- Epidemiology Department, Columbia University, Columbia, NY, USA
| | - Stanislaw Sadowinski-Pine
- Pathology Department, Hospital Infantil de México Federico Gómez, Secretaría de Salud, Mexico City, Mexico
| | - Lourdes Cabrera-Muñoz
- Pathology Department, Hospital Infantil de México Federico Gómez, Secretaría de Salud, Mexico City, Mexico
| | | | | | - Adriana Hernández-Angeles
- Medical Research Unit in Infectious Diseases, Hospital de Pediatría, CMN SXXI, Instituto Mexicano del Seguro Social, Mexico City, Mexico
| | - M Verónica Ponce-Castañeda
- Medical Research Unit in Infectious Diseases, Hospital de Pediatría, CMN SXXI, Instituto Mexicano del Seguro Social, Mexico City, Mexico.
| |
Collapse
|
75
|
Martín-Alonso S, Frutos-Beltrán E, Menéndez-Arias L. Reverse Transcriptase: From Transcriptomics to Genome Editing. Trends Biotechnol 2020; 39:194-210. [PMID: 32653101 DOI: 10.1016/j.tibtech.2020.06.008] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2020] [Revised: 06/10/2020] [Accepted: 06/15/2020] [Indexed: 01/01/2023]
Abstract
Reverse transcriptases (RTs) are enzymes that can generate a complementary strand of DNA (cDNA) from RNA. Coupled with PCR, RTs have been widely used to detect RNAs and to clone expressed genes. Classical retroviral RTs have been improved by protein engineering. These enzymes and newly characterized RTs are key elements in the development of next-generation sequencing techniques that are now being applied to the study of transcriptomics. In addition, engineered RTs fused to a CRISPR/Cas9 nickase have recently shown great potential as tools to manipulate eukaryotic genomes. In this review, we discuss the properties and uses of wild type and engineered RTs in biotechnological applications, from conventional RT-PCR to recently introduced prime editing.
Collapse
Affiliation(s)
- Samara Martín-Alonso
- Centro de Biología Molecular 'Severo Ochoa' (Consejo Superior de Investigaciones Científicas and Universidad Autónoma de Madrid), c/ Nicolás Cabrera 1, Campus de Cantoblanco-UAM, 28049 Madrid, Spain
| | - Estrella Frutos-Beltrán
- Centro de Biología Molecular 'Severo Ochoa' (Consejo Superior de Investigaciones Científicas and Universidad Autónoma de Madrid), c/ Nicolás Cabrera 1, Campus de Cantoblanco-UAM, 28049 Madrid, Spain
| | - Luis Menéndez-Arias
- Centro de Biología Molecular 'Severo Ochoa' (Consejo Superior de Investigaciones Científicas and Universidad Autónoma de Madrid), c/ Nicolás Cabrera 1, Campus de Cantoblanco-UAM, 28049 Madrid, Spain. @cbm.csic.es
| |
Collapse
|
76
|
Zaheed O, Samson J, Dean K. A bioinformatics approach to identify novel long, non-coding RNAs in breast cancer cell lines from an existing RNA-sequencing dataset. Noncoding RNA Res 2020; 5:48-59. [PMID: 32206740 PMCID: PMC7078458 DOI: 10.1016/j.ncrna.2020.02.004] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2019] [Revised: 02/18/2020] [Accepted: 02/18/2020] [Indexed: 01/17/2023] Open
Abstract
Breast cancer research has traditionally centred on genomic alterations, hormone receptor status and changes in cancer-related proteins to provide new avenues for targeted therapies. Due to advances in next generation sequencing technologies, there has been the emergence of long, non-coding RNAs (lncRNAs) as regulators of normal cellular events, with links to various disease states, including breast cancer. Here we describe our bioinformatic analyses of a previously published RNA sequencing (RNA-seq) dataset to identify lncRNAs with altered expression levels in a subset of breast cancer cell lines. Using a previously published RNA-seq dataset of 675 cancer cell lines, a subset of 18 cell lines was selected for our analyses that included 16 breast cancer lines, one ductal carcinoma in situ line and one normal-like breast epithelial cell line. Principal component analysis demonstrated correlation with well-established categorisation methods of breast cancer (i.e. luminal A/B, HER2 enriched and basal-like A/B). Through detailed comparison of differentially expressed lncRNAs in each breast cancer sub-type with normal-like breast epithelial cells, we identified 15 lncRNAs with consistently altered expression, including three uncharacterised lncRNAs. Utilising data from The Cancer Genome Atlas (TCGA) and The Genotype Tissue Expression (GETx) project via Gene Expression Profiling Interactive Analysis (GEPIA2), we assessed clinical relevance of several identified lncRNAs with invasive breast cancer. Lastly, we determined the relative expression level of six lncRNAs across a spectrum of breast cancer cell lines to experimentally confirm the findings of our bioinformatic analyses. Overall, we show that the use of existing RNA-seq datasets, if re-analysed with modern bioinformatic tools, can provide a valuable resource to identify lncRNAs that could have important biological roles in oncogenesis and tumour progression.
Collapse
Affiliation(s)
| | | | - Kellie Dean
- School of Biochemistry and Cell Biology, Western Gateway Building, University College Cork, Cork, T12XF62, Ireland
| |
Collapse
|
77
|
Wilson-Sánchez D, Lup SD, Sarmiento-Mañús R, Ponce MR, Micol JL. Next-generation forward genetic screens: using simulated data to improve the design of mapping-by-sequencing experiments in Arabidopsis. Nucleic Acids Res 2020; 47:e140. [PMID: 31544937 PMCID: PMC6868388 DOI: 10.1093/nar/gkz806] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2019] [Revised: 09/07/2019] [Accepted: 09/10/2019] [Indexed: 12/25/2022] Open
Abstract
Forward genetic screens have successfully identified many genes and continue to be powerful tools for dissecting biological processes in Arabidopsis and other model species. Next-generation sequencing technologies have revolutionized the time-consuming process of identifying the mutations that cause a phenotype of interest. However, due to the cost of such mapping-by-sequencing experiments, special attention should be paid to experimental design and technical decisions so that the read data allows to map the desired mutation. Here, we simulated different mapping-by-sequencing scenarios. We first evaluated which short-read technology was best suited for analyzing gene-rich genomic regions in Arabidopsis and determined the minimum sequencing depth required to confidently call single nucleotide variants. We also designed ways to discriminate mutagenesis-induced mutations from background Single Nucleotide Polymorphisms in mutants isolated in Arabidopsis non-reference lines. In addition, we simulated bulked segregant mapping populations for identifying point mutations and monitored how the size of the mapping population and the sequencing depth affect mapping precision. Finally, we provide the computational basis of a protocol that we already used to map T-DNA insertions with paired-end Illumina-like reads, using very low sequencing depths and pooling several mutants together; this approach can also be used with single-end reads as well as to map any other insertional mutagen. All these simulations proved useful for designing experiments that allowed us to map several mutations in Arabidopsis.
Collapse
Affiliation(s)
- David Wilson-Sánchez
- Instituto de Bioingeniería, Universidad Miguel Hernández, Campus de Elche, 03202 Elche, Spain
| | - Samuel Daniel Lup
- Instituto de Bioingeniería, Universidad Miguel Hernández, Campus de Elche, 03202 Elche, Spain
| | - Raquel Sarmiento-Mañús
- Instituto de Bioingeniería, Universidad Miguel Hernández, Campus de Elche, 03202 Elche, Spain
| | - María Rosa Ponce
- Instituto de Bioingeniería, Universidad Miguel Hernández, Campus de Elche, 03202 Elche, Spain
| | - José Luis Micol
- Instituto de Bioingeniería, Universidad Miguel Hernández, Campus de Elche, 03202 Elche, Spain
| |
Collapse
|
78
|
Martín-Alonso S, Álvarez M, Nevot M, Martínez MÁ, Menéndez-Arias L. Defective Strand-Displacement DNA Synthesis Due to Accumulation of Thymidine Analogue Resistance Mutations in HIV-2 Reverse Transcriptase. ACS Infect Dis 2020; 6:1140-1153. [PMID: 32129987 DOI: 10.1021/acsinfecdis.9b00512] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Retroviral reverse transcriptases (RTs) have the ability to carry out strand displacement DNA synthesis in the absence of accessory proteins. Although studies with RTs and other DNA polymerases suggest that fingers subdomain residues participate in strand displacement, molecular determinants of this activity are still unknown. A mutant human immunodeficiency virus type 2 (HIV-2) RT (M41L/D67N/K70R/S215Y) with low strand displacement activity was identified after screening a panel of purified enzymes, including several antiretroviral drug-resistant HIV-1 and HIV-2 RTs. In HIV-1, resistance to zidovudine and other thymidine analogues is conferred by different combinations of M41L, D67N, K70R, L210W, T215F/Y, and K219E/Q (designated as thymidine analogue resistance-associated mutations (TAMs)). However, those changes are rarely selected in HIV-2. We show that the strand displacement activity of HIV-2ROD mutants M41L/S215Y and D67N/K70R was only slightly reduced compared to the wild-type RT. In contrast, mutants D67N/K70R/S215Y and M41L/D67N/K70R/S215Y were the most defective RTs in reactions carried out with nicked and gapped substrates. Moreover, these enzymes showed the lowest nucleotide incorporation rates in assays carried out with strand displacement substrates. Unlike in HIV-2, substitutions M41L/T215Y and D67N/K70R/T215Y/K219Q had no effect on the strand displacement activity of HIV-1BH10 RT. The strand displacement efficiencies of HIV-2ROD RTs were consistent with the lower replication capacity of HIV-2 strains bearing the four major TAMs in their RT. Our results highlight the role of the fingers subdomain in strand displacement. These findings might be important for the development of strand-displacement defective RTs.
Collapse
Affiliation(s)
- Samara Martín-Alonso
- Centro de Biologı́a Molecular “Severo Ochoa” (Consejo Superior de Investigaciones Cientı́ficas and Universidad Autónoma de Madrid), c/Nicolás Cabrera 1, Campus de Cantoblanco-UAM, 28049 Madrid, Spain
| | - Mar Álvarez
- Centro de Biologı́a Molecular “Severo Ochoa” (Consejo Superior de Investigaciones Cientı́ficas and Universidad Autónoma de Madrid), c/Nicolás Cabrera 1, Campus de Cantoblanco-UAM, 28049 Madrid, Spain
| | - María Nevot
- Laboratori de Retrovirologia, Fundació irsiCaixa, Hospital Universitari Germans Trias i Pujol, Badalona, 08916 Barcelona, Spain
| | - Miguel Á. Martínez
- Laboratori de Retrovirologia, Fundació irsiCaixa, Hospital Universitari Germans Trias i Pujol, Badalona, 08916 Barcelona, Spain
| | - Luis Menéndez-Arias
- Centro de Biologı́a Molecular “Severo Ochoa” (Consejo Superior de Investigaciones Cientı́ficas and Universidad Autónoma de Madrid), c/Nicolás Cabrera 1, Campus de Cantoblanco-UAM, 28049 Madrid, Spain
| |
Collapse
|
79
|
Creydt M, Fischer M. Food authentication in real life: How to link nontargeted approaches with routine analytics? Electrophoresis 2020; 41:1665-1679. [PMID: 32249434 DOI: 10.1002/elps.202000030] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2020] [Revised: 03/19/2020] [Accepted: 03/23/2020] [Indexed: 12/20/2022]
Abstract
In times of increasing globalization and the resulting complexity of trade flows, securing food quality is an increasing challenge. The development of analytical methods for checking the integrity and, thus, the safety of food is one of the central questions for actors from science, politics, and industry. Targeted methods, for the detection of a few selected analytes, still play the most important role in routine analysis. In the past 5 years, nontargeted methods that do not aim at individual analytes but on analyte profiles that are as comprehensive as possible have increasingly come into focus. Instead of investigating individual chemical structures, data patterns are collected, evaluated and, depending on the problem, fed into databases that can be used for further nontargeted approaches. Alternatively, individual markers can be extracted and transferred to targeted methods. Such an approach requires (i) the availability of authentic reference material, (ii) the corresponding high-resolution laboratory infrastructure, and (iii) extensive expertise in processing and storing very large amounts of data. Probably due to the requirements mentioned above, only a few methods have really established themselves in routine analysis. This review article focuses on the establishment of nontargeted methods in routine laboratories. Challenges are summarized and possible solutions are presented.
Collapse
Affiliation(s)
- Marina Creydt
- Hamburg School of Food Science, Institute of Food Chemistry, University of Hamburg, Hamburg, Germany
| | - Markus Fischer
- Hamburg School of Food Science, Institute of Food Chemistry, University of Hamburg, Hamburg, Germany
| |
Collapse
|
80
|
Nieuwenhuis TO, Yang SY, Verma RX, Pillalamarri V, Arking DE, Rosenberg AZ, McCall MN, Halushka MK. Consistent RNA sequencing contamination in GTEx and other data sets. Nat Commun 2020; 11:1933. [PMID: 32321923 PMCID: PMC7176728 DOI: 10.1038/s41467-020-15821-9] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2019] [Accepted: 03/23/2020] [Indexed: 01/15/2023] Open
Abstract
A challenge of next generation sequencing is read contamination. We use Genotype-Tissue Expression (GTEx) datasets and technical metadata along with RNA-seq datasets from other studies to understand factors that contribute to contamination. Here we report, of 48 analyzed tissues in GTEx, 26 have variant co-expression clusters of four highly expressed and pancreas-enriched genes (PRSS1, PNLIP, CLPS, and/or CELA3A). Fourteen additional highly expressed genes from other tissues also indicate contamination. Sample contamination is strongly associated with a sample being sequenced on the same day as a tissue that natively expresses those genes. Discrepant SNPs across four contaminating genes validate the contamination. Low-level contamination affects ~40% of samples and leads to numerous eQTL assignments in inappropriate tissues among these 18 genes. This type of contamination occurs widely, impacting bulk and single cell (scRNA-seq) data set analysis. In conclusion, highly expressed, tissue-enriched genes basally contaminate GTEx and other datasets impacting analyses.
Collapse
Affiliation(s)
- Tim O Nieuwenhuis
- Department of Pathology, Johns Hopkins University SOM, Baltimore, MD, 21205, USA
- McKusick-Nathans Institute, Department of Genetic Medicine, Johns Hopkins University SOM, Baltimore, MD, 21205, USA
| | - Stephanie Y Yang
- McKusick-Nathans Institute, Department of Genetic Medicine, Johns Hopkins University SOM, Baltimore, MD, 21205, USA
| | - Rohan X Verma
- Department of Pathology, Johns Hopkins University SOM, Baltimore, MD, 21205, USA
| | - Vamsee Pillalamarri
- McKusick-Nathans Institute, Department of Genetic Medicine, Johns Hopkins University SOM, Baltimore, MD, 21205, USA
| | - Dan E Arking
- McKusick-Nathans Institute, Department of Genetic Medicine, Johns Hopkins University SOM, Baltimore, MD, 21205, USA
| | - Avi Z Rosenberg
- Department of Pathology, Johns Hopkins University SOM, Baltimore, MD, 21205, USA
| | - Matthew N McCall
- Department of Biostatistics and Computational Biology, University of Rochester Medical Center, Rochester, NY, 14642, USA
| | - Marc K Halushka
- Department of Pathology, Johns Hopkins University SOM, Baltimore, MD, 21205, USA.
| |
Collapse
|
81
|
Hammer SE, Leopold M, Prawits LM, Mair KH, Schwartz JC, Hammond JA, Ravens S, Gerner W, Saalmüller A. Development of a RACE-based RNA-Seq approach to characterize the T-cell receptor repertoire of porcine γδ T cells. DEVELOPMENTAL AND COMPARATIVE IMMUNOLOGY 2020; 105:103575. [PMID: 31846687 DOI: 10.1016/j.dci.2019.103575] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/24/2019] [Revised: 12/13/2019] [Accepted: 12/13/2019] [Indexed: 06/10/2023]
Abstract
Recent data suggest that porcine γδ T cells exhibit a similar degree of functional plasticity as human and murine γδ T cells. Due to the high frequency of TCR-γδ+ cells in blood and secondary lymphatic organs, the pig is an attractive model to study these cells, especially their combined features of the innate and the adaptive immune system. Using a 5' RACE-like approach, we translated a human/murine NGS library preparation strategy to capture full-length V-(D)-J TRG and TRD clonotypes in swine. After oligo(dT) primed conversion of input RNA, the cDNA population was enriched for full-length V(D)J TCR transcripts with porcine-specific primers including Illumina adaptor sequences as overhangs for Illumina MiSeq analysis. After quality control and processing by FastQC and ea-utils, porcine TRG and TRD sequences were mapped against the human IMGT reference directory. Porcine blood-derived CD2+ and CD2‾ TCR-γδ+ cells exhibited two distinct clonotypes Vγ11JγP1 (74.6%) and Vγ10JγP1 (57.7%), respectively. Despite the high TCR-δ diversity among CD2+ cells (39 clonotypes), both subsets shared the same abundant Vδ1DδxJδ4 clonotype at approximately identically frequencies (CD2+: 31.2%; CD2‾: 37.0%). The flexible nature of this approach will facilitate the assessment of organ-specific phenotypes of γδ T cell subsets alongside with their respective TCR diversity at single cell resolution.
Collapse
Affiliation(s)
- Sabine E Hammer
- Institute of Immunology, Department of Pathobiology, University of Veterinary Medicine, Vienna, Austria.
| | - Melanie Leopold
- Institute of Immunology, Department of Pathobiology, University of Veterinary Medicine, Vienna, Austria
| | - Lisa-Maria Prawits
- Institute of Immunology, Department of Pathobiology, University of Veterinary Medicine, Vienna, Austria
| | - Kerstin H Mair
- Institute of Immunology, Department of Pathobiology, University of Veterinary Medicine, Vienna, Austria; CD Laboratory for an Optimized Prediction of Vaccination Success in Pigs, Institute of Immunology, Department of Pathobiology, University of Veterinary Medicine, Vienna, Austria
| | | | | | - Sarina Ravens
- Institute of Immunology, Hannover Medical School, Hannover, Germany
| | - Wilhelm Gerner
- Institute of Immunology, Department of Pathobiology, University of Veterinary Medicine, Vienna, Austria; CD Laboratory for an Optimized Prediction of Vaccination Success in Pigs, Institute of Immunology, Department of Pathobiology, University of Veterinary Medicine, Vienna, Austria
| | - Armin Saalmüller
- Institute of Immunology, Department of Pathobiology, University of Veterinary Medicine, Vienna, Austria
| |
Collapse
|
82
|
Sato MP, Ogura Y, Nakamura K, Nishida R, Gotoh Y, Hayashi M, Hisatsune J, Sugai M, Takehiko I, Hayashi T. Comparison of the sequencing bias of currently available library preparation kits for Illumina sequencing of bacterial genomes and metagenomes. DNA Res 2020; 26:391-398. [PMID: 31364694 PMCID: PMC6796507 DOI: 10.1093/dnares/dsz017] [Citation(s) in RCA: 56] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2019] [Accepted: 07/17/2019] [Indexed: 01/23/2023] Open
Abstract
In bacterial genome and metagenome sequencing, Illumina sequencers are most frequently used due to their high throughput capacity, and multiple library preparation kits have been developed for Illumina platforms. Here, we systematically analysed and compared the sequencing bias generated by currently available library preparation kits for Illumina sequencing. Our analyses revealed that a strong sequencing bias is introduced in low-GC regions by the Nextera XT kit. The level of bias introduced is dependent on the level of GC content; stronger bias is generated as the GC content decreases. Other analysed kits did not introduce this strong sequencing bias. The GC content-associated sequencing bias introduced by Nextera XT was more remarkable in metagenome sequencing of a mock bacterial community and seriously affected estimation of the relative abundance of low-GC species. The results of our analyses highlight the importance of selecting proper library preparation kits according to the purposes and targets of sequencing, particularly in metagenome sequencing, where a wide range of microbial species with various degrees of GC content is present. Our data also indicate that special attention should be paid to which library preparation kit was used when analysing and interpreting publicly available metagenomic data.
Collapse
Affiliation(s)
- Mitsuhiko P Sato
- Department of Bacteriology, Graduate School of Medical Sciences, Kyushu University, Fukuoka, Fukuoka, Japan
| | - Yoshitoshi Ogura
- Department of Bacteriology, Graduate School of Medical Sciences, Kyushu University, Fukuoka, Fukuoka, Japan
| | - Keiji Nakamura
- Department of Bacteriology, Graduate School of Medical Sciences, Kyushu University, Fukuoka, Fukuoka, Japan
| | - Ruriko Nishida
- Department of Bacteriology, Graduate School of Medical Sciences, Kyushu University, Fukuoka, Fukuoka, Japan.,Department of Medicine and Biosystemic Science, Graduate School of Medical Sciences, Kyushu University, Fukuoka, Fukuoka, Japan
| | - Yasuhiro Gotoh
- Department of Bacteriology, Graduate School of Medical Sciences, Kyushu University, Fukuoka, Fukuoka, Japan
| | - Masahiro Hayashi
- Division of Anaerobe Research, Life Science Research Center, Gifu University, Gifu, Gifu, Japan.,Center for Conservation of Microbial Genetic Resource, Gifu University, Gifu, Gifu, Japan
| | - Junzo Hisatsune
- Project Research Center for Nosocomial Infectious Diseases, Hiroshima University, Hiroshima, Hiroshima, Japan.,Department of Bacteriology, Graduate School of Biomedical and Health Sciences, Hiroshima University, Hiroshima, Hiroshima, Japan.,Antimicrobial Resistance Research Center, National Institute of Infectious Diseases, Tokyo, Japan
| | - Motoyuki Sugai
- Project Research Center for Nosocomial Infectious Diseases, Hiroshima University, Hiroshima, Hiroshima, Japan.,Department of Bacteriology, Graduate School of Biomedical and Health Sciences, Hiroshima University, Hiroshima, Hiroshima, Japan.,Antimicrobial Resistance Research Center, National Institute of Infectious Diseases, Tokyo, Japan
| | - Itoh Takehiko
- Department of Biological Information, Tokyo Institute of Technology, Tokyo, Japan
| | - Tetsuya Hayashi
- Department of Bacteriology, Graduate School of Medical Sciences, Kyushu University, Fukuoka, Fukuoka, Japan
| |
Collapse
|
83
|
The Tempo and Mode of Angiosperm Mitochondrial Genome Divergence Inferred from Intraspecific Variation in Arabidopsis thaliana. G3-GENES GENOMES GENETICS 2020; 10:1077-1086. [PMID: 31964685 PMCID: PMC7056966 DOI: 10.1534/g3.119.401023] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
The mechanisms of sequence divergence in angiosperm mitochondrial genomes have long been enigmatic. In particular, it is difficult to reconcile the rapid divergence of intergenic regions that can make non-coding sequences almost unrecognizable even among close relatives with the unusually high levels of sequence conservation found in genic regions. It has been hypothesized that different mutation and repair mechanisms act on genic and intergenic sequences or alternatively that mutational input is relatively constant but that selection has strikingly different effects on these respective regions. To test these alternative possibilities, we analyzed mtDNA divergence within Arabidopsis thaliana, including variants from the 1001 Genomes Project and changes accrued in published mutation accumulation (MA) lines. We found that base-substitution frequencies are relatively similar for intergenic regions and synonymous sites in coding regions, whereas indel and nonsynonymous substitutions rates are greatly depressed in coding regions, supporting a conventional model in which mutation/repair mechanisms are consistent throughout the genome but differentially filtered by selection. Most types of sequence and structural changes were undetectable in 10-generation MA lines, but we found significant shifts in relative copy number across mtDNA regions for lines grown under stressed vs. benign conditions. We confirmed quantitative variation in copy number across the A. thaliana mitogenome using both whole-genome sequencing and droplet digital PCR, further undermining the classic but oversimplified model of a circular angiosperm mtDNA structure. Our results suggest that copy number variation is one of the most fluid features of angiosperm mitochondrial genomes.
Collapse
|
84
|
Comparison of Mendeliome exome capture kits for use in clinical diagnostics. Sci Rep 2020; 10:3235. [PMID: 32094380 PMCID: PMC7039898 DOI: 10.1038/s41598-020-60215-y] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2019] [Accepted: 02/10/2020] [Indexed: 02/06/2023] Open
Abstract
Next generation sequencing has disrupted genetic testing, allowing far more scope in the tests applied. The appropriate sections of the genome to be tested can now be readily selected, from single mutations to whole-genome sequencing. One product offering within this spectrum are focused exomes, targeting ~5,000 genes know to be implicated in human disease. These are designed to offer a flexible platform offering high diagnostic yield with a reduction in sequencing requirement compared to whole exome sequencing. Here, we have undertaken sequencing of control DNA samples and compare two kits, the Illumina TruSight One and the Agilent SureSelect Focused Exome. Characteristics of the kits are comprehensively evaluated. Despite the larger design region of the Agilent kit, we find that the Illumina kit performs better in terms of gene coverage, as well as coverage of clinically relevant loci. We provide exhaustive coverage statistics for each kit to aid the assessment of their suitability and provide read data for control DNA samples to allow for bioinformatic benchmarking by users developing pipelines for these data.
Collapse
|
85
|
Browne PD, Nielsen TK, Kot W, Aggerholm A, Gilbert MTP, Puetz L, Rasmussen M, Zervas A, Hansen LH. GC bias affects genomic and metagenomic reconstructions, underrepresenting GC-poor organisms. Gigascience 2020; 9:giaa008. [PMID: 32052832 PMCID: PMC7016772 DOI: 10.1093/gigascience/giaa008] [Citation(s) in RCA: 61] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2019] [Revised: 11/25/2019] [Accepted: 01/14/2020] [Indexed: 02/01/2023] Open
Abstract
BACKGROUND Metagenomic sequencing is a well-established tool in the modern biosciences. While it promises unparalleled insights into the genetic content of the biological samples studied, conclusions drawn are at risk from biases inherent to the DNA sequencing methods, including inaccurate abundance estimates as a function of genomic guanine-cytosine (GC) contents. RESULTS We explored such GC biases across many commonly used platforms in experiments sequencing multiple genomes (with mean GC contents ranging from 28.9% to 62.4%) and metagenomes. GC bias profiles varied among different library preparation protocols and sequencing platforms. We found that our workflows using MiSeq and NextSeq were hindered by major GC biases, with problems becoming increasingly severe outside the 45-65% GC range, leading to a falsely low coverage in GC-rich and especially GC-poor sequences, where genomic windows with 30% GC content had >10-fold less coverage than windows close to 50% GC content. We also showed that GC content correlates tightly with coverage biases. The PacBio and HiSeq platforms also evidenced similar profiles of GC biases to each other, which were distinct from those seen in the MiSeq and NextSeq workflows. The Oxford Nanopore workflow was not afflicted by GC bias. CONCLUSIONS These findings indicate potential sources of difficulty, arising from GC biases, in genome sequencing that could be pre-emptively addressed with methodological optimizations provided that the GC biases inherent to the relevant workflow are understood. Furthermore, it is recommended that a more critical approach be taken in quantitative abundance estimates in metagenomic studies. In the future, metagenomic studies should take steps to account for the effects of GC bias before drawing conclusions, or they should use a demonstrably unbiased workflow.
Collapse
Affiliation(s)
- Patrick Denis Browne
- Department of Plant and Environmental Sciences, University of Copenhagen, Thorvaldsensvej 40, Frederiksberg C, 1871, Denmark
- Department of Environmental Science, Aarhus University, Frederiksborgvej 399, Roskilde, 4000, Denmark
| | - Tue Kjærgaard Nielsen
- Department of Plant and Environmental Sciences, University of Copenhagen, Thorvaldsensvej 40, Frederiksberg C, 1871, Denmark
- Department of Environmental Science, Aarhus University, Frederiksborgvej 399, Roskilde, 4000, Denmark
| | - Witold Kot
- Department of Plant and Environmental Sciences, University of Copenhagen, Thorvaldsensvej 40, Frederiksberg C, 1871, Denmark
- Department of Environmental Science, Aarhus University, Frederiksborgvej 399, Roskilde, 4000, Denmark
| | - Anni Aggerholm
- Department of Hematology, Aarhus University Hospital, Palle Juul-Jensens Boulevard 99, Aarhus N, 8200, Denmark
| | - M Thomas P Gilbert
- The GLOBE Institute, Faculty of Health and Biomedical Sciences, University of Copenhagen, Blegdamsvej 3B, Copenhagen N, 2200, Denmark
| | - Lara Puetz
- The GLOBE Institute, Faculty of Health and Biomedical Sciences, University of Copenhagen, Blegdamsvej 3B, Copenhagen N, 2200, Denmark
| | - Morten Rasmussen
- Department of Genetics, School of Medicine, Stanford University, 291 Campus Drive, Stanford, CA 94305-5051, USA
| | - Athanasios Zervas
- Department of Environmental Science, Aarhus University, Frederiksborgvej 399, Roskilde, 4000, Denmark
| | - Lars Hestbjerg Hansen
- Department of Plant and Environmental Sciences, University of Copenhagen, Thorvaldsensvej 40, Frederiksberg C, 1871, Denmark
- Department of Environmental Science, Aarhus University, Frederiksborgvej 399, Roskilde, 4000, Denmark
| |
Collapse
|
86
|
Czech A. Deep sequencing of tRNA's 3'-termini sheds light on CCA-tail integrity and maturation. RNA (NEW YORK, N.Y.) 2020; 26:199-208. [PMID: 31719125 PMCID: PMC6961547 DOI: 10.1261/rna.072330.119] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/18/2019] [Accepted: 11/07/2019] [Indexed: 06/10/2023]
Abstract
The 3'-termini of tRNA are the point of amino acid linkage and thus crucial for their function in delivering amino acids to the ribosome and other enzymes. Therefore, to provide tRNA functionality, cells have to ensure the integrity of the 3'-terminal CCA-tail, which is generated during maturation by the 3'-trailer processing machinery and maintained by the CCA-adding enzyme. We developed a new tRNA sequencing method that is specifically tailored to assess the 3'-termini of E. coli tRNA. Intriguingly, we found a significant fraction of tRNAs with damaged CCA-tails under exponential growth conditions and, surprisingly, this fraction decreased upon transition into stationary phase. Interestingly, tRNAs bearing guanine as a discriminator base are generally unaffected by CCA-tail damage. In addition, we showed tRNA species-specific 3'-trailer processing patterns and reproduced in vitro findings on preferences of the maturation enzyme RNase T in vivo.
Collapse
Affiliation(s)
- Andreas Czech
- Institute of Biochemistry and Molecular Biology, Chemistry Department, University of Hamburg, 20146 Hamburg, Germany
| |
Collapse
|
87
|
Milon N, Fuentes Rojas JL, Castinel A, Bigot L, Bouwmans G, Baudelle K, Boutonnet A, Gibert A, Bouchez O, Donnadieu C, Ginot F, Bancaud A. A tunable filter for high molecular weight DNA selection and linked-read sequencing. LAB ON A CHIP 2020; 20:175-184. [PMID: 31796946 DOI: 10.1039/c9lc00965e] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
In third generation sequencing, the production of quality data requires the selection of molecules longer than ∼20 kbp, but the size selection threshold of most purification technologies is smaller than this target. Here, we describe a technology operated in a capillary with a tunable selection threshold in the range of 3 to 40 kbp controlled by an electric field. We demonstrate that the selection cut-off is sharp, the purification yield is high, and the purification throughput is scalable. We also provide an analytical model that the actuation settings of the filter. The selection of high molecular weight genomic DNA from the melon Cucumis melo L., a diploid organism of ∼0.45 Gbp, is then reported. Linked-read sequencing data show that the N50 phase block size, which scores the correct representation of two chromosomes, is enhanced by a factor of 2 after size selection, establishing the relevance and versatility of our technology.
Collapse
Affiliation(s)
- Nicolas Milon
- CNRS, LAAS, 7 avenue du colonel Roche, F-31400, Toulouse, France. and Adelis Technologies, 478 Rue de la Découverte, 31670 Labège, France
| | | | - Adrien Castinel
- INRA, US 1426 GeT-PlaGe, INRA Auzeville, F-31326, Castanet-Tolosan Cedex, France
| | - Laurent Bigot
- Univ. Lille, CNRS, UMR 8523 - PhLAM - Physique des Lasers Atomes et Molécules, F-59000 Lille, France
| | - Géraud Bouwmans
- Univ. Lille, CNRS, UMR 8523 - PhLAM - Physique des Lasers Atomes et Molécules, F-59000 Lille, France
| | - Karen Baudelle
- Univ. Lille, CNRS, UMR 8523 - PhLAM - Physique des Lasers Atomes et Molécules, F-59000 Lille, France
| | - Audrey Boutonnet
- Adelis Technologies, 478 Rue de la Découverte, 31670 Labège, France
| | - Audrey Gibert
- INRA, US 1426 GeT-PlaGe, INRA Auzeville, F-31326, Castanet-Tolosan Cedex, France
| | - Olivier Bouchez
- INRA, US 1426 GeT-PlaGe, INRA Auzeville, F-31326, Castanet-Tolosan Cedex, France
| | - Cécile Donnadieu
- INRA, US 1426 GeT-PlaGe, INRA Auzeville, F-31326, Castanet-Tolosan Cedex, France
| | - Frédéric Ginot
- Adelis Technologies, 478 Rue de la Découverte, 31670 Labège, France
| | - Aurélien Bancaud
- CNRS, LAAS, 7 avenue du colonel Roche, F-31400, Toulouse, France.
| |
Collapse
|
88
|
Pereira R, Oliveira J, Sousa M. Bioinformatics and Computational Tools for Next-Generation Sequencing Analysis in Clinical Genetics. J Clin Med 2020; 9:E132. [PMID: 31947757 PMCID: PMC7019349 DOI: 10.3390/jcm9010132] [Citation(s) in RCA: 86] [Impact Index Per Article: 21.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2019] [Revised: 12/15/2019] [Accepted: 12/30/2019] [Indexed: 12/13/2022] Open
Abstract
Clinical genetics has an important role in the healthcare system to provide a definitive diagnosis for many rare syndromes. It also can have an influence over genetics prevention, disease prognosis and assisting the selection of the best options of care/treatment for patients. Next-generation sequencing (NGS) has transformed clinical genetics making possible to analyze hundreds of genes at an unprecedented speed and at a lower price when comparing to conventional Sanger sequencing. Despite the growing literature concerning NGS in a clinical setting, this review aims to fill the gap that exists among (bio)informaticians, molecular geneticists and clinicians, by presenting a general overview of the NGS technology and workflow. First, we will review the current NGS platforms, focusing on the two main platforms Illumina and Ion Torrent, and discussing the major strong points and weaknesses intrinsic to each platform. Next, the NGS analytical bioinformatic pipelines are dissected, giving some emphasis to the algorithms commonly used to generate process data and to analyze sequence variants. Finally, the main challenges around NGS bioinformatics are placed in perspective for future developments. Even with the huge achievements made in NGS technology and bioinformatics, further improvements in bioinformatic algorithms are still required to deal with complex and genetically heterogeneous disorders.
Collapse
Affiliation(s)
- Rute Pereira
- Laboratory of Cell Biology, Department of Microscopy, Institute of Biomedical Sciences Abel Salazar (ICBAS), University of Porto (UP), 4050-313 Porto, Portugal;
- Biology and Genetics of Reproduction Unit, Multidisciplinary Unit for Biomedical Research (UMIB), ICBAS-UP, 4050-313 Porto, Portugal;
| | - Jorge Oliveira
- Biology and Genetics of Reproduction Unit, Multidisciplinary Unit for Biomedical Research (UMIB), ICBAS-UP, 4050-313 Porto, Portugal;
- UnIGENe and CGPP–Centre for Predictive and Preventive Genetics-Institute for Molecular and Cell Biology (IBMC), i3S-Institute for Research and Innovation in Health-UP, 4200-135 Porto, Portugal
| | - Mário Sousa
- Laboratory of Cell Biology, Department of Microscopy, Institute of Biomedical Sciences Abel Salazar (ICBAS), University of Porto (UP), 4050-313 Porto, Portugal;
- Biology and Genetics of Reproduction Unit, Multidisciplinary Unit for Biomedical Research (UMIB), ICBAS-UP, 4050-313 Porto, Portugal;
| |
Collapse
|
89
|
Favero VO, Carvalho RH, Motta VM, Leite ABC, Coelho MRR, Xavier GR, Rumjanek NG, Urquiaga S. Bradyrhizobium as the Only Rhizobial Inhabitant of Mung Bean ( Vigna radiata) Nodules in Tropical Soils: A Strategy Based on Microbiome for Improving Biological Nitrogen Fixation Using Bio-Products. FRONTIERS IN PLANT SCIENCE 2020; 11:602645. [PMID: 33510747 PMCID: PMC7835340 DOI: 10.3389/fpls.2020.602645] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/03/2020] [Accepted: 12/14/2020] [Indexed: 05/07/2023]
Abstract
The mung bean has a great potential under tropical conditions given its high content of grain protein. Additionally, its ability to benefit from biological nitrogen fixation (BNF) through association with native rhizobia inhabiting nodule microbiome provides most of the nitrogen independence on fertilizers. Soil microbial communities which are influenced by biogeographical factors and soil properties, represent a source of rhizobacteria capable of stimulating plant growth. The objective of this study is to support selection of beneficial bacteria that form positive interactions with mung bean plants cultivated in tropical soils, as part of a seed inoculation program for increasing grain yield based on the BNF and other mechanisms. Two mung bean genotypes (Camaleão and Esmeralda) were cultivated in 10 soil samples. Nodule microbiome was characterized by next-generation sequencing using Illumina MiSeq 16S rRNA. More than 99% of nodule sequences showed similarity with Bradyrhizobium genus, the only rhizobial present in nodules in our study. Higher bacterial diversity of soil samples collected in agribusiness areas (MW_MT-I, II or III) was associated with Esmeralda genotype, while an organic agroecosystem soil sample (SE_RJ-V) showed the highest bacterial diversity independent of genotype. Furthermore, OTUs close to Bradyrhizobium elkanii have dominated in all soil samples, except in the sample from the organic agroecosystem, where just B. japonicum was present. Bacterial community of mung bean nodules is mainly influenced by soil pH, K, Ca, and P. Besides a difference on nodule colonization by OTU sequences close to the Pseudomonas genus regarding the two genotypes was detected too. Although representing a small rate, around 0.1% of the total, Pseudomonas OTUs were only retrieved from nodules of Esmeralda genotype, suggesting a different trait regarding specificity between macro- and micro-symbionts. The microbiome analysis will guide the next steps in the development of an inoculant for mung bean aiming to promote plant growth and grain yield, composed either by an efficient Bradyrhizobium strain on its own or co-inoculated with a Pseudomonas strain. Considering the results achieved, the assessment of microbial ecology parameters is a potent coadjuvant capable to accelerate the inoculant development process and to improve the benefits to the crop by soil microorganisms.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Norma Gouvêa Rumjanek
- Embrapa Agrobiology, Seropédica, Rio de Janeiro, Brazil
- *Correspondence: Norma Gouvêa Rumjanek,
| | | |
Collapse
|
90
|
Wang TT, Abelson S, Zou J, Li T, Zhao Z, Dick JE, Shlush LI, Pugh TJ, Bratman SV. High efficiency error suppression for accurate detection of low-frequency variants. Nucleic Acids Res 2019; 47:e87. [PMID: 31127310 PMCID: PMC6735726 DOI: 10.1093/nar/gkz474] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2018] [Revised: 04/28/2019] [Accepted: 05/16/2019] [Indexed: 12/30/2022] Open
Abstract
Detection of cancer-associated somatic mutations has broad applications for oncology and precision medicine. However, this becomes challenging when cancer-derived DNA is in low abundance, such as in impure tissue specimens or in circulating cell-free DNA. Next-generation sequencing (NGS) is particularly prone to technical artefacts that can limit the accuracy for calling low-allele-frequency mutations. State-of-the-art methods to improve detection of low-frequency mutations often employ unique molecular identifiers (UMIs) for error suppression; however, these methods are highly inefficient as they depend on redundant sequencing to assemble consensus sequences. Here, we present a novel strategy to enhance the efficiency of UMI-based error suppression by retaining single reads (singletons) that can participate in consensus assembly. This 'Singleton Correction' methodology outperformed other UMI-based strategies in efficiency, leading to greater sensitivity with high specificity in a cell line dilution series. Significant benefits were seen with Singleton Correction at sequencing depths ≤16 000×. We validated the utility and generalizability of this approach in a cohort of >300 individuals whose peripheral blood DNA was subjected to hybrid capture sequencing at ∼5000× depth. Singleton Correction can be incorporated into existing UMI-based error suppression workflows to boost mutation detection accuracy, thus improving the cost-effectiveness and clinical impact of NGS.
Collapse
Affiliation(s)
- Ting Ting Wang
- Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada.,Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada
| | - Sagi Abelson
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada.,Ontario Institute for Cancer Research, Toronto, Ontario, Canada
| | - Jinfeng Zou
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada
| | - Tiantian Li
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada
| | - Zhen Zhao
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada
| | - John E Dick
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada.,Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
| | - Liran I Shlush
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada.,Department of Immunology, Weizmann Institute of Science, Rehovot, Israel
| | - Trevor J Pugh
- Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada.,Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada.,Ontario Institute for Cancer Research, Toronto, Ontario, Canada
| | - Scott V Bratman
- Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada.,Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada.,Department of Radiation Oncology, University of Toronto, Toronto, Ontario, Canada
| |
Collapse
|
91
|
Rapid, multiplexed, whole genome and plasmid sequencing of foodborne pathogens using long-read nanopore technology. Sci Rep 2019; 9:16350. [PMID: 31704961 PMCID: PMC6841976 DOI: 10.1038/s41598-019-52424-x] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2019] [Accepted: 10/16/2019] [Indexed: 12/11/2022] Open
Abstract
U.S. public health agencies have employed next-generation sequencing (NGS) as a tool to quickly identify foodborne pathogens during outbreaks. Although established short-read NGS technologies are known to provide highly accurate data, long-read sequencing is still needed to resolve highly-repetitive genomic regions and genomic arrangement, and to close the sequences of bacterial chromosomes and plasmids. Here, we report the use of long-read nanopore sequencing to simultaneously sequence the entire chromosome and plasmid of Salmonella enterica subsp. enterica serovar Bareilly and Escherichia coli O157:H7. We developed a rapid and random sequencing approach coupled with de novo genome assembly within a customized data analysis workflow that uses publicly-available tools. In sequencing runs as short as four hours, using the MinION instrument, we obtained full-length genomes with an average identity of 99.87% for Salmonella Bareilly and 99.89% for E. coli in comparison to the respective MiSeq references. These nanopore-only assemblies provided readily available information on serotype, virulence factors, and antimicrobial resistance genes. We also demonstrate the potential of nanopore sequencing assemblies for rapid preliminary phylogenetic inference. Nanopore sequencing provides additional advantages as very low capital investment and footprint, and shorter (10 hours library preparation and sequencing) turnaround time compared to other NGS technologies.
Collapse
|
92
|
Sessegolo C, Cruaud C, Da Silva C, Cologne A, Dubarry M, Derrien T, Lacroix V, Aury JM. Transcriptome profiling of mouse samples using nanopore sequencing of cDNA and RNA molecules. Sci Rep 2019; 9:14908. [PMID: 31624302 PMCID: PMC6797730 DOI: 10.1038/s41598-019-51470-9] [Citation(s) in RCA: 56] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2019] [Accepted: 09/28/2019] [Indexed: 01/27/2023] Open
Abstract
Our vision of DNA transcription and splicing has changed dramatically with the introduction of short-read sequencing. These high-throughput sequencing technologies promised to unravel the complexity of any transcriptome. Generally gene expression levels are well-captured using these technologies, but there are still remaining caveats due to the limited read length and the fact that RNA molecules had to be reverse transcribed before sequencing. Oxford Nanopore Technologies has recently launched a portable sequencer which offers the possibility of sequencing long reads and most importantly RNA molecules. Here we generated a full mouse transcriptome from brain and liver using the Oxford Nanopore device. As a comparison, we sequenced RNA (RNA-Seq) and cDNA (cDNA-Seq) molecules using both long and short reads technologies and tested the TeloPrime preparation kit, dedicated to the enrichment of full-length transcripts. Using spike-in data, we confirmed that expression levels are efficiently captured by cDNA-Seq using short reads. More importantly, Oxford Nanopore RNA-Seq tends to be more efficient, while cDNA-Seq appears to be more biased. We further show that the cDNA library preparation of the Nanopore protocol induces read truncation for transcripts containing internal runs of T’s. This bias is marked for runs of at least 15 T’s, but is already detectable for runs of at least 9 T’s and therefore concerns more than 20% of expressed transcripts in mouse brain and liver. Finally, we outline that bioinformatics challenges remain ahead for quantifying at the transcript level, especially when reads are not full-length. Accurate quantification of repeat-associated genes such as processed pseudogenes also remains difficult, and we show that current mapping protocols which map reads to the genome largely over-estimate their expression, at the expense of their parent gene.
Collapse
Affiliation(s)
- Camille Sessegolo
- Univ Lyon, Université Lyon 1, CNRS, Laboratoire de Biométrie et Biologie Évolutive UMR5558, F-69622, Villeurbanne, France.,EPI ERABLE - Inria Grenoble, Rhône-Alpes, France
| | - Corinne Cruaud
- Genoscope, Institut de biologie François-Jacob, Commissariat a l'Energie Atomique (CEA), Université Paris-Saclay, F-91057, Evry, France
| | - Corinne Da Silva
- Genoscope, Institut de biologie François-Jacob, Commissariat a l'Energie Atomique (CEA), Université Paris-Saclay, F-91057, Evry, France
| | - Audric Cologne
- Univ Lyon, Université Lyon 1, CNRS, Laboratoire de Biométrie et Biologie Évolutive UMR5558, F-69622, Villeurbanne, France.,EPI ERABLE - Inria Grenoble, Rhône-Alpes, France
| | - Marion Dubarry
- Genoscope, Institut de biologie François-Jacob, Commissariat a l'Energie Atomique (CEA), Université Paris-Saclay, F-91057, Evry, France
| | - Thomas Derrien
- Univ Rennes, CNRS, IGDR (Institut de génétique et développement de Rennes) - UMR 6290, F-35000, Rennes, France
| | - Vincent Lacroix
- Univ Lyon, Université Lyon 1, CNRS, Laboratoire de Biométrie et Biologie Évolutive UMR5558, F-69622, Villeurbanne, France.,EPI ERABLE - Inria Grenoble, Rhône-Alpes, France
| | - Jean-Marc Aury
- Genoscope, Institut de biologie François-Jacob, Commissariat a l'Energie Atomique (CEA), Université Paris-Saclay, F-91057, Evry, France.
| |
Collapse
|
93
|
Komarova N, Kuznetsov A. Inside the Black Box: What Makes SELEX Better? Molecules 2019; 24:E3598. [PMID: 31591283 PMCID: PMC6804172 DOI: 10.3390/molecules24193598] [Citation(s) in RCA: 71] [Impact Index Per Article: 14.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2019] [Revised: 10/04/2019] [Accepted: 10/04/2019] [Indexed: 02/07/2023] Open
Abstract
Aptamers are small oligonucleotides that are capable of binding specifically to a target, with impressive potential for analysis, diagnostics, and therapeutics applications. Aptamers are isolated from large nucleic acid combinatorial libraries using an iterative selection process called SELEX (Systematic Evolution of Ligands by EXponential enrichment). Since being implemented 30 years ago, the SELEX protocol has undergone many modifications and improvements, but it remains a laborious, time-consuming, and costly method, and the results are not always successful. Each step in the aptamer selection protocol can influence its results. This review discusses key technical points of the SELEX procedure and their influence on the outcome of aptamer selection.
Collapse
Affiliation(s)
- Natalia Komarova
- Scientific-Manufacturing Complex Technological Centre, 1-7 Shokin Square, Zelenograd, Moscow 124498, Russia.
| | - Alexander Kuznetsov
- Scientific-Manufacturing Complex Technological Centre, 1-7 Shokin Square, Zelenograd, Moscow 124498, Russia.
| |
Collapse
|
94
|
Santoro AE, Kellom M, Laperriere SM. Contributions of single-cell genomics to our understanding of planktonic marine archaea. Philos Trans R Soc Lond B Biol Sci 2019; 374:20190096. [PMID: 31587640 DOI: 10.1098/rstb.2019.0096] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
Single-cell genomics has transformed many fields of biology, marine microbiology included. Here, we consider the impact of single-cell genomics on a specific group of marine microbes-the planktonic marine archaea. Despite single-cell enabled discoveries of novel metabolic function in the marine thaumarchaea, population-level investigations are hindered by an overall lower than expected recovery of thaumarchaea in single-cell studies. Metagenome-assembled genomes have so far been a more useful method for accessing genome-resolved insights into the Marine Group II euryarchaea. Future progress in the application of single-cell genomics to archaeal biology in the ocean would benefit from more targeted sorting approaches, and a more systematic investigation of potential biases against archaea in single-cell workflows including cell lysis, genome amplification and genome screening. This article is part of a discussion meeting issue 'Single cell ecology'.
Collapse
Affiliation(s)
- A E Santoro
- Department of Ecology, Evolution, and Marine Biology, University of California, Santa Barbara, CA 93106-9620, USA
| | - M Kellom
- Department of Ecology, Evolution, and Marine Biology, University of California, Santa Barbara, CA 93106-9620, USA
| | - S M Laperriere
- Department of Ecology, Evolution, and Marine Biology, University of California, Santa Barbara, CA 93106-9620, USA
| |
Collapse
|
95
|
Euclide PT, McKinney GJ, Bootsma M, Tarsa C, Meek MH, Larson WA. Attack of the PCR clones: Rates of clonality have little effect on RAD‐seq genotype calls. Mol Ecol Resour 2019; 20:66-78. [DOI: 10.1111/1755-0998.13087] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2019] [Revised: 08/12/2019] [Accepted: 08/16/2019] [Indexed: 12/11/2022]
Affiliation(s)
- Peter T. Euclide
- Wisconsin Cooperative Fishery Research Unit College of Natural Resources University of Wisconsin‐Stevens Point Stevens Point WI USA
| | - Garrett J. McKinney
- School of Aquatic and Fishery Sciences University of Washington Seattle WA USA
| | - Matthew Bootsma
- Wisconsin Cooperative Fishery Research Unit College of Natural Resources University of Wisconsin‐Stevens Point Stevens Point WI USA
| | - Charlene Tarsa
- Department of Integrative Biology and AgBio Research Michigan State University East Lansing MI USA
| | - Mariah H. Meek
- Department of Integrative Biology and AgBio Research Michigan State University East Lansing MI USA
| | - Wesley A. Larson
- U.S. Geological Survey Wisconsin Cooperative Fishery Research Unit College of Natural Resources University of Wisconsin‐Stevens Point Stevens Point WI USA
| |
Collapse
|
96
|
Lightbody G, Haberland V, Browne F, Taggart L, Zheng H, Parkes E, Blayney JK. Review of applications of high-throughput sequencing in personalized medicine: barriers and facilitators of future progress in research and clinical application. Brief Bioinform 2019; 20:1795-1811. [PMID: 30084865 PMCID: PMC6917217 DOI: 10.1093/bib/bby051] [Citation(s) in RCA: 78] [Impact Index Per Article: 15.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2018] [Revised: 05/01/2018] [Indexed: 12/28/2022] Open
Abstract
There has been an exponential growth in the performance and output of sequencing technologies (omics data) with full genome sequencing now producing gigabases of reads on a daily basis. These data may hold the promise of personalized medicine, leading to routinely available sequencing tests that can guide patient treatment decisions. In the era of high-throughput sequencing (HTS), computational considerations, data governance and clinical translation are the greatest rate-limiting steps. To ensure that the analysis, management and interpretation of such extensive omics data is exploited to its full potential, key factors, including sample sourcing, technology selection and computational expertise and resources, need to be considered, leading to an integrated set of high-performance tools and systems. This article provides an up-to-date overview of the evolution of HTS and the accompanying tools, infrastructure and data management approaches that are emerging in this space, which, if used within in a multidisciplinary context, may ultimately facilitate the development of personalized medicine.
Collapse
Affiliation(s)
- Gaye Lightbody
- School of Computing, Ulster University, Newtownabbey, UK
| | - Valeriia Haberland
- MRC Integrative Epidemiology Unit, Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK
| | - Fiona Browne
- School of Computing, Ulster University, Newtownabbey, UK
| | | | - Huiru Zheng
- School of Computing, Ulster University, Newtownabbey, UK
| | - Eileen Parkes
- Centre for Cancer Research & Cell Biology, School of Medicine, Dentistry and Biomedical Sciences, Queen's University, Belfast, UK
| | - Jaine K Blayney
- Centre for Cancer Research & Cell Biology, School of Medicine, Dentistry and Biomedical Sciences, Queen's University, Belfast, UK
| |
Collapse
|
97
|
Andrusch A, Dabrowski PW, Klenner J, Tausch SH, Kohl C, Osman AA, Renard BY, Nitsche A. PAIPline: pathogen identification in metagenomic and clinical next generation sequencing samples. Bioinformatics 2019; 34:i715-i721. [PMID: 30423069 PMCID: PMC6129269 DOI: 10.1093/bioinformatics/bty595] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open
Abstract
Motivation Next generation sequencing (NGS) has provided researchers with a powerful tool to characterize metagenomic and clinical samples in research and diagnostic settings. NGS allows an open view into samples useful for pathogen detection in an unbiased fashion and without prior hypothesis about possible causative agents. However, NGS datasets for pathogen detection come with different obstacles, such as a very unfavorable ratio of pathogen to host reads. Alongside often appearing false positives and irrelevant organisms, such as contaminants, tools are often challenged by samples with low pathogen loads and might not report organisms present below a certain threshold. Furthermore, some metagenomic profiling tools are only focused on one particular set of pathogens, for example bacteria. Results We present PAIPline, a bioinformatics pipeline specifically designed to address problems associated with detecting pathogens in diagnostic samples. PAIPline particularly focuses on userfriendliness and encapsulates all necessary steps from preprocessing to resolution of ambiguous reads and filtering up to visualization in a single tool. In contrast to existing tools, PAIPline is more specific while maintaining sensitivity. This is shown in a comparative evaluation where PAIPline was benchmarked along other well-known metagenomic profiling tools on previously published well-characterized datasets. Additionally, as part of an international cooperation project, PAIPline was applied to an outbreak sample of hemorrhagic fevers of then unknown etiology. The presented results show that PAIPline can serve as a robust, reliable, user-friendly, adaptable and generalizable stand-alone software for diagnostics from NGS samples and as a stepping stone for further downstream analyses. Availability and implementation PAIPline is freely available under https://gitlab.com/rki_bioinformatics/paipline.
Collapse
Affiliation(s)
- Andreas Andrusch
- Highly Pathogenic Viruses (ZBS1), Robert Koch Institute, Berlin, Germany
| | | | - Jeanette Klenner
- Highly Pathogenic Viruses (ZBS1), Robert Koch Institute, Berlin, Germany
| | - Simon H Tausch
- Highly Pathogenic Viruses (ZBS1), Robert Koch Institute, Berlin, Germany
| | - Claudia Kohl
- Highly Pathogenic Viruses (ZBS1), Robert Koch Institute, Berlin, Germany
| | | | | | - Andreas Nitsche
- Highly Pathogenic Viruses (ZBS1), Robert Koch Institute, Berlin, Germany
| |
Collapse
|
98
|
Directional high-throughput sequencing of RNAs without gene-specific primers. Biotechniques 2019; 65:219-223. [PMID: 30284935 DOI: 10.2144/btn-2018-0082] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
Ribosomal RNA analysis is a useful tool for characterization of microbial communities. However, the lack of broad-range primers has hampered the simultaneous analysis of eukaryotic and prokaryotic members by amplicon sequencing. We present a complete workflow for directional, primer-independent sequencing of size-selected small subunit ribosomal RNA fragments. The library preparation protocol includes gel extraction of the target RNA, ligation of an RNA oligo to the 5'-end of the target, and cDNA synthesis with a tailed random-hexamer primer and further barcoding. The sequencing results of a phytoplankton mock community showed a highly similar profile to the biomass indicators. This method has universal potential for microbiome studies, and is compatible for the 5'-end sequencing of other RNA types with minimum library preparation costs.
Collapse
|
99
|
McGirr JA, Martin CH. Hybrid gene misregulation in multiple developing tissues within a recent adaptive radiation of Cyprinodon pupfishes. PLoS One 2019; 14:e0218899. [PMID: 31291291 PMCID: PMC6619667 DOI: 10.1371/journal.pone.0218899] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2019] [Accepted: 06/11/2019] [Indexed: 12/24/2022] Open
Abstract
Genetic incompatibilities constitute the final stages of reproductive isolation and speciation, but little is known about incompatibilities that occur within recent adaptive radiations among closely related diverging populations. Crossing divergent species to form hybrids can break up coadapted variation, resulting in genetic incompatibilities within developmental networks shaping divergent adaptive traits. We crossed two closely related sympatric Cyprinodon pupfish species–a dietary generalist and a specialized molluscivore–and measured expression levels in their F1 hybrids to identify regulatory variation underlying the novel craniofacial morphology found in this recent microendemic adaptive radiation. We extracted mRNA from eight day old whole-larvae tissue and from craniofacial tissues dissected from 17–20 day old larvae to compare gene expression between a total of seven F1 hybrids and 24 individuals from parental species populations. We found 3.9% of genes differentially expressed between generalists and molluscivores in whole-larvae tissues and 0.6% of genes differentially expressed in craniofacial tissue. We found that 2.1% of genes were misregulated in whole-larvae hybrids whereas 19.1% of genes were misregulated in hybrid craniofacial tissues, after correcting for sequencing biases. We also measured allele specific expression across 15,429 heterozygous sites to identify putative compensatory regulatory mechanisms underlying differential expression between generalists and molluscivores. Together, our results highlight the importance of considering misregulation as an early indicator of genetic incompatibilities in the context of rapidly diverging adaptive radiations and suggests that compensatory regulatory divergence drives hybrid gene misregulation in developing tissues that give rise to novel craniofacial traits.
Collapse
Affiliation(s)
- Joseph A. McGirr
- Department of Biology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
- * E-mail:
| | - Christopher H. Martin
- Department of Biology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
- Department of Integrative Biology and Museum of Vertebrate Zoology, University of California, Berkeley, California, United States of America
| |
Collapse
|
100
|
Haynes E, Jimenez E, Pardo MA, Helyar SJ. The future of NGS (Next Generation Sequencing) analysis in testing food authenticity. Food Control 2019. [DOI: 10.1016/j.foodcont.2019.02.010] [Citation(s) in RCA: 46] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
|