1
|
Namias A, Sahlin K, Makoundou P, Bonnici I, Sicard M, Belkhir K, Weill M. Nanopore sequencing of PCR products enables multicopy gene family reconstruction. Comput Struct Biotechnol J 2023; 21:3656-3664. [PMID: 37533804 PMCID: PMC10393513 DOI: 10.1016/j.csbj.2023.07.012] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2023] [Revised: 07/05/2023] [Accepted: 07/11/2023] [Indexed: 08/04/2023] Open
Abstract
The importance of gene amplifications in evolution is more and more recognized. Yet, tools to study multi-copy gene families are still scarce, and many such families are overlooked using common sequencing methods. Haplotype reconstruction is even harder for polymorphic multi-copy gene families. Here, we show that all variants (or haplotypes) of a multi-copy gene family present in a single genome, can be obtained using Oxford Nanopore Technologies sequencing of PCR products, followed by steps of mapping, SNP calling and haplotyping. As a proof of concept, we acquired the sequences of highly similar variants of the cidA and cidB genes present in the genome of the Wolbachia wPip, a bacterium infecting Culex pipiens mosquitoes. Our method relies on a wide database of cid genes, previously acquired by cloning and Sanger sequencing. We addressed problems commonly faced when using mapping approaches for multi-copy gene families with highly similar variants. In addition, we confirmed that PCR amplification causes frequent chimeras which have to be carefully considered when working on families of recombinant genes. We tested the robustness of the method using a combination of bioinformatics (read simulations) and molecular biology approaches (sequence acquisitions through cloning and Sanger sequencing, specific PCRs and digital droplet PCR). When different haplotypes present within a single genome cannot be reconstructed from short reads sequencing, this pipeline confers a high throughput acquisition, gives reliable results as well as insights of the relative copy numbers of the different variants.
Collapse
Affiliation(s)
- Alice Namias
- ISEM, Université de Montpellier, CNRS, IRD, EPHE, Montpellier, France
| | - Kristoffer Sahlin
- Department of Mathematics, Science for Life Laboratory, Stockholm University, 10691 Stockholm, Sweden
| | - Patrick Makoundou
- ISEM, Université de Montpellier, CNRS, IRD, EPHE, Montpellier, France
| | - Iago Bonnici
- ISEM, Université de Montpellier, CNRS, IRD, EPHE, Montpellier, France
| | - Mathieu Sicard
- ISEM, Université de Montpellier, CNRS, IRD, EPHE, Montpellier, France
| | - Khalid Belkhir
- ISEM, Université de Montpellier, CNRS, IRD, EPHE, Montpellier, France
| | - Mylène Weill
- ISEM, Université de Montpellier, CNRS, IRD, EPHE, Montpellier, France
| |
Collapse
|
2
|
Dadonaite B, Crawford KHD, Radford CE, Farrell AG, Yu TC, Hannon WW, Zhou P, Andrabi R, Burton DR, Liu L, Ho DD, Chu HY, Neher RA, Bloom JD. A pseudovirus system enables deep mutational scanning of the full SARS-CoV-2 spike. Cell 2023; 186:1263-1278.e20. [PMID: 36868218 PMCID: PMC9922669 DOI: 10.1016/j.cell.2023.02.001] [Citation(s) in RCA: 61] [Impact Index Per Article: 61.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2022] [Revised: 01/11/2023] [Accepted: 01/31/2023] [Indexed: 02/15/2023]
Abstract
A major challenge in understanding SARS-CoV-2 evolution is interpreting the antigenic and functional effects of emerging mutations in the viral spike protein. Here, we describe a deep mutational scanning platform based on non-replicative pseudotyped lentiviruses that directly quantifies how large numbers of spike mutations impact antibody neutralization and pseudovirus infection. We apply this platform to produce libraries of the Omicron BA.1 and Delta spikes. These libraries each contain ∼7,000 distinct amino acid mutations in the context of up to ∼135,000 unique mutation combinations. We use these libraries to map escape mutations from neutralizing antibodies targeting the receptor-binding domain, N-terminal domain, and S2 subunit of spike. Overall, this work establishes a high-throughput and safe approach to measure how ∼105 combinations of mutations affect antibody neutralization and spike-mediated infection. Notably, the platform described here can be extended to the entry proteins of many other viruses.
Collapse
Affiliation(s)
- Bernadeta Dadonaite
- Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, WA 98109, USA
| | - Katharine H D Crawford
- Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, WA 98109, USA; Department of Genome Sciences & Medical Scientist Training Program, University of Washington, Seattle, WA 98109, USA
| | - Caelan E Radford
- Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, WA 98109, USA; Molecular and Cellular Biology Graduate Program, University of Washington, Seattle, WA 98109, USA
| | - Ariana G Farrell
- Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, WA 98109, USA
| | - Timothy C Yu
- Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, WA 98109, USA; Molecular and Cellular Biology Graduate Program, University of Washington, Seattle, WA 98109, USA
| | - William W Hannon
- Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, WA 98109, USA; Molecular and Cellular Biology Graduate Program, University of Washington, Seattle, WA 98109, USA
| | - Panpan Zhou
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA 92037, USA; IAVI Neutralizing Antibody Center, The Scripps Research Institute, La Jolla, CA 92037, USA; Consortium for HIV/AIDS Vaccine Development (CHAVD), The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Raiees Andrabi
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA 92037, USA; IAVI Neutralizing Antibody Center, The Scripps Research Institute, La Jolla, CA 92037, USA; Consortium for HIV/AIDS Vaccine Development (CHAVD), The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Dennis R Burton
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA 92037, USA; IAVI Neutralizing Antibody Center, The Scripps Research Institute, La Jolla, CA 92037, USA; Consortium for HIV/AIDS Vaccine Development (CHAVD), The Scripps Research Institute, La Jolla, CA 92037, USA; Ragon Institute of Massachusetts General Hospital, MIT & Harvard, Cambridge, MA 02139, USA
| | - Lihong Liu
- Aaron Diamond AIDS Research Center, Columbia University Vagelos College of Physicians and Surgeons, New York, NY, USA
| | - David D Ho
- Aaron Diamond AIDS Research Center, Columbia University Vagelos College of Physicians and Surgeons, New York, NY, USA; Department of Microbiology and Immunology, Columbia University Vagelos College of Physicians and Surgeons, New York, NY 10032, USA; Division of Infectious Diseases, Department of Medicine, Columbia University Vagelos College of Physicians and Surgeons, New York, NY 10032, USA
| | - Helen Y Chu
- University of Washington, Department of Medicine, Division of Allergy and Infectious Diseases, Seattle, WA, USA
| | - Richard A Neher
- Biozentrum, University of Basel, Basel, Switzerland; Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Jesse D Bloom
- Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, WA 98109, USA; Howard Hughes Medical Institute, Seattle, WA 98195, USA.
| |
Collapse
|
3
|
Dadonaite B, Crawford KHD, Radford CE, Farrell AG, Yu TC, Hannon WW, Zhou P, Andrabi R, Burton DR, Liu L, Ho DD, Neher RA, Bloom JD. A pseudovirus system enables deep mutational scanning of the full SARS-CoV-2 spike. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2022:2022.10.13.512056. [PMID: 36263061 PMCID: PMC9580381 DOI: 10.1101/2022.10.13.512056] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
A major challenge in understanding SARS-CoV-2 evolution is interpreting the antigenic and functional effects of emerging mutations in the viral spike protein. Here we describe a new deep mutational scanning platform based on non-replicative pseudotyped lentiviruses that directly quantifies how large numbers of spike mutations impact antibody neutralization and pseudovirus infection. We demonstrate this new platform by making libraries of the Omicron BA.1 and Delta spikes. These libraries each contain ~7000 distinct amino-acid mutations in the context of up to ~135,000 unique mutation combinations. We use these libraries to map escape mutations from neutralizing antibodies targeting the receptor binding domain, N-terminal domain, and S2 subunit of spike. Overall, this work establishes a high-throughput and safe approach to measure how ~10 5 combinations of mutations affect antibody neutralization and spike-mediated infection. Notably, the platform described here can be extended to the entry proteins of many other viruses.
Collapse
Affiliation(s)
- Bernadeta Dadonaite
- Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, Washington, 98109, USA
| | - Katharine H D Crawford
- Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, Washington, 98109, USA
- Department of Genome Sciences & Medical Scientist Training Program, University of Washington, Seattle, Washington, 98109, USA
| | - Caelan E Radford
- Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, Washington, 98109, USA
- Molecular and Cellular Biology Graduate Program, University of Washington, and Basic Sciences Division, Fred Hutch Cancer Center, Seattle, Washington, 98109, USA
| | - Ariana G Farrell
- Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, Washington, 98109, USA
| | - Timothy C Yu
- Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, Washington, 98109, USA
- Molecular and Cellular Biology Graduate Program, University of Washington, and Basic Sciences Division, Fred Hutch Cancer Center, Seattle, Washington, 98109, USA
| | - William W Hannon
- Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, Washington, 98109, USA
- Molecular and Cellular Biology Graduate Program, University of Washington, and Basic Sciences Division, Fred Hutch Cancer Center, Seattle, Washington, 98109, USA
| | - Panpan Zhou
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA 92037, USA
- IAVI Neutralizing Antibody Center, The Scripps Research Institute, La Jolla, CA 92037, USA
- Consortium for HIV/AIDS Vaccine Development (CHAVD), The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Raiees Andrabi
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA 92037, USA
- IAVI Neutralizing Antibody Center, The Scripps Research Institute, La Jolla, CA 92037, USA
- Consortium for HIV/AIDS Vaccine Development (CHAVD), The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Dennis R Burton
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA 92037, USA
- IAVI Neutralizing Antibody Center, The Scripps Research Institute, La Jolla, CA 92037, USA
- Consortium for HIV/AIDS Vaccine Development (CHAVD), The Scripps Research Institute, La Jolla, CA 92037, USA
- Ragon Institute of MGH, MIT & Harvard, Cambridge, MA 02139, USA
| | - Lihong Liu
- Aaron Diamond AIDS Research Center, Columbia University Vagelos College of Physicians and Surgeons, New York, NY, USA
| | - David D. Ho
- Aaron Diamond AIDS Research Center, Columbia University Vagelos College of Physicians and Surgeons, New York, NY, USA
- Department of Microbiology and Immunology, Columbia University Vagelos College of Physicians and Surgeons, New York, NY 10032, USA
- Division of Infectious Diseases, Department of Medicine, Columbia University Vagelos College of Physicians and Surgeons, New York, NY 10032, USA
| | - Richard A. Neher
- Biozentrum, University of Basel, Basel, Switzerland, Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Jesse D Bloom
- Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, Washington, 98109, USA
- Howard Hughes Medical Institute, Seattle, WA, 98195, USA
| |
Collapse
|
4
|
Thippeshappa R, Polacino P, Chandrasekar SS, Truong K, Misra A, Aulicino PC, Hu SL, Kaushal D, Kimata JT. In vivo Serial Passaging of Human-Simian Immunodeficiency Virus Clones Identifies Characteristics for Persistent Viral Replication. Front Microbiol 2021; 12:779460. [PMID: 34867922 PMCID: PMC8636705 DOI: 10.3389/fmicb.2021.779460] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2021] [Accepted: 10/22/2021] [Indexed: 12/11/2022] Open
Abstract
We previously reported that a human immunodeficiency virus type 1 with a simian immunodeficiency virus vif substitution (HSIV-vifNL4-3) could replicate in pigtailed macaques (PTMs), demonstrating that Vif is a species-specific tropism factor of primate lentiviruses. However, infections did not result in high-peak viremia or setpoint plasma viral loads, as observed during simian immunodeficiency virus (SIV) infection of PTMs. Here, we characterized variants isolated from one of the original infected animals with CD4 depletion after nearly 4years of infection to identify determinants of increased replication fitness. In our studies, we found that the HSIV-vif clones did not express the HIV-1 Vpr protein due to interference from the vpx open reading frame (ORF) in singly spliced vpr mRNA. To examine whether these viral genes contribute to persistent viral replication, we generated infectious HSIV-vif clones expressing either the HIV-1 Vpr or SIV Vpx protein. And then to determine viral fitness determinants of HSIV-vif, we conducted three rounds of serial in vivo passaging in PTMs, starting with an initial inoculum containing a mixture of CXCR4-tropic [Vpr-HSIV-vifNL4-3 isolated at 196 (C/196) and 200 (C/200) weeks post-infection from a PTM with depressed CD4 counts] and CCR5-tropic HSIV (Vpr+ HSIV-vif derivatives based NL-AD8 and Bru-Yu2 and a Vpx expressing HSIV-vifYu2). Interestingly, all infected PTMs showed peak plasma viremia close to or above 105 copies/ml and persistent viral replication for more than 20weeks. Infectious molecular clones (IMCs) recovered from the passage 3 PTM (HSIV-P3 IMCs) included mutations required for HIV-1 Vpr expression and those mutations encoded by the CXCR4-tropic HSIV-vifNL4-3 isolate C/196. The data indicate that the viruses selected during long-term infection acquired HIV-1 Vpr expression, suggesting the importance of Vpr for in vivo pathogenesis. Further passaging of HSIV-P3 IMCs in vivo may generate pathogenic variants with higher replication capacity, which will be a valuable resource as challenge virus in vaccine and cure studies.
Collapse
Affiliation(s)
- Rajesh Thippeshappa
- Disease Intervention and Prevention Program, Southwest National Primate Research Center, Texas Biomedical Research Institute, San Antonio, TX, United States
| | - Patricia Polacino
- Washington National Primate Research Center, University of Washington, Seattle, WA, United States
| | - Shaswath S Chandrasekar
- Department of Molecular Virology and Microbiology, Baylor College of Medicine, Houston, TX, United States
| | - Khanghy Truong
- Department of Molecular Virology and Microbiology, Baylor College of Medicine, Houston, TX, United States
| | - Anisha Misra
- Department of Molecular Virology and Microbiology, Baylor College of Medicine, Houston, TX, United States
| | - Paula C Aulicino
- Laboratorio de Biología Celular y Retrovirus, Hospital de Pediatría "Juan P. Garrahan"-CONICET, Buenos Aires, Argentina
| | - Shiu-Lok Hu
- Washington National Primate Research Center, University of Washington, Seattle, WA, United States.,Department of Pharmaceutics, University of Washington, Seattle, WA, United States
| | - Deepak Kaushal
- Host-Pathogen Interactions Program, Southwest National Primate Research Center, Texas Biomedical Research Institute, San Antonio, TX, United States
| | - Jason T Kimata
- Department of Molecular Virology and Microbiology, Baylor College of Medicine, Houston, TX, United States
| |
Collapse
|
5
|
Cheng Y, Grueber C, Hogg CJ, Belov K. Improved high-throughput MHC typing for non-model species using long-read sequencing. Mol Ecol Resour 2021; 22:862-876. [PMID: 34551192 PMCID: PMC9293008 DOI: 10.1111/1755-0998.13511] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2021] [Revised: 08/26/2021] [Accepted: 09/06/2021] [Indexed: 11/29/2022]
Abstract
The major histocompatibility complex (MHC) plays a critical role in the vertebrate immune system. Accurate MHC typing is critical to understanding not only host fitness and disease susceptibility, but also the mechanisms underlying host‐pathogen co‐evolution. However, due to the high degree of gene duplication and diversification of MHC genes, it is often technically challenging to accurately characterise MHC genetic diversity in non‐model species. Here we conducted a systematic review to identify common issues associated with current widely used MHC typing approaches. Then to overcome these challenges, we developed a long‐read based MHC typing method along with a new analysis pipeline. Our approach enables the sequencing of fully phased MHC alleles spanning all key functional domains and the separation of highly similar alleles as well as the removal of technical artefacts such as PCR heteroduplexes and chimeras. Using this approach, we performed population‐scale MHC typing in the Tasmanian devil (Sarcophilus harrisii), revealing previously undiscovered MHC functional diversity in this endangered species. Our new method provides a better solution for addressing research questions that require high MHC typing accuracy. Since the method is not limited by species or the number of genes analysed, it will be applicable for studying not only the MHC but also other complex gene families.
Collapse
Affiliation(s)
- Yuanyuan Cheng
- School of Life and Environmental Sciences, The University of Sydney, Sydney, New South Wales, Australia
| | - Catherine Grueber
- School of Life and Environmental Sciences, The University of Sydney, Sydney, New South Wales, Australia
| | - Carolyn J Hogg
- School of Life and Environmental Sciences, The University of Sydney, Sydney, New South Wales, Australia.,San Diego Zoo Wildlife Alliance, San Diego, California, USA
| | - Katherine Belov
- School of Life and Environmental Sciences, The University of Sydney, Sydney, New South Wales, Australia
| |
Collapse
|
6
|
Taylor MK, Williams EP, Wongsurawat T, Jenjaroenpun P, Nookaew I, Jonsson CB. Amplicon-Based, Next-Generation Sequencing Approaches to Characterize Single Nucleotide Polymorphisms of Orthohantavirus Species. Front Cell Infect Microbiol 2020; 10:565591. [PMID: 33163416 PMCID: PMC7591466 DOI: 10.3389/fcimb.2020.565591] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2020] [Accepted: 09/08/2020] [Indexed: 12/26/2022] Open
Abstract
Whole-genome sequencing (WGS) of viruses from patient or environmental samples can provide tremendous insight into the epidemiology, drug resistance or evolution of a virus. However, we face two common hurdles in obtaining robust sequence information; the low copy number of viral genomes in specimens and the error introduced by WGS techniques. To optimize detection and minimize error in WGS of hantaviruses, we tested four amplification approaches and different amplicon pooling methods for library preparation and examined these preparations using two sequencing platforms, Illumina MiSeq and Oxford Nanopore Technologies MinION. First, we tested and optimized primers used for whole segment PCR or one kilobase amplicon amplification for even coverage using RNA isolated from the supernatant of virus-infected cells. Once optimized we assessed two sources of total RNA, virus-infected cells and supernatant from the virus-infected cells, with four variations of primer pooling for amplicons, and six different amplification approaches. We show that 99-100% genome coverage was obtained using a one-step RT-PCR reaction with one forward and reverse primer. Using a two-step RT-PCR with three distinct tiling approaches for the three genomic segments (vRNAs), we optimized primer pooling approaches for PCR amplification to achieve a greater number of aligned reads, average depth of genome, and genome coverage. The single nucleotide polymorphisms identified from MiSeq and MinION sequencing suggested intrinsic mutation frequencies of ~10-5-10-7 per genome and 10-4-10-5 per genome, respectively. We noted no difference in the coverage or accuracy when comparing WGS results with amplicons amplified from RNA extracted from infected cells or supernatant of these infected cells. Our results show that high-throughput diagnostics requiring the identification of hantavirus species or strains can be performed using MiSeq or MinION using a one-step approach. However, the two-step MiSeq approach outperformed the MinION in coverage depth and accuracy, and hence would be superior for assessment of genomes for epidemiology or evolutionary questions using the methods developed herein.
Collapse
Affiliation(s)
- Mariah K. Taylor
- Department of Microbiology, Immunology and Biochemistry, The University of Tennessee Health Science Center, Memphis, TN, United States
| | - Evan P. Williams
- Department of Microbiology, Immunology and Biochemistry, The University of Tennessee Health Science Center, Memphis, TN, United States
| | - Thidathip Wongsurawat
- Department of Biomedical Informatics, College of Medicine, University of Arkansas for Medical Sciences, Little Rock, AR, United States
| | - Piroon Jenjaroenpun
- Department of Biomedical Informatics, College of Medicine, University of Arkansas for Medical Sciences, Little Rock, AR, United States
| | - Intawat Nookaew
- Department of Biomedical Informatics, College of Medicine, University of Arkansas for Medical Sciences, Little Rock, AR, United States
| | - Colleen B. Jonsson
- Department of Microbiology, Immunology and Biochemistry, The University of Tennessee Health Science Center, Memphis, TN, United States
| |
Collapse
|
7
|
Streamlined Subpopulation, Subtype, and Recombination Analysis of HIV-1 Half-Genome Sequences Generated by High-Throughput Sequencing. mSphere 2020; 5:5/5/e00551-20. [PMID: 33055255 PMCID: PMC7565892 DOI: 10.1128/msphere.00551-20] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
The highly recombinogenic nature of human immunodeficiency virus type 1 (HIV-1) leads to recombination and emergence of quasispecies. It is important to reliably identify subpopulations to understand the complexity of a viral population for drug resistance surveillance and vaccine development. High-throughput sequencing (HTS) provides improved resolution over Sanger sequencing for the analysis of heterogeneous viral subpopulations. However, current methods of analysis of HTS reads are unable to fully address accurate population reconstruction. Hence, there is a dire need for a more sensitive, accurate, user-friendly, and cost-effective method to analyze viral quasispecies. For this purpose, we have improved the HIVE-hexahedron algorithm that we previously developed with in silico short sequences to analyze raw HTS short reads. The significance of this study is that our standalone algorithm enables a streamlined analysis of quasispecies, subtype, and recombination patterns from long HIV-1 genome regions without the need of additional sequence analysis tools. Distinct viral populations and recombination patterns identified by HIVE-hexahedron are further validated by comparison with sequences obtained by single genome sequencing (SGS). High-throughput sequencing (HTS) has been widely used to characterize HIV-1 genome sequences. There are no algorithms currently that can directly determine genotype and quasispecies population using short HTS reads generated from long genome sequences without additional software. To establish a robust subpopulation, subtype, and recombination analysis workflow, we amplified the HIV-1 3′-half genome from plasma samples of 65 HIV-1-infected individuals and sequenced the entire amplicon (∼4,500 bp) by HTS. With direct analysis of raw reads using HIVE-hexahedron, we showed that 48% of samples harbored 2 to 13 subpopulations. We identified various subtypes (17 A1s, 4 Bs, 27 Cs, 6 CRF02_AGs, and 11 unique recombinant forms) and defined recombinant breakpoints of 10 recombinants. These results were validated with viral genome sequences generated by single genome sequencing (SGS) or the analysis of consensus sequence of the HTS reads. The HIVE-hexahedron workflow is more sensitive and accurate than just evaluating the consensus sequence and also more cost-effective than SGS. IMPORTANCE The highly recombinogenic nature of human immunodeficiency virus type 1 (HIV-1) leads to recombination and emergence of quasispecies. It is important to reliably identify subpopulations to understand the complexity of a viral population for drug resistance surveillance and vaccine development. High-throughput sequencing (HTS) provides improved resolution over Sanger sequencing for the analysis of heterogeneous viral subpopulations. However, current methods of analysis of HTS reads are unable to fully address accurate population reconstruction. Hence, there is a dire need for a more sensitive, accurate, user-friendly, and cost-effective method to analyze viral quasispecies. For this purpose, we have improved the HIVE-hexahedron algorithm that we previously developed with in silico short sequences to analyze raw HTS short reads. The significance of this study is that our standalone algorithm enables a streamlined analysis of quasispecies, subtype, and recombination patterns from long HIV-1 genome regions without the need of additional sequence analysis tools. Distinct viral populations and recombination patterns identified by HIVE-hexahedron are further validated by comparison with sequences obtained by single genome sequencing (SGS).
Collapse
|
8
|
Determining the Suitability of MinION's Direct RNA and DNA Amplicon Sequencing for Viral Subtype Identification. Viruses 2020; 12:v12080801. [PMID: 32722480 PMCID: PMC7472323 DOI: 10.3390/v12080801] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2020] [Revised: 07/22/2020] [Accepted: 07/23/2020] [Indexed: 12/21/2022] Open
Abstract
The MinION sequencer is increasingly being used for the detection and outbreak surveillance of pathogens due to its rapid throughput. For RNA viruses, MinION's new direct RNA sequencing is the next significant development. Direct RNA sequencing studies are currently limited and comparisons of its diagnostic performance relative to different DNA sequencing approaches are lacking as a result. We sought to address this gap and sequenced six subtypes from the mycovirus CHV-1 using MinION's direct RNA sequencing and DNA sequencing based on a targeted viral amplicon. Reads from both techniques could correctly identify viral presence and species using BLAST, though direct RNA reads were more frequently misassigned to closely related CHV species. De novo consensus sequences were error prone but suitable for viral species identification. However, subtype identification was less accurate from both reads and consensus sequences. This is due to the high sequencing error rate and the limited sequence divergence between some CHV-1 subtypes. Importantly, neither RNA nor amplicon sequencing reads could be used to obtain reliable intra-host variants. Overall, both sequencing techniques were suitable for virus detection, though limitations are present due to the error rate of MinION reads.
Collapse
|
9
|
Omelina ES, Ivankin AV, Letiagina AE, Pindyurin AV. Optimized PCR conditions minimizing the formation of chimeric DNA molecules from MPRA plasmid libraries. BMC Genomics 2019; 20:536. [PMID: 31291895 PMCID: PMC6620194 DOI: 10.1186/s12864-019-5847-2] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023] Open
Abstract
Background Massively parallel reporter assays (MPRAs) enable high-throughput functional evaluation of various DNA regulatory elements and their mutant variants. The assays are based on construction of highly diverse plasmid libraries containing two variable fragments, a region of interest (a sequence under study; ROI) and a barcode (BC) used to uniquely tag each ROI, which are separated by a constant spacer sequence. The sequences of BC–ROI combinations present in the libraries may be either known a priori or not. In the latter case, it is necessary to identify these combinations before performing functional experiments. Typically, this is done by PCR amplification of the BC–ROI regions with flanking primers, followed by next-generation sequencing (NGS) of the products. However, chimeric DNA molecules formed on templates with identical spacer fragment during the amplification process may substantially hamper the identification of genuine BC–ROI combinations, and as a result lower the performance of the assays. Results To identify settings that minimize formation of chimeric products we tested a number of PCR amplification parameters, such as conventional and emulsion types of PCR, one- or two-round amplification strategies, amount of DNA template, number of PCR cycles, and the duration of the extension step. Using specific MPRA libraries as templates, we found that the two-round amplification of the BC–ROI regions with a very low initial template amount, an elongated extension step, and a specific number of PCR cycles result in as low as 0.30 and 0.32% of chimeric products for emulsion and conventional PCR approaches, respectively. Conclusions We have identified PCR parameters that ensure synthesis of specific (non-chimeric) products from highly diverse MPRA plasmid libraries. In addition, we found that there is a negligible difference in performance of emulsion and conventional PCR approaches performed with the identified settings. Electronic supplementary material The online version of this article (10.1186/s12864-019-5847-2) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | - Anton V Ivankin
- Institute of Molecular and Cellular Biology SB RAS, Novosibirsk, Russia
| | - Anna E Letiagina
- Institute of Molecular and Cellular Biology SB RAS, Novosibirsk, Russia.,Novosibirsk State University, Novosibirsk, Russia
| | - Alexey V Pindyurin
- Institute of Molecular and Cellular Biology SB RAS, Novosibirsk, Russia. .,Novosibirsk State University, Novosibirsk, Russia.
| |
Collapse
|
10
|
Liu CC, Ji H. PCR Amplification Strategies Towards Full-length HIV-1 Genome Sequencing. Curr HIV Res 2019; 16:98-105. [PMID: 29943704 DOI: 10.2174/1570162x16666180626152252] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2018] [Revised: 05/05/2018] [Accepted: 06/20/2018] [Indexed: 11/22/2022]
Abstract
The advent of next-generation sequencing has enabled greater resolution of viral diversity and improved feasibility of full viral genome sequencing allowing routine HIV-1 full genome sequencing in both research and diagnostic settings. Regardless of the sequencing platform selected, successful PCR amplification of the HIV-1 genome is essential for sequencing template preparation. As such, full HIV-1 genome amplification is a crucial step in dictating the successful and reliable sequencing downstream. Here we reviewed existing PCR protocols leading to HIV-1 full genome sequencing. In addition to the discussion on basic considerations on relevant PCR design, the advantages as well as the pitfalls of the published protocols were reviewed.
Collapse
Affiliation(s)
- Chao Chun Liu
- National Microbiology Laboratory at JC Wilt Infectious Diseases Research Center, Public Health Agency of Canada, Winnipeg, Canada
| | - Hezhao Ji
- National Microbiology Laboratory at JC Wilt Infectious Diseases Research Center, Public Health Agency of Canada, Winnipeg, Canada.,Department of Medical Microbiology and Infectious Diseases, University of Manitoba, Winnipeg, Canada
| |
Collapse
|
11
|
Song H, Ou W, Feng Y, Zhang J, Li F, Hu J, Peng H, Xing H, Ma L, Tan Q, Li D, Wang L, Wu B, Shao Y. Disparate impact on CD4 T cell count by two distinct HIV-1 phylogenetic clusters from the same clade. Proc Natl Acad Sci U S A 2019; 116:239-244. [PMID: 30559208 PMCID: PMC6320496 DOI: 10.1073/pnas.1814714116] [Citation(s) in RCA: 36] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
HIV-1 evolved into various genetic subtypes and circulating recombinant forms (CRFs) in the global epidemic. The same subtype or CRF is usually considered to have similar phenotype. Being one of the world's major CRFs, CRF01_AE infection was reported to associate with higher prevalence of CXCR4 (X4) viruses and faster CD4 decline. However, the underlying mechanisms remain unclear. We identified eight phylogenetic clusters of CRF01_AE in China and hypothesized that they may have different phenotypes. In the National HIV Molecular Epidemiology Survey, we discovered that people infected by CRF01_AE cluster 4 had significantly lower CD4 counts (391 vs. 470, P < 0.0001) and higher prevalence of X4-using viruses (17.1% vs. 4.4%, P < 0.0001) compared with those infected by cluster 5. In an MSM cohort, X4-using viruses were only isolated from seroconvertors in cluster 4, which was associated with low a CD4 count within the first year of infection (141 vs. 440, P = 0.003). Using a coreceptor binding model, we identified unique V3 signatures in cluster 4 that favor CXCR4 use. We demonstrate that the HIV-1 phenotype and pathogenicity can be determined at the phylogenetic cluster level in the same subtype. Since its initial spread to humans from chimpanzees, estimated to be the first half of the 20th century, HIV-1 continues to undergo rapid evolution in larger and more diverse populations. The divergent phenotype evolution of two major CRF01_AE clusters highlights the importance of monitoring the genetic evolution and phenotypic shift of HIV-1 to provide early warning of the appearance of more pathogenic strains.
Collapse
Affiliation(s)
- Hongshuo Song
- State Key Laboratory for Infectious Disease Prevention and Control, National Center for AIDS/STD Control and Prevention, Chinese Center for Disease Control and Prevention, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, 102206 Beijing, China
| | - Weidong Ou
- State Key Laboratory for Infectious Disease Prevention and Control, National Center for AIDS/STD Control and Prevention, Chinese Center for Disease Control and Prevention, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, 102206 Beijing, China
| | - Yi Feng
- State Key Laboratory for Infectious Disease Prevention and Control, National Center for AIDS/STD Control and Prevention, Chinese Center for Disease Control and Prevention, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, 102206 Beijing, China
| | - Junli Zhang
- State Key Laboratory for Infectious Disease Prevention and Control, National Center for AIDS/STD Control and Prevention, Chinese Center for Disease Control and Prevention, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, 102206 Beijing, China
| | - Fan Li
- State Key Laboratory for Infectious Disease Prevention and Control, National Center for AIDS/STD Control and Prevention, Chinese Center for Disease Control and Prevention, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, 102206 Beijing, China
| | - Jing Hu
- State Key Laboratory for Infectious Disease Prevention and Control, National Center for AIDS/STD Control and Prevention, Chinese Center for Disease Control and Prevention, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, 102206 Beijing, China
| | - Hong Peng
- State Key Laboratory for Infectious Disease Prevention and Control, National Center for AIDS/STD Control and Prevention, Chinese Center for Disease Control and Prevention, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, 102206 Beijing, China
| | - Hui Xing
- State Key Laboratory for Infectious Disease Prevention and Control, National Center for AIDS/STD Control and Prevention, Chinese Center for Disease Control and Prevention, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, 102206 Beijing, China
| | - Liying Ma
- State Key Laboratory for Infectious Disease Prevention and Control, National Center for AIDS/STD Control and Prevention, Chinese Center for Disease Control and Prevention, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, 102206 Beijing, China
| | - Qiuxiang Tan
- CAS Key Laboratory of Receptor Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 201203 Shanghai, China
| | - Dongliang Li
- Chaoyang Center for Disease Control and Prevention, 100021 Beijing, China
| | - Lijuan Wang
- Chaoyang Center for Disease Control and Prevention, 100021 Beijing, China
| | - Beili Wu
- CAS Key Laboratory of Receptor Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 201203 Shanghai, China
| | - Yiming Shao
- State Key Laboratory for Infectious Disease Prevention and Control, National Center for AIDS/STD Control and Prevention, Chinese Center for Disease Control and Prevention, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, 102206 Beijing, China;
- Center of Infectious Diseases, Peking University, 100191 Beijing, China
- The First Affiliated Hospital, School of Medicine, Zhejiang University, 310003 Hangzhou, China
| |
Collapse
|
12
|
Peng W, Li X, Wang C, Cao H, Cui Z. Metagenome complexity and template length are the main causes of bias in PCR-based bacteria community analysis. J Basic Microbiol 2018; 58:987-997. [PMID: 30091475 DOI: 10.1002/jobm.201800265] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2018] [Revised: 07/14/2018] [Accepted: 07/25/2018] [Indexed: 12/25/2022]
Abstract
Multitemplate PCR is used widely for the study of microbial community diversity. Although such studies have established the abundance of different groups within many natural ecosystems, these reports are limited by uncertainties such as bias and artifacts in the PCR. Bias which is introduced by the simultaneous amplification of specific genes from complex mixtures of templates remains poorly understood. In this study, factors leading to the bias of the multitemplate PCR in bacterial communities were examined and optimized. Comparisons between PCR cycle parameters, DNA polymerases, PCR primer degeneracy, and 16S rRNA gene fragments GC content, revealed that annealing temperatures and DNA structure are predominant factors contributing to the observed bias. Pre-digestion of metagenomic DNA with the restriction enzyme Sau3A I and decreased annealing temperature reduced the bias significantly. The application of these optimized conditions to the ten-species model community in a soil sample verified the validity of these treatments.
Collapse
Affiliation(s)
- Wentao Peng
- Key Laboratory of Microbiological Engineering of Agricultural Environment of MOA, College of Life Sciences, Nanjing Agricultural University, Nanjing, Jiangsu, People's Republic of China
| | - Xiangmin Li
- Nanjing Institute of Environmental Sciences, Ministry of Environmental Protection of the People's Republic of China, Nanjing, People's Republic of China
| | - Chuang Wang
- Key Laboratory of Microbiological Engineering of Agricultural Environment of MOA, College of Life Sciences, Nanjing Agricultural University, Nanjing, Jiangsu, People's Republic of China
| | - Hui Cao
- Key Laboratory of Microbiological Engineering of Agricultural Environment of MOA, College of Life Sciences, Nanjing Agricultural University, Nanjing, Jiangsu, People's Republic of China
| | - Zhongli Cui
- Key Laboratory of Microbiological Engineering of Agricultural Environment of MOA, College of Life Sciences, Nanjing Agricultural University, Nanjing, Jiangsu, People's Republic of China
| |
Collapse
|
13
|
Abstract
Genetic reference panels are widely used to map complex, quantitative traits in model organisms. We have generated new high-resolution genetic maps of 259 mouse inbred strains from recombinant inbred strain panels (C57BL/6J × DBA/2J, ILS/IbgTejJ × ISS/IbgTejJ, and C57BL/6J × A/J) and chromosome substitution strain panels (C57BL/6J-Chr#<A/J>, C57BL/6J-Chr#<PWD/Ph>, and C57BL/6J-Chr#<MSM/Ms>). We genotyped all samples using the Affymetrix Mouse Diversity Array with an average intermarker spacing of 4.3 kb. The new genetic maps provide increased precision in the localization of recombination breakpoints compared to the previous maps. Although the strains were presumed to be fully inbred, we found residual heterozygosity in 40% of individual mice from five of the six panels. We also identified de novo deletions and duplications, in homozygous or heterozygous state, ranging in size from 21 kb to 8.4 Mb. Almost two-thirds (46 out of 76) of these deletions overlap exons of protein coding genes and may have phenotypic consequences. Twenty-nine putative gene conversions were identified in the chromosome substitution strains. We find that gene conversions are more likely to occur in regions where the homologous chromosomes are more similar. The raw genotyping data and genetic maps of these strain panels are available at http://churchill-lab.jax.org/website/MDA.
Collapse
|
14
|
Jiang WZ, Henry IM, Lynagh PG, Comai L, Cahoon EB, Weeks DP. Significant enhancement of fatty acid composition in seeds of the allohexaploid, Camelina sativa, using CRISPR/Cas9 gene editing. PLANT BIOTECHNOLOGY JOURNAL 2017; 15:648-657. [PMID: 27862889 PMCID: PMC5399004 DOI: 10.1111/pbi.12663] [Citation(s) in RCA: 169] [Impact Index Per Article: 24.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/05/2016] [Revised: 10/27/2016] [Accepted: 11/07/2016] [Indexed: 05/02/2023]
Abstract
The CRISPR/Cas9 nuclease system is a powerful and flexible tool for genome editing, and novel applications of this system are being developed rapidly. Here, we used CRISPR/Cas9 to target the FAD2 gene in Arabidopsis thaliana and in the closely related emerging oil seed plant, Camelina sativa, with the goal of improving seed oil composition. We successfully obtained Camelina seeds in which oleic acid content was increased from 16% to over 50% of the fatty acid composition. These increases were associated with significant decreases in the less desirable polyunsaturated fatty acids, linoleic acid (i.e. a decrease from ~16% to <4%) and linolenic acid (a decrease from ~35% to <10%). These changes result in oils that are superior on multiple levels: they are healthier, more oxidatively stable and better suited for production of certain commercial chemicals, including biofuels. As expected, A. thaliana T2 and T3 generation seeds exhibiting these types of altered fatty acid profiles were homozygous for disrupted FAD2 alleles. In the allohexaploid, Camelina, guide RNAs were designed that simultaneously targeted all three homoeologous FAD2 genes. This strategy that significantly enhanced oil composition in T3 and T4 generation Camelina seeds was associated with a combination of germ-line mutations and somatic cell mutations in FAD2 genes in each of the three Camelina subgenomes.
Collapse
Affiliation(s)
- Wen Zhi Jiang
- Department of Biochemistry and Center for Plant Science InnovationUniversity of NebraskaLincolnNEUSA
| | - Isabelle M. Henry
- Department of Plant Biology and UC Davis Genome CenterUniversity of CaliforniaDavisCAUSA
| | - Peter G. Lynagh
- Department of Plant Biology and UC Davis Genome CenterUniversity of CaliforniaDavisCAUSA
| | - Luca Comai
- Department of Plant Biology and UC Davis Genome CenterUniversity of CaliforniaDavisCAUSA
| | - Edgar B. Cahoon
- Department of Biochemistry and Center for Plant Science InnovationUniversity of NebraskaLincolnNEUSA
| | - Donald P. Weeks
- Department of Biochemistry and Center for Plant Science InnovationUniversity of NebraskaLincolnNEUSA
| |
Collapse
|
15
|
Islam MF, Watanabe A, Wong L, Lazarou C, Vizeacoumar FS, Abuhussein O, Hill W, Uppalapati M, Geyer CR, Vizeacoumar FJ. Enhancing the throughput and multiplexing capabilities of next generation sequencing for efficient implementation of pooled shRNA and CRISPR screens. Sci Rep 2017; 7:1040. [PMID: 28432350 PMCID: PMC5430825 DOI: 10.1038/s41598-017-01170-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2016] [Accepted: 03/20/2017] [Indexed: 11/11/2022] Open
Abstract
Next generation sequencing is becoming the method of choice for functional genomic studies that use pooled shRNA or CRISPR libraries. A key challenge in sequencing these mixed-oligo libraries is that they are highly susceptible to hairpin and/or heteroduplex formation. This results in polyclonal, low quality, and incomplete reads and reduces sequencing throughput. Unfortunately, this challenge is significantly magnified in low-to-medium throughput bench-top sequencers as failed reads significantly perturb the maximization of sequence coverage and multiplexing capabilities. Here, we report a methodology that can be adapted to maximize the coverage on a bench-top, Ion PGM System for smaller shRNA libraries with high efficiency. This ligation-based, half-shRNA sequencing strategy minimizes failed sequences and is also equally amenable to high-throughput sequencers for increased multiplexing. Towards this, we also demonstrate that our strategy to reduce heteroduplex formation improves multiplexing capabilities of pooled CRISPR screens using Illumina NextSeq 500. Overall, our method will facilitate sequencing of pooled shRNA or CRISPR libraries from genomic DNA and maximize sequence coverage.
Collapse
Affiliation(s)
- Md Fahmid Islam
- Department of Biochemistry, University of Saskatchewan, Saskatoon, S7N 5E5, Canada
| | - Atsushi Watanabe
- Department of Pathology, University of Saskatchewan, Saskatoon, S7N 0W8, Canada.,Department of Hematology, Nephrology and Rheumatology, Graduate School of Medicine, Akita University, Akita, Japan
| | - Lai Wong
- Department of Biochemistry, University of Saskatchewan, Saskatoon, S7N 5E5, Canada
| | - Conor Lazarou
- Department of Pathology, University of Saskatchewan, Saskatoon, S7N 0W8, Canada
| | | | - Omar Abuhussein
- College of Pharmacy and Nutrition, University of Saskatchewan, Saskatoon, S7N 5C9, Canada
| | - Wayne Hill
- Department of Pathology, University of Saskatchewan, Saskatoon, S7N 0W8, Canada
| | - Maruti Uppalapati
- Department of Pathology, University of Saskatchewan, Saskatoon, S7N 0W8, Canada
| | - C Ronald Geyer
- Department of Pathology, University of Saskatchewan, Saskatoon, S7N 0W8, Canada.
| | - Franco J Vizeacoumar
- Department of Pathology, University of Saskatchewan, Saskatoon, S7N 0W8, Canada. .,College of Pharmacy and Nutrition, University of Saskatchewan, Saskatoon, S7N 5C9, Canada. .,Cancer Research, Saskatchewan Cancer Agency, 107 Wiggins Road, Saskatoon, S7N 5E5, Canada.
| |
Collapse
|
16
|
Boltz VF, Rausch J, Shao W, Hattori J, Luke B, Maldarelli F, Mellors JW, Kearney MF, Coffin JM. Ultrasensitive single-genome sequencing: accurate, targeted, next generation sequencing of HIV-1 RNA. Retrovirology 2016; 13:87. [PMID: 27998286 PMCID: PMC5175307 DOI: 10.1186/s12977-016-0321-6] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2016] [Accepted: 11/29/2016] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Although next generation sequencing (NGS) offers the potential for studying virus populations in unprecedented depth, PCR error, amplification bias and recombination during library construction have limited its use to population sequencing and measurements of unlinked allele frequencies. Here we report a method, termed ultrasensitive Single-Genome Sequencing (uSGS), for NGS library construction and analysis that eliminates PCR errors and recombinants, and generates single-genome sequences of the same quality as the "gold-standard" of HIV-1 single-genome sequencing assay but with more than 100-fold greater depth. RESULTS Primer ID tagged cDNA was synthesized from mixtures of cloned BH10 wild-type and mutant HIV-1 transcripts containing ten drug resistance mutations. First, the resultant cDNA was divided and NGS libraries were generated in parallel using two methods: uSGS and a method applying long PCR primers to attach the NGS adaptors (LP-PCR-1). Second, cDNA was divided and NGS libraries were generated in parallel comparing 3 methods: uSGS and 2 methods adapted from more recent reports using variations of the long PCR primers to attach the adaptors (LP-PCR-2 and LP-PCR-3). Consistently, the uSGS method amplified a greater proportion of cDNAs, averaging 30% compared to 13% for LP-PCR-1, 21% for LP-PCR-2 and 14% for LP-PCR-3. Most importantly, when the uSGS sequences were binned according to their primer IDs, 94% of the bins did not contain PCR recombinant sequences versus only 55, 75 and 65% for LP-PCR-1, 2 and 3, respectively. Finally, when uSGS was applied to plasma samples from HIV-1 infected donors, both frequent and rare variants were detected in each sample and neighbor-joining trees revealed clusters of genomes driven by the linkage of these mutations, showing the lack of PCR recombinants in the datasets. CONCLUSIONS The uSGS assay can be used for accurate detection of rare variants and for identifying linkage of rare alleles associated with HIV-1 drug resistance. In addition, the method allows accurate in-depth analyses of the complex genetic relationships of viral populations in vivo.
Collapse
Affiliation(s)
- Valerie F Boltz
- HIV Dynamics and Replication Program, CCR, National Cancer Institute, NIH, Translational Research Unit, 105 Boyles Street, Building 535 Room 111, Frederick, MD, 21702-1201, USA.
| | - Jason Rausch
- HIV Dynamics and Replication Program, CCR, National Cancer Institute, NIH, Translational Research Unit, 105 Boyles Street, Building 535 Room 111, Frederick, MD, 21702-1201, USA
| | - Wei Shao
- Frederick National Laboratory for Cancer Research, Advanced Biomedical Computing Center, Leidos Biomedical Research, Inc, Frederick, MD, USA
| | - Junko Hattori
- HIV Dynamics and Replication Program, CCR, National Cancer Institute, NIH, Translational Research Unit, 105 Boyles Street, Building 535 Room 111, Frederick, MD, 21702-1201, USA
| | - Brian Luke
- Frederick National Laboratory for Cancer Research, Advanced Biomedical Computing Center, Leidos Biomedical Research, Inc, Frederick, MD, USA
| | - Frank Maldarelli
- HIV Dynamics and Replication Program, CCR, National Cancer Institute, NIH, Translational Research Unit, 105 Boyles Street, Building 535 Room 111, Frederick, MD, 21702-1201, USA
| | - John W Mellors
- Division of Infectious Disease, University of Pittsburgh, Pittsburgh, PA, USA
| | - Mary F Kearney
- HIV Dynamics and Replication Program, CCR, National Cancer Institute, NIH, Translational Research Unit, 105 Boyles Street, Building 535 Room 111, Frederick, MD, 21702-1201, USA
| | - John M Coffin
- Department of Molecular Biology and Microbiology, Tufts University, Boston, MA, USA
| |
Collapse
|
17
|
Davidsson M, Diaz-Fernandez P, Schwich OD, Torroba M, Wang G, Björklund T. A novel process of viral vector barcoding and library preparation enables high-diversity library generation and recombination-free paired-end sequencing. Sci Rep 2016; 6:37563. [PMID: 27874090 PMCID: PMC5118689 DOI: 10.1038/srep37563] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2016] [Accepted: 10/31/2016] [Indexed: 12/29/2022] Open
Abstract
Detailed characterization and mapping of oligonucleotide function in vivo is generally a very time consuming effort that only allows for hypothesis driven subsampling of the full sequence to be analysed. Recent advances in deep sequencing together with highly efficient parallel oligonucleotide synthesis and cloning techniques have, however, opened up for entirely new ways to map genetic function in vivo. Here we present a novel, optimized protocol for the generation of universally applicable, barcode labelled, plasmid libraries. The libraries are designed to enable the production of viral vector preparations assessing coding or non-coding RNA function in vivo. When generating high diversity libraries, it is a challenge to achieve efficient cloning, unambiguous barcoding and detailed characterization using low-cost sequencing technologies. With the presented protocol, diversity of above 3 million uniquely barcoded adeno-associated viral (AAV) plasmids can be achieved in a single reaction through a process achievable in any molecular biology laboratory. This approach opens up for a multitude of in vivo assessments from the evaluation of enhancer and promoter regions to the optimization of genome editing. The generated plasmid libraries are also useful for validation of sequencing clustering algorithms and we here validate the newly presented message passing clustering process named Starcode.
Collapse
Affiliation(s)
- Marcus Davidsson
- Molecular Neuromodulation, Department of Experimental Medical Science, Lund University, 221 84 Lund, Sweden
| | - Paula Diaz-Fernandez
- Molecular Neuromodulation, Department of Experimental Medical Science, Lund University, 221 84 Lund, Sweden
| | - Oliver D Schwich
- Molecular Neuromodulation, Department of Experimental Medical Science, Lund University, 221 84 Lund, Sweden
| | - Marcos Torroba
- Molecular Neuromodulation, Department of Experimental Medical Science, Lund University, 221 84 Lund, Sweden
| | - Gang Wang
- Molecular Neuromodulation, Department of Experimental Medical Science, Lund University, 221 84 Lund, Sweden
| | - Tomas Björklund
- Molecular Neuromodulation, Department of Experimental Medical Science, Lund University, 221 84 Lund, Sweden
| |
Collapse
|
18
|
A Comprehensive Analysis of Primer IDs to Study Heterogeneous HIV-1 Populations. J Mol Biol 2015; 428:238-250. [PMID: 26711506 DOI: 10.1016/j.jmb.2015.12.012] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2015] [Revised: 11/25/2015] [Accepted: 12/16/2015] [Indexed: 01/01/2023]
Abstract
Determining the composition of viral populations is becoming increasingly important in the field of medical virology. While recently developed computational tools for viral haplotype analysis allow for correcting sequencing errors, they do not always allow for the removal of errors occurring in the upstream experimental protocol, such as PCR errors. Primer IDs (pIDs) are one method to address this problem by harnessing redundant template resampling for error correction. By using a reference mixture of five HIV-1 strains, we show how pIDs can be useful for estimating key experimental parameters, such as the substitution rate of the PCR process and the reverse transcription (RT) error rate. In addition, we introduce a hidden Markov model for determining the recombination rate of the RT PCR process. We found no strong sequence-specific bias in pID abundances (the same RT efficiencies as compared to commonly used short, specific RT primers) and no effects of pIDs on the estimated distribution of the references viruses.
Collapse
|
19
|
Hans JB, Haubner A, Arandjelovic M, Bergl RA, Fünfstück T, Gray M, Morgan DB, Robbins MM, Sanz C, Vigilant L. Characterization of MHC class II B polymorphism in multiple populations of wild gorillas using non-invasive samples and next-generation sequencing. Am J Primatol 2015; 77:1193-206. [PMID: 26283172 DOI: 10.1002/ajp.22458] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2015] [Revised: 07/08/2015] [Accepted: 08/03/2015] [Indexed: 01/03/2023]
Abstract
Genes encoded by the major histocompatibility complex (MHC) are crucial for the recognition and presentation of antigens to the immune system. In contrast to their closest relatives, chimpanzees and humans, much less is known about variation in gorillas at these loci. This study explored the exon 2 variation of -DPB1, -DQB1, and -DRB genes in 46 gorillas from four populations while simultaneously evaluating the feasibility of using fecal samples for high-throughput MHC genotyping. By applying strict similarity- and frequency-based analysis, we found, despite our modest sample size, a total of 18 alleles that have not been described previously, thereby illustrating the potential for efficient and highly accurate MHC genotyping from non-invasive DNA samples. We emphasize the importance of controlling for multiple potential sources of error when applying this massively parallel short-read sequencing technology to PCR products generated from low concentration DNA extracts. We observed pronounced differences in MHC variation between species, subspecies and populations that are consistent with both the ancient and recent demographic histories experienced by gorillas.
Collapse
Affiliation(s)
- Jörg B Hans
- Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
| | - Anne Haubner
- Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
| | - Mimi Arandjelovic
- Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
| | - Richard A Bergl
- North Carolina Zoological Park, Asheboro, North Carolina, USA
| | | | - Maryke Gray
- International Gorilla Conservation Program, Kigali, Rwanda
| | | | - Martha M Robbins
- Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
| | | | - Linda Vigilant
- Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
| |
Collapse
|