1
|
Zhu Y, Watson C, Safonova Y, Pennell M, Bankevich A. Assessing Assembly Errors in Immunoglobulin Loci: A Comprehensive Evaluation of Long-read Genome Assemblies Across Vertebrates. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.07.19.604360. [PMID: 39091785 PMCID: PMC11291089 DOI: 10.1101/2024.07.19.604360] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/04/2024]
Abstract
Long-read sequencing technologies have revolutionized genome assembly producing near-complete chromosome assemblies for numerous organisms, which are invaluable to research in many fields. However, regions with complex repetitive structure continue to represent a challenge for genome assembly algorithms, particularly in areas with high heterozygosity. Robust and comprehensive solutions for the assessment of assembly accuracy and completeness in these regions do not exist. In this study we focus on the assembly of biomedically important antibody-encoding immunoglobulin (IG) loci, which are characterized by complex duplications and repeat structures. High-quality full-length assemblies for these loci are critical for resolving haplotype-level annotations of IG genes, without which, functional and evolutionary studies of antibody immunity across vertebrates are not tractable. To address these challenges, we developed a pipeline, "CloseRead", that generates multiple assembly verification metrics for analysis and visualization. These metrics expand upon those of existing quality assessment tools and specifically target complex and highly heterozygous regions. Using CloseRead, we systematically assessed the accuracy and completeness of IG loci in publicly available assemblies of 74 vertebrate species, identifying problematic regions. We also demonstrated that inspecting assembly graphs for problematic regions can both identify the root cause of assembly errors and illuminate solutions for improving erroneous assemblies. For a subset of species, we were able to correct assembly errors through targeted reassembly. Together, our analysis demonstrated the utility of assembly assessment in improving the completeness and accuracy of IG loci across species.
Collapse
Affiliation(s)
- Yixin Zhu
- Department of Quantitative and Computational Biology and Biological Sciences, University of Southern California, Los Angeles, CA, United States
| | - Corey Watson
- Department of Biochemistry and Molecular Biology, University of Louisville School of Medicine, Louisville, KY, United States
| | - Yana Safonova
- Department of Computer Science and Engineering, Pennsylvania State University, PA, United States
| | - Matt Pennell
- Department of Quantitative and Computational Biology and Biological Sciences, University of Southern California, Los Angeles, CA, United States
| | - Anton Bankevich
- Department of Computer Science and Engineering, Pennsylvania State University, PA, United States
| |
Collapse
|
2
|
Collins AM, Ohlin M, Corcoran M, Heather JM, Ralph D, Law M, Martínez-Barnetche J, Ye J, Richardson E, Gibson WS, Rodriguez OL, Peres A, Yaari G, Watson CT, Lees WD. AIRR-C IG Reference Sets: curated sets of immunoglobulin heavy and light chain germline genes. Front Immunol 2024; 14:1330153. [PMID: 38406579 PMCID: PMC10884231 DOI: 10.3389/fimmu.2023.1330153] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Accepted: 12/27/2023] [Indexed: 02/27/2024] Open
Abstract
Introduction Analysis of an individual's immunoglobulin (IG) gene repertoire requires the use of high-quality germline gene reference sets. When sets only contain alleles supported by strong evidence, AIRR sequencing (AIRR-seq) data analysis is more accurate and studies of the evolution of IG genes, their allelic variants and the expressed immune repertoire is therefore facilitated. Methods The Adaptive Immune Receptor Repertoire Community (AIRR-C) IG Reference Sets have been developed by including only human IG heavy and light chain alleles that have been confirmed by evidence from multiple high-quality sources. To further improve AIRR-seq analysis, some alleles have been extended to deal with short 3' or 5' truncations that can lead them to be overlooked by alignment utilities. To avoid other challenges for analysis programs, exact paralogs (e.g. IGHV1-69*01 and IGHV1-69D*01) are only represented once in each set, though alternative sequence names are noted in accompanying metadata. Results and discussion The Reference Sets include less than half the previously recognised IG alleles (e.g. just 198 IGHV sequences), and also include a number of novel alleles: 8 IGHV alleles, 2 IGKV alleles and 5 IGLV alleles. Despite their smaller sizes, erroneous calls were eliminated, and excellent coverage was achieved when a set of repertoires comprising over 4 million V(D)J rearrangements from 99 individuals were analyzed using the Sets. The version-tracked AIRR-C IG Reference Sets are freely available at the OGRDB website (https://ogrdb.airr-community.org/germline_sets/Human) and will be regularly updated to include newly observed and previously reported sequences that can be confirmed by new high-quality data.
Collapse
Affiliation(s)
- Andrew M. Collins
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, NSW, Australia
| | - Mats Ohlin
- Department of Immunotechnology, and SciLifeLab, Lund University, Lund, Sweden
| | - Martin Corcoran
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institute, Stockholm, Sweden
| | - James M. Heather
- Mass General Cancer Center, Massachusetts General Hospital, Charlestown, MA, United States
- Department of Medicine, Harvard Medical School, Boston, MA, United States
| | - Duncan Ralph
- Fred Hutchinson Cancer Research Center, Seattle, WA, United States
| | - Mansun Law
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA, United States
| | - Jesus Martínez-Barnetche
- Centro de Investigación Sobre Enfermedades Infecciosas, Instituto Nacional de Salud Pública, Cuernavaca, Morelos, Mexico
| | - Jian Ye
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, United States
| | - Eve Richardson
- La Jolla Institute for Immunology, San Diego, CA, United States
| | - William S. Gibson
- Department of Biochemistry and Molecular Genetics, School of Medicine, University of Louisville, Louisville, KY, United States
| | - Oscar L. Rodriguez
- Department of Biochemistry and Molecular Genetics, School of Medicine, University of Louisville, Louisville, KY, United States
| | - Ayelet Peres
- Bioengineering Program, Faculty of Engineering, Bar-Ilan University, Ramat Gan, Israel
| | - Gur Yaari
- Bioengineering Program, Faculty of Engineering, Bar-Ilan University, Ramat Gan, Israel
| | - Corey T. Watson
- Department of Biochemistry and Molecular Genetics, School of Medicine, University of Louisville, Louisville, KY, United States
| | - William D. Lees
- Institute of Structural and Molecular Biology, Birkbeck College, London, United Kingdom
- Human-Centered Computing and Information Science, Institute for Systems and Computer Engineering, Technology and Science, Porto, Portugal
| |
Collapse
|
3
|
Mikelov A, Nefediev G, Tashkeev A, Rodriguez OL, Ortmans DA, Skatova V, Izraelson M, Davydov A, Poslavsky S, Rahmouni S, Watson CT, Chudakov D, Boyd SD, Bolotin D. Ultrasensitive allele inference from immune repertoire sequencing data with MiXCR. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.10.561703. [PMID: 38014266 PMCID: PMC10680553 DOI: 10.1101/2023.10.10.561703] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/29/2023]
Abstract
Allelic variability in the adaptive immune receptor loci, which harbor the gene segments that encode B cell and T cell receptors (BCR/TCR), has been shown to be of critical importance for immune responses to pathogens and vaccines. In recent years, B cell and T cell receptor repertoire sequencing (Rep-Seq) has become widespread in immunology research making it the most readily available source of information about allelic diversity in immunoglobulin (IG) and T cell receptor (TR) loci in different populations. Here we present a novel algorithm for extra-sensitive and specific variable (V) and joining (J) gene allele inference and genotyping allowing reconstruction of individual high-quality gene segment libraries. The approach can be applied for inferring allelic variants from peripheral blood lymphocyte BCR and TCR repertoire sequencing data, including hypermutated isotype-switched BCR sequences, thus allowing high-throughput genotyping and novel allele discovery from a wide variety of existing datasets. The developed algorithm is a part of the MiXCR software ( https://mixcr.com ) and can be incorporated into any pipeline utilizing upstream processing with MiXCR. We demonstrate the accuracy of this approach using Rep-Seq paired with long-read genomic sequencing data, comparing it to a widely used algorithm, TIgGER. We applied the algorithm to a large set of IG heavy chain (IGH) Rep-Seq data from 450 donors of ancestrally diverse population groups, and to the largest reported full-length TCR alpha and beta chain (TRA; TRB) Rep-Seq dataset, representing 134 individuals. This allowed us to assess the genetic diversity of genes within the IGH, TRA and TRB loci in different populations and demonstrate the connection between antibody repertoire gene usage and the number of allelic variants present in the population. Finally we established a database of allelic variants of V and J genes inferred from Rep-Seq data and their population frequencies with free public access at https://vdj.online .
Collapse
|
4
|
Dhande IS, Zhu Y, Joshi AS, Hicks MJ, Braun MC, Doris PA. Polygenic genetic variation affecting antibody formation underlies hypertensive renal injury in the stroke-prone spontaneously hypertensive rat. Am J Physiol Renal Physiol 2023; 325:F317-F327. [PMID: 37439198 PMCID: PMC10511163 DOI: 10.1152/ajprenal.00058.2023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2023] [Revised: 07/07/2023] [Accepted: 07/07/2023] [Indexed: 07/14/2023] Open
Abstract
During development of the spontaneously hypertensive rat (SHR), several distinct but closely related lines were generated. Most lines are resistant to hypertensive renal disease. However, the SHR-A3 line (stroke-prone SHR) experiences end-organ injury (EOI) and provides a model of injury susceptibility that can be used to uncover genetic causation. In the present study, we generated a congenic line in which three distinct disease loci in SHR-A3 are concurrently replaced with homologous loci from an injury-resistant SHR line (SHR-B2). Verification that all three loci were homozygously replaced in this triple congenic line [SHR-A3(Trip B2)] while the genetic background of SHR-A3 was fully retained was obtained by whole genome sequencing. Congenic genome substitution was without effect on systolic blood pressure [198.9 ± 3.34 mmHg, mean ± SE, SHR-A3(Trip B2) = 194.7 ± 2.55 mmHg]. Measures of renal injury (albuminuria, histological injury scores, and urinary biomarker levels) were reduced in SHR-A3(Trip B2) animals, even though only 4.5 Mbases of the 2.8 Gbases of the SHR-B2 genome (0.16% of the genome) was transferred into the congenic line. The gene content of the three congenic loci and the functional effects of gene polymorphism within suggest a role of immunoglobulin in EOI pathogenesis. To prove the role of antibodies in EOI in SHR-A3, we generated an SHR-A3 line in which expression from the immunoglobulin heavy chain gene was knocked out (SHR-A3-IGHKO). Animals in the SHR-A3-IGHKO line lack B cells and immunoglobulin, but the hypertensive phenotype is not affected. Renal injury, however, was reduced in this line, confirming a pathogenic role for immunoglobulin in hypertensive EOI in this model of heritable risk.NEW & NOTEWORTHY Here, we used a polygenic animal model of hypertensive renal disease to show that genetic variation affecting antibody formation underlies hypertensive renal disease. We proved the genetic thesis by generating an immunoglobulin knockout in the susceptible animal model.
Collapse
Affiliation(s)
- Isha S Dhande
- Institute of Molecular Medicine, University of Texas Health Science Center at Houston, Houston, Texas, United States
| | - Yaming Zhu
- Institute of Molecular Medicine, University of Texas Health Science Center at Houston, Houston, Texas, United States
| | - Aniket S Joshi
- Institute of Molecular Medicine, University of Texas Health Science Center at Houston, Houston, Texas, United States
| | - M John Hicks
- Department of Pathology and Immunology, Baylor College of Medicine, Houston, Texas, United States
| | - Michael C Braun
- Department of Pediatrics, Baylor College of Medicine, Houston, Texas, United States
- Department of Obstetrics and Gynecology, Baylor College of Medicine, Houston, Texas, United States
| | - Peter A Doris
- Institute of Molecular Medicine, University of Texas Health Science Center at Houston, Houston, Texas, United States
| |
Collapse
|
5
|
Corcoran M, Chernyshev M, Mandolesi M, Narang S, Kaduk M, Ye K, Sundling C, Färnert A, Kreslavsky T, Bernhardsson C, Larena M, Jakobsson M, Karlsson Hedestam GB. Archaic humans have contributed to large-scale variation in modern human T cell receptor genes. Immunity 2023; 56:635-652.e6. [PMID: 36796364 DOI: 10.1016/j.immuni.2023.01.026] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2022] [Revised: 11/21/2022] [Accepted: 01/23/2023] [Indexed: 02/18/2023]
Abstract
Human T cell receptors (TCRs) are critical for mediating immune responses to pathogens and tumors and regulating self-antigen recognition. Yet, variations in the genes encoding TCRs remain insufficiently defined. Detailed analysis of expressed TCR alpha, beta, gamma, and delta genes in 45 donors from four human populations-African, East Asian, South Asian, and European-revealed 175 additional TCR variable and junctional alleles. Most of these contained coding changes and were present at widely differing frequencies in the populations, a finding confirmed using DNA samples from the 1000 Genomes Project. Importantly, we identified three Neanderthal-derived, introgressed TCR regions including a highly divergent TRGV4 variant, which mediated altered butyrophilin-like molecule 3 (BTNL3) ligand reactivity and was frequent in all modern Eurasian population groups. Our results demonstrate remarkable variation in TCR genes in both individuals and populations, providing a strong incentive for including allelic variation in studies of TCR function in human biology.
Collapse
Affiliation(s)
- Martin Corcoran
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institutet, 171 77 Stockholm, Sweden.
| | - Mark Chernyshev
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institutet, 171 77 Stockholm, Sweden
| | - Marco Mandolesi
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institutet, 171 77 Stockholm, Sweden
| | - Sanjana Narang
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institutet, 171 77 Stockholm, Sweden
| | - Mateusz Kaduk
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institutet, 171 77 Stockholm, Sweden
| | - Kewei Ye
- Department of Medicine, Solna, Karolinska Institutet and Karolinska University Hospital, Stockholm, Sweden; Center for Molecular Medicine, Karolinska Institutet, Stockholm, Sweden
| | - Christopher Sundling
- Department of Medicine, Solna, Karolinska Institutet and Karolinska University Hospital, Stockholm, Sweden; Center for Molecular Medicine, Karolinska Institutet, Stockholm, Sweden; Department of Infectious Diseases, Karolinska University Hospital, 171 76 Stockholm, Sweden
| | - Anna Färnert
- Department of Medicine, Solna, Karolinska Institutet and Karolinska University Hospital, Stockholm, Sweden; Center for Molecular Medicine, Karolinska Institutet, Stockholm, Sweden; Department of Infectious Diseases, Karolinska University Hospital, 171 76 Stockholm, Sweden
| | - Taras Kreslavsky
- Department of Medicine, Solna, Karolinska Institutet and Karolinska University Hospital, Stockholm, Sweden; Center for Molecular Medicine, Karolinska Institutet, Stockholm, Sweden
| | - Carolina Bernhardsson
- Department of Organismal Biology, Human Evolution, Norbyvägen 18C, 752 63 Uppsala, Sweden
| | - Maximilian Larena
- Department of Organismal Biology, Human Evolution, Norbyvägen 18C, 752 63 Uppsala, Sweden
| | - Mattias Jakobsson
- Department of Organismal Biology, Human Evolution, Norbyvägen 18C, 752 63 Uppsala, Sweden
| | | |
Collapse
|
6
|
Gibson WS, Rodriguez OL, Shields K, Silver CA, Dorgham A, Emery M, Deikus G, Sebra R, Eichler EE, Bashir A, Smith ML, Watson CT. Characterization of the immunoglobulin lambda chain locus from diverse populations reveals extensive genetic variation. Genes Immun 2023; 24:21-31. [PMID: 36539592 PMCID: PMC10041605 DOI: 10.1038/s41435-022-00188-2] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Revised: 10/07/2022] [Accepted: 10/13/2022] [Indexed: 12/24/2022]
Abstract
Immunoglobulins (IGs), crucial components of the adaptive immune system, are encoded by three genomic loci. However, the complexity of the IG loci severely limits the effective use of short read sequencing, limiting our knowledge of population diversity in these loci. We leveraged existing long read whole-genome sequencing (WGS) data, fosmid technology, and IG targeted single-molecule, real-time (SMRT) long-read sequencing (IG-Cap) to create haplotype-resolved assemblies of the IG Lambda (IGL) locus from 6 ethnically diverse individuals. In addition, we generated 10 diploid assemblies of IGL from a diverse cohort of individuals utilizing IG-Cap. From these 16 individuals, we identified significant allelic diversity, including 36 novel IGLV alleles. In addition, we observed highly elevated single nucleotide variation (SNV) in IGLV genes relative to IGL intergenic and genomic background SNV density. By comparing SNV calls between our high quality assemblies and existing short read datasets from the same individuals, we show a high propensity for false-positives in the short read datasets. Finally, for the first time, we nucleotide-resolved common 5-10 Kb duplications in the IGLC region that contain functional IGLJ and IGLC genes. Together these data represent a significant advancement in our understanding of genetic variation and population diversity in the IGL locus.
Collapse
Affiliation(s)
- William S Gibson
- Department of Biochemistry and Molecular Genetics, University of Louisville, Louisville, KY, USA
| | - Oscar L Rodriguez
- Department of Biochemistry and Molecular Genetics, University of Louisville, Louisville, KY, USA
| | - Kaitlyn Shields
- Department of Biochemistry and Molecular Genetics, University of Louisville, Louisville, KY, USA
| | - Catherine A Silver
- Department of Biochemistry and Molecular Genetics, University of Louisville, Louisville, KY, USA
| | - Abdullah Dorgham
- Department of Biochemistry and Molecular Genetics, University of Louisville, Louisville, KY, USA
| | - Matthew Emery
- Icahn Institute of Genomics Technology and Data Science, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Gintaras Deikus
- Icahn Institute of Genomics Technology and Data Science, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Robert Sebra
- Icahn Institute of Genomics Technology and Data Science, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Sema4, Stamford, CT, USA
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| | - Ali Bashir
- Google Accelerated Science Team, Google Inc, Mountain View, CA, USA
| | - Melissa L Smith
- Department of Biochemistry and Molecular Genetics, University of Louisville, Louisville, KY, USA.
| | - Corey T Watson
- Department of Biochemistry and Molecular Genetics, University of Louisville, Louisville, KY, USA.
| |
Collapse
|
7
|
Narang S, Kaduk M, Chernyshev M, Karlsson Hedestam GB, Corcoran MM. Adaptive immune receptor genotyping using the corecount program. Front Immunol 2023; 14:1125884. [PMID: 37114042 PMCID: PMC10126697 DOI: 10.3389/fimmu.2023.1125884] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2022] [Accepted: 02/27/2023] [Indexed: 04/29/2023] Open
Abstract
We present a new Rep-Seq analysis tool called corecount, for analyzing genotypic variation in immunoglobulin (IG) and T cell receptor (TCR) genes. corecount is highly efficient at identifying V alleles, including those that are infrequently used in expressed repertoires and those that contain 3' end variation that are otherwise refractory to reliable identification during germline inference from expressed libraries. Furthermore, corecount facilitates accurate D and J gene genotyping. The output is highly reproducible and facilitates the comparison of genotypes from multiple individuals, such as those from clinical cohorts. Here, we applied corecount to the genotypic analysis of IgM libraries from 16 individuals. To demonstrate the accuracy of corecount, we Sanger sequenced all the heavy chain IG alleles (65 IGHV, 27 IGHD and 7 IGHJ) from one individual from whom we also produced two independent IgM Rep-seq datasets. Genomic analysis revealed that 5 known IGHV and 2 IGHJ sequences are truncated in current reference databases. This dataset of genomically validated alleles and IgM libraries from the same individual provides a useful resource for benchmarking other bioinformatic programs that involve V, D and J assignments and germline inference, and may facilitate the development of AIRR-Seq analysis tools that can take benefit from the availability of more comprehensive reference databases.
Collapse
|
8
|
Katayama Y, Yokota R, Akiyama T, Kobayashi TJ. Machine Learning Approaches to TCR Repertoire Analysis. Front Immunol 2022; 13:858057. [PMID: 35911778 PMCID: PMC9334875 DOI: 10.3389/fimmu.2022.858057] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2022] [Accepted: 06/07/2022] [Indexed: 11/13/2022] Open
Abstract
Sparked by the development of genome sequencing technology, the quantity and quality of data handled in immunological research have been changing dramatically. Various data and database platforms are now driving the rapid progress of machine learning for immunological data analysis. Of various topics in immunology, T cell receptor repertoire analysis is one of the most important targets of machine learning for assessing the state and abnormalities of immune systems. In this paper, we review recent repertoire analysis methods based on machine learning and deep learning and discuss their prospects.
Collapse
Affiliation(s)
- Yotaro Katayama
- Graduate School of Engineering, The University of Tokyo, Tokyo, Japan
| | - Ryo Yokota
- National Research Institute of Police Science, Kashiwa, Chiba, Japan
| | - Taishin Akiyama
- Laboratory for Immune Homeostasis, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- Graduate School of Medical Life Science, Yokohama City University, Yokohama, Japan
| | - Tetsuya J. Kobayashi
- Graduate School of Engineering, The University of Tokyo, Tokyo, Japan
- Institute of Industrial Science, The University of Tokyo, Tokyo, Japan
| |
Collapse
|
9
|
Jackson KJL, Kos JT, Lees W, Gibson WS, Smith ML, Peres A, Yaari G, Corcoran M, Busse CE, Ohlin M, Watson CT, Collins AM. A BALB/c IGHV Reference Set, Defined by Haplotype Analysis of Long-Read VDJ-C Sequences From F1 (BALB/c x C57BL/6) Mice. Front Immunol 2022; 13:888555. [PMID: 35720344 PMCID: PMC9205180 DOI: 10.3389/fimmu.2022.888555] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2022] [Accepted: 04/28/2022] [Indexed: 11/13/2022] Open
Abstract
The immunoglobulin genes of inbred mouse strains that are commonly used in models of antibody-mediated human diseases are poorly characterized. This compromises data analysis. To infer the immunoglobulin genes of BALB/c mice, we used long-read SMRT sequencing to amplify VDJ-C sequences from F1 (BALB/c x C57BL/6) hybrid animals. Strain variations were identified in the Ighm and Ighg2b genes, and analysis of VDJ rearrangements led to the inference of 278 germline IGHV alleles. 169 alleles are not present in the C57BL/6 genome reference sequence. To establish a set of expressed BALB/c IGHV germline gene sequences, we computationally retrieved IGHV haplotypes from the IgM dataset. Haplotyping led to the confirmation of 162 BALB/c IGHV gene sequences. A musIGHV398 pseudogene variant also appears to be present in the BALB/cByJ substrain, while a functional musIGHV398 gene is highly expressed in the BALB/cJ substrain. Only four of the BALB/c alleles were also observed in the C57BL/6 haplotype. The full set of inferred BALB/c sequences has been used to establish a BALB/c IGHV reference set, hosted at https://ogrdb.airr-community.org. We assessed whether assemblies from the Mouse Genome Project (MGP) are suitable for the determination of the genes of the IGH loci. Only 37 (43.5%) of the 85 confirmed IMGT-named BALB/c IGHV and 33 (42.9%) of the 77 confirmed non-IMGT IGHV were found in a search of the MGP BALB/cJ genome assembly. This suggests that current MGP assemblies are unsuitable for the comprehensive documentation of germline IGHVs and more efforts will be needed to establish strain-specific reference sets.
Collapse
Affiliation(s)
| | - Justin T. Kos
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, United States
| | - William Lees
- Institute of Structural and Molecular Biology, Birkbeck College, University of London, London, United Kingdom
| | - William S. Gibson
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, United States
| | - Melissa Laird Smith
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, United States
| | - Ayelet Peres
- Faculty of Engineering, Bar Ilan University, Ramat Gan, Israel
| | - Gur Yaari
- Faculty of Engineering, Bar Ilan University, Ramat Gan, Israel
| | - Martin Corcoran
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institutet, Stockholm, Sweden
| | - Christian E. Busse
- Division of B Cell Immunology, German Cancer Research Center, Heidelberg, Germany
| | - Mats Ohlin
- Department of Immunotechnology, Lund University, Lund, Sweden
| | - Corey T. Watson
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, United States
| | - Andrew M. Collins
- School of Biotechnology and Biomolecular Sciences, The University of New South Wales, Sydney, NSW, Australia
| |
Collapse
|
10
|
Rao Q, Xu R, Wan Q. Immunoglobulin heavy chain gene rearrangement in heavy chain deposition disease suggests it is a plasma cell disease: a case report. J Int Med Res 2022; 50:3000605221086428. [PMID: 35301906 PMCID: PMC8943313 DOI: 10.1177/03000605221086428] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022] Open
Abstract
Heavy chain deposition disease (HCDD) is characterized by the deposition of truncated monoclonal immunoglobulin heavy chains along glomerular basement membranes. Truncated heavy chains are thought to be associated with plasma cell disease (PCD), but previous bone marrow cytology tests showed that only 30% of HCDD cases are related to PCDs. We report the first known use of immunoglobulin heavy chain (IGH) gene rearrangement to diagnose a patient with γ3-HCDD, although bone marrow morphology test identified no abnormalities. Our findings provide strong evidence for a correlation between PCDs and HCDD, which could help understand the genetic background underlying abnormal heavy chains and assess disease prognosis. Further, concordant with previous findings, bortezomib-based chemotherapy had a good therapeutic effect in our patient. We summarize the experience of diagnosing and treating a case of HCDD, and combine this with a literature review to further explore the correlation between PCDs and HCDD, which has important clinical value.
Collapse
Affiliation(s)
- Qingqing Rao
- Graduate School of Shenzhen University, Shenzhen 518060, Guangdong, China.,Department of Nephrology, Shenzhen Second People's Hospital, First Affiliated Hospital of Shenzhen University, Shenzhen 518000, Guangdong, China
| | - Ricong Xu
- Department of Nephrology, Shenzhen Second People's Hospital, First Affiliated Hospital of Shenzhen University, Shenzhen 518000, Guangdong, China
| | - Qijun Wan
- Department of Nephrology, Shenzhen Second People's Hospital, First Affiliated Hospital of Shenzhen University, Shenzhen 518000, Guangdong, China
| |
Collapse
|
11
|
Lefranc MP, Lefranc G. IMGT ®Homo sapiens IG and TR Loci, Gene Order, CNV and Haplotypes: New Concepts as a Paradigm for Jawed Vertebrates Genome Assemblies. Biomolecules 2022; 12:381. [PMID: 35327572 PMCID: PMC8945572 DOI: 10.3390/biom12030381] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2022] [Revised: 02/21/2022] [Accepted: 02/24/2022] [Indexed: 02/04/2023] Open
Abstract
IMGT®, the international ImMunoGeneTics information system®, created in 1989, by Marie-Paule Lefranc (Université de Montpellier and CNRS), marked the advent of immunoinformatics, a new science which emerged at the interface between immunogenetics and bioinformatics for the study of the adaptive immune responses. IMGT® is based on a standardized nomenclature of the immunoglobulin (IG) and T cell receptor (TR) genes and alleles from fish to humans and on the IMGT unique numbering for the variable (V) and constant (C) domains of the immunoglobulin superfamily (IgSF) of vertebrates and invertebrates, and for the groove (G) domain of the major histocompatibility (MH) and MH superfamily (MhSF) proteins. IMGT® comprises 7 databases, 17 tools and more than 25,000 pages of web resources for sequences, genes and structures, based on the IMGT Scientific chart rules generated from the IMGT-ONTOLOGY axioms and concepts. IMGT® reference directories are used for the analysis of the NGS high-throughput expressed IG and TR repertoires (natural, synthetic and/or bioengineered) and for bridging sequences, two-dimensional (2D) and three-dimensional (3D) structures. This manuscript focuses on the IMGT®Homo sapiens IG and TR loci, gene order, copy number variation (CNV) and haplotypes new concepts, as a paradigm for jawed vertebrates genome assemblies.
Collapse
Affiliation(s)
- Marie-Paule Lefranc
- IMGT®, The International ImMunoGeneTics Information System®, Laboratoire d’Immuno Génétique Moléculaire (LIGM), Institut de Génétique Humaine (IGH), Université de Montpellier (UM), Centre National de la Recherche Scientifique (CNRS), UMR 9002 CNRS-UM, 141 rue de la Cardonille, CEDEX 5, 34396 Montpellier, France
| | - Gérard Lefranc
- IMGT®, The International ImMunoGeneTics Information System®, Laboratoire d’Immuno Génétique Moléculaire (LIGM), Institut de Génétique Humaine (IGH), Université de Montpellier (UM), Centre National de la Recherche Scientifique (CNRS), UMR 9002 CNRS-UM, 141 rue de la Cardonille, CEDEX 5, 34396 Montpellier, France
| |
Collapse
|
12
|
Heather JM, Spindler MJ, Alonso M, Shui Y, Millar DG, Johnson D, Cobbold M, Hata A. OUP accepted manuscript. Nucleic Acids Res 2022; 50:e68. [PMID: 35325179 PMCID: PMC9262623 DOI: 10.1093/nar/gkac190] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2021] [Revised: 02/18/2022] [Accepted: 03/09/2022] [Indexed: 11/17/2022] Open
Abstract
The study and manipulation of T cell receptors (TCRs) is central to multiple fields across basic and translational immunology research. Produced by V(D)J recombination, TCRs are often only recorded in the literature and data repositories as a combination of their V and J gene symbols, plus their hypervariable CDR3 amino acid sequence. However, numerous applications require full-length coding nucleotide sequences. Here we present Stitchr, a software tool developed to specifically address this limitation. Given minimal V/J/CDR3 information, Stitchr produces complete coding sequences representing a fully spliced TCR cDNA. Due to its modular design, Stitchr can be used for TCR engineering using either published germline or novel/modified variable and constant region sequences. Sequences produced by Stitchr were validated by synthesizing and transducing TCR sequences into Jurkat cells, recapitulating the expected antigen specificity of the parental TCR. Using a companion script, Thimble, we demonstrate that Stitchr can process a million TCRs in under ten minutes using a standard desktop personal computer. By systematizing the production and modification of TCR sequences, we propose that Stitchr will increase the speed, repeatability, and reproducibility of TCR research. Stitchr is available on GitHub.
Collapse
Affiliation(s)
- James M Heather
- To whom correspondence should be addressed. Tel: +1 617 724 0104;
| | | | | | | | - David G Millar
- Massachusetts General Hospital Cancer Center, Charlestown, MA, USA
- Department of Medicine, Harvard Medical School, Boston, MA, USA
| | | | - Mark Cobbold
- Massachusetts General Hospital Cancer Center, Charlestown, MA, USA
- Department of Medicine, Harvard Medical School, Boston, MA, USA
| | - Aaron N Hata
- Correspondence may also be addressed to Aaron N. Hata. Tel: +1 617 724 3442;
| |
Collapse
|
13
|
Babrak L, Marquez S, Busse CE, Lees WD, Miho E, Ohlin M, Rosenfeld AM, Stervbo U, Watson CT, Schramm CA. Adaptive Immune Receptor Repertoire (AIRR) Community Guide to TR and IG Gene Annotation. Methods Mol Biol 2022; 2453:279-296. [PMID: 35622332 PMCID: PMC9761530 DOI: 10.1007/978-1-0716-2115-8_16] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
High-throughput sequencing of adaptive immune receptor repertoires (AIRR, i.e., IG and TR) has revolutionized the ability to carry out large-scale experiments to study the adaptive immune response. Since the method was first introduced in 2009, AIRR sequencing (AIRR-Seq) has been applied to survey the immune state of individuals, identify antigen-specific or immune-state-associated signatures of immune responses, study the development of the antibody immune response, and guide the development of vaccines and antibody therapies. Recent advancements in the technology include sequencing at the single-cell level and in parallel with gene expression, which allows the introduction of multi-omics approaches to understand in detail the adaptive immune response. Analyzing AIRR-seq data can prove challenging even with high-quality sequencing, in part due to the many steps involved and the need to parameterize each step. In this chapter, we outline key factors to consider when preprocessing raw AIRR-Seq data and annotating the genetic origins of the rearranged receptors. We also highlight a number of common difficulties with common AIRR-seq data processing and provide strategies to address them.
Collapse
Affiliation(s)
- Lmar Babrak
- Institute of Biomedical Engineering and Medical Informatics, School of Life Sciences, FHNW University of Applied Sciences and Arts Northwestern Switzerland, Muttenz, Switzerland
| | - Susanna Marquez
- Department of Pathology, Yale School of Medicine, New Haven, CT, USA
| | - Christian E Busse
- Division of B Cell Immunology, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - William D Lees
- Institute of Structural and Molecular Biology, Birkbeck College, University of London, London, UK
| | - Enkelejda Miho
- Institute of Biomedical Engineering and Medical Informatics, School of Life Sciences, FHNW University of Applied Sciences and Arts Northwestern Switzerland, Muttenz, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
- aiNET GmbH, Basel, Switzerland
| | - Mats Ohlin
- Department of Immunotechnology, Lund University, Lund, Sweden
| | - Aaron M Rosenfeld
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Ulrik Stervbo
- Center for Translational Medicine, Immunology, and Transplantation, Medical Department I, Marien Hospital Herne, University Hospital of the Ruhr-University Bochum, Herne, Germany
- Immundiagnostik, Marien Hospital Herne, University Hospital of the Ruhr-University Bochum, Herne, Germany
| | - Corey T Watson
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, USA
| | - Chaim A Schramm
- Vaccine Research Center, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD, USA.
| |
Collapse
|
14
|
Manso T, Folch G, Giudicelli V, Jabado-Michaloud J, Kushwaha A, Nguefack Ngoune V, Georga M, Papadaki A, Debbagh C, Pégorier P, Bertignac M, Hadi-Saljoqi S, Chentli I, Cherouali K, Aouinti S, El Hamwi A, Albani A, Elazami Elhassani M, Viart B, Goret A, Tran A, Sanou G, Rollin M, Duroux P, Kossida S. IMGT® databases, related tools and web resources through three main axes of research and development. Nucleic Acids Res 2021; 50:D1262-D1272. [PMID: 34875068 PMCID: PMC8728119 DOI: 10.1093/nar/gkab1136] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Revised: 10/26/2021] [Accepted: 11/28/2021] [Indexed: 11/15/2022] Open
Abstract
IMGT®, the international ImMunoGeneTics information system®, http://www.imgt.org/, is at the forefront of the immunogenetics and immunoinformatics fields with more than 30 years of experience. IMGT® makes available databases and tools to the scientific community pertaining to the adaptive immune response, based on the IMGT-ONTOLOGY. We focus on the recent features of the IMGT® databases, tools, reference directories and web resources, within the three main axes of IMGT® research and development. Axis I consists in understanding the adaptive immune response, by deciphering the identification and characterization of the immunoglobulin (IG) and T cell receptor (TR) genes in jawed vertebrates. It is the starting point of the two other axes, namely the analysis and exploration of the expressed IG and TR repertoires based on comparison with IMGT reference directories in normal and pathological situations (Axis II) and the analysis of amino acid changes and functions of 2D and 3D structures of antibody and TR engineering (Axis III).
Collapse
Affiliation(s)
- Taciana Manso
- IMGT®, the international ImMunoGeneTics Information System®, Scientific Research National Center (CNRS), Institute of Human Genetics (IGH), University of Montpellier (UM), Montpellier, France
| | - Géraldine Folch
- IMGT®, the international ImMunoGeneTics Information System®, Scientific Research National Center (CNRS), Institute of Human Genetics (IGH), University of Montpellier (UM), Montpellier, France
| | - Véronique Giudicelli
- IMGT®, the international ImMunoGeneTics Information System®, Scientific Research National Center (CNRS), Institute of Human Genetics (IGH), University of Montpellier (UM), Montpellier, France
| | - Joumana Jabado-Michaloud
- IMGT®, the international ImMunoGeneTics Information System®, Scientific Research National Center (CNRS), Institute of Human Genetics (IGH), University of Montpellier (UM), Montpellier, France
| | - Anjana Kushwaha
- IMGT®, the international ImMunoGeneTics Information System®, Scientific Research National Center (CNRS), Institute of Human Genetics (IGH), University of Montpellier (UM), Montpellier, France
| | - Viviane Nguefack Ngoune
- IMGT®, the international ImMunoGeneTics Information System®, Scientific Research National Center (CNRS), Institute of Human Genetics (IGH), University of Montpellier (UM), Montpellier, France
| | - Maria Georga
- IMGT®, the international ImMunoGeneTics Information System®, Scientific Research National Center (CNRS), Institute of Human Genetics (IGH), University of Montpellier (UM), Montpellier, France
| | - Ariadni Papadaki
- IMGT®, the international ImMunoGeneTics Information System®, Scientific Research National Center (CNRS), Institute of Human Genetics (IGH), University of Montpellier (UM), Montpellier, France
| | - Chahrazed Debbagh
- IMGT®, the international ImMunoGeneTics Information System®, Scientific Research National Center (CNRS), Institute of Human Genetics (IGH), University of Montpellier (UM), Montpellier, France
| | - Perrine Pégorier
- IMGT®, the international ImMunoGeneTics Information System®, Scientific Research National Center (CNRS), Institute of Human Genetics (IGH), University of Montpellier (UM), Montpellier, France
| | - Morgane Bertignac
- IMGT®, the international ImMunoGeneTics Information System®, Scientific Research National Center (CNRS), Institute of Human Genetics (IGH), University of Montpellier (UM), Montpellier, France
| | - Saida Hadi-Saljoqi
- IMGT®, the international ImMunoGeneTics Information System®, Scientific Research National Center (CNRS), Institute of Human Genetics (IGH), University of Montpellier (UM), Montpellier, France
| | - Imène Chentli
- IMGT®, the international ImMunoGeneTics Information System®, Scientific Research National Center (CNRS), Institute of Human Genetics (IGH), University of Montpellier (UM), Montpellier, France
| | - Karima Cherouali
- IMGT®, the international ImMunoGeneTics Information System®, Scientific Research National Center (CNRS), Institute of Human Genetics (IGH), University of Montpellier (UM), Montpellier, France
| | - Safa Aouinti
- IMGT®, the international ImMunoGeneTics Information System®, Scientific Research National Center (CNRS), Institute of Human Genetics (IGH), University of Montpellier (UM), Montpellier, France
| | - Amar El Hamwi
- IMGT®, the international ImMunoGeneTics Information System®, Scientific Research National Center (CNRS), Institute of Human Genetics (IGH), University of Montpellier (UM), Montpellier, France
| | - Alexandre Albani
- IMGT®, the international ImMunoGeneTics Information System®, Scientific Research National Center (CNRS), Institute of Human Genetics (IGH), University of Montpellier (UM), Montpellier, France
| | - Merouane Elazami Elhassani
- IMGT®, the international ImMunoGeneTics Information System®, Scientific Research National Center (CNRS), Institute of Human Genetics (IGH), University of Montpellier (UM), Montpellier, France
| | - Benjamin Viart
- IMGT®, the international ImMunoGeneTics Information System®, Scientific Research National Center (CNRS), Institute of Human Genetics (IGH), University of Montpellier (UM), Montpellier, France
| | - Agathe Goret
- IMGT®, the international ImMunoGeneTics Information System®, Scientific Research National Center (CNRS), Institute of Human Genetics (IGH), University of Montpellier (UM), Montpellier, France
| | - Anna Tran
- IMGT®, the international ImMunoGeneTics Information System®, Scientific Research National Center (CNRS), Institute of Human Genetics (IGH), University of Montpellier (UM), Montpellier, France
| | - Gaoussou Sanou
- IMGT®, the international ImMunoGeneTics Information System®, Scientific Research National Center (CNRS), Institute of Human Genetics (IGH), University of Montpellier (UM), Montpellier, France
| | - Maël Rollin
- IMGT®, the international ImMunoGeneTics Information System®, Scientific Research National Center (CNRS), Institute of Human Genetics (IGH), University of Montpellier (UM), Montpellier, France
| | - Patrice Duroux
- IMGT®, the international ImMunoGeneTics Information System®, Scientific Research National Center (CNRS), Institute of Human Genetics (IGH), University of Montpellier (UM), Montpellier, France
| | - Sofia Kossida
- IMGT®, the international ImMunoGeneTics Information System®, Scientific Research National Center (CNRS), Institute of Human Genetics (IGH), University of Montpellier (UM), Montpellier, France
| |
Collapse
|
15
|
Huang Y, Thörnqvist L, Ohlin M. Computational Inference, Validation, and Analysis of 5'UTR-Leader Sequences of Alleles of Immunoglobulin Heavy Chain Variable Genes. Front Immunol 2021; 12:730105. [PMID: 34671351 PMCID: PMC8521166 DOI: 10.3389/fimmu.2021.730105] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2021] [Accepted: 09/06/2021] [Indexed: 12/05/2022] Open
Abstract
Upstream and downstream sequences of immunoglobulin genes may affect the expression of such genes. However, these sequences are rarely studied or characterized in most studies of immunoglobulin repertoires. Inference from large, rearranged immunoglobulin transcriptome data sets offers an opportunity to define the upstream regions (5'-untranslated regions and leader sequences). We have now established a new data pre-processing procedure to eliminate artifacts caused by a 5'-RACE library generation process, reanalyzed a previously studied data set defining human immunoglobulin heavy chain genes, and identified novel upstream regions, as well as previously identified upstream regions that may have been identified in error. Upstream sequences were also identified for a set of previously uncharacterized germline gene alleles. Several novel upstream region variants were validated, for instance by their segregation to a single haplotype in heterozygotic subjects. SNPs representing several sequence variants were identified from population data. Finally, based on the outcomes of the analysis, we define a set of testable hypotheses with respect to the placement of particular alleles in complex IGHV locus haplotypes, and discuss the evolutionary relatedness of particular heavy chain variable genes based on sequences of their upstream regions.
Collapse
Affiliation(s)
| | | | - Mats Ohlin
- Department of Immunotechnology, Lund University, Lund, Sweden
| |
Collapse
|
16
|
Dhande IS, Doris PA. Genomics and Inflammation in Cardiovascular Disease. Compr Physiol 2021; 11:2433-2454. [PMID: 34570903 DOI: 10.1002/cphy.c200032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Chronic cardiovascular diseases are associated with inflammatory responses within the blood vessels and end organs. The origin of this inflammation has not been certain, and neither is its relationship to disease clear. There is a need to determine whether this association is causal or coincidental to the processes leading to cardiovascular disease. These processes are themselves complex: many cardiovascular diseases arise in conjunction with the presence of sustained elevation of blood pressure. Inflammatory processes have been linked to hypertension, and causality has been suggested. Evidence of causality poses the difficult challenge of linking the integrated and multifaceted biology of blood pressure regulation with vascular function and complex elements of immune system function. These include both, innate and adaptive immunity, as well as interactions between the host immune system and the omnipresent microorganisms that are encountered in the environment and that colonize and exist in commensal relationship with the host. Progress has been made in this task and has drawn on experimental approaches in animals, much of which have focused on hypertension occurring with prolonged infusion of angiotensin II. These laboratory studies are complemented by studies that seek to inform disease mechanism by examining the genomic basis of heritable disease susceptibility in human populations. In this realm too, evidence has emerged that implicates genetic variation affecting immunity in disease pathogenesis. In this article, we survey the genetic and genomic evidence linking high blood pressure and its end-organ injuries to immune system function and examine evidence that genomic factors can influence disease risk. © 2021 American Physiological Society. Compr Physiol 11:1-22, 2021.
Collapse
Affiliation(s)
- Isha S Dhande
- Center for Human Genetics, Institute of Molecular Medicine, University of Texas Health Science Center at Houston, Houston, Texas, USA
| | - Peter A Doris
- Center for Human Genetics, Institute of Molecular Medicine, University of Texas Health Science Center at Houston, Houston, Texas, USA
| |
Collapse
|
17
|
Mikocziova I, Greiff V, Sollid LM. Immunoglobulin germline gene variation and its impact on human disease. Genes Immun 2021; 22:205-217. [PMID: 34175903 PMCID: PMC8234759 DOI: 10.1038/s41435-021-00145-5] [Citation(s) in RCA: 46] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2021] [Revised: 06/01/2021] [Accepted: 06/10/2021] [Indexed: 02/06/2023]
Abstract
Immunoglobulins (Ig) play an important role in the immune system both when expressed as antigen receptors on the cell surface of B cells and as antibodies secreted into extracellular fluids. The advent of high-throughput sequencing methods has enabled the investigation of human Ig repertoires at unprecedented depth. This has led to the discovery of many previously unreported germline Ig alleles. Moreover, it is becoming clear that convergent and stereotypic antibody responses are common where different individuals recognise defined antigenic epitopes with the use of the same Ig V genes. Thus, germline V gene variation is increasingly being linked to the differential capacity of generating an effective immune response, which might lead to varying disease susceptibility. Here, we review recent evidence of how germline variation in Ig genes impacts the Ig repertoire and its subsequent effects on the adaptive immune response in vaccination, infection, and autoimmunity.
Collapse
Affiliation(s)
- Ivana Mikocziova
- Department of Immunology, University of Oslo, Oslo, Norway
- K. G. Jebsen Centre for Coeliac Disease Research, University of Oslo and Oslo University Hospital, Oslo, Norway
| | - Victor Greiff
- Department of Immunology, University of Oslo, Oslo, Norway
| | - Ludvig M Sollid
- Department of Immunology, University of Oslo, Oslo, Norway.
- K. G. Jebsen Centre for Coeliac Disease Research, University of Oslo and Oslo University Hospital, Oslo, Norway.
| |
Collapse
|
18
|
Khatri I, Berkowska MA, van den Akker EB, Teodosio C, Reinders MJT, van Dongen JJM. Population matched (pm) germline allelic variants of immunoglobulin (IG) loci: Relevance in infectious diseases and vaccination studies in human populations. Genes Immun 2021; 22:172-186. [PMID: 34120151 PMCID: PMC8196923 DOI: 10.1038/s41435-021-00143-7] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2021] [Revised: 05/12/2021] [Accepted: 06/01/2021] [Indexed: 02/05/2023]
Abstract
Immunoglobulin (IG) loci harbor inter-individual allelic variants in many different germline IG variable, diversity and joining genes of the IG heavy (IGH), kappa (IGK) and lambda (IGL) loci, which together form the genetic basis of the highly diverse antigen-specific B-cell receptors. These allelic variants can be shared between or be specific to human populations. The current immunogenetics resources gather the germline alleles, however, lack the population specificity of the alleles which poses limitations for disease-association studies related to immune responses in different human populations. Therefore, we systematically identified germline alleles from 26 different human populations around the world, profiled by "1000 Genomes" data. We identified 409 IGHV, 179 IGKV, and 199 IGLV germline alleles supported by at least seven haplotypes. The diversity of germline alleles is the highest in Africans. Remarkably, the variants in the identified novel alleles show strikingly conserved patterns, the same as found in other IG databases, suggesting over-time evolutionary selection processes. We could relate the genetic variants to population-specific immune responses, e.g. IGHV1-69 for flu in Africans. The population matched IG (pmIG) resource will enhance our understanding of the SHM-related B-cell receptor selection processes in (infectious) diseases and vaccination within and between different human populations.
Collapse
Affiliation(s)
- Indu Khatri
- Department Immunology, Leiden University Medical Center, Leiden, The Netherlands
- Leiden Computational Biology Center, Leiden University Medical Center, Leiden, The Netherlands
| | | | - Erik B van den Akker
- Leiden Computational Biology Center, Leiden University Medical Center, Leiden, The Netherlands
- Department Molecular Epidemiology, Leiden University Medical Center, Leiden, The Netherlands
- Delft Bioinformatics Lab, Delft University of Technology, Delft, The Netherlands
| | - Cristina Teodosio
- Department Immunology, Leiden University Medical Center, Leiden, The Netherlands
| | - Marcel J T Reinders
- Leiden Computational Biology Center, Leiden University Medical Center, Leiden, The Netherlands
- Delft Bioinformatics Lab, Delft University of Technology, Delft, The Netherlands
| | | |
Collapse
|
19
|
Abstract
Immunogenomics studies have been largely limited to individuals of European ancestry, restricting the ability to identify variation in human adaptive immune responses across populations. Inclusion of a greater diversity of individuals in immunogenomics studies will substantially enhance our understanding of human immunology.
Collapse
|
20
|
Arnaout RA, Prak ETL, Schwab N, Rubelt F. The Future of Blood Testing Is the Immunome. Front Immunol 2021; 12:626793. [PMID: 33790897 PMCID: PMC8005722 DOI: 10.3389/fimmu.2021.626793] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2020] [Accepted: 01/19/2021] [Indexed: 12/13/2022] Open
Abstract
It is increasingly clear that an extraordinarily diverse range of clinically important conditions—including infections, vaccinations, autoimmune diseases, transplants, transfusion reactions, aging, and cancers—leave telltale signatures in the millions of V(D)J-rearranged antibody and T cell receptor [TR per the Human Genome Organization (HUGO) nomenclature but more commonly known as TCR] genes collectively expressed by a person’s B cells (antibodies) and T cells. We refer to these as the immunome. Because of its diversity and complexity, the immunome provides singular opportunities for advancing personalized medicine by serving as the substrate for a highly multiplexed, near-universal blood test. Here we discuss some of these opportunities, the current state of immunome-based diagnostics, and highlight some of the challenges involved. We conclude with a call to clinicians, researchers, and others to join efforts with the Adaptive Immune Receptor Repertoire Community (AIRR-C) to realize the diagnostic potential of the immunome.
Collapse
Affiliation(s)
- Ramy A Arnaout
- Department of Pathology and Division of Clinical Informatics, Department of Medicine, Beth Israel Deaconess Medical Center, Boston, MA, United States.,Department of Pathology, Harvard Medical School, Boston, MA, United States
| | - Eline T Luning Prak
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Nicholas Schwab
- Department of Neurology and Institute of Translational Neurology, University of Muenster, Muenster, Germany
| | - Florian Rubelt
- Roche Sequencing Solutions, Pleasanton, CA, United States
| | | |
Collapse
|
21
|
Ohlin M. Poorly Expressed Alleles of Several Human Immunoglobulin Heavy Chain Variable Genes are Common in the Human Population. Front Immunol 2021; 11:603980. [PMID: 33717051 PMCID: PMC7943739 DOI: 10.3389/fimmu.2020.603980] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2020] [Accepted: 12/08/2020] [Indexed: 12/23/2022] Open
Abstract
Extensive diversity has been identified in the human heavy chain immunoglobulin locus, including allelic variation, gene duplication, and insertion/deletion events. Several genes have been suggested to be deleted in many haplotypes. Such findings have commonly been based on inference of the germline repertoire from data sets covering antibody heavy chain encoding transcripts. The inference process operates under conditions that may limit identification of genes transcribed at low levels. The presence of rare transcripts that would indicate the existence of poorly expressed alleles in haplotypes that otherwise appear to have deleted these genes has been assessed in the present study. Alleles IGHV1-2*05, IGHV1-3*02, IGHV4-4*01, and IGHV7-4-1*01 were all identified as being expressed from multiple haplotypes, but only at low levels, haplotypes that by inference often appeared not to express these genes at all. These genes are thus not as commonly deleted as previously thought. An assessment of the 5' untranslated region (up to and including the TATA-box), the signal peptide-encoding part of the gene, and the 3'-heptamer suggests that the alleles have no or minimal sequence difference in these regions in comparison to highly expressed alleles. This suggest that they may be able to participate in immunoglobulin gene rearrangement, transcription and translation. However, all four poorly expressed alleles harbor unusual sequence variants within their coding region that may compromise the functionality of the encoded products, thereby limiting their incorporation into the immunoglobulin repertoire. Transcripts based on IGHV7-4-1*01 that had undergone somatic hypermutation and class switch had mutated the codon that encoded the unusual residue in framework region 3 (cysteine 92; located far from the antigen binding site). This finding further supports the poor compatibility of this unusual residue in a fully functional protein product. Indications of a linkage disequilibrium were identified as IGHV1-2*05 and IGHV4-4*01 co-localized to the same haplotypes. Furthermore, transcripts of two of the poorly expressed alleles (IGHV1-3*02 and IGHV4-4*01) mostly do not encode in-frame, functional products, suggesting that these alleles might be essentially non-functional. It is proposed that the functionality status of immunoglobulin genes should also include assessment of their ability to encode functional protein products.
Collapse
Affiliation(s)
- Mats Ohlin
- Department of Immunotechnology, Lund University, Lund, Sweden
| |
Collapse
|
22
|
Rhesus and cynomolgus macaque immunoglobulin heavy-chain genotyping yields comprehensive databases of germline VDJ alleles. Immunity 2021; 54:355-366.e4. [PMID: 33484642 DOI: 10.1016/j.immuni.2020.12.018] [Citation(s) in RCA: 42] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2020] [Revised: 10/19/2020] [Accepted: 12/30/2020] [Indexed: 12/20/2022]
Abstract
Definition of the specific germline immunoglobulin (Ig) alleles present in an individual is a critical first step to delineate the ontogeny and evolution of antigen-specific antibody responses. Rhesus and cynomolgus macaques are important animal models for pre-clinical studies, with four main sub-groups being used: Indian- and Chinese-origin rhesus macaques and Mauritian and Indonesian cynomolgus macaques. We applied the (Ig) gene inference tool IgDiscover and performed extensive Sanger sequencing-based genomic validation to define germline VDJ alleles in these 4 sub-groups, comprising 45 macaques in total. There was allelic overlap between Chinese- and Indian-origin rhesus macaques and also between the two macaque species, which is consistent with substantial admixture. The island-restricted Mauritian cynomolgus population displayed the lowest number of alleles of the sub-groups, yet maintained high individual allelic diversity. These comprehensive databases of germline IGH alleles for rhesus and cynomolgus macaques provide a resource toward the study of B cell responses in these important pre-clinical models.
Collapse
|
23
|
Collins AM, Peres A, Corcoran MM, Watson CT, Yaari G, Lees WD, Ohlin M. Commentary on Population matched (pm) germline allelic variants of immunoglobulin (IG) loci: relevance in infectious diseases and vaccination studies in human populations. Genes Immun 2021; 22:335-338. [PMID: 34667305 PMCID: PMC8674141 DOI: 10.1038/s41435-021-00152-6] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2021] [Revised: 09/29/2021] [Accepted: 10/05/2021] [Indexed: 12/11/2022]
Affiliation(s)
- Andrew M. Collins
- grid.1005.40000 0004 4902 0432School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, NSW Australia
| | - Ayelet Peres
- grid.22098.310000 0004 1937 0503Bioengineering, Faculty of Engineering, Bar Ilan University, Ramat Gan, Israel ,grid.22098.310000 0004 1937 0503Bar Ilan Institute of Nanotechnologies and Advanced Materials, Bar Ilan University, Ramat Gan, Israel
| | - Martin M. Corcoran
- grid.4714.60000 0004 1937 0626Department of Microbiology, Tumor and Cell Biology, Karolinska Institute, Stockholm, Sweden
| | - Corey T. Watson
- grid.266623.50000 0001 2113 1622Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY USA
| | - Gur Yaari
- grid.22098.310000 0004 1937 0503Bioengineering, Faculty of Engineering, Bar Ilan University, Ramat Gan, Israel ,grid.22098.310000 0004 1937 0503Bar Ilan Institute of Nanotechnologies and Advanced Materials, Bar Ilan University, Ramat Gan, Israel
| | - William D. Lees
- grid.509978.a0000 0004 0432 693XInstitute of Structural and Molecular Biology, Birkbeck College, University of London, London, UK
| | - Mats Ohlin
- grid.4514.40000 0001 0930 2361Department of Immunotechnology, Lund University, Lund, Sweden
| |
Collapse
|
24
|
Scott JK, Breden F. The adaptive immune receptor repertoire community as a model for FAIR stewardship of big immunology data. CURRENT OPINION IN SYSTEMS BIOLOGY 2020; 24:71-77. [PMID: 33073065 PMCID: PMC7547575 DOI: 10.1016/j.coisb.2020.10.001] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Systems biology involves network-oriented, computational approaches to modeling biological systems through analysis of big biological data. To contribute maximally to scientific progress, big biological data should be FAIR: findable, accessible, interoperable, and reusable. Here, we describe high-throughput sequencing data that characterize the vast diversity of B- and T-cell clones comprising the adaptive immune receptor repertoire (AIRR-seq data) and its contribution to our understanding of COVID-19 (coronavirus disease 19). We describe the accomplishments of the AIRR community, a grass-roots network of interdisciplinary laboratory scientists, bioinformaticians, and policy wonks, in creating and publishing standards, software and repositories for AIRR-seq data based on the FAIR principles.
Collapse
Affiliation(s)
- Jamie K Scott
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, BC, V5A 1S6, Canada
- Faculty of Health Sciences, Simon Fraser University, Burnaby, BC, V5A 1S6, Canada
| | - Felix Breden
- Department of Biological Sciences, Simon Fraser University, Burnaby, BC, V5A 1S6, Canada
| |
Collapse
|
25
|
Collins AM, Yaari G, Shepherd AJ, Lees W, Watson CT. Germline immunoglobulin genes: Disease susceptibility genes hidden in plain sight? CURRENT OPINION IN SYSTEMS BIOLOGY 2020; 24:100-108. [PMID: 37008538 PMCID: PMC10062056 DOI: 10.1016/j.coisb.2020.10.011] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Immunoglobulin genes are rarely considered as disease susceptibility genes despite their obvious and central contributions to immune function. This appears to be a consequence of historical views on antibody repertoire formation that no longer stand, and of difficulties that until recently surrounded the documentation of the suite of antibody genes in any individual. If these important genes are to be accessible to GWAS studies, allelic variation within the human population needs to be better documented, and a curated set of genomic variations associated with antibody genes needs to be formulated. Repertoire studies arising from the COVID-19 pandemic provide an opportunity to meet these needs, and may provide insights into the profound variability that is seen in outcomes to this infection.
Collapse
|
26
|
Abstract
Advances in reading, writing, and editing DNA are providing unprecedented insights into the complexity of immunological systems. This combination of systems and synthetic biology methods is enabling the quantitative and precise understanding of molecular recognition in adaptive immunity, thus providing a framework for reprogramming immune responses for translational medicine. In this review, we will highlight state-of-the-art methods such as immune repertoire sequencing, immunoinformatics, and immunogenomic engineering and their application toward adaptive immunity. We showcase novel and interdisciplinary approaches that have the promise of transforming the design and breadth of molecular and cellular immunotherapies.
Collapse
Affiliation(s)
- Lucia Csepregi
- Department of Biosystems Science and Engineering, ETH Zurich, 4058 Basel, Switzerland
| | - Roy A. Ehling
- Department of Biosystems Science and Engineering, ETH Zurich, 4058 Basel, Switzerland
| | - Bastian Wagner
- Department of Biosystems Science and Engineering, ETH Zurich, 4058 Basel, Switzerland
| | - Sai T. Reddy
- Department of Biosystems Science and Engineering, ETH Zurich, 4058 Basel, Switzerland
| |
Collapse
|
27
|
Norman RA, Ambrosetti F, Bonvin AMJJ, Colwell LJ, Kelm S, Kumar S, Krawczyk K. Computational approaches to therapeutic antibody design: established methods and emerging trends. Brief Bioinform 2020; 21:1549-1567. [PMID: 31626279 PMCID: PMC7947987 DOI: 10.1093/bib/bbz095] [Citation(s) in RCA: 113] [Impact Index Per Article: 28.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2019] [Revised: 06/07/2019] [Accepted: 07/05/2019] [Indexed: 12/31/2022] Open
Abstract
Antibodies are proteins that recognize the molecular surfaces of potentially noxious molecules to mount an adaptive immune response or, in the case of autoimmune diseases, molecules that are part of healthy cells and tissues. Due to their binding versatility, antibodies are currently the largest class of biotherapeutics, with five monoclonal antibodies ranked in the top 10 blockbuster drugs. Computational advances in protein modelling and design can have a tangible impact on antibody-based therapeutic development. Antibody-specific computational protocols currently benefit from an increasing volume of data provided by next generation sequencing and application to related drug modalities based on traditional antibodies, such as nanobodies. Here we present a structured overview of available databases, methods and emerging trends in computational antibody analysis and contextualize them towards the engineering of candidate antibody therapeutics.
Collapse
|
28
|
Smakaj E, Babrak L, Ohlin M, Shugay M, Briney B, Tosoni D, Galli C, Grobelsek V, D'Angelo I, Olson B, Reddy S, Greiff V, Trück J, Marquez S, Lees W, Miho E. Benchmarking immunoinformatic tools for the analysis of antibody repertoire sequences. Bioinformatics 2020; 36:1731-1739. [PMID: 31873728 PMCID: PMC7075533 DOI: 10.1093/bioinformatics/btz845] [Citation(s) in RCA: 35] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2019] [Revised: 10/21/2019] [Accepted: 12/19/2019] [Indexed: 01/01/2023] Open
Abstract
Summary Antibody repertoires reveal insights into the biology of the adaptive immune system and empower diagnostics and therapeutics. There are currently multiple tools available for the annotation of antibody sequences. All downstream analyses such as choosing lead drug candidates depend on the correct annotation of these sequences; however, a thorough comparison of the performance of these tools has not been investigated. Here, we benchmark the performance of commonly used immunoinformatic tools, i.e. IMGT/HighV-QUEST, IgBLAST and MiXCR, in terms of reproducibility of annotation output, accuracy and speed using simulated and experimental high-throughput sequencing datasets. We analyzed changes in IMGT reference germline database in the last 10 years in order to assess the reproducibility of the annotation output. We found that only 73/183 (40%) V, D and J human genes were shared between the reference germline sets used by the tools. We found that the annotation results differed between tools. In terms of alignment accuracy, MiXCR had the highest average frequency of gene mishits, 0.02 mishit frequency and IgBLAST the lowest, 0.004 mishit frequency. Reproducibility in the output of complementarity determining three regions (CDR3 amino acids) ranged from 4.3% to 77.6% with preprocessed data. In addition, run time of the tools was assessed: MiXCR was the fastest tool for number of sequences processed per unit of time. These results indicate that immunoinformatic analyses greatly depend on the choice of bioinformatics tool. Our results support informed decision-making to immunoinformaticians based on repertoire composition and sequencing platforms. Availability and implementation All tools utilized in the paper are free for academic use. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Erand Smakaj
- Institute of Biomedical Engineering and Medical Informatics, School of Life Sciences, FHNW University of Applied Sciences and Arts Northwestern Switzerland, Muttenz 4132, Switzerland
| | - Lmar Babrak
- Institute of Biomedical Engineering and Medical Informatics, School of Life Sciences, FHNW University of Applied Sciences and Arts Northwestern Switzerland, Muttenz 4132, Switzerland
| | - Mats Ohlin
- Department of Immunotechnology, Lund University, Lund 223, Sweden
| | - Mikhail Shugay
- Center of Life Sciences, Skolkovo Institute of Science and Technology, Moscow 121205, Russia
| | - Bryan Briney
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Deniz Tosoni
- Institute of Biomedical Engineering and Medical Informatics, School of Life Sciences, FHNW University of Applied Sciences and Arts Northwestern Switzerland, Muttenz 4132, Switzerland
| | - Christopher Galli
- Institute of Biomedical Engineering and Medical Informatics, School of Life Sciences, FHNW University of Applied Sciences and Arts Northwestern Switzerland, Muttenz 4132, Switzerland
| | - Vendi Grobelsek
- Department of Biosystems Science and Engineering, ETH Zurich, Basel 4058, Switzerland
| | - Igor D'Angelo
- One Amgen Center Drive, Amgen, Inc., Therapeutic Discovery/Molecular Engineering, Thousand Oaks, CA 91320, USA
| | - Branden Olson
- Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA.,Department of Statistics, University of Washington, Seattle, WA 98195, USA
| | - Sai Reddy
- Department of Biosystems Science and Engineering, ETH Zurich, Basel 4058, Switzerland
| | - Victor Greiff
- Department of Immunology, University of Oslo, Oslo 0372, Norway
| | - Johannes Trück
- Paediatric Immunology, Children's Research Center, University Children's Hospital, University of Zurich, Zurich 8032, Switzerland
| | - Susanna Marquez
- Department of Pathology, Yale School of Medicine, New Haven, CT 06511, USA
| | - William Lees
- Department of Biological Sciences and Institute of Structural and Molecular Biology, Birkbeck College, University of London, London WC1E 7HX, UK
| | - Enkelejda Miho
- Institute of Biomedical Engineering and Medical Informatics, School of Life Sciences, FHNW University of Applied Sciences and Arts Northwestern Switzerland, Muttenz 4132, Switzerland.,aiNET GmbH, Switzerland Innovation Park Basel Area AG, Basel 4057, Switzerland
| |
Collapse
|
29
|
Ghraichy M, Galson JD, Kovaltsuk A, von Niederhäusern V, Pachlopnik Schmid J, Recher M, Jauch AJ, Miho E, Kelly DF, Deane CM, Trück J. Maturation of the Human Immunoglobulin Heavy Chain Repertoire With Age. Front Immunol 2020; 11:1734. [PMID: 32849618 PMCID: PMC7424015 DOI: 10.3389/fimmu.2020.01734] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2020] [Accepted: 06/29/2020] [Indexed: 01/01/2023] Open
Abstract
B cells play a central role in adaptive immune processes, mainly through the production of antibodies. The maturation of the B cell system with age is poorly studied. We extensively investigated age-related alterations of naïve and antigen-experienced immunoglobulin heavy chain (IgH) repertoires. The most significant changes were observed in the first 10 years of life, and were characterized by altered immunoglobulin gene usage and an increased frequency of mutated antibodies structurally diverging from their germline precursors. Older age was associated with an increased usage of downstream IgH constant region genes and fewer antibodies with self-reactive properties. As mutations accumulated with age, the frequency of germline-encoded self-reactive antibodies decreased, indicating a possible beneficial role of self-reactive B cells in the developing immune system. Our results suggest a continuous process of change through childhood across a broad range of parameters characterizing IgH repertoires and stress the importance of using well-selected, age-appropriate controls in IgH studies.
Collapse
Affiliation(s)
- Marie Ghraichy
- Division of Immunology, University Children's Hospital, University of Zurich, Zurich, Switzerland.,Children's Research Center, University of Zurich, Zurich, Switzerland
| | - Jacob D Galson
- Children's Research Center, University of Zurich, Zurich, Switzerland.,Alchemab Therapeutics Ltd, London, United Kingdom
| | | | - Valentin von Niederhäusern
- Division of Immunology, University Children's Hospital, University of Zurich, Zurich, Switzerland.,Children's Research Center, University of Zurich, Zurich, Switzerland
| | - Jana Pachlopnik Schmid
- Division of Immunology, University Children's Hospital, University of Zurich, Zurich, Switzerland.,Children's Research Center, University of Zurich, Zurich, Switzerland
| | - Mike Recher
- Immunodeficiency Laboratory, Department of Biomedicine, University and University Hospital of Basel, Basel, Switzerland
| | - Annaïse J Jauch
- Immunodeficiency Laboratory, Department of Biomedicine, University and University Hospital of Basel, Basel, Switzerland
| | - Enkelejda Miho
- Institute of Medical Engineering and Medical Informatics, University of Applied Sciences and Arts Northwestern Switzerland FHNW, Muttenz, Switzerland.,SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland.,aiNET GmbH, Basel, Switzerland
| | - Dominic F Kelly
- Oxford Vaccine Group, Department of Paediatrics, University of Oxford, Oxford, United Kingdom.,Oxford University Hospitals NHS Foundation Trust, Oxford, United Kingdom
| | - Charlotte M Deane
- Department of Statistics, University of Oxford, Oxford, United Kingdom
| | - Johannes Trück
- Division of Immunology, University Children's Hospital, University of Zurich, Zurich, Switzerland.,Children's Research Center, University of Zurich, Zurich, Switzerland
| |
Collapse
|
30
|
Ott JA, Harrison J, Flajnik MF, Criscitiello MF. Nurse shark T-cell receptors employ somatic hypermutation preferentially to alter alpha/delta variable segments associated with alpha constant region. Eur J Immunol 2020; 50:1307-1320. [PMID: 32346855 DOI: 10.1002/eji.201948495] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2019] [Revised: 03/02/2020] [Accepted: 04/24/2020] [Indexed: 12/25/2022]
Abstract
In addition to canonical TCR and BCR, cartilaginous fish assemble noncanonical TCR that employ various B-cell components. For example, shark T cells associate alpha (TCR-α) or delta (TCR-δ) constant (C) regions with Ig heavy chain (H) variable (V) segments or TCR-associated Ig-like V (TAILV) segments to form chimeric IgV-TCR, and combine TCRδC with both Ig-like and TCR-like V segments to form the doubly rearranging NAR-TCR. Activation-induced (cytidine) deaminase-catalyzed somatic hypermutation (SHM), typically used for B-cell affinity maturation, also is used by TCR-α during selection in the shark thymus presumably to salvage failing receptors. Here, we found that the use of SHM by nurse shark TCR varies depending on the particular V segment or C region used. First, SHM significantly alters alpha/delta V (TCRαδV) segments using TCR αC but not δC. Second, mutation to IgHV segments associated with TCR δC was reduced compared to mutation to TCR αδV associated with TCR αC. Mutation was present but limited in V segments of all other TCR chains including NAR-TCR. Unexpectedly, we found preferential rearrangement of the noncanonical IgHV-TCRδC over canonical TCR αδV-TCRδC receptors. The differential use of SHM may reveal how activation-induced (cytidine) deaminase targets V regions.
Collapse
Affiliation(s)
- Jeannine A Ott
- Comparative Immunogenetics Laboratory, Department of Veterinary Pathobiology, College of Veterinary Medicine and Biomedical Sciences, Texas A&M University, College Station, TX, USA
| | - Jenna Harrison
- Comparative Immunogenetics Laboratory, Department of Veterinary Pathobiology, College of Veterinary Medicine and Biomedical Sciences, Texas A&M University, College Station, TX, USA
| | - Martin F Flajnik
- Department of Microbiology and Immunology, School of Medicine, University of Maryland at Baltimore, Baltimore, MD, USA
| | - Michael F Criscitiello
- Comparative Immunogenetics Laboratory, Department of Veterinary Pathobiology, College of Veterinary Medicine and Biomedical Sciences, Texas A&M University, College Station, TX, USA.,Department of Microbial Pathogenesis and Immunology, College of Medicine, Texas A&M Health Science Center, Texas A&M University, College Station, TX, USA
| |
Collapse
|
31
|
Lees W, Busse CE, Corcoran M, Ohlin M, Scheepers C, Matsen FA, Yaari G, Watson CT, Collins A, Shepherd AJ. OGRDB: a reference database of inferred immune receptor genes. Nucleic Acids Res 2020; 48:D964-D970. [PMID: 31566225 PMCID: PMC6943078 DOI: 10.1093/nar/gkz822] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2019] [Revised: 09/05/2019] [Accepted: 09/16/2019] [Indexed: 12/20/2022] Open
Abstract
High-throughput sequencing of the adaptive immune receptor repertoire (AIRR-seq) is providing unprecedented insights into the immune response to disease and into the development of immune disorders. The accurate interpretation of AIRR-seq data depends on the existence of comprehensive germline gene reference sets. Current sets are known to be incomplete and unrepresentative of the degree of polymorphism and diversity in human and animal populations. A key issue is the complexity of the genomic regions in which they lie, which, because of the presence of multiple repeats, insertions and deletions, have not proved tractable with short-read whole genome sequencing. Recently, tools and methods for inferring such gene sequences from AIRR-seq datasets have become available, and a community approach has been developed for the expert review and publication of such inferences. Here, we present OGRDB, the Open Germline Receptor Database (https://ogrdb.airr-community.org), a public resource for the submission, review and publication of previously unknown receptor germline sequences together with supporting evidence.
Collapse
Affiliation(s)
- William Lees
- Institute of Structural and Molecular Biology, Birkbeck College, University of London, London WC1E 7HX, UK
| | - Christian E Busse
- Division of B Cell Immunology, German Cancer Research Center, 69120 Heidelberg, Germany
| | - Martin Corcoran
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institute, Box 280, 171 77 Stockholm, Sweden
| | - Mats Ohlin
- Department of Immunotechnology, Lund University, Medicon Village, S-223 81 Lund, Sweden
| | - Cathrine Scheepers
- Center for HIV and STIs, National Institute for Communicable Diseases of the National Health Laboratory Service, Sandringam, Gauteng 2131, South Africa.,Antibody Immunity Research Unit, School of Pathology, University of the Witwatersrand, Johannesburg 2050, South Africa
| | - Frederick A Matsen
- Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA 98109-1024, USA
| | - Gur Yaari
- Faculty of Engineering, Bar Ilan University, Ramat Gan 5290002, Israel
| | - Corey T Watson
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY 40202, USA
| | | | - Andrew Collins
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, New South Wales 2052, Australia
| | - Adrian J Shepherd
- Institute of Structural and Molecular Biology, Birkbeck College, University of London, London WC1E 7HX, UK
| |
Collapse
|
32
|
Inter- and intraspecies comparison of phylogenetic fingerprints and sequence diversity of immunoglobulin variable genes. Immunogenetics 2020; 72:279-294. [PMID: 32367185 DOI: 10.1007/s00251-020-01164-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2020] [Accepted: 04/13/2020] [Indexed: 10/24/2022]
Abstract
Protection and neutralization of a vast array of pathogens is accomplished by the tremendous diversity of the B cell receptor (BCR) repertoire. For jawed vertebrates, this diversity is initiated via the somatic recombination of immunoglobulin (Ig) germline elements. While it is clear that the number of these germline segments differs from species to species, the extent of cross-species sequence diversity remains largely uncharacterized. Here we use extensive computational and statistical methods to investigate the sequence diversity and evolutionary relationship between Ig variable (V), diversity (D), and joining (J) germline segments across nine commonly studied species ranging from zebrafish to human. Metrics such as guanine-cytosine (GC) content showed low redundancy across Ig germline genes within a given species. Other comparisons, including amino acid motifs, evolutionary selection, and sequence diversity, revealed species-specific properties. Additionally, we showed that the germline-encoded diversity differs across antibody (recombined V-D-J) repertoires of various B cell subsets. To facilitate future comparative immunogenomics analysis, we created VDJgermlines, an R package that contains the germline sequences from multiple species. Our study informs strategies for the humanization and engineering of therapeutic antibodies.
Collapse
|
33
|
Bhardwaj V, Franceschetti M, Rao R, Pevzner PA, Safonova Y. Automated analysis of immunosequencing datasets reveals novel immunoglobulin D genes across diverse species. PLoS Comput Biol 2020; 16:e1007837. [PMID: 32339161 PMCID: PMC7295240 DOI: 10.1371/journal.pcbi.1007837] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2019] [Revised: 06/15/2020] [Accepted: 04/01/2020] [Indexed: 12/30/2022] Open
Abstract
Immunoglobulin genes are formed through V(D)J recombination, which joins the variable (V), diversity (D), and joining (J) germline genes. Since variations in germline genes have been linked to various diseases, personalized immunogenomics focuses on finding alleles of germline genes across various patients. Although reconstruction of V and J genes is a well-studied problem, the more challenging task of reconstructing D genes remained open until the IgScout algorithm was developed in 2019. In this work, we address limitations of IgScout by developing a probabilistic MINING-D algorithm for D gene reconstruction, apply it to hundreds of immunosequencing datasets from multiple species, and validate the newly inferred D genes by analyzing diverse whole genome sequencing datasets and haplotyping heterozygous V genes. Antibodies provide specific binding to an enormous range of antigens and represent a key component of the adaptive immune system. Immunosequencing has emerged as a method of choice for generating millions of reads that sample antibody repertoires and provides insights into monitoring immune response to disease and vaccination. Most of the previous immunogenomics studies rely on the reference germline genes in the immunoglobulin locus rather than the germline genes in a specific patient. This approach is deficient since the set of known germline genes is incomplete (particularly for non-European humans and non-human species) and contains alleles that resulted from sequencing and annotation errors. The problem of de novo inference of diversity (D) genes from immunosequencing data remained open until the IgScout algorithm was developed in 2019. We address limitations of IgScout by developing a probabilistic MINING-D algorithm for D gene reconstruction and infer multiple D genes across multiple species that are not present in standard databases.
Collapse
Affiliation(s)
- Vinnu Bhardwaj
- Electrical and Computer Engineering Department, University of California San Diego, San Diego, California, United States of America
| | - Massimo Franceschetti
- Electrical and Computer Engineering Department, University of California San Diego, San Diego, California, United States of America
| | - Ramesh Rao
- Electrical and Computer Engineering Department, University of California San Diego, San Diego, California, United States of America
- Qualcomm Institute, University of California San Diego, San Diego, California, United States of America
| | - Pavel A. Pevzner
- Computer Science and Engineering Department, University of California San Diego, San Diego, California, United States of America
- * E-mail:
| | - Yana Safonova
- Computer Science and Engineering Department, University of California San Diego, San Diego, California, United States of America
- Center for Information Theory and Applications, University of California San Diego, San Diego, California, United States of America
| |
Collapse
|
34
|
Ford M, Haghshenas E, Watson CT, Sahinalp SC. Genotyping and Copy Number Analysis of Immunoglobin Heavy Chain Variable Genes Using Long Reads. iScience 2020; 23:100883. [PMID: 32109676 PMCID: PMC7044747 DOI: 10.1016/j.isci.2020.100883] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2019] [Revised: 11/08/2019] [Accepted: 01/29/2020] [Indexed: 11/22/2022] Open
Abstract
One of the remaining challenges to describing an individual's genetic variation lies in the highly heterogeneous and complex genomic regions that impede the use of classical reference-guided mapping and assembly approaches. Once such region is the Immunoglobulin heavy chain locus (IGH), which is critical for the development of antibodies and the adaptive immune system. We describe ImmunoTyper, the first PacBio-based genotyping and copy number calling tool specifically designed for IGH V genes (IGHV). We demonstrate that ImmunoTyper's multi-stage clustering and combinatorial optimization approach represents the most comprehensive IGHV genotyping approach published to date, through validation using gold-standard IGH reference sequence. This preliminary work establishes the feasibility of fine-grained genotype and copy number analysis using error-prone long reads in complex multi-gene loci and opens the door for in-depth investigation into IGHV heterogeneity using accessible and increasingly common whole-genome sequence.
Collapse
Affiliation(s)
- Michael Ford
- School of Computing Science, Simon Fraser University, Burnaby V5A 1S6, Canada
| | - Ehsan Haghshenas
- School of Computing Science, Simon Fraser University, Burnaby V5A 1S6, Canada
| | - Corey T Watson
- Department of Biochemistry and Molecular Genetics, University of Louisville, Louisville 40292, USA
| | - S Cenk Sahinalp
- Cancer Data Science Laboratory, National Cancer Institute, Bethesda, MD 20892, USA.
| |
Collapse
|
35
|
Omer A, Shemesh O, Peres A, Polak P, Shepherd AJ, Watson C, Boyd SD, Collins AM, Lees W, Yaari G. VDJbase: an adaptive immune receptor genotype and haplotype database. Nucleic Acids Res 2020; 48:D1051-D1056. [PMID: 31602484 PMCID: PMC6943044 DOI: 10.1093/nar/gkz872] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2019] [Revised: 09/19/2019] [Accepted: 10/01/2019] [Indexed: 12/14/2022] Open
Abstract
VDJbase is a publicly available database that offers easy searching of data describing the complete sets of gene sequences (genotypes and haplotypes) inferred from adaptive immune receptor repertoire sequencing datasets. VDJbase is designed to act as a resource that will allow the scientific community to explore the genetic variability of the immunoglobulin (Ig) and T cell receptor (TR) gene loci. It can also assist in the investigation of Ig- and TR-related genetic predispositions to diseases. Our database includes web-based query and online tools to assist in visualization and analysis of the genotype and haplotype data. It enables users to detect those alleles and genes that are significantly over-represented in a particular population, in terms of genotype, haplotype and gene expression. The database website can be freely accessed at https://www.vdjbase.org/, and no login is required. The data and code use creative common licenses and are freely downloadable from https://bitbucket.org/account/user/yaarilab/projects/GPHP.
Collapse
Affiliation(s)
- Aviv Omer
- Bioengineering, Faculty of Engineering, Bar-Ilan University, Ramat Gan 5290002, Israel
| | - Or Shemesh
- Bioengineering, Faculty of Engineering, Bar-Ilan University, Ramat Gan 5290002, Israel
| | - Ayelet Peres
- Bioengineering, Faculty of Engineering, Bar-Ilan University, Ramat Gan 5290002, Israel
| | - Pazit Polak
- Bioengineering, Faculty of Engineering, Bar-Ilan University, Ramat Gan 5290002, Israel
| | - Adrian J Shepherd
- Institute of Structural and Molecular Biology, Birkbeck, University of London, London, UK
| | - Corey T Watson
- University of Louisville School of Medicine, Biochemistry and Molecular Genetics, Louisville, KY 40292, USA
| | - Scott D Boyd
- Department of Pathology, Stanford University, Stanford, CA 94305, USA
| | - Andrew M Collins
- School of Biotechnology and Biomolecular Sciences, University of NSW, Kensington, Sydney, NSW 2052, Australia
| | - William Lees
- Institute of Structural and Molecular Biology, Birkbeck, University of London, London, UK
| | - Gur Yaari
- Bioengineering, Faculty of Engineering, Bar-Ilan University, Ramat Gan 5290002, Israel
| |
Collapse
|
36
|
Abstract
The origins of the various elements in the human antibody repertoire have been and still are subject to considerable uncertainty. Uncertainty in respect of whether the various elements have always served a specific defense function or whether they were co-opted from other organismal roles to form a crude naïve repertoire that then became more complex as combinatorial mechanisms were added. Estimates of the current size of the human antibody naïve repertoire are also widely debated with numbers anywhere from 10 million members, based on experimentally derived numbers, to in excess of one thousand trillion members or more, based on the different sequences derived from theoretical combinatorial calculations. There are questions that are relevant at both ends of this number spectrum. At the lower bound it could be questioned whether this is an insufficient repertoire size to counter all the potential antigen-bearing pathogens. At the upper bound the question is rather simpler: How can any individual interrogate such an astronomical number of antibody-bearing B cells in a timeframe that is meaningful? This review evaluates the evolutionary aspects of the adaptive immune system, the calculations that lead to the large repertoire estimates, some of the experimental evidence pointing to a more restricted repertoire whose variation appears to derive from convergent 'structure and specificity features', and includes a theoretical model that seems to support it. Finally, a solution that may reconcile the size difference anomaly, which is still a hot subject of debate, is suggested.
Collapse
|
37
|
Cowell LG. The Diagnostic, Prognostic, and Therapeutic Potential of Adaptive Immune Receptor Repertoire Profiling in Cancer. Cancer Res 2019; 80:643-654. [PMID: 31888887 DOI: 10.1158/0008-5472.can-19-1457] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2019] [Revised: 10/14/2019] [Accepted: 12/17/2019] [Indexed: 11/16/2022]
Abstract
Lymphocytes play a critical role in antitumor immune responses. They are directly targeted by some therapies, and the composition and spatial organization of intratumor T-cell populations is prognostic in some cancer types. A better understanding of lymphocyte population dynamics over the course of disease and in response to therapy is urgently needed to guide therapy decisions and to develop new therapy targets. Deep sequencing of the repertoire of antigen receptor-encoding genes expressed in a lymphocyte population has become a widely used approach for profiling the population's immune status. Lymphocyte antigen receptor repertoire deep sequencing data can be used to assess the clonal richness and diversity of lymphocyte populations; to track clone members over time, between tissues, and across lymphocyte subsets; to detect clonal expansion; and to detect the recruitment of new clones into a tissue. Repertoire sequencing is thus a critical complement to other methods of lymphocyte and immune profiling in cancer. This review describes the current state of knowledge based on repertoire sequencing studies conducted on human cancer patients, with a focus on studies of the T-cell receptor beta chain locus. The review then outlines important questions left unanswered and suggests future directions for the field.
Collapse
Affiliation(s)
- Lindsay G Cowell
- Department of Population and Data Sciences, Department of Immunology, UT Southwestern Medical Center, Dallas, Texas.
| |
Collapse
|
38
|
Busse CE, Jackson KJL, Watson CT, Collins AM. A Proposed New Nomenclature for the Immunoglobulin Genes of Mus musculus. Front Immunol 2019; 10:2961. [PMID: 31921202 PMCID: PMC6930147 DOI: 10.3389/fimmu.2019.02961] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2019] [Accepted: 12/03/2019] [Indexed: 01/26/2023] Open
Abstract
Mammalian immunoglobulin (IG) genes are found in complex loci that contain hundreds of highly similar pseudogenes, functional genes and repetitive elements, which has made their investigation particularly challenging. High-throughput sequencing has provided new avenues for the investigation of these loci, and has recently been applied to study the IG genes of important inbred mouse strains, revealing unexpected differences between their IG loci. This demonstrated that the structural differences are of such magnitude that they call into question the merits of the current mouse IG gene nomenclatures. Three nomenclatures for the mouse IG heavy chain locus (Igh) are presently in use, and they are all positional nomenclatures using the C57BL/6 genome reference sequence as their template. The continued use of these nomenclatures requires that genes of other inbred strains be confidently identified as allelic variants of C57BL/6 genes, but this is clearly impossible. The unusual breeding histories of inbred mouse strains mean that, regardless of the genetics of wild mice, no single ancestral origin for the IG loci exists for laboratory mice. Here we present a general discussion of the challenges this presents for any IG nomenclature. Furthermore, we describe principles that could be followed in the formulation of a solution to these challenges. Finally, we propose a non-positional nomenclature that accords with the guidelines of the International Mouse Nomenclature Committee, and outline strategies that can be adopted to meet the nomenclature challenges if three systems are to give way to a new one.
Collapse
Affiliation(s)
- Christian E Busse
- Division of B Cell Immunology, German Cancer Research Center, Heidelberg, Germany
| | - Katherine J L Jackson
- Immunology Division, The Garvan Institute of Medical Research, Darlinghurst, NSW, Australia
| | - Corey T Watson
- Department of Biochemistry and Molecular Genetics, University of Louisville, Louisville, KY, United States
| | - Andrew M Collins
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, NSW, Australia
| |
Collapse
|
39
|
Watson CT, Kos JT, Gibson WS, Newman L, Deikus G, Busse CE, Smith ML, Jackson KJ, Collins AM. A comparison of immunoglobulin IGHV, IGHD and IGHJ genes in wild-derived and classical inbred mouse strains. Immunol Cell Biol 2019; 97:888-901. [PMID: 31441114 DOI: 10.1111/imcb.12288] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2019] [Revised: 08/05/2019] [Accepted: 08/20/2019] [Indexed: 01/20/2023]
Abstract
The genomes of classical inbred mouse strains include genes derived from all three major subspecies of the house mouse, Mus musculus. We recently posited that genetic diversity in the immunoglobulin heavy chain (IGH) gene loci of C57BL/6 and BALB/c mice reflects differences in subspecies origin. To investigate this hypothesis, we conducted high-throughput sequencing of IGH gene rearrangements to document IGH variable (IGHV), joining (IGHJ) and diversity (IGHD) genes in four inbred wild-derived mouse strains (CAST/EiJ, LEWES/EiJ, MSM/MsJ and PWD/PhJ) and a single disease model strain (NOD/ShiLtJ), collectively representing genetic backgrounds of several major mouse subspecies. A total of 341 germline IGHV sequences were inferred in the wild-derived strains, including 247 not curated in the international ImMunoGeneTics information system. By contrast, 83/84 inferred NOD IGHV genes had previously been observed in C57BL/6 mice. Variability among the strains examined was observed for only a single IGHJ gene, involving a description of a novel allele. By contrast, unexpected variation was found in the IGHD gene loci, with four previously unreported IGHD gene sequences being documented. Very few IGHV sequences of C57BL/6 and BALB/c mice were shared with strains representing major subspecies, suggesting that their IGH loci may be complex mosaics of genes of disparate origins. This suggests a similar level of diversity is likely present in the IGH loci of other classical inbred strains. This must now be documented if we are to properly understand interstrain variation in models of antibody-mediated disease.
Collapse
Affiliation(s)
- Corey T Watson
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, 40202, USA
| | - Justin T Kos
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, 40202, USA
| | - William S Gibson
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, 40202, USA
| | - Leah Newman
- Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA.,Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Gintaras Deikus
- Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA.,Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Christian E Busse
- Division of B Cell Immunology, German Cancer Research Center, 69120, Heidelberg, Germany
| | - Melissa L Smith
- Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA.,Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Katherine Jl Jackson
- Immunology Division, Garvan Institute of Medical Research, Darlinghurst, NSW, 2010, Australia
| | - Andrew M Collins
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, NSW, 2052, Australia
| |
Collapse
|
40
|
Zhang J, Ji Z, Smith KN. Analysis of TCR β CDR3 sequencing data for tracking anti-tumor immunity. Methods Enzymol 2019; 629:443-464. [PMID: 31727253 DOI: 10.1016/bs.mie.2019.08.006] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Anti-tumor T cells are the soldiers in the body's war against cancer. Effector T cells can detect and eliminate cells expressing their cognate antigen via activation through engagement of the T cell receptor (TCR) with its cognate peptide:MHC complex. Owing to the recent success of immunotherapy in the treatment of many different cancer types, research efforts have shifted toward identifying and tracking anti-tumor T cell responses upon treatment in cancer patients. While traditional methods, such as ELISpot and flow cytometric intracellular staining have had limited success, likely owing to the inability to get viable biospecimens or the lower magnitude of tumor-specific T cell responses relative to virus-specific responses, new techniques that utilize next generation sequencing enable T cell response tracking independent of cytokine production or cell viability. The TCR, which confers T cell antigen-specificity, can be used as a molecular barcode to track T cell clonotypic dynamics across biological compartments and over time in cancer patients undergoing treatment. Because this method does not require viable cells, these T cell clonotypes can also be tracked in archival tumor tissue and flash frozen cell pellets. While exciting, quantitative TCR sequencing (TCRseq) technologies have been met with the conundrum of how to properly analyze and interpret the data. Here we provide a comprehensive guide on how to acquire, analyze, and interpret TCRseq data, as well as special considerations that should be taken prior to experimental setup.
Collapse
Affiliation(s)
- Jiajia Zhang
- The Bloomberg-Kimmel Institute for Cancer Immunotherapy, Johns Hopkins School of Medicine, Baltimore, MD, United States; The Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins School of Medicine, Baltimore, MD, United States
| | - Zhicheng Ji
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Heath, Johns Hopkins University, Baltimore, MD, United States
| | - Kellie N Smith
- The Bloomberg-Kimmel Institute for Cancer Immunotherapy, Johns Hopkins School of Medicine, Baltimore, MD, United States; The Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins School of Medicine, Baltimore, MD, United States.
| |
Collapse
|
41
|
Davis MM, Boyd SD. Recent progress in the analysis of αβT cell and B cell receptor repertoires. Curr Opin Immunol 2019; 59:109-114. [PMID: 31326777 PMCID: PMC7075470 DOI: 10.1016/j.coi.2019.05.012] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2019] [Accepted: 05/28/2019] [Indexed: 01/10/2023]
Abstract
T cell receptors (TCRs) and B cell receptors (BCRs) are vertebrate evolution's best answer to the threat of microbial pathogens that can evolve much faster than ourselves. These antigen receptors are generated during T cell or B cell development by combinatorial rearrangement of germline genome V, D and J gene segments, and with junctional residues capable of enormous diversity. For decades the complexity of these receptor repertoires has limited their analysis, but advances in DNA sequencing technology and an array of complementary tools have now made their study much more tractable, filling a major gap in our ability to understand immunology as a system. Here, we summarize the recent approaches and discoveries that are enabling these advances, with some suggestions as to what may lie ahead.
Collapse
Affiliation(s)
- Mark M Davis
- Institute for Immunity, Transplantation, and Infection, Stanford University School of Medicine, Stanford, CA, USA; Department of Microbiology and Immunology, Stanford University School of Medicine, Stanford, CA, USA; The Howard Hughes Medical Institute, Chevy Chase, MD, USA.
| | - Scott D Boyd
- Institute for Immunity, Transplantation, and Infection, Stanford University School of Medicine, Stanford, CA, USA; The Sean N. Parker Center for Allergy and Asthma Research at Stanford University, Stanford, CA, USA; Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| |
Collapse
|
42
|
Safonova Y, Pevzner PA. De novo Inference of Diversity Genes and Analysis of Non-canonical V(DD)J Recombination in Immunoglobulins. Front Immunol 2019; 10:987. [PMID: 31134072 PMCID: PMC6516046 DOI: 10.3389/fimmu.2019.00987] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2019] [Accepted: 04/16/2019] [Indexed: 12/03/2022] Open
Abstract
The V(D)J recombination forms the immunoglobulin genes by joining the variable (V), diversity (D), and joining (J) germline genes. Since variations in germline genes have been linked to various diseases, personalized immunogenomics aims at finding alleles of germline genes across various patients. Although recent studies described algorithms for de novo inference of V and J genes from immunosequencing data, they stopped short of solving a more difficult problem of reconstructing D genes that form the highly divergent CDR3 regions and provide the most important contribution to the antigen binding. We present the IgScout algorithm for de novo D gene reconstruction and apply it to reveal new alleles of human D genes and previously unknown D genes in camel, an important model organism in immunology. We further analyze non-canonical V(DD)J recombination that results in unusually long CDR3s with tandem fused IGHD genes and thus expands the diversity of the antibody repertoires. We demonstrate that tandem CDR3s represent a consistent and functional feature of all analyzed immunosequencing datasets, reveal ultra-long CDR3s, and shed light on the mechanism responsible for their formation.
Collapse
Affiliation(s)
- Yana Safonova
- Center for Information Theory and Applications, University of California, San Diego, San Diego, CA, United States
| | - Pavel A Pevzner
- Department of Computer Science and Engineering, University of California, San Diego, San Diego, CA, United States
| |
Collapse
|
43
|
Vázquez Bernat N, Corcoran M, Hardt U, Kaduk M, Phad GE, Martin M, Karlsson Hedestam GB. High-Quality Library Preparation for NGS-Based Immunoglobulin Germline Gene Inference and Repertoire Expression Analysis. Front Immunol 2019; 10:660. [PMID: 31024532 PMCID: PMC6459949 DOI: 10.3389/fimmu.2019.00660] [Citation(s) in RCA: 38] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2018] [Accepted: 03/11/2019] [Indexed: 12/13/2022] Open
Abstract
Next generation sequencing (NGS) of immunoglobulin (Ig) repertoires (Rep-seq) enables examination of the adaptive immune system at an unprecedented level. Applications include studies of expressed repertoires, gene usage, somatic hypermutation levels, Ig lineage tracing and identification of genetic variation within the Ig loci through inference methods. All these applications require starting libraries that allow the generation of sequence data with low error rate and optimal representation of the expressed repertoire. Here, we provide detailed protocols for the production of libraries suitable for human Ig germline gene inference and Ig repertoire studies. Various parameters used in the process were tested in order to demonstrate factors that are critical to obtain high quality libraries. We demonstrate an improved 5'RACE technique that reduces the length constraints of Illumina MiSeq based Rep-seq analysis but allows for the acquisition of sequences upstream of Ig V genes, useful for primer design. We then describe a 5' multiplex method for library preparation, which yields full length V(D)J sequences suitable for genotype identification and novel gene inference. We provide comprehensive sets of primers targeting IGHV, IGKV, and IGLV genes. Using the optimized protocol, we produced IgM, IgG, IgK, and IgL libraries and analyzed them using the germline inference tool IgDiscover to identify expressed germline V alleles. This process additionally uncovered three IGHV, one IGKV, and six IGLV novel alleles in a single individual, which are absent from the IMGT reference database, highlighting the need for further study of Ig genetic variation. The library generation protocols presented here enable a robust means of analyzing expressed Ig repertoires, identifying novel alleles and producing individualized germline gene databases from humans.
Collapse
Affiliation(s)
- Néstor Vázquez Bernat
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institutet, Stockholm, Sweden
| | - Martin Corcoran
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institutet, Stockholm, Sweden
| | - Uta Hardt
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institutet, Stockholm, Sweden
- Division of Rheumatology, Department of Medicine, Center for Molecular Medicine, Karolinska Institutet and Karolinska University Hospital, Stockholm, Sweden
| | - Mateusz Kaduk
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institutet, Stockholm, Sweden
| | - Ganesh E. Phad
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institutet, Stockholm, Sweden
| | - Marcel Martin
- Science for Life Laboratory, Department of Biochemistry and Biophysics, Stockholm University, Stockholm, Sweden
| | | |
Collapse
|