1
|
Engelbrecht E, Rodriguez OL, Watson CT. Addressing Technical Pitfalls in Pursuit of Molecular Factors That Mediate Immunoglobulin Gene Regulation. JOURNAL OF IMMUNOLOGY (BALTIMORE, MD. : 1950) 2024; 213:651-662. [PMID: 39007649 PMCID: PMC11333172 DOI: 10.4049/jimmunol.2400131] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/08/2024] [Accepted: 06/13/2024] [Indexed: 07/16/2024]
Abstract
The expressed Ab repertoire is a critical determinant of immune-related phenotypes. Ab-encoding transcripts are distinct from other expressed genes because they are transcribed from somatically rearranged gene segments. Human Abs are composed of two identical H and L chain polypeptides derived from genes in IGH locus and one of two L chain loci. The combinatorial diversity that results from Ab gene rearrangement and the pairing of different H and L chains contributes to the immense diversity of the baseline Ab repertoire. During rearrangement, Ab gene selection is mediated by factors that influence chromatin architecture, promoter/enhancer activity, and V(D)J recombination. Interindividual variation in the composition of the Ab repertoire associates with germline variation in IGH, implicating polymorphism in Ab gene regulation. Determining how IGH variants directly mediate gene regulation will require integration of these variants with other functional genomic datasets. In this study, we argue that standard approaches using short reads have limited utility for characterizing regulatory regions in IGH at haplotype resolution. Using simulated and chromatin immunoprecipitation sequencing reads, we define features of IGH that limit use of short reads and a single reference genome, namely 1) the highly duplicated nature of the DNA sequence in IGH and 2) structural polymorphisms that are frequent in the population. We demonstrate that personalized diploid references enhance performance of short-read data for characterizing mappable portions of the locus, while also showing that long-read profiling tools will ultimately be needed to fully resolve functional impacts of IGH germline variation on expressed Ab repertoires.
Collapse
Affiliation(s)
- Eric Engelbrecht
- Department of Biochemistry and Molecular Genetics, University of Louisville, Louisville, KY
| | - Oscar L Rodriguez
- Department of Biochemistry and Molecular Genetics, University of Louisville, Louisville, KY
| | - Corey T Watson
- Department of Biochemistry and Molecular Genetics, University of Louisville, Louisville, KY
| |
Collapse
|
2
|
Engelbrecht E, Rodriguez OL, Shields K, Schultze S, Tieri D, Jana U, Yaari G, Lees WD, Smith ML, Watson CT. Resolving haplotype variation and complex genetic architecture in the human immunoglobulin kappa chain locus in individuals of diverse ancestry. Genes Immun 2024; 25:297-306. [PMID: 38844673 PMCID: PMC11327106 DOI: 10.1038/s41435-024-00279-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Revised: 05/21/2024] [Accepted: 05/24/2024] [Indexed: 08/17/2024]
Abstract
Immunoglobulins (IGs), critical components of the human immune system, are composed of heavy and light protein chains encoded at three genomic loci. The IG Kappa (IGK) chain locus consists of two large, inverted segmental duplications. The complexity of the IG loci has hindered use of standard high-throughput methods for characterizing genetic variation within these regions. To overcome these limitations, we use long-read sequencing to create haplotype-resolved IGK assemblies in an ancestrally diverse cohort (n = 36), representing the first comprehensive description of IGK haplotype variation. We identify extensive locus polymorphism, including novel single nucleotide variants (SNVs) and novel structural variants harboring functional IGKV genes. Among 47 functional IGKV genes, we identify 145 alleles, 67 of which were not previously curated. We report inter-population differences in allele frequencies for 10 IGKV genes, including alleles unique to specific populations within this dataset. We identify haplotypes carrying signatures of gene conversion that associate with SNV enrichment in the IGK distal region, and a haplotype with an inversion spanning the proximal and distal regions. These data provide a critical resource of curated genomic reference information from diverse ancestries, laying a foundation for advancing our understanding of population-level genetic variation in the IGK locus.
Collapse
Affiliation(s)
- Eric Engelbrecht
- Department of Biochemistry and Molecular Genetics, University of Louisville, Louisville, KY, USA
| | - Oscar L Rodriguez
- Department of Biochemistry and Molecular Genetics, University of Louisville, Louisville, KY, USA
| | - Kaitlyn Shields
- Department of Biochemistry and Molecular Genetics, University of Louisville, Louisville, KY, USA
| | - Steven Schultze
- Department of Biochemistry and Molecular Genetics, University of Louisville, Louisville, KY, USA
| | - David Tieri
- Department of Biochemistry and Molecular Genetics, University of Louisville, Louisville, KY, USA
| | - Uddalok Jana
- Department of Biochemistry and Molecular Genetics, University of Louisville, Louisville, KY, USA
| | - Gur Yaari
- Faculty of Engineering, Bar Ilan University, Ramat Gan, Israel
- Institute of Nanotechnology and Advanced Materials, Bar Ilan University, Ramat Gan, Israel
| | - William D Lees
- Faculty of Engineering, Bar Ilan University, Ramat Gan, Israel
| | - Melissa L Smith
- Department of Biochemistry and Molecular Genetics, University of Louisville, Louisville, KY, USA.
| | - Corey T Watson
- Department of Biochemistry and Molecular Genetics, University of Louisville, Louisville, KY, USA.
| |
Collapse
|
3
|
Collins AM, Ohlin M, Corcoran M, Heather JM, Ralph D, Law M, Martínez-Barnetche J, Ye J, Richardson E, Gibson WS, Rodriguez OL, Peres A, Yaari G, Watson CT, Lees WD. AIRR-C IG Reference Sets: curated sets of immunoglobulin heavy and light chain germline genes. Front Immunol 2024; 14:1330153. [PMID: 38406579 PMCID: PMC10884231 DOI: 10.3389/fimmu.2023.1330153] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Accepted: 12/27/2023] [Indexed: 02/27/2024] Open
Abstract
Introduction Analysis of an individual's immunoglobulin (IG) gene repertoire requires the use of high-quality germline gene reference sets. When sets only contain alleles supported by strong evidence, AIRR sequencing (AIRR-seq) data analysis is more accurate and studies of the evolution of IG genes, their allelic variants and the expressed immune repertoire is therefore facilitated. Methods The Adaptive Immune Receptor Repertoire Community (AIRR-C) IG Reference Sets have been developed by including only human IG heavy and light chain alleles that have been confirmed by evidence from multiple high-quality sources. To further improve AIRR-seq analysis, some alleles have been extended to deal with short 3' or 5' truncations that can lead them to be overlooked by alignment utilities. To avoid other challenges for analysis programs, exact paralogs (e.g. IGHV1-69*01 and IGHV1-69D*01) are only represented once in each set, though alternative sequence names are noted in accompanying metadata. Results and discussion The Reference Sets include less than half the previously recognised IG alleles (e.g. just 198 IGHV sequences), and also include a number of novel alleles: 8 IGHV alleles, 2 IGKV alleles and 5 IGLV alleles. Despite their smaller sizes, erroneous calls were eliminated, and excellent coverage was achieved when a set of repertoires comprising over 4 million V(D)J rearrangements from 99 individuals were analyzed using the Sets. The version-tracked AIRR-C IG Reference Sets are freely available at the OGRDB website (https://ogrdb.airr-community.org/germline_sets/Human) and will be regularly updated to include newly observed and previously reported sequences that can be confirmed by new high-quality data.
Collapse
Affiliation(s)
- Andrew M. Collins
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, NSW, Australia
| | - Mats Ohlin
- Department of Immunotechnology, and SciLifeLab, Lund University, Lund, Sweden
| | - Martin Corcoran
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institute, Stockholm, Sweden
| | - James M. Heather
- Mass General Cancer Center, Massachusetts General Hospital, Charlestown, MA, United States
- Department of Medicine, Harvard Medical School, Boston, MA, United States
| | - Duncan Ralph
- Fred Hutchinson Cancer Research Center, Seattle, WA, United States
| | - Mansun Law
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA, United States
| | - Jesus Martínez-Barnetche
- Centro de Investigación Sobre Enfermedades Infecciosas, Instituto Nacional de Salud Pública, Cuernavaca, Morelos, Mexico
| | - Jian Ye
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, United States
| | - Eve Richardson
- La Jolla Institute for Immunology, San Diego, CA, United States
| | - William S. Gibson
- Department of Biochemistry and Molecular Genetics, School of Medicine, University of Louisville, Louisville, KY, United States
| | - Oscar L. Rodriguez
- Department of Biochemistry and Molecular Genetics, School of Medicine, University of Louisville, Louisville, KY, United States
| | - Ayelet Peres
- Bioengineering Program, Faculty of Engineering, Bar-Ilan University, Ramat Gan, Israel
| | - Gur Yaari
- Bioengineering Program, Faculty of Engineering, Bar-Ilan University, Ramat Gan, Israel
| | - Corey T. Watson
- Department of Biochemistry and Molecular Genetics, School of Medicine, University of Louisville, Louisville, KY, United States
| | - William D. Lees
- Institute of Structural and Molecular Biology, Birkbeck College, London, United Kingdom
- Human-Centered Computing and Information Science, Institute for Systems and Computer Engineering, Technology and Science, Porto, Portugal
| |
Collapse
|
4
|
Lees WD, Christley S, Peres A, Kos JT, Corrie B, Ralph D, Breden F, Cowell LG, Yaari G, Corcoran M, Karlsson Hedestam GB, Ohlin M, Collins AM, Watson CT, Busse CE. AIRR community curation and standardised representation for immunoglobulin and T cell receptor germline sets. IMMUNOINFORMATICS (AMSTERDAM, NETHERLANDS) 2023; 10:100025. [PMID: 37388275 PMCID: PMC10310305 DOI: 10.1016/j.immuno.2023.100025] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 07/01/2023]
Abstract
Analysis of an individual's immunoglobulin or T cell receptor gene repertoire can provide important insights into immune function. High-quality analysis of adaptive immune receptor repertoire sequencing data depends upon accurate and relatively complete germline sets, but current sets are known to be incomplete. Established processes for the review and systematic naming of receptor germline genes and alleles require specific evidence and data types, but the discovery landscape is rapidly changing. To exploit the potential of emerging data, and to provide the field with improved state-of-the-art germline sets, an intermediate approach is needed that will allow the rapid publication of consolidated sets derived from these emerging sources. These sets must use a consistent naming scheme and allow refinement and consolidation into genes as new information emerges. Name changes should be minimised, but, where changes occur, the naming history of a sequence must be traceable. Here we outline the current issues and opportunities for the curation of germline IG/TR genes and present a forward-looking data model for building out more robust germline sets that can dovetail with current established processes. We describe interoperability standards for germline sets, and an approach to transparency based on principles of findability, accessibility, interoperability, and reusability.
Collapse
Affiliation(s)
- William D. Lees
- Institute of Structural and Molecular Biology, Birkbeck College, London, England
- Human-Centered Computing and Information Science, Institute for Systems and Computer Engineering Technology and Science, Porto, Portugal
| | - Scott Christley
- Peter O’Donnell Jr. School of Public Health, UT Southwestern Medical Center, Dallas, TX, USA
| | - Ayelet Peres
- Bioengineering Program, Faculty of Engineering, Bar-Ilan University, Ramat Gan, Israel
| | - Justin T. Kos
- Department of Biochemistry and Molecular Genetics, School of Medicine, University of Louisville, KY, USA
| | - Brian Corrie
- Department of Biological Sciences, Simon Fraser University, Burnaby, BC, Canada
| | - Duncan Ralph
- Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Felix Breden
- Department of Biological Sciences, Simon Fraser University, Burnaby, BC, Canada
| | - Lindsay G. Cowell
- Peter O’Donnell Jr. School of Public Health, Department of Immunology, School of Biomedical Sciences, UT Southwestern Medical Center, Dallas, TX, USA
| | - Gur Yaari
- Bioengineering Program, Faculty of Engineering, Bar-Ilan University, Ramat Gan, Israel
| | - Martin Corcoran
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institutet, Stockholm, Swede
| | | | - Mats Ohlin
- Department of Immunotechnology and SciLifeLab, Lund University, Lund, Sweden
| | - Andrew M. Collins
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, NSW, Australia
| | - Corey T. Watson
- Department of Biochemistry and Molecular Genetics, School of Medicine, University of Louisville, KY, USA
| | - Christian E. Busse
- Division of B Cell Immunology, German Cancer Research Center, Heidelberg, Germany
| | | |
Collapse
|
5
|
Corcoran M, Chernyshev M, Mandolesi M, Narang S, Kaduk M, Ye K, Sundling C, Färnert A, Kreslavsky T, Bernhardsson C, Larena M, Jakobsson M, Karlsson Hedestam GB. Archaic humans have contributed to large-scale variation in modern human T cell receptor genes. Immunity 2023; 56:635-652.e6. [PMID: 36796364 DOI: 10.1016/j.immuni.2023.01.026] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2022] [Revised: 11/21/2022] [Accepted: 01/23/2023] [Indexed: 02/18/2023]
Abstract
Human T cell receptors (TCRs) are critical for mediating immune responses to pathogens and tumors and regulating self-antigen recognition. Yet, variations in the genes encoding TCRs remain insufficiently defined. Detailed analysis of expressed TCR alpha, beta, gamma, and delta genes in 45 donors from four human populations-African, East Asian, South Asian, and European-revealed 175 additional TCR variable and junctional alleles. Most of these contained coding changes and were present at widely differing frequencies in the populations, a finding confirmed using DNA samples from the 1000 Genomes Project. Importantly, we identified three Neanderthal-derived, introgressed TCR regions including a highly divergent TRGV4 variant, which mediated altered butyrophilin-like molecule 3 (BTNL3) ligand reactivity and was frequent in all modern Eurasian population groups. Our results demonstrate remarkable variation in TCR genes in both individuals and populations, providing a strong incentive for including allelic variation in studies of TCR function in human biology.
Collapse
Affiliation(s)
- Martin Corcoran
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institutet, 171 77 Stockholm, Sweden.
| | - Mark Chernyshev
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institutet, 171 77 Stockholm, Sweden
| | - Marco Mandolesi
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institutet, 171 77 Stockholm, Sweden
| | - Sanjana Narang
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institutet, 171 77 Stockholm, Sweden
| | - Mateusz Kaduk
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institutet, 171 77 Stockholm, Sweden
| | - Kewei Ye
- Department of Medicine, Solna, Karolinska Institutet and Karolinska University Hospital, Stockholm, Sweden; Center for Molecular Medicine, Karolinska Institutet, Stockholm, Sweden
| | - Christopher Sundling
- Department of Medicine, Solna, Karolinska Institutet and Karolinska University Hospital, Stockholm, Sweden; Center for Molecular Medicine, Karolinska Institutet, Stockholm, Sweden; Department of Infectious Diseases, Karolinska University Hospital, 171 76 Stockholm, Sweden
| | - Anna Färnert
- Department of Medicine, Solna, Karolinska Institutet and Karolinska University Hospital, Stockholm, Sweden; Center for Molecular Medicine, Karolinska Institutet, Stockholm, Sweden; Department of Infectious Diseases, Karolinska University Hospital, 171 76 Stockholm, Sweden
| | - Taras Kreslavsky
- Department of Medicine, Solna, Karolinska Institutet and Karolinska University Hospital, Stockholm, Sweden; Center for Molecular Medicine, Karolinska Institutet, Stockholm, Sweden
| | - Carolina Bernhardsson
- Department of Organismal Biology, Human Evolution, Norbyvägen 18C, 752 63 Uppsala, Sweden
| | - Maximilian Larena
- Department of Organismal Biology, Human Evolution, Norbyvägen 18C, 752 63 Uppsala, Sweden
| | - Mattias Jakobsson
- Department of Organismal Biology, Human Evolution, Norbyvägen 18C, 752 63 Uppsala, Sweden
| | | |
Collapse
|
6
|
Warrender AK, Pan J, Pudney CR, Arcus VL, Kelton W. Constant domain polymorphisms influence monoclonal antibody stability and dynamics. Protein Sci 2023; 32:e4589. [PMID: 36759959 PMCID: PMC9951194 DOI: 10.1002/pro.4589] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Revised: 02/02/2023] [Accepted: 02/06/2023] [Indexed: 02/11/2023]
Abstract
The constant regions of clinical monoclonal antibodies are derived from a select number of allotypes found in IgG subclasses. Despite a long-term acknowledgment that this diversity may impact both antibody function and developability, there is a lack of data on the stability of variants carrying these mutations. Here, we generated a panel of IgG1, IgG2, and IgG3 antibodies with 32 unique constant region alleles and performed a systematic comparison of stability using red edge excitation shift (REES). This technique exploits the fluorescent properties of tryptophan residues to measure antibody structural dynamics which predict flexibility and the propensity to unfold. Our REES measurements revealed broad stability differences between subclasses with IgG3 possessing the poorest overall stability. Further interrogation of differences between variants within each subclass enabled the high-resolution profiling of individual allotype stabilities. Crucially, these observed differences were not found to be linked to N297-linked glycan heterogeneity. Our work demonstrates diverse stabilities (and dynamics) for a range of naturally occurring constant domain alleles and the utility of REES as a method for rapid and sensitive antibody stability profiling, requiring only laboratory spectrophotometry equipment.
Collapse
Affiliation(s)
- Annmaree K Warrender
- Te Huataki Waiora School of Health, University of Waikato, Hamilton, New Zealand
| | - Jolyn Pan
- Te Aka Mātuatua School of Science, University of Waikato, Hamilton, New Zealand
| | - Chris R Pudney
- Department of Biology and Biochemistry, University of Bath, Bath, UK
| | - Vickery L Arcus
- Te Aka Mātuatua School of Science, University of Waikato, Hamilton, New Zealand
| | - William Kelton
- Te Huataki Waiora School of Health, University of Waikato, Hamilton, New Zealand.,Te Aka Mātuatua School of Science, University of Waikato, Hamilton, New Zealand
| |
Collapse
|
7
|
Pennell M, Rodriguez OL, Watson CT, Greiff V. The evolutionary and functional significance of germline immunoglobulin gene variation. Trends Immunol 2023; 44:7-21. [PMID: 36470826 DOI: 10.1016/j.it.2022.11.001] [Citation(s) in RCA: 14] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2022] [Accepted: 11/07/2022] [Indexed: 12/04/2022]
Abstract
The recombination between immunoglobulin (IG) gene segments determines an individual's naïve antibody repertoire and, consequently, (auto)antigen recognition. Emerging evidence suggests that mammalian IG germline variation impacts humoral immune responses associated with vaccination, infection, and autoimmunity - from the molecular level of epitope specificity, up to profound changes in the architecture of antibody repertoires. These links between IG germline variants and immunophenotype raise the question on the evolutionary causes and consequences of diversity within IG loci. We discuss why the extreme diversity in IG loci remains a mystery, why resolving this is important for the design of more effective vaccines and therapeutics, and how recent evidence from multiple lines of inquiry may help us do so.
Collapse
Affiliation(s)
- Matt Pennell
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA; Department of Biological Sciences, University of Southern California, Los Angeles, CA, USA.
| | - Oscar L Rodriguez
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, USA
| | - Corey T Watson
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, USA
| | - Victor Greiff
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway.
| |
Collapse
|
8
|
Narang S, Kaduk M, Chernyshev M, Karlsson Hedestam GB, Corcoran MM. Adaptive immune receptor genotyping using the corecount program. Front Immunol 2023; 14:1125884. [PMID: 37114042 PMCID: PMC10126697 DOI: 10.3389/fimmu.2023.1125884] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2022] [Accepted: 02/27/2023] [Indexed: 04/29/2023] Open
Abstract
We present a new Rep-Seq analysis tool called corecount, for analyzing genotypic variation in immunoglobulin (IG) and T cell receptor (TCR) genes. corecount is highly efficient at identifying V alleles, including those that are infrequently used in expressed repertoires and those that contain 3' end variation that are otherwise refractory to reliable identification during germline inference from expressed libraries. Furthermore, corecount facilitates accurate D and J gene genotyping. The output is highly reproducible and facilitates the comparison of genotypes from multiple individuals, such as those from clinical cohorts. Here, we applied corecount to the genotypic analysis of IgM libraries from 16 individuals. To demonstrate the accuracy of corecount, we Sanger sequenced all the heavy chain IG alleles (65 IGHV, 27 IGHD and 7 IGHJ) from one individual from whom we also produced two independent IgM Rep-seq datasets. Genomic analysis revealed that 5 known IGHV and 2 IGHJ sequences are truncated in current reference databases. This dataset of genomically validated alleles and IgM libraries from the same individual provides a useful resource for benchmarking other bioinformatic programs that involve V, D and J assignments and germline inference, and may facilitate the development of AIRR-Seq analysis tools that can take benefit from the availability of more comprehensive reference databases.
Collapse
|
9
|
Rodriguez OL, Silver CA, Shields K, Smith ML, Watson CT. Targeted long-read sequencing facilitates phased diploid assembly and genotyping of the human T cell receptor alpha, delta, and beta loci. CELL GENOMICS 2022; 2:100228. [PMID: 36778049 PMCID: PMC9903726 DOI: 10.1016/j.xgen.2022.100228] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Revised: 08/25/2022] [Accepted: 11/05/2022] [Indexed: 12/02/2022]
Abstract
T cell receptors (TCRs) recognize peptide fragments presented by the major histocompatibility complex (MHC) and are critical to T cell-mediated immunity. Recent data have indicated that genetic diversity within TCR-encoding gene regions is underexplored, limiting understanding of the impact of TCR loci polymorphisms on TCR function in disease, even though TCR repertoire signatures (1) are heritable and (2) associate with disease phenotypes. To address this, we developed a targeted long-read sequencing approach to generate highly accurate haplotype resolved assemblies of the TCR beta (TRB) and alpha/delta (TRA/D) loci, facilitating the genotyping of all variant types, including structural variants. We validate our approach using two mother-father-child trios and 5 unrelated donors representing multiple populations. This resulted in improved genotyping accuracy and the discovery of 84 undocumented V, D, J, and C alleles, demonstrating the utility of this framework for improving our understanding of TCR diversity and function in disease.
Collapse
Affiliation(s)
- Oscar L. Rodriguez
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, USA
| | - Catherine A. Silver
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, USA
| | - Kaitlyn Shields
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, USA
| | - Melissa L. Smith
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, USA
| | - Corey T. Watson
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, USA,Corresponding author
| |
Collapse
|
10
|
Ford MKB, Hari A, Rodriguez O, Xu J, Lack J, Oguz C, Zhang Y, Weber S, Magliocco M, Barnett J, Xirasagar S, Samuel S, Imberti L, Bonfanti P, Biondi A, Dalgard CL, Chanock S, Rosen L, Holland S, Su H, Notarangelo L, Vishkin U, Watson CT, Sahinalp SC. ImmunoTyper-SR: A computational approach for genotyping immunoglobulin heavy chain variable genes using short-read data. Cell Syst 2022; 13:808-816.e5. [PMID: 36265467 PMCID: PMC10084889 DOI: 10.1016/j.cels.2022.08.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2022] [Revised: 07/20/2022] [Accepted: 08/22/2022] [Indexed: 01/26/2023]
Abstract
Human immunoglobulin heavy chain (IGH) locus on chromosome 14 includes more than 40 functional copies of the variable gene (IGHV), which are critical for the structure of antibodies that identify and neutralize pathogenic invaders as a part of the adaptive immune system. Because of its highly repetitive sequence composition, the IGH locus has been particularly difficult to assemble or genotype when using standard short-read sequencing technologies. Here, we introduce ImmunoTyper-SR, an algorithmic tool for the genotyping and CNV analysis of the germline IGHV genes on Illumina whole-genome sequencing (WGS) data using a combinatorial optimization formulation that resolves ambiguous read mappings. We have validated ImmunoTyper-SR on 12 individuals, whose IGHV allele composition had been independently validated, as well as concordance between WGS replicates from nine individuals. We then applied ImmunoTyper-SR on 585 COVID patients to investigate the associations between IGHV alleles and anti-type I IFN autoantibodies, which were previously associated with COVID-19 severity.
Collapse
Affiliation(s)
| | - Ananth Hari
- National Cancer Institute, NIH, Bethesda, MD, USA; Department of Electrical Engineering, University of Maryland, College Park, MD, USA
| | - Oscar Rodriguez
- Department of Biochemistry and Molecular Genetics, University of Louisville, Louisville, KY, USA
| | - Junyan Xu
- National Cancer Institute, NIH, Bethesda, MD, USA
| | - Justin Lack
- National Institute of Allergy and Infectious Diseases, NIH, Bethesda, MD, USA
| | - Cihan Oguz
- National Institute of Allergy and Infectious Diseases, NIH, Bethesda, MD, USA
| | - Yu Zhang
- National Institute of Allergy and Infectious Diseases, NIH, Bethesda, MD, USA
| | - Sarah Weber
- National Institute of Allergy and Infectious Diseases, NIH, Bethesda, MD, USA
| | - Mary Magliocco
- National Institute of Allergy and Infectious Diseases, NIH, Bethesda, MD, USA
| | - Jason Barnett
- National Institute of Allergy and Infectious Diseases, NIH, Bethesda, MD, USA
| | - Sandhya Xirasagar
- National Institute of Allergy and Infectious Diseases, NIH, Bethesda, MD, USA
| | - Smilee Samuel
- National Institute of Allergy and Infectious Diseases, NIH, Bethesda, MD, USA
| | - Luisa Imberti
- Diagnostic Department, ASST Spedali Civili di Brescia, Brescia, Italy
| | - Paolo Bonfanti
- University of Milano-Bicocca, Fondazione MBBM, Monza, Italy
| | - Andrea Biondi
- University of Milano-Bicocca, Fondazione MBBM, Monza, Italy
| | - Clifton L Dalgard
- Uniformed Services University of the Health Sciences, Bethesda, MD, USA
| | | | - Lindsey Rosen
- National Institute of Allergy and Infectious Diseases, NIH, Bethesda, MD, USA
| | - Steven Holland
- National Institute of Allergy and Infectious Diseases, NIH, Bethesda, MD, USA
| | - Helen Su
- National Institute of Allergy and Infectious Diseases, NIH, Bethesda, MD, USA
| | - Luigi Notarangelo
- National Institute of Allergy and Infectious Diseases, NIH, Bethesda, MD, USA
| | - Uzi Vishkin
- Department of Electrical Engineering, University of Maryland, College Park, MD, USA
| | - Corey T Watson
- Department of Biochemistry and Molecular Genetics, University of Louisville, Louisville, KY, USA
| | | |
Collapse
|
11
|
Jackson KJL, Kos JT, Lees W, Gibson WS, Smith ML, Peres A, Yaari G, Corcoran M, Busse CE, Ohlin M, Watson CT, Collins AM. A BALB/c IGHV Reference Set, Defined by Haplotype Analysis of Long-Read VDJ-C Sequences From F1 (BALB/c x C57BL/6) Mice. Front Immunol 2022; 13:888555. [PMID: 35720344 PMCID: PMC9205180 DOI: 10.3389/fimmu.2022.888555] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2022] [Accepted: 04/28/2022] [Indexed: 11/13/2022] Open
Abstract
The immunoglobulin genes of inbred mouse strains that are commonly used in models of antibody-mediated human diseases are poorly characterized. This compromises data analysis. To infer the immunoglobulin genes of BALB/c mice, we used long-read SMRT sequencing to amplify VDJ-C sequences from F1 (BALB/c x C57BL/6) hybrid animals. Strain variations were identified in the Ighm and Ighg2b genes, and analysis of VDJ rearrangements led to the inference of 278 germline IGHV alleles. 169 alleles are not present in the C57BL/6 genome reference sequence. To establish a set of expressed BALB/c IGHV germline gene sequences, we computationally retrieved IGHV haplotypes from the IgM dataset. Haplotyping led to the confirmation of 162 BALB/c IGHV gene sequences. A musIGHV398 pseudogene variant also appears to be present in the BALB/cByJ substrain, while a functional musIGHV398 gene is highly expressed in the BALB/cJ substrain. Only four of the BALB/c alleles were also observed in the C57BL/6 haplotype. The full set of inferred BALB/c sequences has been used to establish a BALB/c IGHV reference set, hosted at https://ogrdb.airr-community.org. We assessed whether assemblies from the Mouse Genome Project (MGP) are suitable for the determination of the genes of the IGH loci. Only 37 (43.5%) of the 85 confirmed IMGT-named BALB/c IGHV and 33 (42.9%) of the 77 confirmed non-IMGT IGHV were found in a search of the MGP BALB/cJ genome assembly. This suggests that current MGP assemblies are unsuitable for the comprehensive documentation of germline IGHVs and more efforts will be needed to establish strain-specific reference sets.
Collapse
Affiliation(s)
| | - Justin T. Kos
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, United States
| | - William Lees
- Institute of Structural and Molecular Biology, Birkbeck College, University of London, London, United Kingdom
| | - William S. Gibson
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, United States
| | - Melissa Laird Smith
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, United States
| | - Ayelet Peres
- Faculty of Engineering, Bar Ilan University, Ramat Gan, Israel
| | - Gur Yaari
- Faculty of Engineering, Bar Ilan University, Ramat Gan, Israel
| | - Martin Corcoran
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institutet, Stockholm, Sweden
| | - Christian E. Busse
- Division of B Cell Immunology, German Cancer Research Center, Heidelberg, Germany
| | - Mats Ohlin
- Department of Immunotechnology, Lund University, Lund, Sweden
| | - Corey T. Watson
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, United States
| | - Andrew M. Collins
- School of Biotechnology and Biomolecular Sciences, The University of New South Wales, Sydney, NSW, Australia
| |
Collapse
|
12
|
Slabodkin A, Chernigovskaya M, Mikocziova I, Akbar R, Scheffer L, Pavlović M, Bashour H, Snapkov I, Mehta BB, Weber CR, Gutierrez-Marcos J, Sollid LM, Haff IH, Sandve GK, Robert PA, Greiff V. Individualized VDJ recombination predisposes the available Ig sequence space. Genome Res 2021; 31:2209-2224. [PMID: 34815307 PMCID: PMC8647828 DOI: 10.1101/gr.275373.121] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2021] [Accepted: 10/20/2021] [Indexed: 11/25/2022]
Abstract
The process of recombination between variable (V), diversity (D), and joining (J) immunoglobulin (Ig) gene segments determines an individual's naive Ig repertoire and, consequently, (auto)antigen recognition. VDJ recombination follows probabilistic rules that can be modeled statistically. So far, it remains unknown whether VDJ recombination rules differ between individuals. If these rules differed, identical (auto)antigen-specific Ig sequences would be generated with individual-specific probabilities, signifying that the available Ig sequence space is individual specific. We devised a sensitivity-tested distance measure that enables inter-individual comparison of VDJ recombination models. We discovered, accounting for several sources of noise as well as allelic variation in Ig sequencing data, that not only unrelated individuals but also human monozygotic twins and even inbred mice possess statistically distinguishable immunoglobulin recombination models. This suggests that, in addition to genetic, there is also nongenetic modulation of VDJ recombination. We demonstrate that population-wide individualized VDJ recombination can result in orders of magnitude of difference in the probability to generate (auto)antigen-specific Ig sequences. Our findings have implications for immune receptor-based individualized medicine approaches relevant to vaccination, infection, and autoimmunity.
Collapse
Affiliation(s)
- Andrei Slabodkin
- Department of Immunology and Oslo University Hospital, University of Oslo, 0372 Oslo, Norway
| | - Maria Chernigovskaya
- Department of Immunology and Oslo University Hospital, University of Oslo, 0372 Oslo, Norway
| | - Ivana Mikocziova
- Department of Immunology and Oslo University Hospital, University of Oslo, 0372 Oslo, Norway
| | - Rahmad Akbar
- Department of Immunology and Oslo University Hospital, University of Oslo, 0372 Oslo, Norway
| | - Lonneke Scheffer
- Department of Informatics, University of Oslo, 0373 Oslo, Norway
| | - Milena Pavlović
- Department of Informatics, University of Oslo, 0373 Oslo, Norway
| | - Habib Bashour
- School of Life Sciences, University of Warwick, Coventry CV4 7AL, United Kingdom
| | - Igor Snapkov
- Department of Immunology and Oslo University Hospital, University of Oslo, 0372 Oslo, Norway
| | - Brij Bhushan Mehta
- Department of Immunology and Oslo University Hospital, University of Oslo, 0372 Oslo, Norway
| | - Cédric R Weber
- Department of Biosystems Science and Engineering, ETH Zurich, 4058 Basel, Switzerland
| | | | - Ludvig M Sollid
- Department of Immunology and Oslo University Hospital, University of Oslo, 0372 Oslo, Norway
| | | | | | - Philippe A Robert
- Department of Immunology and Oslo University Hospital, University of Oslo, 0372 Oslo, Norway
| | - Victor Greiff
- Department of Immunology and Oslo University Hospital, University of Oslo, 0372 Oslo, Norway
| |
Collapse
|
13
|
Reply to the Commentary on population matched (pm) germline allelic variants of immunoglobulin (IG) loci: relevance in infectious diseases and vaccination studies in human populations. Genes Immun 2021; 22:339-342. [PMID: 34876659 PMCID: PMC8674126 DOI: 10.1038/s41435-021-00155-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2021] [Revised: 10/27/2021] [Accepted: 11/02/2021] [Indexed: 11/24/2022]
|