1
|
Rodriguez OL, Safonova Y, Silver CA, Shields K, Gibson WS, Kos JT, Tieri D, Ke H, Jackson KJL, Boyd SD, Smith ML, Marasco WA, Watson CT. Genetic variation in the immunoglobulin heavy chain locus shapes the human antibody repertoire. Nat Commun 2023; 14:4419. [PMID: 37479682 PMCID: PMC10362067 DOI: 10.1038/s41467-023-40070-x] [Citation(s) in RCA: 37] [Impact Index Per Article: 18.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2022] [Accepted: 07/11/2023] [Indexed: 07/23/2023] Open
Abstract
Variation in the antibody response has been linked to differential outcomes in disease, and suboptimal vaccine and therapeutic responsiveness, the determinants of which have not been fully elucidated. Countering models that presume antibodies are generated largely by stochastic processes, we demonstrate that polymorphisms within the immunoglobulin heavy chain locus (IGH) impact the naive and antigen-experienced antibody repertoire, indicating that genetics predisposes individuals to mount qualitatively and quantitatively different antibody responses. We pair recently developed long-read genomic sequencing methods with antibody repertoire profiling to comprehensively resolve IGH genetic variation, including novel structural variants, single nucleotide variants, and genes and alleles. We show that IGH germline variants determine the presence and frequency of antibody genes in the expressed repertoire, including those enriched in functional elements linked to V(D)J recombination, and overlapping disease-associated variants. These results illuminate the power of leveraging IGH genetics to better understand the regulation, function, and dynamics of the antibody response in disease.
Collapse
Affiliation(s)
- Oscar L Rodriguez
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, USA
| | - Yana Safonova
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Catherine A Silver
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, USA
| | - Kaitlyn Shields
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, USA
| | - William S Gibson
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, USA
| | - Justin T Kos
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, USA
| | - David Tieri
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, USA
| | - Hanzhong Ke
- Department of Cancer Immunology and Virology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA
- Department of Medicine, Harvard Medical School, Boston, MA, USA
| | | | - Scott D Boyd
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - Melissa L Smith
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, USA.
| | - Wayne A Marasco
- Department of Cancer Immunology and Virology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA.
- Department of Medicine, Harvard Medical School, Boston, MA, USA.
| | - Corey T Watson
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, USA.
| |
Collapse
|
2
|
Pushparaj P, Nicoletto A, Castro Dopico X, Sheward DJ, Kim S, Ekström S, Murrell B, Corcoran M, Karlsson Hedestam GB. Frequent use of IGHV3-30-3 in SARS-CoV-2 neutralizing antibody responses. FRONTIERS IN VIROLOGY 2023; 3:1128253. [PMID: 37041983 PMCID: PMC7614418 DOI: 10.3389/fviro.2023.1128253] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/08/2023]
Abstract
The antibody response to SARS-CoV-2 shows biased immunoglobulin heavy chain variable (IGHV) gene usage, allowing definition of genetic signatures for some classes of neutralizing antibodies. We investigated IGHV gene usage frequencies by sorting spike-specific single memory B cells from individuals infected with SARS-CoV-2 early in the pandemic. From two study participants and 703 spike-specific B cells, the most used genes were IGHV1-69, IGHV3-30-3, and IGHV3-30. Here, we focused on the IGHV3-30 group of genes and an IGHV3-30-3-using ultrapotent neutralizing monoclonal antibody, CAB-F52, which displayed broad neutralizing activity also in its germline-reverted form. IGHV3-30-3 is encoded by a region of the IGH locus that is highly variable at both the allelic and structural levels. Using personalized IG genotyping, we found that 4 of 14 study participants lacked the IGHV3-30-3 gene on both chromosomes, raising the question if other, highly similar IGHV genes could substitute for IGHV3-30-3 in persons lacking this gene. In the context of CAB-F52, we found that none of the tested IGHV3-33 alleles, but several IGHV3-30 alleles could substitute for IGHV3-30-3, suggesting functional redundancy between the highly homologous IGHV3-30 and IGHV3-30-3 genes for this antibody.
Collapse
Affiliation(s)
- Pradeepa Pushparaj
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institutet, Stockholm, Sweden
| | - Andrea Nicoletto
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institutet, Stockholm, Sweden
| | - Xaquin Castro Dopico
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institutet, Stockholm, Sweden
| | - Daniel J. Sheward
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institutet, Stockholm, Sweden
| | - Sungyong Kim
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institutet, Stockholm, Sweden
| | - Simon Ekström
- Department of Biomedical Engineering, Lund University, Lund, Sweden
| | - Ben Murrell
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institutet, Stockholm, Sweden
| | - Martin Corcoran
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institutet, Stockholm, Sweden
| | - Gunilla B. Karlsson Hedestam
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institutet, Stockholm, Sweden
- CORRESPONDENCE Gunilla B. Karlsson Hedestam
| |
Collapse
|
3
|
Clarke T, Du P, Kumar S, Okitsu SL, Schuette M, An Q, Zhang J, Tzvetkov E, Jensen MA, Niewold TB, Ferre EMN, Nardone J, Lionakis MS, Vlach J, DeMartino J, Bender AT. Autoantibody repertoire characterization provides insight into the pathogenesis of monogenic and polygenic autoimmune diseases. Front Immunol 2023; 14:1106537. [PMID: 36845162 PMCID: PMC9955420 DOI: 10.3389/fimmu.2023.1106537] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Accepted: 01/16/2023] [Indexed: 02/12/2023] Open
Abstract
Autoimmune diseases vary in the magnitude and diversity of autoantibody profiles, and these differences may be a consequence of different types of breaks in tolerance. Here, we compared the disparate autoimmune diseases autoimmune polyendocrinopathy-candidiasis-ecto-dermal dystrophy (APECED), systemic lupus erythematosus (SLE), and Sjogren's syndrome (SjS) to gain insight into the etiology of breaks in tolerance triggering autoimmunity. APECED was chosen as a prototypical monogenic disease with organ-specific pathology while SjS and SLE represent polygenic autoimmunity with focal or systemic disease. Using protein microarrays for autoantibody profiling, we found that APECED patients develop a focused but highly reactive set of shared mostly anti-cytokine antibodies, while SLE patients develop broad and less expanded autoantibody repertoires against mostly intracellular autoantigens. SjS patients had few autoantibody specificities with the highest shared reactivities observed against Ro-52 and La. RNA-seq B-cell receptor analysis revealed that APECED samples have fewer, but highly expanded, clonotypes compared with SLE samples containing a diverse, but less clonally expanded, B-cell receptor repertoire. Based on these data, we propose a model whereby the presence of autoreactive T-cells in APECED allows T-dependent B-cell responses against autoantigens, while SLE is driven by breaks in peripheral B-cell tolerance and extrafollicular B-cell activation. These results highlight differences in the autoimmunity observed in several monogenic and polygenic disorders and may be generalizable to other autoimmune diseases.
Collapse
Affiliation(s)
- Thomas Clarke
- TIP Immunology, EMD Serono, Billerica, MA, United States
| | - Pan Du
- TIP Immunology, EMD Serono, Billerica, MA, United States
| | | | | | - Mark Schuette
- Protein Engineering and Antibody Technologies, Merck KGaA, Darmstadt, Germany
| | - Qi An
- TIP Immunology, EMD Serono, Billerica, MA, United States
| | - Jinyang Zhang
- TIP Immunology, EMD Serono, Billerica, MA, United States
| | | | - Mark A. Jensen
- Department of Immunology, Division of Rheumatology, Mayo Clinic, Rochester, MN, United States
| | - Timothy B. Niewold
- Department of Immunology, Division of Rheumatology, Mayo Clinic, Rochester, MN, United States
| | - Elise M. N. Ferre
- Fungal Pathogenesis Section, Laboratory of Clinical Immunology and Microbiology (LCIM), National Institute of Allergy and Infectious Diseases (NIAID), Bethesda, MD, United States
| | - Julie Nardone
- TIP Immunology, EMD Serono, Billerica, MA, United States
| | - Michail S. Lionakis
- Fungal Pathogenesis Section, Laboratory of Clinical Immunology and Microbiology (LCIM), National Institute of Allergy and Infectious Diseases (NIAID), Bethesda, MD, United States
| | - Jaromir Vlach
- TIP Immunology, EMD Serono, Billerica, MA, United States
| | | | | |
Collapse
|
4
|
Omer A, Peres A, Rodriguez OL, Watson CT, Lees W, Polak P, Collins AM, Yaari G. T cell receptor beta germline variability is revealed by inference from repertoire data. Genome Med 2022; 14:2. [PMID: 34991709 PMCID: PMC8740489 DOI: 10.1186/s13073-021-01008-4] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2021] [Accepted: 12/08/2021] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND T and B cell receptor (TCR, BCR) repertoires constitute the foundation of adaptive immunity. Adaptive immune receptor repertoire sequencing (AIRR-seq) is a common approach to study immune system dynamics. Understanding the genetic factors influencing the composition and dynamics of these repertoires is of major scientific and clinical importance. The chromosomal loci encoding for the variable regions of TCRs and BCRs are challenging to decipher due to repetitive elements and undocumented structural variants. METHODS To confront this challenge, AIRR-seq-based methods have recently been developed for B cells, enabling genotype and haplotype inference and discovery of undocumented alleles. However, this approach relies on complete coverage of the receptors' variable regions, whereas most T cell studies sequence a small fraction of that region. Here, we adapted a B cell pipeline for undocumented alleles, genotype, and haplotype inference for full and partial AIRR-seq TCR data sets. The pipeline also deals with gene assignment ambiguities, which is especially important in the analysis of data sets of partial sequences. RESULTS From the full and partial AIRR-seq TCR data sets, we identified 39 undocumented polymorphisms in T cell receptor Beta V (TRBV) and 31 undocumented 5 ' UTR sequences. A subset of these inferences was also observed using independent genomic approaches. We found that a single nucleotide polymorphism differentiating between the two documented T cell receptor Beta D2 (TRBD2) alleles is strongly associated with dramatic changes in the expressed repertoire. CONCLUSIONS We reveal a rich picture of germline variability and demonstrate how a single nucleotide polymorphism dramatically affects the composition of the whole repertoire. Our findings provide a basis for annotation of TCR repertoires for future basic and clinical studies.
Collapse
Affiliation(s)
- Aviv Omer
- Faculty of Engineering, Bar Ilan University, Ramat Gan, 5290002, Israel
- Bar Ilan institute of Nanotechnology and Advanced Materials, Bar Ilan University, Ramat Gan, 5290002, Israel
| | - Ayelet Peres
- Faculty of Engineering, Bar Ilan University, Ramat Gan, 5290002, Israel
- Bar Ilan institute of Nanotechnology and Advanced Materials, Bar Ilan University, Ramat Gan, 5290002, Israel
| | - Oscar L Rodriguez
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, USA
| | - Corey T Watson
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, USA
| | - William Lees
- Institute of Structural and Molecular Biology, Birkbeck College, University of London, London, UK
| | - Pazit Polak
- Faculty of Engineering, Bar Ilan University, Ramat Gan, 5290002, Israel
- Bar Ilan institute of Nanotechnology and Advanced Materials, Bar Ilan University, Ramat Gan, 5290002, Israel
| | - Andrew M Collins
- School of Biotechnology and Biomedical Sciences, University of New South Wales, Sydney, Australia
| | - Gur Yaari
- Faculty of Engineering, Bar Ilan University, Ramat Gan, 5290002, Israel.
- Bar Ilan institute of Nanotechnology and Advanced Materials, Bar Ilan University, Ramat Gan, 5290002, Israel.
| |
Collapse
|
5
|
Rodriguez OL, Gibson WS, Parks T, Emery M, Powell J, Strahl M, Deikus G, Auckland K, Eichler EE, Marasco WA, Sebra R, Sharp AJ, Smith ML, Bashir A, Watson CT. A Novel Framework for Characterizing Genomic Haplotype Diversity in the Human Immunoglobulin Heavy Chain Locus. Front Immunol 2020; 11:2136. [PMID: 33072076 PMCID: PMC7539625 DOI: 10.3389/fimmu.2020.02136] [Citation(s) in RCA: 65] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2020] [Accepted: 08/06/2020] [Indexed: 02/06/2023] Open
Abstract
An incomplete ascertainment of genetic variation within the highly polymorphic immunoglobulin heavy chain locus (IGH) has hindered our ability to define genetic factors that influence antibody-mediated processes. Due to locus complexity, standard high-throughput approaches have failed to accurately and comprehensively capture IGH polymorphism. As a result, the locus has only been fully characterized two times, severely limiting our knowledge of human IGH diversity. Here, we combine targeted long-read sequencing with a novel bioinformatics tool, IGenotyper, to fully characterize IGH variation in a haplotype-specific manner. We apply this approach to eight human samples, including a haploid cell line and two mother-father-child trios, and demonstrate the ability to generate high-quality assemblies (>98% complete and >99% accurate), genotypes, and gene annotations, identifying 2 novel structural variants and 15 novel IGH alleles. We show multiplexing allows for scaling of the approach without impacting data quality, and that our genotype call sets are more accurate than short-read (>35% increase in true positives and >97% decrease in false-positives) and array/imputation-based datasets. This framework establishes a desperately needed foundation for leveraging IG genomic data to study population-level variation in antibody-mediated immunity, critical for bettering our understanding of disease risk, and responses to vaccines and therapeutics.
Collapse
Affiliation(s)
- Oscar L Rodriguez
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, United States
| | - William S Gibson
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, United States
| | - Tom Parks
- Wellcome Centre for Human Genetics, University of Oxford, Oxford, United Kingdom
| | - Matthew Emery
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, United States
| | - James Powell
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, United States
| | - Maya Strahl
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, United States
| | - Gintaras Deikus
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, United States
| | - Kathryn Auckland
- Wellcome Centre for Human Genetics, University of Oxford, Oxford, United Kingdom
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, United States.,Howard Hughes Medical Institute, University of Washington, Seattle, WA, United States
| | - Wayne A Marasco
- Department of Cancer Immunology and AIDS, Dana-Farber Cancer Institute, Department of Medicine, Harvard Medical School, Boston, MA, United States
| | - Robert Sebra
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, United States.,Icahn Institute of Data Science and Genomic Technology, New York, NY, United States
| | - Andrew J Sharp
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, United States
| | - Melissa L Smith
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, United States.,Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, United States.,Icahn Institute of Data Science and Genomic Technology, New York, NY, United States
| | - Ali Bashir
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, United States
| | - Corey T Watson
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, United States
| |
Collapse
|
6
|
Inter- and intraspecies comparison of phylogenetic fingerprints and sequence diversity of immunoglobulin variable genes. Immunogenetics 2020; 72:279-294. [PMID: 32367185 DOI: 10.1007/s00251-020-01164-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2020] [Accepted: 04/13/2020] [Indexed: 10/24/2022]
Abstract
Protection and neutralization of a vast array of pathogens is accomplished by the tremendous diversity of the B cell receptor (BCR) repertoire. For jawed vertebrates, this diversity is initiated via the somatic recombination of immunoglobulin (Ig) germline elements. While it is clear that the number of these germline segments differs from species to species, the extent of cross-species sequence diversity remains largely uncharacterized. Here we use extensive computational and statistical methods to investigate the sequence diversity and evolutionary relationship between Ig variable (V), diversity (D), and joining (J) germline segments across nine commonly studied species ranging from zebrafish to human. Metrics such as guanine-cytosine (GC) content showed low redundancy across Ig germline genes within a given species. Other comparisons, including amino acid motifs, evolutionary selection, and sequence diversity, revealed species-specific properties. Additionally, we showed that the germline-encoded diversity differs across antibody (recombined V-D-J) repertoires of various B cell subsets. To facilitate future comparative immunogenomics analysis, we created VDJgermlines, an R package that contains the germline sequences from multiple species. Our study informs strategies for the humanization and engineering of therapeutic antibodies.
Collapse
|
7
|
Ralph DK, Matsen FA. Per-sample immunoglobulin germline inference from B cell receptor deep sequencing data. PLoS Comput Biol 2019; 15:e1007133. [PMID: 31329576 PMCID: PMC6675132 DOI: 10.1371/journal.pcbi.1007133] [Citation(s) in RCA: 41] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2017] [Revised: 08/01/2019] [Accepted: 05/28/2019] [Indexed: 11/26/2022] Open
Abstract
The collection of immunoglobulin genes in an individual's germline, which gives rise to B cell receptors via recombination, is known to vary significantly across individuals. In humans, for example, each individual has only a fraction of the several hundred known V alleles. Furthermore, the currently-accepted set of known V alleles is both incomplete (particularly for non-European samples), and contains a significant number of spurious alleles. The resulting uncertainty as to which immunoglobulin alleles are present in any given sample results in inaccurate B cell receptor sequence annotations, and in particular inaccurate inferred naive ancestors. In this paper we first show that the currently widespread practice of aligning each sequence to its closest match in the full set of IMGT alleles results in a very large number of spurious alleles that are not in the sample's true set of germline V alleles. We then describe a new method for inferring each individual's germline gene set from deep sequencing data, and show that it improves upon existing methods by making a detailed comparison on a variety of simulated and real data samples. This new method has been integrated into the partis annotation and clonal family inference package, available at https://github.com/psathyrella/partis, and is run by default without affecting overall run time.
Collapse
Affiliation(s)
- Duncan K. Ralph
- Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America
| | - Frederick A. Matsen
- Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America
| |
Collapse
|
8
|
Luo S, Yu JA, Li H, Song YS. Worldwide genetic variation of the IGHV and TRBV immune receptor gene families in humans. Life Sci Alliance 2019; 2:2/2/e201800221. [PMID: 30808649 PMCID: PMC6391684 DOI: 10.26508/lsa.201800221] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2018] [Revised: 02/14/2019] [Accepted: 02/14/2019] [Indexed: 12/31/2022] Open
Abstract
This article presents a comprehensive study of the IGHV and TRBV gene families in a globally diverse sample of humans and shows that the two gene families exhibit starkly different patterns of variation. The immunoglobulin heavy variable (IGHV) and T cell beta variable (TRBV) loci are among the most complex and variable regions in the human genome. Generated through a process of gene duplication/deletion and diversification, these loci can vary extensively between individuals in copy number and contain genes that are highly similar, making their analysis technically challenging. Here, we present a comprehensive study of the functional gene segments in the IGHV and TRBV loci, quantifying their copy number and single-nucleotide variation in a globally diverse sample of 109 (IGHV) and 286 (TRBV) humans from over a 100 populations. We find that the IGHV and TRBV gene families exhibit starkly different patterns of variation. In addition to providing insight into the different evolutionary paths of the IGHV and TRBV loci, our results are also important to the adaptive immune repertoire sequencing community, where the lack of frequencies of common alleles and copy number variants is hampering existing analytical pipelines.
Collapse
Affiliation(s)
- Shishi Luo
- Computer Science Division, University of California, Berkeley, Berkeley, CA, USA.,Department of Statistics, University of California, Berkeley, Berkeley, CA, USA
| | - Jane A Yu
- Computer Science Division, University of California, Berkeley, Berkeley, CA, USA
| | - Heng Li
- Department of Biostatistics, Harvard Medical School, Boston, MA, USA
| | - Yun S Song
- Computer Science Division, University of California, Berkeley, Berkeley, CA, USA .,Department of Statistics, University of California, Berkeley, Berkeley, CA, USA.,Chan Zuckerberg Biohub, San Francisco, CA, USA
| |
Collapse
|
9
|
Collins AM, Watson CT. Immunoglobulin Light Chain Gene Rearrangements, Receptor Editing and the Development of a Self-Tolerant Antibody Repertoire. Front Immunol 2018; 9:2249. [PMID: 30349529 PMCID: PMC6186787 DOI: 10.3389/fimmu.2018.02249] [Citation(s) in RCA: 43] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2018] [Accepted: 09/10/2018] [Indexed: 11/13/2022] Open
Abstract
Discussion of the antibody repertoire usually emphasizes diversity, but a conspicuous feature of the light chain repertoire is its lack of diversity. The diversity of reported allelic variants of germline light chain genes is also limited, even in well-studied species. In this review, the implications of this lack of diversity are considered. We explore germline and rearranged light chain genes in a variety of species, with a particular focus on human and mouse genes. The importance of the number, organization and orientation of the genes for the control of repertoire development is discussed, and we consider how primary rearrangements and receptor editing together shape the expressed light chain repertoire. The resulting repertoire is dominated by just a handful of IGKV and IGLV genes. It has been hypothesized that an important function of the light chain is to guard against self-reactivity, and the role of secondary rearrangements in this process could explain the genomic organization of the light chain genes. It could also explain why the light chain repertoire is so limited. Heavy and light chain genes may have co-evolved to ensure that suitable light chain partners are usually available for each heavy chain that forms early in B cell development. We suggest that the co-evolved loci of the house mouse often became separated during the inbreeding of laboratory mice, resulting in new pairings of loci that are derived from different sub-species of the house mouse. A resulting vulnerability to self-reactivity could explain at least some mouse models of autoimmune disease.
Collapse
Affiliation(s)
- Andrew M. Collins
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, NSW, Australia
| | - Corey T. Watson
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, United States
| |
Collapse
|
10
|
Breden F, Watson CT. Using High-Throughput Sequencing to Characterize the Development of the Antibody Repertoire During Infections: A Case Study of HIV-1. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2018; 1053:245-263. [PMID: 29549643 DOI: 10.1007/978-3-319-72077-7_12] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
High throughput sequencing (HTS) approaches have only recently been applied to describing the antibody/B-cell repertoire in fine detail, but these data sets have already become critical to the design of vaccines and therapeutics, and monitoring of cancer immunotherapy. As a case study, we describe the potential and present limitations of HTS studies of the Ab repertoire during infection with HIV-1. Most of the present studies restrict their analyses to lineages of specific bnAbs. We discuss future initiatives to expand this type of analysis to more complete repertoires and to improve comparing and sharing of these Ab repertoire data across studies and institutions.
Collapse
Affiliation(s)
- Felix Breden
- Department of Biological Sciences, Simon Fraser University, Burnaby, BC, Canada.
| | - Corey T Watson
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, USA
| |
Collapse
|
11
|
On being the right size: antibody repertoire formation in the mouse and human. Immunogenetics 2017; 70:143-158. [DOI: 10.1007/s00251-017-1049-8] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2017] [Accepted: 12/04/2017] [Indexed: 01/01/2023]
|
12
|
Kirik U, Greiff L, Levander F, Ohlin M. Parallel antibody germline gene and haplotype analyses support the validity of immunoglobulin germline gene inference and discovery. Mol Immunol 2017; 87:12-22. [DOI: 10.1016/j.molimm.2017.03.012] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2016] [Revised: 03/07/2017] [Accepted: 03/08/2017] [Indexed: 12/18/2022]
|
13
|
Watson CT, Glanville J, Marasco WA. The Individual and Population Genetics of Antibody Immunity. Trends Immunol 2017; 38:459-470. [PMID: 28539189 PMCID: PMC5656258 DOI: 10.1016/j.it.2017.04.003] [Citation(s) in RCA: 87] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2017] [Revised: 04/06/2017] [Accepted: 04/10/2017] [Indexed: 12/12/2022]
Abstract
Antibodies (Abs) produced by immunoglobulin (IG) genes are the most diverse proteins expressed in humans. While part of this diversity is generated by recombination during B-cell development and mutations during affinity maturation, the germ-line IG loci are also diverse across human populations and ethnicities. Recently, proof-of-concept studies have demonstrated genotype–phenotype correlations between specific IG germ-line variants and the quality of Ab responses during vaccination and disease. However, the functional consequences of IG genetic variation in Ab function and immunological outcomes remain underexplored. In this opinion article, we outline interconnections between IG genomic diversity and Ab-expressed repertoires and structure. We further propose a strategy for integrating IG genotyping with functional Ab profiling data as a means to better predict and optimize humoral responses in genetically diverse human populations, with immediate implications for personalized medicine. Genetic variation in human populations affects how individuals are able to mount functional antibody responses. Different alleles can encode convergent binding motifs that result in successful Ab responses against specific infections and vaccinations. Given the complexity of the IG loci and the diversity of the antibody repertoire, links between IG polymorphism and antibody repertoire variability have not been thoroughly explored. We present a strategy to mine genotype–repertoire–disease associations.
Collapse
Affiliation(s)
- Corey T Watson
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, USA.
| | - Jacob Glanville
- Institute for Immunity, Transplantation and Infection, and Computational and Systems Immunology, Stanford University School of Medicine, Stanford, CA, USA.
| | - Wayne A Marasco
- Department of Cancer Immunology and Virology, Dana-Farber Cancer Institute, Boston, MA, USA; Department of Medicine, Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
14
|
Watson CT, Matsen FA, Jackson KJL, Bashir A, Smith ML, Glanville J, Breden F, Kleinstein SH, Collins AM, Busse CE. Comment on “A Database of Human Immune Receptor Alleles Recovered from Population Sequencing Data”. THE JOURNAL OF IMMUNOLOGY 2017; 198:3371-3373. [DOI: 10.4049/jimmunol.1700306] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
|
15
|
Luo S, Yu JA, Song YS. Estimating Copy Number and Allelic Variation at the Immunoglobulin Heavy Chain Locus Using Short Reads. PLoS Comput Biol 2016; 12:e1005117. [PMID: 27632220 PMCID: PMC5025152 DOI: 10.1371/journal.pcbi.1005117] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2016] [Accepted: 08/23/2016] [Indexed: 11/28/2022] Open
Abstract
The study of genomic regions that contain gene copies and structural variation is a major challenge in modern genomics. Unlike variation involving single nucleotide changes, data on the variation of copy number is difficult to collect and few tools exist for analyzing the variation between individuals. The immunoglobulin heavy variable (IGHV) locus, which plays an integral role in the adaptive immune response, is an example of a complex genomic region that varies in gene copy number. Lack of standard methods to genotype this region prevents it from being included in association studies and is holding back the growing field of antibody repertoire analysis. Here we develop a method that takes short reads from high-throughput sequencing and outputs a genetic profile of the IGHV locus with the read coverage depth and a putative nucleotide sequence for each operationally defined gene cluster. Our operationally defined gene clusters aim to address a major challenge in studying the IGHV locus: the high sequence similarity between gene segments in different genomic locations. Tests on simulated data demonstrate that our approach can accurately determine the presence or absence of a gene cluster from reads as short as 70 bp. More detailed resolution on the copy number of gene clusters can be obtained from read coverage depth using longer reads (e.g., ≥ 100 bp). Detail at the nucleotide resolution of single copy genes (genes present in one copy per haplotype) can be determined with 250 bp reads. For IGHV genes with more than one copy, accurate nucleotide-resolution reconstruction is currently beyond the means of our approach. When applied to a family of European ancestry, our pipeline outputs genotypes that are consistent with the family pedigree, confirms existing multigene variants and suggests new copy number variants. This study paves the way for analyzing population-level patterns of variation in IGHV gene clusters in larger diverse datasets and for quantitatively handling regions of copy number variation in other structurally varying and complex loci.
Collapse
Affiliation(s)
- Shishi Luo
- Computer Science Division, University of California, Berkeley, Berkeley, California, United States of America
- Department of Statistics, University of California, Berkeley, Berkeley, California, United States of America
| | - Jane A. Yu
- Computer Science Division, University of California, Berkeley, Berkeley, California, United States of America
| | - Yun S. Song
- Computer Science Division, University of California, Berkeley, Berkeley, California, United States of America
- Department of Statistics, University of California, Berkeley, Berkeley, California, United States of America
- Departments of Mathematics and Biology, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| |
Collapse
|
16
|
Steele EJ, Lloyd SS. Soma-to-germline feedback is implied by the extreme polymorphism at IGHV relative to MHC: The manifest polymorphism of the MHC appears greatly exceeded at Immunoglobulin loci, suggesting antigen-selected somatic V mutants penetrate Weismann's Barrier. Bioessays 2015; 37:557-69. [PMID: 25810320 DOI: 10.1002/bies.201400213] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2014] [Revised: 02/15/2015] [Accepted: 02/24/2015] [Indexed: 01/22/2023]
Abstract
Soma-to-germline feedback is forbidden under the neo-Darwinian paradigm. Nevertheless, there is a growing realization it occurs frequently in immunoglobulin (Ig) variable (V) region genes. This is a surprising development. It arises from a most unlikely source in light of the exposure of co-author EJS to the haplotype data of RL Dawkins and others on the polymorphism of the Major Histocompatibility Complex, which is generally assumed to be the most polymorphic region in the genome (spanning ∼4 Mb). The comparison between the magnitude of MHC polymorphism with estimates for the human heavy chain immunoglobulin V locus (spanning ∼1 Mb), suggests IGHV could be many orders of magnitude more polymorphic than the MHC. This conclusion needs airing in the literature as it implies generational churn and soma-to-germline gene feedback. Pedigree-based experimental strategies to resolve the IGHV issue are outlined.
Collapse
Affiliation(s)
- Edward J Steele
- C.Y. O'Connor ERADE Village Foundation, Piara Waters, WA, Australia
| | | |
Collapse
|
17
|
Watson C, Steinberg K, Huddleston J, Warren R, Malig M, Schein J, Willsey AJ, Joy J, Scott J, Graves TA, Wilson R, Holt R, Eichler E, Breden F. Complete haplotype sequence of the human immunoglobulin heavy-chain variable, diversity, and joining genes and characterization of allelic and copy-number variation. Am J Hum Genet 2013; 92:530-46. [PMID: 23541343 DOI: 10.1016/j.ajhg.2013.03.004] [Citation(s) in RCA: 172] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2012] [Revised: 01/08/2013] [Accepted: 03/06/2013] [Indexed: 01/02/2023] Open
Abstract
The immunoglobulin heavy-chain locus (IGH) encodes variable (IGHV), diversity (IGHD), joining (IGHJ), and constant (IGHC) genes and is responsible for antibody heavy-chain biosynthesis, which is vital to the adaptive immune response. Programmed V-(D)-J somatic rearrangement and the complex duplicated nature of the locus have impeded attempts to reconcile its genomic organization based on traditional B-lymphocyte derived genetic material. As a result, sequence descriptions of germline variation within IGHV are lacking, haplotype inference using traditional linkage disequilibrium methods has been difficult, and the human genome reference assembly is missing several expressed IGHV genes. By using a hydatidiform mole BAC clone resource, we present the most complete haplotype of IGHV, IGHD, and IGHJ gene regions derived from a single chromosome, representing an alternate assembly of ∼1 Mbp of high-quality finished sequence. From this we add 101 kbp of previously uncharacterized sequence, including functional IGHV genes, and characterize four large germline copy-number variants (CNVs). In addition to this germline reference, we identify and characterize eight CNV-containing haplotypes from a panel of nine diploid genomes of diverse ethnic origin, discovering previously unmapped IGHV genes and an additional 121 kbp of insertion sequence. We genotype four of these CNVs by using PCR in 425 individuals from nine human populations. We find that all four are highly polymorphic and show considerable evidence of stratification (Fst = 0.3-0.5), with the greatest differences observed between African and Asian populations. These CNVs exhibit weak linkage disequilibrium with SNPs from two commercial arrays in most of the populations tested.
Collapse
|
18
|
Watson CT, Breden F. The immunoglobulin heavy chain locus: genetic variation, missing data, and implications for human disease. Genes Immun 2012; 13:363-73. [PMID: 22551722 DOI: 10.1038/gene.2012.12] [Citation(s) in RCA: 123] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Abstract
The immunoglobulin (IG) loci consist of repeated and highly homologous sets of genes of different types, variable (V), diversity (D) and junction (J), that rearrange in developing B cells to produce an individual's highly variable repertoire of expressed antibodies, designed to bind to a vast array of pathogens. This repeated structure makes these loci susceptible to a high frequency of insertion and deletion events through evolutionary time, and also makes them difficult to characterize at the genomic level or assay with high-throughput techniques. Given the central role of antibodies in the adaptive immune system, it is not surprising that early candidate gene approaches showed that germline polymorphisms in these regions correlated with susceptibility to both infectious and autoimmune diseases. However, more recent studies, particularly those using high-throughput genome-wide arrays, have failed to implicate these loci in disease. In this review of the IG heavy chain variable gene cluster (IGHV), we examine how poorly we understand the distribution of haplotype variation in this genomic region, and we argue that this lack of information may mask candidate loci in the IGHV gene cluster as causative factors for infectious and autoimmune diseases.
Collapse
Affiliation(s)
- C T Watson
- Department of Biological Sciences, Simon Fraser University, Burnaby, BC, Canada.
| | | |
Collapse
|
19
|
Kidd MJ, Chen Z, Wang Y, Jackson KJ, Zhang L, Boyd SD, Fire AZ, Tanaka MM, Gaëta BA, Collins AM. The inference of phased haplotypes for the immunoglobulin H chain V region gene loci by analysis of VDJ gene rearrangements. THE JOURNAL OF IMMUNOLOGY 2011; 188:1333-40. [PMID: 22205028 DOI: 10.4049/jimmunol.1102097] [Citation(s) in RCA: 80] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Abstract
The existence of many highly similar genes in the lymphocyte receptor gene loci makes them difficult to investigate, and the determination of phased "haplotypes" has been particularly problematic. However, V(D)J gene rearrangements provide an opportunity to infer the association of Ig genes along the chromosomes. The chromosomal distribution of H chain genes in an Ig genotype can be inferred through analysis of VDJ rearrangements in individuals who are heterozygous at points within the IGH locus. We analyzed VDJ rearrangements from 44 individuals for whom sufficient unique rearrangements were available to allow comprehensive genotyping. Nine individuals were identified who were heterozygous at the IGHJ6 locus and for whom sufficient suitable VDJ rearrangements were available to allow comprehensive haplotyping. Each of the 18 resulting IGHV│IGHD│IGHJ haplotypes was unique. Apparent deletion polymorphisms were seen that involved as many as four contiguous, functional IGHV genes. Two deletion polymorphisms involving multiple contiguous IGHD genes were also inferred. Three previously unidentified gene duplications were detected, where two sequences recognized as allelic variants of a single gene were both inferred to be on a single chromosome. Phased genomic data brings clarity to the study of the contribution of each gene to the available repertoire of rearranged VDJ genes. Analysis of rearrangement frequencies suggests that particular genes may have substantially different yet predictable propensities for rearrangement within different haplotypes. Together with data highlighting the extent of haplotypic variation within the population, this suggests that there may be substantial variability in the available Ab repertoires of different individuals.
Collapse
Affiliation(s)
- Marie J Kidd
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Kensington, Sydney, New South Wales 2052, Australia
| | | | | | | | | | | | | | | | | | | |
Collapse
|
20
|
Pramanik S, Cui X, Wang HY, Chimge NO, Hu G, Shen L, Gao R, Li H. Segmental duplication as one of the driving forces underlying the diversity of the human immunoglobulin heavy chain variable gene region. BMC Genomics 2011; 12:78. [PMID: 21272357 PMCID: PMC3042411 DOI: 10.1186/1471-2164-12-78] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2010] [Accepted: 01/27/2011] [Indexed: 11/10/2022] Open
Abstract
Background Segmental duplication and deletion were implicated for a region containing the human immunoglobulin heavy chain variable (IGHV) gene segments, 1.9III/hv3005 (possible allelic variants of IGHV3-30) and hv3019b9 (a possible allelic variant of IGHV3-33). However, very little is known about the ranges of the duplication and the polymorphic region. This is mainly because of the difficulty associated with distinguishing between allelic and paralogous sequences in the IGHV region containing extensive repetitive sequences. Inability to separate the two parental haploid genomes in the subjects is another serious barrier. To address these issues, unique DNA sequence tags evenly distributed within and flanking the duplicated region implicated by the previous studies were selected. The selected tags in single sperm from six unrelated healthy donors were amplified by multiplex PCR followed by microarray detection. In this way, individual haplotypes of different parental origins in the sperm donors could be analyzed separately and precisely. The identified polymorphic region was further analyzed at the nucleotide sequence level using sequences from the three human genomic sequence assemblies in the database. Results A large polymorphic region was identified using the selected sequence tags. Four of the 12 haplotypes were shown to contain consecutively undetectable tags spanning in a variable range. Detailed analysis of sequences from the genomic sequence assemblies revealed two large duplicate sequence blocks of 24,696 bp and 24,387 bp, respectively, and an incomplete copy of 961 bp in this region. It contains up to 13 IGHV gene segments depending on haplotypes. A polymorphic region was found to be located within the duplicated blocks. The variants of this polymorphism unusually diverged at the nucleotide sequence level and in IGHV gene segment number, composition and organization, indicating a limited selection pressure in general. However, the divergence level within the gene segments is significantly different from that in the intergenic regions indicating that these regions may have been subject to different selection pressures and that the IGHV gene segments in this region are functionally important. Conclusions Non-reciprocal genetic rearrangements associated with large duplicate sequence blocks could substantially contribute to the IGHV region diversity. Since the resulting polymorphisms may affect the number, composition and organization of the gene segments in this region, it may have significant impact on the function of the IGHV gene segment repertoire, antibody diversity, and therefore, the immune system. Because one of the gene segments, 3-30 (1.9III), is associated with autoimmune diseases, it could be of diagnostic significance to learn about the variants in the haplotypes by using the multiplex haplotype analysis system used in the present study with DNA sequence tags specific for the variants of all gene segments in this region.
Collapse
Affiliation(s)
- Sreemanta Pramanik
- Department of Molecular Genetics, Microbiology, and Immunology, University of Medicine and Dentistry of New Jersey-Robert Wood Johnson Medical School, Piscataway, NJ 08854, USA
| | | | | | | | | | | | | | | |
Collapse
|
21
|
IGHV4-39 deletion polymorphism does not associate with risk or outcome of multiple sclerosis. J Neuroimmunol 2010; 225:164-6. [DOI: 10.1016/j.jneuroim.2010.04.012] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2010] [Revised: 04/19/2010] [Accepted: 04/20/2010] [Indexed: 12/28/2022]
|
22
|
van Zelm MC, Geertsema C, Nieuwenhuis N, de Ridder D, Conley ME, Schiff C, Tezcan I, Bernatowska E, Hartwig NG, Sanders EA, Litzman J, Kondratenko I, van Dongen JJ, van der Burg M. Gross deletions involving IGHM, BTK, or Artemis: a model for genomic lesions mediated by transposable elements. Am J Hum Genet 2008; 82:320-32. [PMID: 18252213 DOI: 10.1016/j.ajhg.2007.10.011] [Citation(s) in RCA: 59] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2007] [Revised: 10/01/2007] [Accepted: 10/02/2007] [Indexed: 01/27/2023] Open
Abstract
Most genetic disruptions underlying human disease are microlesions, whereas gross lesions are rare with gross deletions being most frequently found (6%). Similar observations have been made in primary immunodeficiency genes, such as BTK, but for unknown reasons the IGHM and DCLRE1C (Artemis) gene defects frequently represent gross deletions ( approximately 60%). We characterized the gross deletion breakpoints in IGHM-, BTK-, and Artemis-deficient patients. The IGHM deletion breakpoints did not show involvement of recombination signal sequences or immunoglobulin switch regions. Instead, five IGHM, eight BTK, and five unique Artemis breakpoints were located in or near sequences derived from transposable elements (TE). The breakpoints of four out of five disrupted Artemis alleles were located in highly homologous regions, similar to Ig subclass deficiencies and Vh deletion polymorphisms. Nevertheless, these observations suggest a role for TEs in mediating gross deletions. The identified gross deletion breakpoints were mostly located in TE subclasses that were specifically overrepresented in the involved gene as compared to the average in the human genome. This concerned both long (LINE1) and short (Alu, MIR) interspersed elements, as well as LTR retrotransposons (ERV). Furthermore, a high total TE content (>40%) was associated with an increased frequency of gross deletions. Both findings were further investigated and confirmed in a total set of 20 genes disrupted in human disease. Thus, to our knowledge for the first time, we provide evidence that a high TE content, irrespective of the type of element, results in the increased incidence of gross deletions as gene disruption underlying human disease.
Collapse
|