1
|
Collins AM, Ohlin M, Corcoran M, Heather JM, Ralph D, Law M, Martínez-Barnetche J, Ye J, Richardson E, Gibson WS, Rodriguez OL, Peres A, Yaari G, Watson CT, Lees WD. AIRR-C IG Reference Sets: curated sets of immunoglobulin heavy and light chain germline genes. Front Immunol 2024; 14:1330153. [PMID: 38406579 PMCID: PMC10884231 DOI: 10.3389/fimmu.2023.1330153] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Accepted: 12/27/2023] [Indexed: 02/27/2024] Open
Abstract
Introduction Analysis of an individual's immunoglobulin (IG) gene repertoire requires the use of high-quality germline gene reference sets. When sets only contain alleles supported by strong evidence, AIRR sequencing (AIRR-seq) data analysis is more accurate and studies of the evolution of IG genes, their allelic variants and the expressed immune repertoire is therefore facilitated. Methods The Adaptive Immune Receptor Repertoire Community (AIRR-C) IG Reference Sets have been developed by including only human IG heavy and light chain alleles that have been confirmed by evidence from multiple high-quality sources. To further improve AIRR-seq analysis, some alleles have been extended to deal with short 3' or 5' truncations that can lead them to be overlooked by alignment utilities. To avoid other challenges for analysis programs, exact paralogs (e.g. IGHV1-69*01 and IGHV1-69D*01) are only represented once in each set, though alternative sequence names are noted in accompanying metadata. Results and discussion The Reference Sets include less than half the previously recognised IG alleles (e.g. just 198 IGHV sequences), and also include a number of novel alleles: 8 IGHV alleles, 2 IGKV alleles and 5 IGLV alleles. Despite their smaller sizes, erroneous calls were eliminated, and excellent coverage was achieved when a set of repertoires comprising over 4 million V(D)J rearrangements from 99 individuals were analyzed using the Sets. The version-tracked AIRR-C IG Reference Sets are freely available at the OGRDB website (https://ogrdb.airr-community.org/germline_sets/Human) and will be regularly updated to include newly observed and previously reported sequences that can be confirmed by new high-quality data.
Collapse
Affiliation(s)
- Andrew M. Collins
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, NSW, Australia
| | - Mats Ohlin
- Department of Immunotechnology, and SciLifeLab, Lund University, Lund, Sweden
| | - Martin Corcoran
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institute, Stockholm, Sweden
| | - James M. Heather
- Mass General Cancer Center, Massachusetts General Hospital, Charlestown, MA, United States
- Department of Medicine, Harvard Medical School, Boston, MA, United States
| | - Duncan Ralph
- Fred Hutchinson Cancer Research Center, Seattle, WA, United States
| | - Mansun Law
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA, United States
| | - Jesus Martínez-Barnetche
- Centro de Investigación Sobre Enfermedades Infecciosas, Instituto Nacional de Salud Pública, Cuernavaca, Morelos, Mexico
| | - Jian Ye
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, United States
| | - Eve Richardson
- La Jolla Institute for Immunology, San Diego, CA, United States
| | - William S. Gibson
- Department of Biochemistry and Molecular Genetics, School of Medicine, University of Louisville, Louisville, KY, United States
| | - Oscar L. Rodriguez
- Department of Biochemistry and Molecular Genetics, School of Medicine, University of Louisville, Louisville, KY, United States
| | - Ayelet Peres
- Bioengineering Program, Faculty of Engineering, Bar-Ilan University, Ramat Gan, Israel
| | - Gur Yaari
- Bioengineering Program, Faculty of Engineering, Bar-Ilan University, Ramat Gan, Israel
| | - Corey T. Watson
- Department of Biochemistry and Molecular Genetics, School of Medicine, University of Louisville, Louisville, KY, United States
| | - William D. Lees
- Institute of Structural and Molecular Biology, Birkbeck College, London, United Kingdom
- Human-Centered Computing and Information Science, Institute for Systems and Computer Engineering, Technology and Science, Porto, Portugal
| |
Collapse
|
2
|
Shi B, Dong X, Ma Q, Sun S, Ma L, Yu J, Wang X, Pan J, He X, Su D, Yao X. The Usage of Human IGHJ Genes Follows a Particular Non-random Selection: The Recombination Signal Sequence May Affect the Usage of Human IGHJ Genes. Front Genet 2020; 11:524413. [PMID: 33363565 PMCID: PMC7753069 DOI: 10.3389/fgene.2020.524413] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2020] [Accepted: 11/06/2020] [Indexed: 12/02/2022] Open
Abstract
The formation of the B cell receptor (BCR) heavy chain variable region is derived from the germline V(D)J gene rearrangement according to the “12/23” rule and the “beyond 12/23” rule. The usage frequency of each V(D)J gene in the peripheral BCR repertoires is related to the initial recombination, self-tolerance selection, and the clonal proliferative response. However, their specific differences and possible mechanisms are still unknown. We analyzed in-frame and out-of-frame BCR-H repertoires from human samples with normal physiological and various pathological conditions by high-throughput sequencing. Our results showed that IGHJ gene frequency follows a similar pattern which is previously known, where IGHJ4 is used at high frequency (>40%), IGHJ6/IGHJ3/IGHJ5 is used at medium frequencies (10∼20%), and IGH2/IGHJ1 is used at low frequency (<4%) under whether normal physiological or various pathological conditions. However, our analysis of the recombination signal sequences suggested that the conserved non-amer and heptamer and certain 23 bp spacer length may affect the initial IGHD-IGHJ recombination, which results in different frequencies of IGHJ genes among the initial BCR-H repertoire. Based on this “initial repertoire,” we recommend that re-evaluation and further investigation are needed when analyzing the significance and mechanism of IGHJ gene frequency in self-tolerance selection and the clonal proliferative response.
Collapse
Affiliation(s)
- Bin Shi
- Department of Laboratory Medicine, Affiliated Hospital of Zunyi Medical University, Zunyi, China.,School of Laboratory Medicine, Zunyi Medical University, Zunyi, China
| | - Xiaoheng Dong
- Department of Immunology, Center of Immunomolecular Engineering, Innovation & Practice Base for Graduate Students Education, Zunyi Medical University, Zunyi, China
| | - Qingqing Ma
- Department of Immunology, Center of Immunomolecular Engineering, Innovation & Practice Base for Graduate Students Education, Zunyi Medical University, Zunyi, China
| | - Suhong Sun
- Department of Breast Surgery, Affiliated Hospital of Zunyi Medical University, Zunyi, China
| | - Long Ma
- Department of Immunology, Center of Immunomolecular Engineering, Innovation & Practice Base for Graduate Students Education, Zunyi Medical University, Zunyi, China
| | - Jiang Yu
- Department of Immunology, Center of Immunomolecular Engineering, Innovation & Practice Base for Graduate Students Education, Zunyi Medical University, Zunyi, China
| | - Xiaomei Wang
- Department of Immunology, Center of Immunomolecular Engineering, Innovation & Practice Base for Graduate Students Education, Zunyi Medical University, Zunyi, China
| | - Juan Pan
- Department of Immunology, Center of Immunomolecular Engineering, Innovation & Practice Base for Graduate Students Education, Zunyi Medical University, Zunyi, China
| | - Xiaoyan He
- Department of Immunology, Center of Immunomolecular Engineering, Innovation & Practice Base for Graduate Students Education, Zunyi Medical University, Zunyi, China
| | - Danhua Su
- Department of Immunology, Center of Immunomolecular Engineering, Innovation & Practice Base for Graduate Students Education, Zunyi Medical University, Zunyi, China
| | - Xinsheng Yao
- Department of Immunology, Center of Immunomolecular Engineering, Innovation & Practice Base for Graduate Students Education, Zunyi Medical University, Zunyi, China
| |
Collapse
|
3
|
Watson CT, Kos JT, Gibson WS, Newman L, Deikus G, Busse CE, Smith ML, Jackson KJ, Collins AM. A comparison of immunoglobulin IGHV, IGHD and IGHJ genes in wild-derived and classical inbred mouse strains. Immunol Cell Biol 2019; 97:888-901. [PMID: 31441114 DOI: 10.1111/imcb.12288] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2019] [Revised: 08/05/2019] [Accepted: 08/20/2019] [Indexed: 01/20/2023]
Abstract
The genomes of classical inbred mouse strains include genes derived from all three major subspecies of the house mouse, Mus musculus. We recently posited that genetic diversity in the immunoglobulin heavy chain (IGH) gene loci of C57BL/6 and BALB/c mice reflects differences in subspecies origin. To investigate this hypothesis, we conducted high-throughput sequencing of IGH gene rearrangements to document IGH variable (IGHV), joining (IGHJ) and diversity (IGHD) genes in four inbred wild-derived mouse strains (CAST/EiJ, LEWES/EiJ, MSM/MsJ and PWD/PhJ) and a single disease model strain (NOD/ShiLtJ), collectively representing genetic backgrounds of several major mouse subspecies. A total of 341 germline IGHV sequences were inferred in the wild-derived strains, including 247 not curated in the international ImMunoGeneTics information system. By contrast, 83/84 inferred NOD IGHV genes had previously been observed in C57BL/6 mice. Variability among the strains examined was observed for only a single IGHJ gene, involving a description of a novel allele. By contrast, unexpected variation was found in the IGHD gene loci, with four previously unreported IGHD gene sequences being documented. Very few IGHV sequences of C57BL/6 and BALB/c mice were shared with strains representing major subspecies, suggesting that their IGH loci may be complex mosaics of genes of disparate origins. This suggests a similar level of diversity is likely present in the IGH loci of other classical inbred strains. This must now be documented if we are to properly understand interstrain variation in models of antibody-mediated disease.
Collapse
Affiliation(s)
- Corey T Watson
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, 40202, USA
| | - Justin T Kos
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, 40202, USA
| | - William S Gibson
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, 40202, USA
| | - Leah Newman
- Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA.,Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Gintaras Deikus
- Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA.,Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Christian E Busse
- Division of B Cell Immunology, German Cancer Research Center, 69120, Heidelberg, Germany
| | - Melissa L Smith
- Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA.,Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Katherine Jl Jackson
- Immunology Division, Garvan Institute of Medical Research, Darlinghurst, NSW, 2010, Australia
| | - Andrew M Collins
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, NSW, 2052, Australia
| |
Collapse
|
4
|
Collins AM, Wang Y, Roskin KM, Marquis CP, Jackson KJL. The mouse antibody heavy chain repertoire is germline-focused and highly variable between inbred strains. Philos Trans R Soc Lond B Biol Sci 2016; 370:rstb.2014.0236. [PMID: 26194750 DOI: 10.1098/rstb.2014.0236] [Citation(s) in RCA: 65] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023] Open
Abstract
The human and mouse antibody repertoires are formed by identical processes, but like all small animals, mice only have sufficient lymphocytes to express a small part of the potential antibody repertoire. In this study, we determined how the heavy chain repertoires of two mouse strains are generated. Analysis of IgM- and IgG-associated VDJ rearrangements generated by high-throughput sequencing confirmed the presence of 99 functional immunoglobulin heavy chain variable (IGHV) genes in the C57BL/6 genome, and inferred the presence of 164 IGHV genes in the BALB/c genome. Remarkably, only five IGHV sequences were common to both strains. Compared with humans, little N nucleotide addition was seen in the junctions of mouse VDJ genes. Germline human IgG-associated IGHV genes are rare, but many murine IgG-associated IGHV genes were unmutated. Together these results suggest that the expressed mouse repertoire is more germline-focused than the human repertoire. The apparently divergent germline repertoires of the mouse strains are discussed with reference to reports that inbred mouse strains carry blocks of genes derived from each of the three subspecies of the house mouse. We hypothesize that the germline genes of BALB/c and C57BL/6 mice may originally have evolved to generate distinct germline-focused antibody repertoires in the different mouse subspecies.
Collapse
Affiliation(s)
- Andrew M Collins
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, 2052 NSW, Australia
| | - Yan Wang
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, 2052 NSW, Australia
| | - Krishna M Roskin
- Department of Pathology, School of Medicine, Stanford University, Stanford, CA 94305-5324, USA
| | - Christopher P Marquis
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, 2052 NSW, Australia
| | - Katherine J L Jackson
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, 2052 NSW, Australia Department of Pathology, School of Medicine, Stanford University, Stanford, CA 94305-5324, USA
| |
Collapse
|