1
|
Collins AM, Ohlin M, Corcoran M, Heather JM, Ralph D, Law M, Martínez-Barnetche J, Ye J, Richardson E, Gibson WS, Rodriguez OL, Peres A, Yaari G, Watson CT, Lees WD. AIRR-C IG Reference Sets: curated sets of immunoglobulin heavy and light chain germline genes. Front Immunol 2024; 14:1330153. [PMID: 38406579 PMCID: PMC10884231 DOI: 10.3389/fimmu.2023.1330153] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Accepted: 12/27/2023] [Indexed: 02/27/2024] Open
Abstract
Introduction Analysis of an individual's immunoglobulin (IG) gene repertoire requires the use of high-quality germline gene reference sets. When sets only contain alleles supported by strong evidence, AIRR sequencing (AIRR-seq) data analysis is more accurate and studies of the evolution of IG genes, their allelic variants and the expressed immune repertoire is therefore facilitated. Methods The Adaptive Immune Receptor Repertoire Community (AIRR-C) IG Reference Sets have been developed by including only human IG heavy and light chain alleles that have been confirmed by evidence from multiple high-quality sources. To further improve AIRR-seq analysis, some alleles have been extended to deal with short 3' or 5' truncations that can lead them to be overlooked by alignment utilities. To avoid other challenges for analysis programs, exact paralogs (e.g. IGHV1-69*01 and IGHV1-69D*01) are only represented once in each set, though alternative sequence names are noted in accompanying metadata. Results and discussion The Reference Sets include less than half the previously recognised IG alleles (e.g. just 198 IGHV sequences), and also include a number of novel alleles: 8 IGHV alleles, 2 IGKV alleles and 5 IGLV alleles. Despite their smaller sizes, erroneous calls were eliminated, and excellent coverage was achieved when a set of repertoires comprising over 4 million V(D)J rearrangements from 99 individuals were analyzed using the Sets. The version-tracked AIRR-C IG Reference Sets are freely available at the OGRDB website (https://ogrdb.airr-community.org/germline_sets/Human) and will be regularly updated to include newly observed and previously reported sequences that can be confirmed by new high-quality data.
Collapse
Affiliation(s)
- Andrew M. Collins
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, NSW, Australia
| | - Mats Ohlin
- Department of Immunotechnology, and SciLifeLab, Lund University, Lund, Sweden
| | - Martin Corcoran
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institute, Stockholm, Sweden
| | - James M. Heather
- Mass General Cancer Center, Massachusetts General Hospital, Charlestown, MA, United States
- Department of Medicine, Harvard Medical School, Boston, MA, United States
| | - Duncan Ralph
- Fred Hutchinson Cancer Research Center, Seattle, WA, United States
| | - Mansun Law
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA, United States
| | - Jesus Martínez-Barnetche
- Centro de Investigación Sobre Enfermedades Infecciosas, Instituto Nacional de Salud Pública, Cuernavaca, Morelos, Mexico
| | - Jian Ye
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, United States
| | - Eve Richardson
- La Jolla Institute for Immunology, San Diego, CA, United States
| | - William S. Gibson
- Department of Biochemistry and Molecular Genetics, School of Medicine, University of Louisville, Louisville, KY, United States
| | - Oscar L. Rodriguez
- Department of Biochemistry and Molecular Genetics, School of Medicine, University of Louisville, Louisville, KY, United States
| | - Ayelet Peres
- Bioengineering Program, Faculty of Engineering, Bar-Ilan University, Ramat Gan, Israel
| | - Gur Yaari
- Bioengineering Program, Faculty of Engineering, Bar-Ilan University, Ramat Gan, Israel
| | - Corey T. Watson
- Department of Biochemistry and Molecular Genetics, School of Medicine, University of Louisville, Louisville, KY, United States
| | - William D. Lees
- Institute of Structural and Molecular Biology, Birkbeck College, London, United Kingdom
- Human-Centered Computing and Information Science, Institute for Systems and Computer Engineering, Technology and Science, Porto, Portugal
| |
Collapse
|
2
|
Peres A, Lees WD, Rodriguez OL, Lee NY, Polak P, Hope R, Kedmi M, Collins AM, Ohlin M, Kleinstein S, Watson C, Yaari G. IGHV allele similarity clustering improves genotype inference from adaptive immune receptor repertoire sequencing data. Nucleic Acids Res 2023; 51:e86. [PMID: 37548401 PMCID: PMC10484671 DOI: 10.1093/nar/gkad603] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2023] [Revised: 06/26/2023] [Accepted: 08/03/2023] [Indexed: 08/08/2023] Open
Abstract
In adaptive immune receptor repertoire analysis, determining the germline variable (V) allele associated with each T- and B-cell receptor sequence is a crucial step. This process is highly impacted by allele annotations. Aligning sequences, assigning them to specific germline alleles, and inferring individual genotypes are challenging when the repertoire is highly mutated, or sequence reads do not cover the whole V region. Here, we propose an alternative naming scheme for the V alleles, as well as a novel method to infer individual genotypes. We demonstrate the strengths of the two by comparing their outcomes to other genotype inference methods. We validate the genotype approach with independent genomic long-read data. The naming scheme is compatible with current annotation tools and pipelines. Analysis results can be converted from the proposed naming scheme to the nomenclature determined by the International Union of Immunological Societies (IUIS). Both the naming scheme and the genotype procedure are implemented in a freely available R package (PIgLET https://bitbucket.org/yaarilab/piglet). To allow researchers to further explore the approach on real data and to adapt it for their uses, we also created an interactive website (https://yaarilab.github.io/IGHV_reference_book).
Collapse
Affiliation(s)
- Ayelet Peres
- Faculty of Engineering, Bar Ilan University, 5290002 Ramat Gan, Israel
- Bar Ilan Institute of Nanotechnology and Advanced Materials, Bar Ilan University, 5290002 Ramat Gan, Israel
| | - William D Lees
- Institute of Structural and Molecular Biology, Birkbeck College, University of London, London, WC1E 7JE, UK
| | - Oscar L Rodriguez
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, 40202, USA
| | - Noah Y Lee
- Program in Computational Biology & Bioinformatics, Yale University, New Haven, CT, 06511, USA
- Department of Pathology, Yale School of Medicine, New Haven, CT, 06520, USA
| | - Pazit Polak
- Faculty of Engineering, Bar Ilan University, 5290002 Ramat Gan, Israel
- Bar Ilan Institute of Nanotechnology and Advanced Materials, Bar Ilan University, 5290002 Ramat Gan, Israel
| | - Ronen Hope
- Faculty of Engineering, Bar Ilan University, 5290002 Ramat Gan, Israel
| | - Meirav Kedmi
- Department of Pathology, Yale School of Medicine, New Haven, CT, 06520, USA
- Division of Hematology and Bone Marrow Transplantation, Chaim Sheba Medical Center, Tel-Hashomer, 5262000, Israel
- Sackler School of Medicine, Tel-Aviv University, Tel-Aviv, 69978, Israel
| | - Andrew M Collins
- School of Biotechnology and Biomedical Sciences, University of New South Wales, Sydney, NSW 2052, Australia
| | - Mats Ohlin
- Department of Immunotechnology Lund University, Lund, 221 00, Sweden
| | - Steven H Kleinstein
- Program in Computational Biology & Bioinformatics, Yale University, New Haven, CT, 06511, USA
- Department of Pathology, Yale School of Medicine, New Haven, CT, 06520, USA
| | - Corey T Watson
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, 40202, USA
| | - Gur Yaari
- Faculty of Engineering, Bar Ilan University, 5290002 Ramat Gan, Israel
- Bar Ilan Institute of Nanotechnology and Advanced Materials, Bar Ilan University, 5290002 Ramat Gan, Israel
| |
Collapse
|
3
|
Jackson KJL, Kos JT, Lees W, Gibson WS, Smith ML, Peres A, Yaari G, Corcoran M, Busse CE, Ohlin M, Watson CT, Collins AM. A BALB/c IGHV Reference Set, Defined by Haplotype Analysis of Long-Read VDJ-C Sequences From F1 (BALB/c x C57BL/6) Mice. Front Immunol 2022; 13:888555. [PMID: 35720344 PMCID: PMC9205180 DOI: 10.3389/fimmu.2022.888555] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2022] [Accepted: 04/28/2022] [Indexed: 11/13/2022] Open
Abstract
The immunoglobulin genes of inbred mouse strains that are commonly used in models of antibody-mediated human diseases are poorly characterized. This compromises data analysis. To infer the immunoglobulin genes of BALB/c mice, we used long-read SMRT sequencing to amplify VDJ-C sequences from F1 (BALB/c x C57BL/6) hybrid animals. Strain variations were identified in the Ighm and Ighg2b genes, and analysis of VDJ rearrangements led to the inference of 278 germline IGHV alleles. 169 alleles are not present in the C57BL/6 genome reference sequence. To establish a set of expressed BALB/c IGHV germline gene sequences, we computationally retrieved IGHV haplotypes from the IgM dataset. Haplotyping led to the confirmation of 162 BALB/c IGHV gene sequences. A musIGHV398 pseudogene variant also appears to be present in the BALB/cByJ substrain, while a functional musIGHV398 gene is highly expressed in the BALB/cJ substrain. Only four of the BALB/c alleles were also observed in the C57BL/6 haplotype. The full set of inferred BALB/c sequences has been used to establish a BALB/c IGHV reference set, hosted at https://ogrdb.airr-community.org. We assessed whether assemblies from the Mouse Genome Project (MGP) are suitable for the determination of the genes of the IGH loci. Only 37 (43.5%) of the 85 confirmed IMGT-named BALB/c IGHV and 33 (42.9%) of the 77 confirmed non-IMGT IGHV were found in a search of the MGP BALB/cJ genome assembly. This suggests that current MGP assemblies are unsuitable for the comprehensive documentation of germline IGHVs and more efforts will be needed to establish strain-specific reference sets.
Collapse
Affiliation(s)
| | - Justin T. Kos
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, United States
| | - William Lees
- Institute of Structural and Molecular Biology, Birkbeck College, University of London, London, United Kingdom
| | - William S. Gibson
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, United States
| | - Melissa Laird Smith
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, United States
| | - Ayelet Peres
- Faculty of Engineering, Bar Ilan University, Ramat Gan, Israel
| | - Gur Yaari
- Faculty of Engineering, Bar Ilan University, Ramat Gan, Israel
| | - Martin Corcoran
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institutet, Stockholm, Sweden
| | - Christian E. Busse
- Division of B Cell Immunology, German Cancer Research Center, Heidelberg, Germany
| | - Mats Ohlin
- Department of Immunotechnology, Lund University, Lund, Sweden
| | - Corey T. Watson
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, United States
| | - Andrew M. Collins
- School of Biotechnology and Biomolecular Sciences, The University of New South Wales, Sydney, NSW, Australia
| |
Collapse
|
4
|
Yang X, Zhu Y, Chen S, Zeng H, Guan J, Wang Q, Lan C, Sun D, Yu X, Zhang Z. Novel Allele Detection Tool Benchmark and Application With Antibody Repertoire Sequencing Dataset. Front Immunol 2021; 12:739179. [PMID: 34764956 PMCID: PMC8576399 DOI: 10.3389/fimmu.2021.739179] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2021] [Accepted: 10/11/2021] [Indexed: 11/29/2022] Open
Abstract
Detailed knowledge of the diverse immunoglobulin germline genes is critical for the study of humoral immunity. Hundreds of alleles have been discovered by analyzing antibody repertoire sequencing (Rep-seq or Ig-seq) data via multiple novel allele detection tools (NADTs). However, the performance of these NADTs through antibody sequences with intrinsic somatic hypermutations (SHMs) is unclear. Here, we developed a tool to simulate repertoires by integrating the full spectrum features of an antibody repertoire such as germline gene usage, junctional modification, position-specific SHM and clonal expansion based on 2152 high-quality datasets. We then systematically evaluated these NADTs using both simulated and genuine Ig-seq datasets. Finally, we applied these NADTs to 687 Ig-seq datasets and identified 43 novel allele candidates (NACs) using defined criteria. Twenty-five alleles were validated through findings of other sources. In addition to the NACs detected, our simulation tool, the results of our comparison, and the streamline of this process may benefit further humoral immunity studies via Ig-seq.
Collapse
Affiliation(s)
- Xiujia Yang
- Center for Precision Medicine, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China.,Guangdong-Hong Kong Joint Laboratory on Immunological and Genetic Kidney Diseases, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China.,State Key Laboratory of Organ Failure Research, National Clinical Research Center for Kidney Disease, Division of Nephrology, Nanfang Hospital, Southern Medical University, Guangzhou, China.,Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China
| | - Yan Zhu
- Center for Precision Medicine, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China
| | - Sen Chen
- State Key Laboratory of Organ Failure Research, National Clinical Research Center for Kidney Disease, Division of Nephrology, Nanfang Hospital, Southern Medical University, Guangzhou, China.,Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China
| | - Huikun Zeng
- Center for Precision Medicine, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China.,Guangdong-Hong Kong Joint Laboratory on Immunological and Genetic Kidney Diseases, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China
| | - Junjie Guan
- State Key Laboratory of Organ Failure Research, National Clinical Research Center for Kidney Disease, Division of Nephrology, Nanfang Hospital, Southern Medical University, Guangzhou, China.,Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China
| | - Qilong Wang
- Center for Precision Medicine, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China.,Guangdong-Hong Kong Joint Laboratory on Immunological and Genetic Kidney Diseases, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China
| | - Chunhong Lan
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China
| | - Deqiang Sun
- Department of Center Laboratory, The Fifth Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Xueqing Yu
- Guangdong-Hong Kong Joint Laboratory on Immunological and Genetic Kidney Diseases, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China.,Division of Nephrology, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China
| | - Zhenhai Zhang
- Center for Precision Medicine, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China.,Guangdong-Hong Kong Joint Laboratory on Immunological and Genetic Kidney Diseases, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China.,State Key Laboratory of Organ Failure Research, National Clinical Research Center for Kidney Disease, Division of Nephrology, Nanfang Hospital, Southern Medical University, Guangzhou, China.,Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China.,Key Laboratory of Mental Health of the Ministry of Education, Guangdong-Hong Kong-Macao Greater Bay Area Center for Brain Science and Brain-Inspired Intelligence, Southern Medical University, Guangzhou, China
| |
Collapse
|
5
|
Abstract
Immunogenomics studies have been largely limited to individuals of European ancestry, restricting the ability to identify variation in human adaptive immune responses across populations. Inclusion of a greater diversity of individuals in immunogenomics studies will substantially enhance our understanding of human immunology.
Collapse
|
6
|
Adaptive immune receptor repertoires, an overview of this exciting field. Immunol Lett 2020; 221:49-55. [PMID: 32113899 DOI: 10.1016/j.imlet.2020.02.013] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2019] [Revised: 02/19/2020] [Accepted: 02/26/2020] [Indexed: 12/30/2022]
Abstract
The adaptive immune response in jawed vertebrates relies on the huge diversity and specificity of the B cell and T cell antigen receptors, the immunoglobulins (IG) or antibodies and the T cell receptors (TR), respectively. The high level of diversity has represented a barrier to a comprehensive analysis of the adaptive immune response before the emergence of high-throughput sequencing (HTS) technologies. The size and complexity of HTS data requires the generation of novel computational and analytical approaches, which are transforming how the adaptive immune responses are deciphered to understand the clonal dynamics and properties of antigen-specific B and T cells in response to different kind of antigens. This exciting and rapidly evolving field is not only impacting human and clinical immunology but also comparative immunology. We are now closer to understanding the evolution of adaptive immune response in jawed vertebrates. This review provides an overview about classical and current strategies developed to assess the IG/TR diversity and their applications in basic and clinical immunology.
Collapse
|
7
|
Schwartz GW, Shauli T, Linial M, Hershberg U. Serine substitutions are linked to codon usage and differ for variable and conserved protein regions. Sci Rep 2019; 9:17238. [PMID: 31754132 PMCID: PMC6872785 DOI: 10.1038/s41598-019-53452-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2019] [Accepted: 11/01/2019] [Indexed: 11/11/2022] Open
Abstract
Serine is the only amino acid that is encoded by two disjoint codon sets (TCN & AGY) so that a tandem substitution of two nucleotides is required to switch between the two sets. We show that these codon sets underlie distinct substitution patterns at positions subject to purifying and diversifying selections. We found that in humans, positions that are conserved among ~100 vertebrates, and thus subjected to purifying selection, are enriched for substitutions involving serine (TCN, denoted S'), proline, and alanine, (S'PA). In contrast, the less conserved positions are enriched for serine encoded with AGY codons (denoted S″), glycine and asparagine, (GS″N). We tested this phenomenon in the HIV envelope glycoprotein (gp120), and the V-gene that encodes B-cell receptors/antibodies. These fast evolving proteins both have hypervariable positions, which are under diversifying selection, closely adjacent to highly conserved structural regions. In both instances, we identified an opposite abundance of two groups of serine substitutions, with enrichment of S'PA in the conserved positions, and GS″N in the hypervariable regions. Finally, we analyzed the substitutions across 60,000 individual human exomes to show that, when serine has a specific functional constraint of phosphorylation capability, S' codons are 32-folds less prone than S″ to substitutions to Threonine or Tyrosine that could potentially retain the phosphorylation site capacity. Combined, our results, that cover evolutionary signals at different temporal scales, demonstrate that through its encoding by two codon sets, serine allows for the existence of alternating substitution patterns within positions of functional maintenance versus sites of rapid diversification.
Collapse
Affiliation(s)
- Gregory W Schwartz
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, USA
| | - Tair Shauli
- School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Michal Linial
- Department of Biological Chemistry, Institute of Life Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Uri Hershberg
- Drexel School of Biomedical Engineering, Science and Health Systems, Drexel University, Philadelphia, USA.
- Department of Microbiology and Immunology, Drexel College of Medicine, Drexel University, Philadelphia, USA.
- Department of Human Biology, Faculty of Science, University of Haifa, Haifa, Israel.
| |
Collapse
|
8
|
Watson CT, Kos JT, Gibson WS, Newman L, Deikus G, Busse CE, Smith ML, Jackson KJ, Collins AM. A comparison of immunoglobulin IGHV, IGHD and IGHJ genes in wild-derived and classical inbred mouse strains. Immunol Cell Biol 2019; 97:888-901. [PMID: 31441114 DOI: 10.1111/imcb.12288] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2019] [Revised: 08/05/2019] [Accepted: 08/20/2019] [Indexed: 01/20/2023]
Abstract
The genomes of classical inbred mouse strains include genes derived from all three major subspecies of the house mouse, Mus musculus. We recently posited that genetic diversity in the immunoglobulin heavy chain (IGH) gene loci of C57BL/6 and BALB/c mice reflects differences in subspecies origin. To investigate this hypothesis, we conducted high-throughput sequencing of IGH gene rearrangements to document IGH variable (IGHV), joining (IGHJ) and diversity (IGHD) genes in four inbred wild-derived mouse strains (CAST/EiJ, LEWES/EiJ, MSM/MsJ and PWD/PhJ) and a single disease model strain (NOD/ShiLtJ), collectively representing genetic backgrounds of several major mouse subspecies. A total of 341 germline IGHV sequences were inferred in the wild-derived strains, including 247 not curated in the international ImMunoGeneTics information system. By contrast, 83/84 inferred NOD IGHV genes had previously been observed in C57BL/6 mice. Variability among the strains examined was observed for only a single IGHJ gene, involving a description of a novel allele. By contrast, unexpected variation was found in the IGHD gene loci, with four previously unreported IGHD gene sequences being documented. Very few IGHV sequences of C57BL/6 and BALB/c mice were shared with strains representing major subspecies, suggesting that their IGH loci may be complex mosaics of genes of disparate origins. This suggests a similar level of diversity is likely present in the IGH loci of other classical inbred strains. This must now be documented if we are to properly understand interstrain variation in models of antibody-mediated disease.
Collapse
Affiliation(s)
- Corey T Watson
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, 40202, USA
| | - Justin T Kos
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, 40202, USA
| | - William S Gibson
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, 40202, USA
| | - Leah Newman
- Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA.,Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Gintaras Deikus
- Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA.,Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Christian E Busse
- Division of B Cell Immunology, German Cancer Research Center, 69120, Heidelberg, Germany
| | - Melissa L Smith
- Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA.,Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Katherine Jl Jackson
- Immunology Division, Garvan Institute of Medical Research, Darlinghurst, NSW, 2010, Australia
| | - Andrew M Collins
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, NSW, 2052, Australia
| |
Collapse
|
9
|
Ralph DK, Matsen FA. Per-sample immunoglobulin germline inference from B cell receptor deep sequencing data. PLoS Comput Biol 2019; 15:e1007133. [PMID: 31329576 PMCID: PMC6675132 DOI: 10.1371/journal.pcbi.1007133] [Citation(s) in RCA: 37] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2017] [Revised: 08/01/2019] [Accepted: 05/28/2019] [Indexed: 11/26/2022] Open
Abstract
The collection of immunoglobulin genes in an individual's germline, which gives rise to B cell receptors via recombination, is known to vary significantly across individuals. In humans, for example, each individual has only a fraction of the several hundred known V alleles. Furthermore, the currently-accepted set of known V alleles is both incomplete (particularly for non-European samples), and contains a significant number of spurious alleles. The resulting uncertainty as to which immunoglobulin alleles are present in any given sample results in inaccurate B cell receptor sequence annotations, and in particular inaccurate inferred naive ancestors. In this paper we first show that the currently widespread practice of aligning each sequence to its closest match in the full set of IMGT alleles results in a very large number of spurious alleles that are not in the sample's true set of germline V alleles. We then describe a new method for inferring each individual's germline gene set from deep sequencing data, and show that it improves upon existing methods by making a detailed comparison on a variety of simulated and real data samples. This new method has been integrated into the partis annotation and clonal family inference package, available at https://github.com/psathyrella/partis, and is run by default without affecting overall run time.
Collapse
Affiliation(s)
- Duncan K. Ralph
- Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America
| | - Frederick A. Matsen
- Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America
| |
Collapse
|
10
|
Gadala-Maria D, Gidoni M, Marquez S, Vander Heiden JA, Kos JT, Watson CT, O'Connor KC, Yaari G, Kleinstein SH. Identification of Subject-Specific Immunoglobulin Alleles From Expressed Repertoire Sequencing Data. Front Immunol 2019; 10:129. [PMID: 30814994 PMCID: PMC6381938 DOI: 10.3389/fimmu.2019.00129] [Citation(s) in RCA: 43] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2018] [Accepted: 01/16/2019] [Indexed: 01/10/2023] Open
Abstract
The adaptive immune receptor repertoire (AIRR) contains information on an individuals' immune past, present and potential in the form of the evolving sequences that encode the B cell receptor (BCR) repertoire. AIRR sequencing (AIRR-seq) studies rely on databases of known BCR germline variable (V), diversity (D), and joining (J) genes to detect somatic mutations in AIRR-seq data via comparison to the best-aligning database alleles. However, it has been shown that these databases are far from complete, leading to systematic misidentification of mutated positions in subsets of sample sequences. We previously presented TIgGER, a computational method to identify subject-specific V gene genotypes, including the presence of novel V gene alleles, directly from AIRR-seq data. However, the original algorithm was unable to detect alleles that differed by more than 5 single nucleotide polymorphisms (SNPs) from a database allele. Here we present and apply an improved version of the TIgGER algorithm which can detect alleles that differ by any number of SNPs from the nearest database allele, and can construct subject-specific genotypes with minimal prior information. TIgGER predictions are validated both computationally (using a leave-one-out strategy) and experimentally (using genomic sequencing), resulting in the addition of three new immunoglobulin heavy chain V (IGHV) gene alleles to the IMGT repertoire. Finally, we develop a Bayesian strategy to provide a confidence estimate associated with genotype calls. All together, these methods allow for much higher accuracy in germline allele assignment, an essential step in AIRR-seq studies.
Collapse
Affiliation(s)
- Daniel Gadala-Maria
- Interdepartmental Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, United States
| | - Moriah Gidoni
- Bioengineering Program, Faculty of Engineering, Bar-Ilan University, Ramat Gan, Israel
| | - Susanna Marquez
- Department of Pathology, Yale School of Medicine, Yale University, New Haven, CT, United States
| | - Jason A. Vander Heiden
- Department of Neurology, Yale School of Medicine, Yale University, New Haven, CT, United States
| | - Justin T. Kos
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, United States
| | - Corey T. Watson
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, United States
| | - Kevin C. O'Connor
- Department of Neurology, Yale School of Medicine, Yale University, New Haven, CT, United States
- Department of Immunobiology, Yale School of Medicine, Yale University, New Haven, CT, United States
| | - Gur Yaari
- Bioengineering Program, Faculty of Engineering, Bar-Ilan University, Ramat Gan, Israel
| | - Steven H. Kleinstein
- Interdepartmental Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, United States
- Department of Pathology, Yale School of Medicine, Yale University, New Haven, CT, United States
- Department of Immunobiology, Yale School of Medicine, Yale University, New Haven, CT, United States
| |
Collapse
|
11
|
Kaur H, Sain N, Mohanty D, Salunke DM. Deciphering evolution of immune recognition in antibodies. BMC STRUCTURAL BIOLOGY 2018; 18:19. [PMID: 30563492 PMCID: PMC6299584 DOI: 10.1186/s12900-018-0096-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/17/2018] [Accepted: 11/14/2018] [Indexed: 11/29/2022]
Abstract
Background Antibody, the primary effector molecule of the immune system, evolves after initial encounter with the antigen from a precursor form to a mature one to effectively deal with the antigen. Antibodies of a lineage diverge through antigen-directed isolated pathways of maturation to exhibit distinct recognition potential. In the context of evolution in immune recognition, diversity of antigen cannot be ignored. While there are reports on antibody lineage, structural perspective with respect to diverse recognition potential in a lineage has never been studied. Hence, it is crucial to evaluate how maturation leads to topological tailoring within a lineage enabling them to interact with significantly distinct antigens. Results A data-driven approach was undertaken for the study. Global experimental mouse and human antibody-antigen complex structures from PDB were compiled into a coherent database of germline-linked antibodies bound with distinct antigens. Structural analysis of all lineages showed variations in CDRs of both H and L chains. Observations of conformational adaptation made from analysis of static structures were further evaluated by characterizing dynamics of interaction in two lineages, mouse VH1–84 and human VH5–51. Sequence and structure analysis of the lineages explained that somatic mutations altered the geometries of individual antibodies with common structural constraints in some CDRs. Additionally, conformational landscape obtained from molecular dynamics simulations revealed that incoming pathogen led to further conformational divergence in the paratope (as observed across datasets) even while maintaining similar overall backbone topology. MM-GB/SA analysis showed binding energies to be in physiological range. Results of the study are coherent with experimental observations. Conclusions The findings of this study highlight basic structural principles shaping the molecular evolution of a lineage for significantly diverse antigens. Antibodies of a lineage follow different developmental pathways while preserving the imprint of the germline. From the study, it can be generalized that structural diversification of the paratope is an outcome of natural selection of a conformation from an available ensemble, which is further optimized for antigen interaction. The study establishes that starting from a common lineage, antibodies can mature to recognize a wide range of antigens. This hypothesis can be further tested and validated experimentally. Electronic supplementary material The online version of this article (10.1186/s12900-018-0096-1) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Harmeet Kaur
- Regional Centre for Biotechnology, Biotech Science Cluster, Faridabad, Haryana, 121001, India.,Manipal Academy of Higher Education, Manipal, Karnataka, 576104, India
| | - Neetu Sain
- National Institute of Immunology, New Delhi, Delhi, 110067, India
| | - Debasisa Mohanty
- National Institute of Immunology, New Delhi, Delhi, 110067, India
| | - Dinakar M Salunke
- Regional Centre for Biotechnology, Biotech Science Cluster, Faridabad, Haryana, 121001, India. .,International Centre for Genetic Engineering and Biotechnology, New Delhi, Delhi, 110067, India.
| |
Collapse
|
12
|
Thörnqvist L, Ohlin M. Critical steps for computational inference of the 3'-end of novel alleles of immunoglobulin heavy chain variable genes - illustrated by an allele of IGHV3-7. Mol Immunol 2018; 103:1-6. [PMID: 30172112 DOI: 10.1016/j.molimm.2018.08.018] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2018] [Revised: 08/10/2018] [Accepted: 08/18/2018] [Indexed: 01/16/2023]
Abstract
Sequencing of immunoglobulin germline gene loci is a challenging process, e.g. due to their repetitiveness and complexity, hence limiting the insight in the germline gene repertoire of humans and other species. Through next generation sequencing technology, it is possible to generate immunoglobulin transcript data sets large enough to computationally infer the germline genes from which the transcripts originate. Multiple tools for such inference have been developed and they can be used for construction of individual germline gene databases, and for discovery of new immunoglobulin germline genes and alleles. However, there are challenges associated with these methods, many of them related to the biological process through which immunoglobulin coding genes are generated. The junctional diversity introduced during rearrangement of the immunoglobulin heavy chain variable (IGHV), diversity and joining genes specifically complicates the inference of the junction regions, with implications for inference of the 3'-end of IGHV genes. With the aim of coping with such diversity, an inference software package may not be able to identify novel alleles harbouring a difference in these regions compared to their closest relatives in the starting database. In this study, we were able to computationally infer one such previously uncharacterized allele, IGHV3-7*02 A318G. However, this was possible only if a strategy was used in which different variants of IGHV3-7*02 were included in the inference-initiating database. Importantly, the presence of the novel allele, but not the standard IGHV3-7*02 sequence, in the genotype was strongly supported by the actual sequences that were assigned to the allele. We thus showed that the starting database used will impact the germline gene inference process, and that difference in the 3'-end of IGHV genes may remain undetected unless specific, non-standard procedures are used to address this matter. We suggest that inferred genes/alleles should be confirmed e.g. by examination of the nucleotide composition of the 3'-bases of the inference-supporting sequence reads.
Collapse
Affiliation(s)
| | - Mats Ohlin
- Dept. of Immunotechnology, Lund University, Lund, Sweden.
| |
Collapse
|
13
|
Breden F, Watson CT. Using High-Throughput Sequencing to Characterize the Development of the Antibody Repertoire During Infections: A Case Study of HIV-1. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2018; 1053:245-263. [PMID: 29549643 DOI: 10.1007/978-3-319-72077-7_12] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
High throughput sequencing (HTS) approaches have only recently been applied to describing the antibody/B-cell repertoire in fine detail, but these data sets have already become critical to the design of vaccines and therapeutics, and monitoring of cancer immunotherapy. As a case study, we describe the potential and present limitations of HTS studies of the Ab repertoire during infection with HIV-1. Most of the present studies restrict their analyses to lineages of specific bnAbs. We discuss future initiatives to expand this type of analysis to more complete repertoires and to improve comparing and sharing of these Ab repertoire data across studies and institutions.
Collapse
Affiliation(s)
- Felix Breden
- Department of Biological Sciences, Simon Fraser University, Burnaby, BC, Canada.
| | - Corey T Watson
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, USA
| |
Collapse
|
14
|
Abstract
Probabilistic modeling is fundamental to the statistical analysis of complex data. In addition to forming a coherent description of the data-generating process, probabilistic models enable parameter inference about given datasets. This procedure is well developed in the Bayesian perspective, in which one infers probability distributions describing to what extent various possible parameters agree with the data. In this paper, we motivate and review probabilistic modeling for adaptive immune receptor repertoire data then describe progress and prospects for future work, from germline haplotyping to adaptive immune system deployment across tissues. The relevant quantities in immune sequence analysis include not only continuous parameters such as gene use frequency but also discrete objects such as B-cell clusters and lineages. Throughout this review, we unravel the many opportunities for probabilistic modeling in adaptive immune receptor analysis, including settings for which the Bayesian approach holds substantial promise (especially if one is optimistic about new computational methods). From our perspective, the greatest prospects for progress in probabilistic modeling for repertoires concern ancestral sequence estimation for B-cell receptor lineages, including uncertainty from germline genotype, rearrangement, and lineage development.
Collapse
Affiliation(s)
- Branden Olson
- Computational Biology Program Fred Hutchinson Cancer Research Center, 1100 Fairview Ave. N., Mail stop: M1-B514 Seattle, WA 98109-1024 phone: +1 206 667 7318
| | - Frederick A. Matsen
- Computational Biology Program Fred Hutchinson Cancer Research Center, 1100 Fairview Ave. N., Mail stop: M1-B514 Seattle, WA 98109-1024 phone: +1 206 667 7318
| |
Collapse
|
15
|
Characterization of the naive murine antibody repertoire using unamplified high-throughput sequencing. PLoS One 2018; 13:e0190982. [PMID: 29320559 PMCID: PMC5761896 DOI: 10.1371/journal.pone.0190982] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2017] [Accepted: 12/22/2017] [Indexed: 12/22/2022] Open
Abstract
Antibody specificity and diversity are generated through the enzymatic splicing of genomic gene segments within each B cell. Antibodies are heterodimers of heavy- and light-chains encoded on separate loci. We studied the antibody repertoire from pooled, splenic tissue of unimmunized, adult female C57BL/6J mice, using high-throughput sequencing (HTS) without amplification of antibody transcripts. We recovered over 90,000 heavy-chain and over 135,000 light-chain immunoglobulin sequences. Individual V-, D-, and J-gene segment usage was uniform among the three mouse pools, particularly in highly abundant gene segments, with low frequency V-gene segments not being detected in all pools. Despite the similar usage of individual gene segments, the repertoire of individual B-cell CDR3 amino acid sequences in each mouse pool was highly varied, affirming the combinatorial diversity in the B-cell pool that has been previously demonstrated. There also was some skewing in the V-gene segments that were detected depending on chromosomal location. This study presents a unique, non-primer biased glimpse of the conventionally housed, unimmunized antibody repertoire of the C57BL6/J mouse.
Collapse
|
16
|
Nelson RS, Valadon P. A universal phage display system for the seamless construction of Fab libraries. J Immunol Methods 2017; 450:41-49. [DOI: 10.1016/j.jim.2017.07.011] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2017] [Revised: 06/22/2017] [Accepted: 07/25/2017] [Indexed: 01/09/2023]
|
17
|
Castro R, Navelsaker S, Krasnov A, Du Pasquier L, Boudinot P. Describing the diversity of Ag specific receptors in vertebrates: Contribution of repertoire deep sequencing. DEVELOPMENTAL AND COMPARATIVE IMMUNOLOGY 2017; 75:28-37. [PMID: 28259700 DOI: 10.1016/j.dci.2017.02.018] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/02/2016] [Revised: 02/16/2017] [Accepted: 02/22/2017] [Indexed: 06/06/2023]
Abstract
During the last decades, gene and cDNA cloning identified TCR and Ig genes across vertebrates; genome sequencing of TCR and Ig loci in many species revealed the different organizations selected during evolution under the pressure of generating diverse repertoires of Ag receptors. By detecting clonotypes over a wide range of frequency, deep sequencing of Ig and TCR transcripts provides a new way to compare the structure of expressed repertoires in species of various sizes, at different stages of development, with different physiologies, and displaying multiple adaptations to the environment. In this review, we provide a short overview of the technologies currently used to produce global description of immune repertoires, describe how they have already been used in comparative immunology, and we discuss the future potential of such approaches. The development of these methodologies in new species holds promise for new discoveries concerning particular adaptations. As an example, understanding the development of adaptive immunity across metamorphosis in frogs has been made possible by such approaches. Repertoire sequencing is now widely used, not only in basic research but also in the context of immunotherapy and vaccination. Analysis of fish responses to pathogens and vaccines has already benefited from these methods. Finally, we also discuss potential advances based on repertoire sequencing of multigene families of immune sensors and effectors in invertebrates.
Collapse
Affiliation(s)
- Rosario Castro
- Department of Animal Health, Faculty of Veterinary Sciences, Complutense University, Madrid, Spain
| | - Sofie Navelsaker
- Norwegian University of Life Sciences, Faculty of Veterinary Medicine, Department of Basic Sciences and Aquatic Medicine, Adamstuen Campus, Oslo 0454, Norway; Virologie et Immunologie Moléculaires, INRA, Université Paris-Saclay, 78350 Jouy-en-Josas, France
| | | | | | - Pierre Boudinot
- Virologie et Immunologie Moléculaires, INRA, Université Paris-Saclay, 78350 Jouy-en-Josas, France.
| |
Collapse
|
18
|
Watson CT, Glanville J, Marasco WA. The Individual and Population Genetics of Antibody Immunity. Trends Immunol 2017; 38:459-470. [PMID: 28539189 PMCID: PMC5656258 DOI: 10.1016/j.it.2017.04.003] [Citation(s) in RCA: 80] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2017] [Revised: 04/06/2017] [Accepted: 04/10/2017] [Indexed: 12/12/2022]
Abstract
Antibodies (Abs) produced by immunoglobulin (IG) genes are the most diverse proteins expressed in humans. While part of this diversity is generated by recombination during B-cell development and mutations during affinity maturation, the germ-line IG loci are also diverse across human populations and ethnicities. Recently, proof-of-concept studies have demonstrated genotype–phenotype correlations between specific IG germ-line variants and the quality of Ab responses during vaccination and disease. However, the functional consequences of IG genetic variation in Ab function and immunological outcomes remain underexplored. In this opinion article, we outline interconnections between IG genomic diversity and Ab-expressed repertoires and structure. We further propose a strategy for integrating IG genotyping with functional Ab profiling data as a means to better predict and optimize humoral responses in genetically diverse human populations, with immediate implications for personalized medicine. Genetic variation in human populations affects how individuals are able to mount functional antibody responses. Different alleles can encode convergent binding motifs that result in successful Ab responses against specific infections and vaccinations. Given the complexity of the IG loci and the diversity of the antibody repertoire, links between IG polymorphism and antibody repertoire variability have not been thoroughly explored. We present a strategy to mine genotype–repertoire–disease associations.
Collapse
Affiliation(s)
- Corey T Watson
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, USA.
| | - Jacob Glanville
- Institute for Immunity, Transplantation and Infection, and Computational and Systems Immunology, Stanford University School of Medicine, Stanford, CA, USA.
| | - Wayne A Marasco
- Department of Cancer Immunology and Virology, Dana-Farber Cancer Institute, Boston, MA, USA; Department of Medicine, Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
19
|
Parks T, Mirabel MM, Kado J, Auckland K, Nowak J, Rautanen A, Mentzer AJ, Marijon E, Jouven X, Perman ML, Cua T, Kauwe JK, Allen JB, Taylor H, Robson KJ, Deane CM, Steer AC, Hill AVS. Association between a common immunoglobulin heavy chain allele and rheumatic heart disease risk in Oceania. Nat Commun 2017; 8:14946. [PMID: 28492228 PMCID: PMC5437274 DOI: 10.1038/ncomms14946] [Citation(s) in RCA: 84] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2016] [Accepted: 02/15/2017] [Indexed: 12/19/2022] Open
Abstract
The indigenous populations of the South Pacific experience a high burden of rheumatic heart disease (RHD). Here we report a genome-wide association study (GWAS) of RHD susceptibility in 2,852 individuals recruited in eight Oceanian countries. Stratifying by ancestry, we analysed genotyped and imputed variants in Melanesians (607 cases and 1,229 controls) before follow-up of suggestive loci in three further ancestral groups: Polynesians, South Asians and Mixed or other populations (totalling 399 cases and 617 controls). We identify a novel susceptibility signal in the immunoglobulin heavy chain (IGH) locus centring on a haplotype of nonsynonymous variants in the IGHV4-61 gene segment corresponding to the IGHV4-61*02 allele. We show each copy of IGHV4-61*02 is associated with a 1.4-fold increase in the risk of RHD (odds ratio 1.43, 95% confidence intervals 1.27–1.61, P=4.1 × 10−9). These findings provide new insight into the role of germline variation in the IGH locus in disease susceptibility. Rheumatic heart disease (RHD) is a chronic auto-inflammatory reaction to group A streptococcal infection, and frequently occurs in individuals from the South Pacific. This study finds a novel association between an immunoglobulin heavy chain allele and risk of RHD in Pacific Islanders and South Asians.
Collapse
Affiliation(s)
- Tom Parks
- Wellcome Trust Centre for Human Genetics, University of Oxford, Roosevelt Drive, Oxford OX3 7BN, UK
| | - Mariana M Mirabel
- Paris Centre de Recherche Cardiovasculaire, Institut National de la Santé et de la Recherche Médicale, Hôpital Européen Georges Pompidou, 56, rue Leblanc, 75908 Paris, France
| | - Joseph Kado
- Department of Paediatrics, Ministry of Health and Medical Services, Colonial War Memorial Hospital, Brown Street, Suva, Fiji.,College of Medicine, Nursing &Health Sciences, Fiji National University, Brown Street, Suva, Fiji
| | - Kathryn Auckland
- Wellcome Trust Centre for Human Genetics, University of Oxford, Roosevelt Drive, Oxford OX3 7BN, UK
| | - Jaroslaw Nowak
- Department of Statistics, University of Oxford, Peter Medawar Building for Pathogen Research, Oxford OX1 3S, UK
| | - Anna Rautanen
- Wellcome Trust Centre for Human Genetics, University of Oxford, Roosevelt Drive, Oxford OX3 7BN, UK
| | - Alexander J Mentzer
- Wellcome Trust Centre for Human Genetics, University of Oxford, Roosevelt Drive, Oxford OX3 7BN, UK
| | - Eloi Marijon
- Paris Centre de Recherche Cardiovasculaire, Institut National de la Santé et de la Recherche Médicale, Hôpital Européen Georges Pompidou, 56, rue Leblanc, 75908 Paris, France.,Faculté de Médecine Paris Descartes, Université Paris Descartes, 15, rue de l'école de medicine, 75006 Paris, France
| | - Xavier Jouven
- Paris Centre de Recherche Cardiovasculaire, Institut National de la Santé et de la Recherche Médicale, Hôpital Européen Georges Pompidou, 56, rue Leblanc, 75908 Paris, France.,Faculté de Médecine Paris Descartes, Université Paris Descartes, 15, rue de l'école de medicine, 75006 Paris, France
| | - Mai Ling Perman
- College of Medicine, Nursing &Health Sciences, Fiji National University, Brown Street, Suva, Fiji
| | - Tuliana Cua
- Rheumatic Heart Disease Control Programme, Ministry of Health and Medical Services, Colonial War Memorial Hospital, Brown Street, Suva, Fiji
| | - John K Kauwe
- College of Life Sciences, Brigham Young University, 4146 Life Sciences Building, Provo, Utah 84602, USA
| | - John B Allen
- College of Life Sciences, Brigham Young University, 4146 Life Sciences Building, Provo, Utah 84602, USA
| | - Henry Taylor
- Rheumatic Heart Disease Control Programme, Samoa Ministry of Health, Moto'otua, Ifiifi Street, Apia, Samoa
| | - Kathryn J Robson
- MRC Weatherall Institute of Molecular Medicine, University of Oxford, John Radcliffe Hospital, Headington, Oxford OX3 9DS, UK
| | - Charlotte M Deane
- Department of Statistics, University of Oxford, Peter Medawar Building for Pathogen Research, Oxford OX1 3S, UK
| | - Andrew C Steer
- Centre for International Child Health, University of Melbourne, 50 Flemington Road, Parkville, Melbourne Victoria 3052, Australia.,Murdoch Children's Research Institute, 50 Flemington Road, Parkville, Melbourne, Victoria 3052, Australia
| | - Adrian V S Hill
- Wellcome Trust Centre for Human Genetics, University of Oxford, Roosevelt Drive, Oxford OX3 7BN, UK
| | | |
Collapse
|
20
|
Watson CT, Matsen FA, Jackson KJL, Bashir A, Smith ML, Glanville J, Breden F, Kleinstein SH, Collins AM, Busse CE. Comment on “A Database of Human Immune Receptor Alleles Recovered from Population Sequencing Data”. THE JOURNAL OF IMMUNOLOGY 2017; 198:3371-3373. [DOI: 10.4049/jimmunol.1700306] [Citation(s) in RCA: 37] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
|
21
|
Yu Y, Ceredig R, Seoighe C. A Database of Human Immune Receptor Alleles Recovered from Population Sequencing Data. THE JOURNAL OF IMMUNOLOGY 2017; 198:2202-2210. [DOI: 10.4049/jimmunol.1601710] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/05/2016] [Accepted: 01/03/2017] [Indexed: 01/05/2023]
|
22
|
Production of individualized V gene databases reveals high levels of immunoglobulin genetic diversity. Nat Commun 2016; 7:13642. [PMID: 27995928 PMCID: PMC5187446 DOI: 10.1038/ncomms13642] [Citation(s) in RCA: 138] [Impact Index Per Article: 17.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2016] [Accepted: 10/21/2016] [Indexed: 12/19/2022] Open
Abstract
Comprehensive knowledge of immunoglobulin genetics is required to advance our understanding of B cell biology. Validated immunoglobulin variable (V) gene databases are close to completion only for human and mouse. We present a novel computational approach, IgDiscover, that identifies germline V genes from expressed repertoires to a specificity of 100%. IgDiscover uses a cluster identification process to produce candidate sequences that, once filtered, results in individualized germline V gene databases. IgDiscover was tested in multiple species, validated by genomic cloning and cross library comparisons and produces comprehensive gene databases even where limited genomic sequence is available. IgDiscover analysis of the allelic content of the Indian and Chinese-origin rhesus macaques reveals high levels of immunoglobulin gene diversity in this species. Further, we describe a novel human IGHV3-21 allele and confirm significant gene differences between Balb/c and C57BL6 mouse strains, demonstrating the power of IgDiscover as a germline V gene discovery tool. Current databases of V genes for antibody repertoire have limitations. Here Corcoran et al. develop a computational approach named IgDiscover that can identify germline V gene sequences from expressed antibody repertoires to high specificity and completeness.
Collapse
|
23
|
Assing K, Nielsen C, Jakobsen M, Scholze A, Nybo M, Soerensen G, Mortensen S, Vejen K, Barington T, Bistrup C. Evidence of perturbed germinal center dynamics, but preserved antibody diversity, in end-stage renal disease. IMMUNITY INFLAMMATION AND DISEASE 2016; 4:225-234. [PMID: 27957330 PMCID: PMC4879468 DOI: 10.1002/iid3.108] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/14/2016] [Revised: 04/11/2016] [Accepted: 04/15/2016] [Indexed: 11/08/2022]
Abstract
INTRODUCTION End-stage renal disease (ESRD) is associated with increased infectious susceptibility and with reduced vaccine responses consistent with compromised humoral immunity. Whether the compromised humoral immunity is due to reduced antibody diversity (reduced somatic hypermutation [SHM]) or altered germinal center (GC) dynamics is not known. The GC-derived chemokine CXCL13 as well as peripheral T follicular helper cells (pTFH) reflect GC dynamics, but have, similar to SHM, never been characterized in relation to ESRD. METHODS Serum CXCL 13 was determined by ELISA. PTFH were flow-cytometrically defined as CD4+ CD45RA- CCR7+ CXCR5+ lymphocytes. Apoptotic lymphocyte subsets were in addition annexin V+. SHM was determined, by next-generation sequencing and bioinformatics, as nucleotide mutations within the IgG VH (comprising the important antigen-binding domains of IgG, CDR1, and CDR2). RESULTS Elevated CXCL13 levels characterized ESRD (n = 19; [median] 90 pg/ml, P < 0.01) (controls, n = 18; 62 pg/ml). ESRD pTFH frequencies (n = 19; 11.6% [of CD4+ memory T cells], P < 0.02*, *Bonferroni corrected) (controls, n = 22; 14.9%) and concentrations (n = 19; 0.03 × 109/L, P < 0.02*) (controls, n = 22; 0.07 × 109/L) were reduced. ESRD pTFH were more apoptotic (n = 9; 25.7%, P = 0.04*) (controls, n = 10; 15.9%). SHM did not discriminate between ESRD (n = 10; 7.4%, P = 0.21) and controls (n = 10; 8.4%). CONCLUSIONS Elevated CXCL13 levels, reduced pTFH levels, and increased pTFH apoptosis suggest that perturbed GC dynamics, and not reduced antibody diversity, underlie the diminished vaccine responses and the compromised humoral immunity in ESRD. However, largely preserved SHM provides a rationale for pursuing vaccination in relation to ESRD.
Collapse
Affiliation(s)
- Kristian Assing
- Department of Clinical Immunology Odense University Hospital Odense Denmark
| | - Christian Nielsen
- Department of Clinical Immunology Odense University Hospital Odense Denmark
| | - Marianne Jakobsen
- Department of Clinical Immunology Odense University Hospital Odense Denmark
| | - Alexandra Scholze
- Clinical Research UnitDepartment of NephrologyOdense University HospitalOdenseDenmark; Institute of Clinical ResearchUniversity of Southern DenmarkOdenseDenmark
| | - Mads Nybo
- Department of Clinical Biochemistry Odense University Hospital Odense Denmark
| | - Grete Soerensen
- Department of Nephrology Odense University Hospital Odense Denmark
| | - Sussie Mortensen
- Department of Clinical Immunology Odense University Hospital Odense Denmark
| | - Knud Vejen
- Department of Clinical Immunology Odense University Hospital Odense Denmark
| | - Torben Barington
- Department of Clinical Immunology Odense University Hospital Odense Denmark
| | - Claus Bistrup
- Department of Nephrology Odense University Hospital Odense Denmark
| |
Collapse
|
24
|
Luo S, Yu JA, Song YS. Estimating Copy Number and Allelic Variation at the Immunoglobulin Heavy Chain Locus Using Short Reads. PLoS Comput Biol 2016; 12:e1005117. [PMID: 27632220 PMCID: PMC5025152 DOI: 10.1371/journal.pcbi.1005117] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2016] [Accepted: 08/23/2016] [Indexed: 11/28/2022] Open
Abstract
The study of genomic regions that contain gene copies and structural variation is a major challenge in modern genomics. Unlike variation involving single nucleotide changes, data on the variation of copy number is difficult to collect and few tools exist for analyzing the variation between individuals. The immunoglobulin heavy variable (IGHV) locus, which plays an integral role in the adaptive immune response, is an example of a complex genomic region that varies in gene copy number. Lack of standard methods to genotype this region prevents it from being included in association studies and is holding back the growing field of antibody repertoire analysis. Here we develop a method that takes short reads from high-throughput sequencing and outputs a genetic profile of the IGHV locus with the read coverage depth and a putative nucleotide sequence for each operationally defined gene cluster. Our operationally defined gene clusters aim to address a major challenge in studying the IGHV locus: the high sequence similarity between gene segments in different genomic locations. Tests on simulated data demonstrate that our approach can accurately determine the presence or absence of a gene cluster from reads as short as 70 bp. More detailed resolution on the copy number of gene clusters can be obtained from read coverage depth using longer reads (e.g., ≥ 100 bp). Detail at the nucleotide resolution of single copy genes (genes present in one copy per haplotype) can be determined with 250 bp reads. For IGHV genes with more than one copy, accurate nucleotide-resolution reconstruction is currently beyond the means of our approach. When applied to a family of European ancestry, our pipeline outputs genotypes that are consistent with the family pedigree, confirms existing multigene variants and suggests new copy number variants. This study paves the way for analyzing population-level patterns of variation in IGHV gene clusters in larger diverse datasets and for quantitatively handling regions of copy number variation in other structurally varying and complex loci.
Collapse
Affiliation(s)
- Shishi Luo
- Computer Science Division, University of California, Berkeley, Berkeley, California, United States of America
- Department of Statistics, University of California, Berkeley, Berkeley, California, United States of America
| | - Jane A. Yu
- Computer Science Division, University of California, Berkeley, Berkeley, California, United States of America
| | - Yun S. Song
- Computer Science Division, University of California, Berkeley, Berkeley, California, United States of America
- Department of Statistics, University of California, Berkeley, Berkeley, California, United States of America
- Departments of Mathematics and Biology, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| |
Collapse
|
25
|
Collins AM, Wang Y, Roskin KM, Marquis CP, Jackson KJL. The mouse antibody heavy chain repertoire is germline-focused and highly variable between inbred strains. Philos Trans R Soc Lond B Biol Sci 2016; 370:rstb.2014.0236. [PMID: 26194750 DOI: 10.1098/rstb.2014.0236] [Citation(s) in RCA: 68] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023] Open
Abstract
The human and mouse antibody repertoires are formed by identical processes, but like all small animals, mice only have sufficient lymphocytes to express a small part of the potential antibody repertoire. In this study, we determined how the heavy chain repertoires of two mouse strains are generated. Analysis of IgM- and IgG-associated VDJ rearrangements generated by high-throughput sequencing confirmed the presence of 99 functional immunoglobulin heavy chain variable (IGHV) genes in the C57BL/6 genome, and inferred the presence of 164 IGHV genes in the BALB/c genome. Remarkably, only five IGHV sequences were common to both strains. Compared with humans, little N nucleotide addition was seen in the junctions of mouse VDJ genes. Germline human IgG-associated IGHV genes are rare, but many murine IgG-associated IGHV genes were unmutated. Together these results suggest that the expressed mouse repertoire is more germline-focused than the human repertoire. The apparently divergent germline repertoires of the mouse strains are discussed with reference to reports that inbred mouse strains carry blocks of genes derived from each of the three subspecies of the house mouse. We hypothesize that the germline genes of BALB/c and C57BL/6 mice may originally have evolved to generate distinct germline-focused antibody repertoires in the different mouse subspecies.
Collapse
Affiliation(s)
- Andrew M Collins
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, 2052 NSW, Australia
| | - Yan Wang
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, 2052 NSW, Australia
| | - Krishna M Roskin
- Department of Pathology, School of Medicine, Stanford University, Stanford, CA 94305-5324, USA
| | - Christopher P Marquis
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, 2052 NSW, Australia
| | - Katherine J L Jackson
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, 2052 NSW, Australia Department of Pathology, School of Medicine, Stanford University, Stanford, CA 94305-5324, USA
| |
Collapse
|
26
|
Abstract
New high-throughput DNA sequencing (HTS) technologies developed in the past decade have begun to be applied to the study of the complex gene rearrangements that encode human antibodies. This article first reviews the genetic features of Ig loci and the HTS technologies that have been applied to human repertoire studies, then discusses key choices for experimental design and data analysis in these experiments and the insights gained in immunological and infectious disease studies with the use of these approaches.
Collapse
|
27
|
Efficient construct of a large and functional scFv yeast display library derived from the ascites B cells of ovarian cancer patients by three-fragment transformation-associated recombination. Appl Microbiol Biotechnol 2016; 100:4051-61. [PMID: 26782745 DOI: 10.1007/s00253-016-7303-4] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2015] [Revised: 01/04/2016] [Accepted: 01/06/2016] [Indexed: 10/22/2022]
Abstract
Over the past decade, yeast display technology has emerged as a powerful tool for the isolation of high-affinity immunoglobulin fragments with potential utility as clinical diagnostic and therapeutic reagents. Despite significant refinement of the various methodologies underpinning library construction and selections, certain aspects remain challenging and process limiting. We have sought to significantly improve the robustness of the single-chain Fv (scFv) library construction step by overcoming the technical inefficiencies frequently encountered during the PCR-mediated assembly of scFvs from the discrete heavy and light V-domain repertoires. Using a novel primer set designed to provide maximum amplification coverage of the known germ-line V-domain repertoire, we have exploited the potential of the in vivo homologous gap-repair apparatus of Saccharomyces cerevisiae to assemble intact scFvs directly from co-transformed PBMC-derived VH, VL, and linearized vector component fragments. We have successfully applied this three-fragment assembly strategy to construct a large (>10(9)) scFv yeast display library from the ascites immune repertoire of ovarian cancer patients and validated the approach by applying FACS-based sorting to readily isolate scFvs that recognize various tumor marker antigens (TMAs). It is expected that this simplified construction method may find general utility, both for de novo scFv library construction and for subsequent combinatorial affinity maturation manipulations that require more than two fragments.
Collapse
|
28
|
Yaari G, Kleinstein SH. Practical guidelines for B-cell receptor repertoire sequencing analysis. Genome Med 2015; 7:121. [PMID: 26589402 PMCID: PMC4654805 DOI: 10.1186/s13073-015-0243-2] [Citation(s) in RCA: 152] [Impact Index Per Article: 16.9] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
High-throughput sequencing of B-cell immunoglobulin repertoires is increasingly being applied to gain insights into the adaptive immune response in healthy individuals and in those with a wide range of diseases. Recent applications include the study of autoimmunity, infection, allergy, cancer and aging. As sequencing technologies continue to improve, these repertoire sequencing experiments are producing ever larger datasets, with tens- to hundreds-of-millions of sequences. These data require specialized bioinformatics pipelines to be analyzed effectively. Numerous methods and tools have been developed to handle different steps of the analysis, and integrated software suites have recently been made available. However, the field has yet to converge on a standard pipeline for data processing and analysis. Common file formats for data sharing are also lacking. Here we provide a set of practical guidelines for B-cell receptor repertoire sequencing analysis, starting from raw sequencing reads and proceeding through pre-processing, determination of population structure, and analysis of repertoire properties. These include methods for unique molecular identifiers and sequencing error correction, V(D)J assignment and detection of novel alleles, clonal assignment, lineage tree construction, somatic hypermutation modeling, selection analysis, and analysis of stereotyped or convergent responses. The guidelines presented here highlight the major steps involved in the analysis of B-cell repertoire sequencing data, along with recommendations on how to avoid common pitfalls.
Collapse
Affiliation(s)
- Gur Yaari
- Bioengineering Program, Faculty of Engineering, Bar-Ilan University, 5290002, Ramat Gan, Israel.
| | - Steven H Kleinstein
- Interdepartmental Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, 06511, USA. .,Departments of Pathology and Immunobiology, Yale University School of Medicine, New Haven, CT, 06520, USA.
| |
Collapse
|
29
|
Snir O, Mesin L, Gidoni M, Lundin KEA, Yaari G, Sollid LM. Analysis of celiac disease autoreactive gut plasma cells and their corresponding memory compartment in peripheral blood using high-throughput sequencing. THE JOURNAL OF IMMUNOLOGY 2015; 194:5703-12. [PMID: 25972486 DOI: 10.4049/jimmunol.1402611] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/14/2014] [Accepted: 04/17/2015] [Indexed: 11/19/2022]
Abstract
Autoreactive IgA plasma cells (PCs) specific for the enzyme transglutaminase 2 (TG2) are abundant in the small intestine of patients with active celiac disease (CD), and their number drops in patients treated by dietary gluten elimination. Little is known about their characteristics and their role in the disease. In this study, using high-throughput sequencing of the IgH V region (IGHV) genes, we have studied features of TG2-specific PCs and their related B cell clones in peripheral blood. We found that TG2-specific PCs from both untreated and treated patients have acquired lower number of somatic hypermutation and used focused IGHV repertoire with overrepresentation of the IGHV3-48, IGHV4-59, IGHV5-10-1, and IGHV5-51 gene segments. Furthermore, these PCs were clonally expanded and showed signs of affinity maturation. Lineage trees demonstrated shared clones between gut PCs and blood memory B cells, primarily IgAs. Some trees also involved IgG cells, suggesting that anti-TG2 IgA and IgG responses are related. Similarly to TG2-specific PCs, clonally related memory IgA B cells of blood showed lower mutation rates with biased usage of IGHV3-48 and IGHV5-51. Such memory cells were rare in peripheral blood, yet detectable in most patients assessed by production of anti-TG2 Abs in vitro following stimulation of cells from patients who had been on a long-term gluten-free diet. Thus, the Ab response to TG2 in CD, while maintaining its IGHV gene usage, is dynamically regulated in response to gluten exposure with a low degree of maintenance at both PC and memory B cell levels in patients in remission.
Collapse
Affiliation(s)
- Omri Snir
- Centre for Immune Regulation and Department of Immunology, University of Oslo and Oslo University Hospital, 0372 Oslo, Norway
| | - Luka Mesin
- Centre for Immune Regulation and Department of Immunology, University of Oslo and Oslo University Hospital, 0372 Oslo, Norway
| | - Moriah Gidoni
- Bioengineering Program, Faculty of Engineering, Bar-Ilan University, Ramt Gan 52900, Israel; and
| | - Knut E A Lundin
- Centre for Immune Regulation and Department of Immunology, University of Oslo and Oslo University Hospital, 0372 Oslo, Norway; Department of Gastroenterology, Oslo University Hospital-Rikshospitalet, 0372 Oslo, Norway
| | - Gur Yaari
- Bioengineering Program, Faculty of Engineering, Bar-Ilan University, Ramt Gan 52900, Israel; and
| | - Ludvig M Sollid
- Centre for Immune Regulation and Department of Immunology, University of Oslo and Oslo University Hospital, 0372 Oslo, Norway;
| |
Collapse
|
30
|
Scheepers C, Shrestha RK, Lambson BE, Jackson KJL, Wright IA, Naicker D, Goosen M, Berrie L, Ismail A, Garrett N, Abdool Karim Q, Abdool Karim SS, Moore PL, Travers SA, Morris L. Ability to develop broadly neutralizing HIV-1 antibodies is not restricted by the germline Ig gene repertoire. THE JOURNAL OF IMMUNOLOGY 2015; 194:4371-8. [PMID: 25825450 DOI: 10.4049/jimmunol.1500118] [Citation(s) in RCA: 56] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/16/2015] [Accepted: 02/24/2015] [Indexed: 11/19/2022]
Abstract
The human Ig repertoire is vast, producing billions of unique Abs from a limited number of germline Ig genes. The IgH V region (IGHV) is central to Ag binding and consists of 48 functional genes. In this study, we analyzed whether HIV-1-infected individuals who develop broadly neutralizing Abs show a distinctive germline IGHV profile. Using both 454 and Illumina technologies, we sequenced the IGHV repertoire of 28 HIV-infected South African women from the Centre for the AIDS Programme of Research in South Africa (CAPRISA) 002 and 004 cohorts, 13 of whom developed broadly neutralizing Abs. Of the 259 IGHV alleles identified in this study, approximately half were not found in the International Immunogenetics Database (IMGT). This included 85 entirely novel alleles and 38 alleles that matched rearranged sequences in non-IMGT databases. Analysis of the rearranged H chain V region genes of mAbs isolated from seven of these women, as well as previously isolated broadly neutralizing Abs from other donors, provided evidence that at least eight novel or non-IMGT alleles contributed to functional Abs. Importantly, we found that, despite a wide range in the number of IGHV alleles in each individual, including alleles used by known broadly neutralizing Abs, there were no significant differences in germline IGHV repertoires between individuals who do and do not develop broadly neutralizing Abs. This study reports novel IGHV repertoires and highlights the importance of a fully comprehensive Ig database for germline gene usage prediction. Furthermore, these data suggest a lack of genetic bias in broadly neutralizing Ab development in HIV-1 infection, with positive implications for HIV vaccine design.
Collapse
Affiliation(s)
- Cathrine Scheepers
- Centre for HIV and Sexually Transmitted Infections, National Institute for Communicable Diseases of the National Health Laboratory Service, Johannesburg 2131, South Africa; Division of Virology and Communicable Disease Surveillance, School of Pathology, University of the Witwatersrand, Johannesburg 2050, South Africa
| | - Ram K Shrestha
- South African National Bioinformatics Institute, South African Medical Research Council Bioinformatics Unit, University of the Western Cape, Bellville 7535, South Africa
| | - Bronwen E Lambson
- Centre for HIV and Sexually Transmitted Infections, National Institute for Communicable Diseases of the National Health Laboratory Service, Johannesburg 2131, South Africa; Division of Virology and Communicable Disease Surveillance, School of Pathology, University of the Witwatersrand, Johannesburg 2050, South Africa
| | - Katherine J L Jackson
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, New South Wales 2052, Australia; Department of Pathology, School of Medicine, Stanford University, Stanford, CA 94305
| | - Imogen A Wright
- South African National Bioinformatics Institute, South African Medical Research Council Bioinformatics Unit, University of the Western Cape, Bellville 7535, South Africa
| | - Dshanta Naicker
- Centre for HIV and Sexually Transmitted Infections, National Institute for Communicable Diseases of the National Health Laboratory Service, Johannesburg 2131, South Africa
| | - Mark Goosen
- Centre for HIV and Sexually Transmitted Infections, National Institute for Communicable Diseases of the National Health Laboratory Service, Johannesburg 2131, South Africa
| | - Leigh Berrie
- Centre for HIV and Sexually Transmitted Infections, National Institute for Communicable Diseases of the National Health Laboratory Service, Johannesburg 2131, South Africa
| | - Arshad Ismail
- Centre for HIV and Sexually Transmitted Infections, National Institute for Communicable Diseases of the National Health Laboratory Service, Johannesburg 2131, South Africa
| | - Nigel Garrett
- Centre for the AIDS Programme of Research in South Africa, KwaZulu-Natal 4013, South Africa; Department of Infectious Diseases, Nelson R. Mandela School of Medicine, University of KwaZulu-Natal, 4041 Durban, South Africa; and
| | - Quarraisha Abdool Karim
- Centre for the AIDS Programme of Research in South Africa, KwaZulu-Natal 4013, South Africa; Department of Epidemiology, Columbia University, New York, NY 10032
| | - Salim S Abdool Karim
- Centre for the AIDS Programme of Research in South Africa, KwaZulu-Natal 4013, South Africa; Department of Epidemiology, Columbia University, New York, NY 10032
| | - Penny L Moore
- Centre for HIV and Sexually Transmitted Infections, National Institute for Communicable Diseases of the National Health Laboratory Service, Johannesburg 2131, South Africa; Division of Virology and Communicable Disease Surveillance, School of Pathology, University of the Witwatersrand, Johannesburg 2050, South Africa; Centre for the AIDS Programme of Research in South Africa, KwaZulu-Natal 4013, South Africa
| | - Simon A Travers
- South African National Bioinformatics Institute, South African Medical Research Council Bioinformatics Unit, University of the Western Cape, Bellville 7535, South Africa
| | - Lynn Morris
- Centre for HIV and Sexually Transmitted Infections, National Institute for Communicable Diseases of the National Health Laboratory Service, Johannesburg 2131, South Africa; Division of Virology and Communicable Disease Surveillance, School of Pathology, University of the Witwatersrand, Johannesburg 2050, South Africa; Centre for the AIDS Programme of Research in South Africa, KwaZulu-Natal 4013, South Africa;
| |
Collapse
|
31
|
Steele EJ, Lloyd SS. Soma-to-germline feedback is implied by the extreme polymorphism at IGHV relative to MHC: The manifest polymorphism of the MHC appears greatly exceeded at Immunoglobulin loci, suggesting antigen-selected somatic V mutants penetrate Weismann's Barrier. Bioessays 2015; 37:557-69. [PMID: 25810320 DOI: 10.1002/bies.201400213] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2014] [Revised: 02/15/2015] [Accepted: 02/24/2015] [Indexed: 01/22/2023]
Abstract
Soma-to-germline feedback is forbidden under the neo-Darwinian paradigm. Nevertheless, there is a growing realization it occurs frequently in immunoglobulin (Ig) variable (V) region genes. This is a surprising development. It arises from a most unlikely source in light of the exposure of co-author EJS to the haplotype data of RL Dawkins and others on the polymorphism of the Major Histocompatibility Complex, which is generally assumed to be the most polymorphic region in the genome (spanning ∼4 Mb). The comparison between the magnitude of MHC polymorphism with estimates for the human heavy chain immunoglobulin V locus (spanning ∼1 Mb), suggests IGHV could be many orders of magnitude more polymorphic than the MHC. This conclusion needs airing in the literature as it implies generational churn and soma-to-germline gene feedback. Pedigree-based experimental strategies to resolve the IGHV issue are outlined.
Collapse
Affiliation(s)
- Edward J Steele
- C.Y. O'Connor ERADE Village Foundation, Piara Waters, WA, Australia
| | | |
Collapse
|
32
|
Automated analysis of high-throughput B-cell sequencing data reveals a high frequency of novel immunoglobulin V gene segment alleles. Proc Natl Acad Sci U S A 2015; 112:E862-70. [PMID: 25675496 DOI: 10.1073/pnas.1417683112] [Citation(s) in RCA: 144] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Individual variation in germline and expressed B-cell immunoglobulin (Ig) repertoires has been associated with aging, disease susceptibility, and differential response to infection and vaccination. Repertoire properties can now be studied at large-scale through next-generation sequencing of rearranged Ig genes. Accurate analysis of these repertoire-sequencing (Rep-Seq) data requires identifying the germline variable (V), diversity (D), and joining (J) gene segments used by each Ig sequence. Current V(D)J assignment methods work by aligning sequences to a database of known germline V(D)J segment alleles. However, existing databases are likely to be incomplete and novel polymorphisms are hard to differentiate from the frequent occurrence of somatic hypermutations in Ig sequences. Here we develop a Tool for Ig Genotype Elucidation via Rep-Seq (TIgGER). TIgGER analyzes mutation patterns in Rep-Seq data to identify novel V segment alleles, and also constructs a personalized germline database containing the specific set of alleles carried by a subject. This information is then used to improve the initial V segment assignments from existing tools, like IMGT/HighV-QUEST. The application of TIgGER to Rep-Seq data from seven subjects identified 11 novel V segment alleles, including at least one in every subject examined. These novel alleles constituted 13% of the total number of unique alleles in these subjects, and impacted 3% of V(D)J segment assignments. These results reinforce the highly polymorphic nature of human Ig V genes, and suggest that many novel alleles remain to be discovered. The integration of TIgGER into Rep-Seq processing pipelines will increase the accuracy of V segment assignments, thus improving B-cell repertoire analyses.
Collapse
|
33
|
Hansen TØ, Lange AB, Barington T. Sterile DJH rearrangements reveal that distance between gene segments on the human Ig H chain locus influences their ability to rearrange. THE JOURNAL OF IMMUNOLOGY 2015; 194:973-82. [PMID: 25556246 DOI: 10.4049/jimmunol.1401443] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
Rearrangement of the Ig locus occurs in two steps. First, a JH gene is rearranged to a D gene followed by a VH gene rearranging to the DJH rearrangement. By next generation sequencing, we analyzed 9969 unique DJH rearrangements and 5919 unique VHDJH rearrangements obtained from peripheral blood B cells from 110 healthy adult donors. We found that DJH rearrangements and nonproductive VHDJH rearrangements share many features but differ significantly in their use of D genes and propensity for somatic hypermutation. In D to JH gene rearrangements, the D genes proximal to the JH locus are used more frequently than JH locus distal D genes, whereas VH locus proximal D genes were observed more frequently in nonproductive VHDJH rearrangements. We further demonstrate that the distance between VH, D, and JH gene segments influence their ability to rearrange within the human Ig locus.
Collapse
Affiliation(s)
- Tina Østergaard Hansen
- Department of Clinical Biochemistry, Roskilde University Hospital, DK-5000 Odense, Denmark
| | - Anders Blaabjerg Lange
- Maersk Mc-Kinney Moller Institute, Faculty of Sciences, University of Southern Denmark, DK-5000 Odense, Denmark; and
| | - Torben Barington
- Department of Clinical Immunology, Odense University Hospital, DK-5000 Odense, Denmark
| |
Collapse
|
34
|
Watson CT, Steinberg KM, Graves TA, Warren RL, Malig M, Schein J, Wilson RK, Holt RA, Eichler EE, Breden F. Sequencing of the human IG light chain loci from a hydatidiform mole BAC library reveals locus-specific signatures of genetic diversity. Genes Immun 2015; 16:24-34. [PMID: 25338678 PMCID: PMC4304971 DOI: 10.1038/gene.2014.56] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2014] [Revised: 09/03/2014] [Accepted: 09/03/2014] [Indexed: 12/24/2022]
Abstract
Germline variation at immunoglobulin (IG) loci is critical for pathogen-mediated immunity, but establishing complete haplotype sequences in these regions has been problematic because of complex sequence architecture and diploid source DNA. We sequenced BAC clones from the effectively haploid human hydatidiform mole cell line, CHM1htert, across the light chain IG loci, kappa (IGK) and lambda (IGL), creating single haplotype representations of these regions. The IGL haplotype generated here is 1.25 Mb of contiguous sequence, including four novel IGLV alleles, one novel IGLC allele, and an 11.9-kb insertion. The CH17 IGK haplotype consists of two 644 kb proximal and 466 kb distal contigs separated by a large gap of unknown size; these assemblies added 49 kb of unique sequence extending into this gap. Our analysis also resulted in the characterization of seven novel IGKV alleles and a 16.7-kb region exhibiting signatures of interlocus sequence exchange between distal and proximal IGKV gene clusters. Genetic diversity in IGK/IGL was compared with that of the IG heavy chain (IGH) locus within the same haploid genome, revealing threefold (IGK) and sixfold (IGL) higher diversity in the IGH locus, potentially associated with increased levels of segmental duplication and the telomeric location of IGH.
Collapse
Affiliation(s)
- C T Watson
- Department of Biological Sciences, Simon Fraser University, Burnaby, British Columbia, Canada
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY USA
| | - K M Steinberg
- Department of Genome Sciences, University of Washington, Seattle, WA USA
- The Genome Institute, Washington University, St Louis, MO USA
| | - T A Graves
- The Genome Institute, Washington University, St Louis, MO USA
| | - R L Warren
- Genome Sciences Centre, BC Cancer Agency, Vancouver, British Columbia Canada
| | - M Malig
- Department of Genome Sciences, University of Washington, Seattle, WA USA
| | - J Schein
- Genome Sciences Centre, BC Cancer Agency, Vancouver, British Columbia Canada
| | - R K Wilson
- The Genome Institute, Washington University, St Louis, MO USA
| | - R A Holt
- Genome Sciences Centre, BC Cancer Agency, Vancouver, British Columbia Canada
| | - E E Eichler
- Department of Genome Sciences, University of Washington, Seattle, WA USA
- Howard Hughes Medical Institute, Seattle, WA USA
| | - F Breden
- Department of Biological Sciences, Simon Fraser University, Burnaby, British Columbia, Canada
| |
Collapse
|
35
|
Wang Y, Jackson KJL, Davies J, Chen Z, Gaeta BA, Rimmer J, Sewell WA, Collins AM. IgE-associated IGHV genes from venom and peanut allergic individuals lack mutational evidence of antigen selection. PLoS One 2014; 9:e89730. [PMID: 24586993 PMCID: PMC3934916 DOI: 10.1371/journal.pone.0089730] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2013] [Accepted: 01/22/2014] [Indexed: 11/18/2022] Open
Abstract
Antigen selection of B cells within the germinal center reaction generally leads to the accumulation of replacement mutations in the complementarity-determining regions (CDRs) of immunoglobulin genes. Studies of mutations in IgE-associated VDJ gene sequences have cast doubt on the role of antigen selection in the evolution of the human IgE response, and it may be that selection for high affinity antibodies is a feature of some but not all allergic diseases. The severity of IgE-mediated anaphylaxis is such that it could result from higher affinity IgE antibodies. We therefore investigated IGHV mutations in IgE-associated sequences derived from ten individuals with a history of anaphylactic reactions to bee or wasp venom or peanut allergens. IgG sequences, which more certainly experience antigen selection, served as a control dataset. A total of 6025 unique IgE and 5396 unique IgG sequences were generated using high throughput 454 pyrosequencing. The proportion of replacement mutations seen in the CDRs of the IgG dataset was significantly higher than that of the IgE dataset, and the IgE sequences showed little evidence of antigen selection. To exclude the possibility that 454 errors had compromised analysis, rigorous filtering of the datasets led to datasets of 90 core IgE sequences and 411 IgG sequences. These sequences were present as both forward and reverse reads, and so were most unlikely to include sequencing errors. The filtered datasets confirmed that antigen selection plays a greater role in the evolution of IgG sequences than of IgE sequences derived from the study participants.
Collapse
Affiliation(s)
- Yan Wang
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, Australia
| | - Katherine J. L. Jackson
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, Australia
| | - Janet Davies
- The Lung and Allergy Research Centre, School of Medicine, The University of Queensland, Woolloongabba, Australia
| | - Zhiliang Chen
- School of Computer Science and Engineering, University of New South Wales, Sydney, Australia
| | - Bruno A. Gaeta
- School of Computer Science and Engineering, University of New South Wales, Sydney, Australia
| | | | - William A. Sewell
- Institute of Laboratory Medicine, St Vincent's Hospital, Darlinghurst, Australia and St Vincent's Clinical School, University of New South Wales, Darlinghurst, Australia
| | - Andrew M. Collins
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, Australia
- * E-mail:
| |
Collapse
|
36
|
The promise and challenge of high-throughput sequencing of the antibody repertoire. Nat Biotechnol 2014; 32:158-68. [PMID: 24441474 PMCID: PMC4113560 DOI: 10.1038/nbt.2782] [Citation(s) in RCA: 478] [Impact Index Per Article: 47.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2013] [Accepted: 12/04/2013] [Indexed: 12/16/2022]
Abstract
Georgiou and colleagues discuss rapidly evolving methods for high-throughput sequencing of the antibody repertoire, and how the resulting data may be applied to answer basic and translational research questions. Efforts to determine the antibody repertoire encoded by B cells in the blood or lymphoid organs using high-throughput DNA sequencing technologies have been advancing at an extremely rapid pace and are transforming our understanding of humoral immune responses. Information gained from high-throughput DNA sequencing of immunoglobulin genes (Ig-seq) can be applied to detect B-cell malignancies with high sensitivity, to discover antibodies specific for antigens of interest, to guide vaccine development and to understand autoimmunity. Rapid progress in the development of experimental protocols and informatics analysis tools is helping to reduce sequencing artifacts, to achieve more precise quantification of clonal diversity and to extract the most pertinent biological information. That said, broader application of Ig-seq, especially in clinical settings, will require the development of a standardized experimental design framework that will enable the sharing and meta-analysis of sequencing data generated by different laboratories.
Collapse
|
37
|
Jackson KJL, Kidd MJ, Wang Y, Collins AM. The shape of the lymphocyte receptor repertoire: lessons from the B cell receptor. Front Immunol 2013; 4:263. [PMID: 24032032 PMCID: PMC3759170 DOI: 10.3389/fimmu.2013.00263] [Citation(s) in RCA: 81] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2013] [Accepted: 08/19/2013] [Indexed: 11/13/2022] Open
Abstract
Both the B cell receptor (BCR) and the T cell receptor (TCR) repertoires are generated through essentially identical processes of V(D)J recombination, exonuclease trimming of germline genes, and the random addition of non-template encoded nucleotides. The naïve TCR repertoire is constrained by thymic selection, and TCR repertoire studies have therefore focused strongly on the diversity of MHC-binding complementarity determining region (CDR) CDR3. The process of somatic point mutations has given B cell studies a major focus on variable (IGHV, IGLV, and IGKV) genes. This in turn has influenced how both the naïve and memory BCR repertoires have been studied. Diversity (D) genes are also more easily identified in BCR VDJ rearrangements than in TCR VDJ rearrangements, and this has allowed the processes and elements that contribute to the incredible diversity of the immunoglobulin heavy chain CDR3 to be analyzed in detail. This diversity can be contrasted with that of the light chain where a small number of polypeptide sequences dominate the repertoire. Biases in the use of different germline genes, in gene processing, and in the addition of non-template encoded nucleotides appear to be intrinsic to the recombination process, imparting "shape" to the repertoire of rearranged genes as a result of differences spanning many orders of magnitude in the probabilities that different BCRs will be generated. This may function to increase the precursor frequency of naïve B cells with important specificities, and the likely emergence of such B cell lineages upon antigen exposure is discussed with reference to public and private T cell clonotypes.
Collapse
Affiliation(s)
- Katherine J. L. Jackson
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, NSW, Australia
| | - Marie J. Kidd
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, NSW, Australia
| | - Yan Wang
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, NSW, Australia
| | - Andrew M. Collins
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, NSW, Australia
| |
Collapse
|
38
|
Rubelt F, Sievert V, Knaust F, Diener C, Lim TS, Skriner K, Klipp E, Reinhardt R, Lehrach H, Konthur Z. Onset of immune senescence defined by unbiased pyrosequencing of human immunoglobulin mRNA repertoires. PLoS One 2012; 7:e49774. [PMID: 23226220 PMCID: PMC3511497 DOI: 10.1371/journal.pone.0049774] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2012] [Accepted: 10/12/2012] [Indexed: 12/15/2022] Open
Abstract
The immune system protects us from foreign substances or pathogens by generating specific antibodies. The variety of immunoglobulin (Ig) paratopes for antigen recognition is a result of the V(D)J rearrangement mechanism, while a fast and efficient immune response is mediated by specific immunoglobulin isotypes obtained through class switch recombination (CSR). To get a better understanding on how antibody-based immune protection works and how it changes with age, the interdependency between these two parameters need to be addressed. Here, we have performed an in depth analysis of antibody repertoires of 14 healthy donors representing different gender and age groups. For this task, we developed a unique pyrosequencing approach, which is able to monitor the expression levels of all immunoglobulin V(D)J recombinations of all isotypes including subtypes in an unbiased and quantitative manner. Our results show that donors have individual immunoglobulin repertoires and cannot be clustered according to V(D)J recombination patterns, neither by age nor gender. However, after incorporating isotype-specific analysis and considering CSR information into hierarchical clustering the situation changes. For the first time the donors cluster according to age and separate into young adults and elderly donors (>50). As a direct consequence, this clustering defines the onset of immune senescence at the age of fifty and beyond. The observed age-dependent reduction of CSR ability proposes a feasible explanation why reduced efficacy of vaccination is seen in the elderly and implies that novel vaccine strategies for the elderly should include the "Golden Agers".
Collapse
Affiliation(s)
- Florian Rubelt
- Max Planck Institute for Molecular Genetics, Berlin, Germany
- Faculty of Biology, Chemistry, and Pharmacy, Freie Universität Berlin, Berlin, Germany
| | - Volker Sievert
- Max Planck Institute for Molecular Genetics, Berlin, Germany
| | - Florian Knaust
- Max Planck Institute for Molecular Genetics, Berlin, Germany
| | - Christian Diener
- Max Planck Institute for Molecular Genetics, Berlin, Germany
- Theoretische Biophysik, Humboldt-Universität zu Berlin, Berlin, Germany
| | - Theam Soon Lim
- Max Planck Institute for Molecular Genetics, Berlin, Germany
- Institute for Research in Molecular Medicine, Universiti Sains Malaysia, Penang, Malaysia
| | - Karl Skriner
- Department of Rheumatology and Clinical Immunology, Charité – Universitätsmedizin Berlin, Berlin, Germany
| | - Edda Klipp
- Theoretische Biophysik, Humboldt-Universität zu Berlin, Berlin, Germany
| | - Richard Reinhardt
- Max Planck Institute for Molecular Genetics, Berlin, Germany
- Max Planck Genome Centre Cologne, Max Planck Institute for Plant Breeding Research, Cologne, Germany
| | - Hans Lehrach
- Max Planck Institute for Molecular Genetics, Berlin, Germany
| | - Zoltán Konthur
- Max Planck Institute for Molecular Genetics, Berlin, Germany
- * E-mail:
| |
Collapse
|
39
|
Statistical inference of the generation probability of T-cell receptors from sequence repertoires. Proc Natl Acad Sci U S A 2012; 109:16161-6. [PMID: 22988065 DOI: 10.1073/pnas.1212755109] [Citation(s) in RCA: 184] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Stochastic rearrangement of germline V-, D-, and J-genes to create variable coding sequence for certain cell surface receptors is at the origin of immune system diversity. This process, known as "VDJ recombination", is implemented via a series of stochastic molecular events involving gene choices and random nucleotide insertions between, and deletions from, genes. We use large sequence repertoires of the variable CDR3 region of human CD4+ T-cell receptor beta chains to infer the statistical properties of these basic biochemical events. Because any given CDR3 sequence can be produced in multiple ways, the probability distribution of hidden recombination events cannot be inferred directly from the observed sequences; we therefore develop a maximum likelihood inference method to achieve this end. To separate the properties of the molecular rearrangement mechanism from the effects of selection, we focus on nonproductive CDR3 sequences in T-cell DNA. We infer the joint distribution of the various generative events that occur when a new T-cell receptor gene is created. We find a rich picture of correlation (and absence thereof), providing insight into the molecular mechanisms involved. The generative event statistics are consistent between individuals, suggesting a universal biochemical process. Our probabilistic model predicts the generation probability of any specific CDR3 sequence by the primitive recombination process, allowing us to quantify the potential diversity of the T-cell repertoire and to understand why some sequences are shared between individuals. We argue that the use of formal statistical inference methods, of the kind presented in this paper, will be essential for quantitative understanding of the generation and evolution of diversity in the adaptive immune system.
Collapse
|
40
|
Abstract
Rheumatologists see patients with a range of autoimmune diseases. Phenotyping these diseases for diagnosis, prognosis and selection of therapies is an ever increasing problem. Advances in multiplexed assay technology at the gene, protein, and cellular level have enabled the identification of 'actionable biomarkers'; that is, biological metrics that can inform clinical practice. Not only will such biomarkers yield insight into the development, remission, and exacerbation of a disease, they will undoubtedly improve diagnostic sensitivity and accuracy of classification, and ultimately guide treatment. This Review provides an introduction to these powerful technologies that could promote the identification of actionable biomarkers, including mass cytometry, protein arrays, and immunoglobulin and T-cell receptor high-throughput sequencing. In our opinion, these technologies should become part of routine clinical practice for the management of autoimmune diseases. The use of analytical tools to deconvolve the data obtained from use of these technologies is also presented here. These analyses are revealing a more comprehensive and interconnected view of the immune system than ever before and should have an important role in directing future treatment approaches for autoimmune diseases.
Collapse
|
41
|
Watson CT, Breden F. The immunoglobulin heavy chain locus: genetic variation, missing data, and implications for human disease. Genes Immun 2012; 13:363-73. [PMID: 22551722 DOI: 10.1038/gene.2012.12] [Citation(s) in RCA: 130] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Abstract
The immunoglobulin (IG) loci consist of repeated and highly homologous sets of genes of different types, variable (V), diversity (D) and junction (J), that rearrange in developing B cells to produce an individual's highly variable repertoire of expressed antibodies, designed to bind to a vast array of pathogens. This repeated structure makes these loci susceptible to a high frequency of insertion and deletion events through evolutionary time, and also makes them difficult to characterize at the genomic level or assay with high-throughput techniques. Given the central role of antibodies in the adaptive immune system, it is not surprising that early candidate gene approaches showed that germline polymorphisms in these regions correlated with susceptibility to both infectious and autoimmune diseases. However, more recent studies, particularly those using high-throughput genome-wide arrays, have failed to implicate these loci in disease. In this review of the IG heavy chain variable gene cluster (IGHV), we examine how poorly we understand the distribution of haplotype variation in this genomic region, and we argue that this lack of information may mask candidate loci in the IGHV gene cluster as causative factors for infectious and autoimmune diseases.
Collapse
Affiliation(s)
- C T Watson
- Department of Biological Sciences, Simon Fraser University, Burnaby, BC, Canada.
| | | |
Collapse
|
42
|
Prabakaran P, Chen W, Singarayan MG, Stewart CC, Streaker E, Feng Y, Dimitrov DS. Expressed antibody repertoires in human cord blood cells: 454 sequencing and IMGT/HighV-QUEST analysis of germline gene usage, junctional diversity, and somatic mutations. Immunogenetics 2012; 64:337-50. [PMID: 22200891 PMCID: PMC6953429 DOI: 10.1007/s00251-011-0595-8] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2011] [Accepted: 12/05/2011] [Indexed: 12/16/2022]
Abstract
Human cord blood cell-derived IgM antibodies are important for the neonate immune responses and construction of germline-based immunoglobulin libraries. Several previous studies of a relatively small number of sequences found that they exhibit restrictions in the usage of germline genes and in the diversity of the variable heavy chain complementarity determining region 3 compared to adults. To further characterize such restrictions on a larger scale and to compare the early B-cell diversity to adult IgM repertoires, we performed 454 sequencing and IMGT/HighV-QUEST analysis of cord blood IG libraries from two babies and determined germline gene usage, V-D-J rearrangement, VHCDR3 diversity, and somatic mutations to characterize human neonate repertoire. Most of the germline subgroups were identified with frequencies comparable to those present in the adult IgM repertoire except for the IGHV1-2 gene that was preferentially expressed in the cord blood cells. The gene usage diversity contributed to 1,430 unique IGH V-D-J rearrangement patterns while the exonuclease trimming and N region addition at the V-D-J junctions along with gene diversity created a wide range of VHCDR3 with different lengths and sequence variability. We observed a lower degree of somatic mutations in the CDR and framework regions of antibodies from cord blood cells compared to adults. These results provide insights into the characteristics of human cord blood antibody repertoires, which have gene usage diversity and VHCDR3 lengths similar to that of the adult IgM repertoire but differ significantly in some of the gene usages, V-D-J rearrangements, junctional diversity, and somatic mutations.
Collapse
Affiliation(s)
- Ponraj Prabakaran
- Protein Interactions Group, Center for Cancer Research Nanobiology Program, National Cancer Institute (NCI)-Frederick, National Institutes of Health (NIH), Bldg 469, Rm 150B, Frederick, MD 21702, USA
| | | | | | | | | | | | | |
Collapse
|
43
|
Kidd MJ, Chen Z, Wang Y, Jackson KJ, Zhang L, Boyd SD, Fire AZ, Tanaka MM, Gaëta BA, Collins AM. The inference of phased haplotypes for the immunoglobulin H chain V region gene loci by analysis of VDJ gene rearrangements. THE JOURNAL OF IMMUNOLOGY 2011; 188:1333-40. [PMID: 22205028 DOI: 10.4049/jimmunol.1102097] [Citation(s) in RCA: 80] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Abstract
The existence of many highly similar genes in the lymphocyte receptor gene loci makes them difficult to investigate, and the determination of phased "haplotypes" has been particularly problematic. However, V(D)J gene rearrangements provide an opportunity to infer the association of Ig genes along the chromosomes. The chromosomal distribution of H chain genes in an Ig genotype can be inferred through analysis of VDJ rearrangements in individuals who are heterozygous at points within the IGH locus. We analyzed VDJ rearrangements from 44 individuals for whom sufficient unique rearrangements were available to allow comprehensive genotyping. Nine individuals were identified who were heterozygous at the IGHJ6 locus and for whom sufficient suitable VDJ rearrangements were available to allow comprehensive haplotyping. Each of the 18 resulting IGHV│IGHD│IGHJ haplotypes was unique. Apparent deletion polymorphisms were seen that involved as many as four contiguous, functional IGHV genes. Two deletion polymorphisms involving multiple contiguous IGHD genes were also inferred. Three previously unidentified gene duplications were detected, where two sequences recognized as allelic variants of a single gene were both inferred to be on a single chromosome. Phased genomic data brings clarity to the study of the contribution of each gene to the available repertoire of rearranged VDJ genes. Analysis of rearrangement frequencies suggests that particular genes may have substantially different yet predictable propensities for rearrangement within different haplotypes. Together with data highlighting the extent of haplotypic variation within the population, this suggests that there may be substantial variability in the available Ab repertoires of different individuals.
Collapse
Affiliation(s)
- Marie J Kidd
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Kensington, Sydney, New South Wales 2052, Australia
| | | | | | | | | | | | | | | | | | | |
Collapse
|
44
|
Jackson KJL, Wang Y, Gaeta BA, Pomat W, Siba P, Rimmer J, Sewell WA, Collins AM. Divergent human populations show extensive shared IGK rearrangements in peripheral blood B cells. Immunogenetics 2011; 64:3-14. [PMID: 21789596 DOI: 10.1007/s00251-011-0559-z] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2011] [Accepted: 07/12/2011] [Indexed: 11/28/2022]
Abstract
We have analysed the transcribed immunoglobulin kappa (IGK) repertoire of peripheral blood B cells from four individuals from two genetically distinct populations, Papua New Guinean and Australian, using high-throughput DNA sequencing. The depth of sequencing data for each individual averaged 5,548 high-quality IGK reads, and permitted genotyping of the inferred IGKV and IGKJ germline gene segments for each individual. All individuals were homozygous at each IGKJ locus and had highly similar inferred IGKV genotypes. Preferential gene usage was seen at both the IGKV and IGKJ loci, but only IGKV segment usage varied significantly between individuals. Despite the differences in IGKV gene utilisation, the rearranged IGK repertoires showed extensive identity at the amino acid level. Public rearrangements (those shared by two or more individuals) made up 60.2% of the total sequenced IGK rearrangements. The total diversity of IGK rearrangements of each individual was estimated to range from just 340 to 549 unique amino acid sequences. Thus, the repertoire of unique expressed IGK rearrangements is dramatically less than previous theoretical estimates of IGK diversity, and the majority of expressed IGK rearrangements are likely to be extensively shared in individual human beings.
Collapse
|