1
|
Abstract
The origins of the various elements in the human antibody repertoire have been and still are subject to considerable uncertainty. Uncertainty in respect of whether the various elements have always served a specific defense function or whether they were co-opted from other organismal roles to form a crude naïve repertoire that then became more complex as combinatorial mechanisms were added. Estimates of the current size of the human antibody naïve repertoire are also widely debated with numbers anywhere from 10 million members, based on experimentally derived numbers, to in excess of one thousand trillion members or more, based on the different sequences derived from theoretical combinatorial calculations. There are questions that are relevant at both ends of this number spectrum. At the lower bound it could be questioned whether this is an insufficient repertoire size to counter all the potential antigen-bearing pathogens. At the upper bound the question is rather simpler: How can any individual interrogate such an astronomical number of antibody-bearing B cells in a timeframe that is meaningful? This review evaluates the evolutionary aspects of the adaptive immune system, the calculations that lead to the large repertoire estimates, some of the experimental evidence pointing to a more restricted repertoire whose variation appears to derive from convergent 'structure and specificity features', and includes a theoretical model that seems to support it. Finally, a solution that may reconcile the size difference anomaly, which is still a hot subject of debate, is suggested.
Collapse
|
2
|
Blasco A, Endres MG, Sergeev RA, Jonchhe A, Macaluso NJM, Narayan R, Natoli T, Paik JH, Briney B, Wu C, Su AI, Subramanian A, Lakhani KR. Advancing computational biology and bioinformatics research through open innovation competitions. PLoS One 2019; 14:e0222165. [PMID: 31560691 PMCID: PMC6764653 DOI: 10.1371/journal.pone.0222165] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2019] [Accepted: 08/22/2019] [Indexed: 11/19/2022] Open
Abstract
Open data science and algorithm development competitions offer a unique avenue for rapid discovery of better computational strategies. We highlight three examples in computational biology and bioinformatics research in which the use of competitions has yielded significant performance gains over established algorithms. These include algorithms for antibody clustering, imputing gene expression data, and querying the Connectivity Map (CMap). Performance gains are evaluated quantitatively using realistic, albeit sanitized, data sets. The solutions produced through these competitions are then examined with respect to their utility and the prospects for implementation in the field. We present the decision process and competition design considerations that lead to these successful outcomes as a model for researchers who want to use competitions and non-domain crowds as collaborators to further their research.
Collapse
Affiliation(s)
- Andrea Blasco
- Laboratory for Innovation Science at Harvard, Harvard University, Cambridge, MA, United States of America
- Institute for Quantitative Social Science, Harvard University, Cambridge, MA, United States of America
- The Broad Institute, Cambridge, MA, United States of America
- * E-mail:
| | - Michael G. Endres
- Laboratory for Innovation Science at Harvard, Harvard University, Cambridge, MA, United States of America
- Institute for Quantitative Social Science, Harvard University, Cambridge, MA, United States of America
| | - Rinat A. Sergeev
- Laboratory for Innovation Science at Harvard, Harvard University, Cambridge, MA, United States of America
- Harvard Business School, Harvard University, Boston, MA, United States of America
| | - Anup Jonchhe
- The Broad Institute, Cambridge, MA, United States of America
| | | | - Rajiv Narayan
- The Broad Institute, Cambridge, MA, United States of America
| | - Ted Natoli
- The Broad Institute, Cambridge, MA, United States of America
| | - Jin H. Paik
- Laboratory for Innovation Science at Harvard, Harvard University, Cambridge, MA, United States of America
- Harvard Business School, Harvard University, Boston, MA, United States of America
| | - Bryan Briney
- Department of Immunology and Microbial Science, The Scripps Research Institute, La Jolla, CA, United States of America
| | - Chunlei Wu
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, United States of America
| | - Andrew I. Su
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, United States of America
| | | | - Karim R. Lakhani
- Laboratory for Innovation Science at Harvard, Harvard University, Cambridge, MA, United States of America
- Harvard Business School, Harvard University, Boston, MA, United States of America
- National Bureau of Economic Research, Cambridge, MA, United States of America
| |
Collapse
|
3
|
Vázquez Bernat N, Corcoran M, Hardt U, Kaduk M, Phad GE, Martin M, Karlsson Hedestam GB. High-Quality Library Preparation for NGS-Based Immunoglobulin Germline Gene Inference and Repertoire Expression Analysis. Front Immunol 2019; 10:660. [PMID: 31024532 PMCID: PMC6459949 DOI: 10.3389/fimmu.2019.00660] [Citation(s) in RCA: 38] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2018] [Accepted: 03/11/2019] [Indexed: 12/13/2022] Open
Abstract
Next generation sequencing (NGS) of immunoglobulin (Ig) repertoires (Rep-seq) enables examination of the adaptive immune system at an unprecedented level. Applications include studies of expressed repertoires, gene usage, somatic hypermutation levels, Ig lineage tracing and identification of genetic variation within the Ig loci through inference methods. All these applications require starting libraries that allow the generation of sequence data with low error rate and optimal representation of the expressed repertoire. Here, we provide detailed protocols for the production of libraries suitable for human Ig germline gene inference and Ig repertoire studies. Various parameters used in the process were tested in order to demonstrate factors that are critical to obtain high quality libraries. We demonstrate an improved 5'RACE technique that reduces the length constraints of Illumina MiSeq based Rep-seq analysis but allows for the acquisition of sequences upstream of Ig V genes, useful for primer design. We then describe a 5' multiplex method for library preparation, which yields full length V(D)J sequences suitable for genotype identification and novel gene inference. We provide comprehensive sets of primers targeting IGHV, IGKV, and IGLV genes. Using the optimized protocol, we produced IgM, IgG, IgK, and IgL libraries and analyzed them using the germline inference tool IgDiscover to identify expressed germline V alleles. This process additionally uncovered three IGHV, one IGKV, and six IGLV novel alleles in a single individual, which are absent from the IMGT reference database, highlighting the need for further study of Ig genetic variation. The library generation protocols presented here enable a robust means of analyzing expressed Ig repertoires, identifying novel alleles and producing individualized germline gene databases from humans.
Collapse
Affiliation(s)
- Néstor Vázquez Bernat
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institutet, Stockholm, Sweden
| | - Martin Corcoran
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institutet, Stockholm, Sweden
| | - Uta Hardt
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institutet, Stockholm, Sweden
- Division of Rheumatology, Department of Medicine, Center for Molecular Medicine, Karolinska Institutet and Karolinska University Hospital, Stockholm, Sweden
| | - Mateusz Kaduk
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institutet, Stockholm, Sweden
| | - Ganesh E. Phad
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institutet, Stockholm, Sweden
| | - Marcel Martin
- Science for Life Laboratory, Department of Biochemistry and Biophysics, Stockholm University, Stockholm, Sweden
| | | |
Collapse
|
4
|
Rettig TA, Pecaut MJ, Chapes SK. A comparison of unamplified and massively multiplexed PCR amplification for murine antibody repertoire sequencing. FASEB Bioadv 2019; 1:6-17. [PMID: 32123808 PMCID: PMC6996338 DOI: 10.1096/fba.1017] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2018] [Revised: 08/09/2018] [Accepted: 08/17/2018] [Indexed: 11/26/2022] Open
Abstract
Sequencing antibody repertoires has steadily become cheaper and easier. Sequencing methods usually rely on some form of amplification, often a massively multiplexed PCR prior to sequencing. To eliminate potential biases and create a data set that could be used for other studies, our laboratory compared unamplified sequencing results from the splenic heavy-chain repertoire in the mouse to those processed through two commercial applications. We also compared the use of mRNA vs total RNA, reverse transcriptase, and primer usage for cDNA synthesis and submission. The use of mRNA for cDNA synthesis resulted in higher read counts but reverse transcriptase and primer usage had no statistical effects on read count. Although most of the amplified data sets contained more antibody reads than the unamplified data set, we detected more unique variable (V)-gene segments in the unamplified data set. Although unique CDR3 detection was much lower in the unamplified data set, RNASeq detected 98% of the high-frequency CDR3s. We have shown that unamplified profiling of the antibody repertoire is possible, detects more V-gene segments, and detects high-frequency clones in the repertoire.
Collapse
Affiliation(s)
| | - Michael J. Pecaut
- Division of Biomedical Engineering Sciences (BMES)Loma Linda UniversityLoma LindaCalifornia
| | | |
Collapse
|