1
|
Hollenbach JA, Mack SJ, Gourraud PA, Single RM, Maiers M, Middleton D, Thomson G, Marsh SGE, Varney MD. A community standard for immunogenomic data reporting and analysis: proposal for a STrengthening the REporting of Immunogenomic Studies statement. ACTA ACUST UNITED AC 2012; 78:333-44. [PMID: 21988720 DOI: 10.1111/j.1399-0039.2011.01777.x] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
Modern high-throughput HLA and KIR typing technologies are generating a wealth of immunogenomic data with the potential to revolutionize the fields of histocompatibility and immune-related disease association and population genetic research, much as SNP-based approaches have revolutionized association research. The STrengthening the REporting of Genetic Association studies (STREGA) statement provides community-based data reporting and analysis standards for genomic disease-association studies, identifying specific areas in which adoption of reporting guidelines can improve the consistent interpretation of genetic studies. While aspects of STREGA can be applied to immunogenomic studies, HLA and KIR research requires additional consideration, as the high levels of polymorphism associated with immunogenomic data pose unique methodological and computational challenges to the synthesis of information across datasets. Here, we outline the principle challenges to consistency in immunogenomic studies, and propose that an immunogenomic-specific analog to the STREGA statement, a STrengthening the REporting of Immunogenomic Studies (STREIS) statement, be developed as part of the 16th International HLA and Immunogenetics Workshop. We propose that STREIS extends at least four of the 22 elements of the STREGA statement to specifically address issues pertinent to immunogenomic data: HLA and KIR nomenclature, data-validation, ambiguity resolution, and the analysis of highly polymorphic genetic systems. As with the STREGA guidelines, the intent behind STREIS is not to dictate the design of immunogenomic studies, but to ensure consistent and transparent reporting of research, facilitating the synthesis of HLA and KIR data across studies.
Collapse
Affiliation(s)
- J A Hollenbach
- Center for Genetics, Children's Hospital & Research Center Oakland, Oakland, CA, USA.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
2
|
Kennedy LJ, Modrell A, Groves P, Wei Z, Single RM, Happ GM. Genetic diversity of the major histocompatibility complex class II in Alaskan caribou herds. Int J Immunogenet 2010; 38:109-19. [PMID: 21054806 DOI: 10.1111/j.1744-313x.2010.00973.x] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
We have sampled five different herds of caribou in Alaska to ascertain their major histocompatibility complex (MHC) class II diversity, and to assess whether the herds were significantly different in their MHC class II allele profiles. We complemented the MHC results with data from nine neutral microsatellite markers. The results indicate that while the microsatellites are diverse, there are no significant differences between the herds. However, for the MHC, we have shown that there is diversity at three of the four loci studied, the different herds have significantly different MHC class II allele profiles. It is also clear that although some of the herds have overlapping ranges, they are still different for their MHC class II alleles.
Collapse
Affiliation(s)
- L J Kennedy
- Centre for Integrated Genomic Medical Research, University of Manchester, Manchester, UK.
| | | | | | | | | | | |
Collapse
|
3
|
Martin MP, Single RM, Wilson MJ, Trowsdale J, Carrington M. KIR haplotypes defined by segregation analysis in 59 Centre d’Etude Polymorphisme Humain (CEPH) families. Immunogenetics 2008. [DOI: 10.1007/s00251-008-0345-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
4
|
Martin MP, Single RM, Wilson MJ, Trowsdale J, Carrington M. KIR haplotypes defined by segregation analysis in 59 Centre d'Etude Polymorphisme Humain (CEPH) families. Immunogenetics 2008; 60:767-74. [PMID: 18972110 DOI: 10.1007/s00251-008-0334-y] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2008] [Accepted: 09/29/2008] [Indexed: 01/21/2023]
Abstract
The killer cell immunoglobulin-like receptor (KIR) gene cluster exhibits extensive allelic and haplotypic diversity. Variation at the locus is associated with an increasing number of human diseases, reminiscent of the HLA loci. Characterization of diversity at the KIR locus has progressed over the past several years, particularly since the sequence of entire KIR haplotypes have become available. To determine the extent of KIR haplotypic variability among individuals of northern European descent, we genotyped 59 CEPH families for presence/absence of all KIR genes and performed limited allelic subtyping at several KIR loci. A total of 20 unique haplotypes differing in gene content were identified, the most common of which was the previously defined A haplotype (f = 0.52). Several unusual haplotypes that probably arose as a consequence of unequal crossing over events were also identified. Linkage disequilibrium (LD) analysis indicated strong negative and positive LD between several pairs of genes, values that may be useful in determining haplotypic structure when family data are not available. These data provide a resource to aid in the interpretation of disease association data involving individuals of European descent.
Collapse
Affiliation(s)
- M P Martin
- Cancer and Inflammation Program, Laboratory of Experimental Immunology, SAIC-Frederick, Inc, NCI-Frederick, Frederick, MD, 21702, USA
| | | | | | | | | |
Collapse
|
5
|
Single RM, Meyer D, Mack SJ, Lancaster A, Erlich HA, Thomson G. 14th International HLA and Immunogenetics Workshop: report of progress in methodology, data collection, and analyses. ACTA ACUST UNITED AC 2007; 69 Suppl 1:185-7. [PMID: 17445197 DOI: 10.1111/j.1399-0039.2006.00767.x] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
The Biostatistics Component of the 13th International Histocompatibility Workshop (IHWS) developed the PyPop (Python for Population Genomics) software framework for high-throughput analysis and quality control (QC) assessments of highly polymorphic genotype data. Since its initial release, the software has had several new analysis modules added to it. These additions, combined with improved data filtering and QC modules, facilitate analyses of data at different levels (allele, haplotype, amino acid sequence, and nucleotide sequence). Since the 13th IHWS, much of the human leukocyte antigen (HLA) data from the workshop, QCed via PyPop and other methods, have been made publicly available through the Major Histocompatibility Complex database web site at the National Center for Biotechnology Information (http://ncbi.nih.gov/mhc/). The Anthropology/Human Genetic Diversity component (AHGDC) data have been used in a variety of studies. Prugnolle et al. used this data to corroborate a model of pathogen-driven selection as a factor related to high levels of diversity at HLA loci. Using a comparative genomics approach contrasting results for HLA and non-HLA markers, Meyer et al. analyzed a subset of the 13th IHWS AHGDC data and showed that HLA loci show detectable signs of both natural selection and the demographic history of populations.
Collapse
Affiliation(s)
- R M Single
- Department of Mathematics and Statistics, University of Vermont, Burlington, VT, USA.
| | | | | | | | | | | |
Collapse
|
6
|
Gourraud PA, Cambon-Thomsen A, Dauber EM, Feolo M, Hansen J, Mickelson E, Single RM, Thomsen M, Mayr WR. Nomenclature for HLA microsatellites. ACTA ACUST UNITED AC 2007; 69 Suppl 1:210-3. [PMID: 17445203 DOI: 10.1111/j.1399-0039.2006.00771.x] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
A proposal for a standardized nomenclature for human leukocyte antigen (HLA) microsatellites is presented. It provides recommendations for Microsatellites as regards to locus name, primer names, and denominations for alleles.
Collapse
Affiliation(s)
- P A Gourraud
- Inserm, Unit 558, Department of Epidemiology and Public Health, Faculty of Medicine, Toulouse, France
| | | | | | | | | | | | | | | | | |
Collapse
|
7
|
Abstract
The 14th International HLA (human leukocyte antigen) Immunogenetics Workshop (14th-IHIWS) Biostatistics and Anthropology/Human Genetic Diversity project continues the population sampling, genotype data generation, and biostatistic analyses of the 13th International Histocompatibility Workshop Anthropology/Human Genetic Diversity Component, with the overall goal of further characterizing global HLA allele and haplotype diversity and better describing the relationships between major histocompatibility complex diversity, geography, linguistics, and population history. Since the 13th Workshop, new investigators have and continue to be recruited to the project and new high-resolution class I and class II genotype data are being generated for 112 population samples from around the world.
Collapse
Affiliation(s)
- S J Mack
- Children's Hospital Oakland Research Institute, Oakland, CA94501-1155, USA.
| | | | | | | | | | | | | | | | | |
Collapse
|
8
|
Abstract
Population genetic statistics from multilocus genotype data inform our understanding of the patterns of genetic variation and their implications for evolutionary studies, generally, and human disease studies in particular. In any given population one can estimate haplotype frequencies, identify deviation from Hardy-Weinberg equilibrium, test for balancing or directional selection, and investigate patterns of linkage disequilibrium. Existing software packages are oriented primarily toward the computation of such statistics on a population-by-population basis, not on comparisons among populations and across different statistics. We developed PyPop (Python for Population Genomics) to facilitate the analyses of population genetic statistics across populations and the relationships among different statistics within and across populations. PyPop is an open-source framework for performing large-scale population genetic analyses on multilocus genotype data. It computes the statistics described above, among others. PyPop deploys a standard Extensible Markup Language (XML) output format and can integrate the results of multiple analyses on various populations that were performed at different times into a common output format that can be read into a spreadsheet. The XML output format allows PyPop to be embedded as part of a larger analysis pipeline. Originally developed to analyze the highly polymorphic genetic data of the human leukocyte antigen region of the human genome, PyPop has applicability to any kind of multilocus genetic data. It is the primary analysis platform for analyzing data collected for the Anthropological component of the 13th and 14th International Histocompatibility Workshops. PyPop has also been successfully used in studies by our group, with collaborators, and in publications by several independent research teams.
Collapse
Affiliation(s)
- A K Lancaster
- Department of Integrative Biology, University of California, Berkeley, Berkeley, CA, USA.
| | | | | | | | | |
Collapse
|
9
|
Abstract
A common practice among researchers performing linkage studies is the use of equal allele frequencies as input when reporting p-values from computer linkage programs such as S.A.G.E. SIBPAL. Our results, using 5,000 sets from a uniform-prior distribution of allele frequencies, showed that such input may be problematic. Further, we found that the S.A.G.E. SIBPAL test for proportion of alleles shared identical by descent among concordantly affected sib pairs showed a greater percentage of significant p-values with decreasing parental genotype information (Table III), while the S.A.G.E. SIBPAL Haseman-Elston test produced significant p-values comparatively less frequently (Table IV).
Collapse
Affiliation(s)
- D Gordon
- Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA
| | | | | | | | | | | |
Collapse
|
10
|
Abstract
Techniques that test for linkage between a marker and a trait locus based on the regression methods proposed by Haseman and Elston [1972] involve testing a null hypothesis of no linkage by examination of the regression coefficient. Modified Haseman-Elston methods accomplish this using ordinary least squares (OLS), weighted least squares (WLS), in which weights are reciprocals of estimated variances, and generalized estimating equations (GEE). Methods implementing the WLS and GEE currently use a diagonal covariance matrix, thus incorrectly treating the squared trait differences of two sib pairs within a family as uncorrelated. Correctly specifying the correlations between sib pairs in a family yields the best linear unbiased estimator of the regression coefficient [Scheffe, 1959]. This estimator will be referred to as the generalized least squares (GLS) estimator. We determined the null variance of the GLS estimator and the null variance of the WLS/OLS estimator. The correct null variance of the WLS/OLS estimate of the Haseman-Elston (H-E) regression coefficient may be either larger or smaller than the variance of the WLS/OLS estimate calculated assuming that the squared sib-pair differences are uncorrelated. For a fully informative marker locus, the gain in efficiency using GLS rather than WLS/OLS under the null hypothesis is approximately 11% in a large multifamily study with three siblings per family and 25% for families with four siblings each.
Collapse
Affiliation(s)
- R M Single
- Department of Applied Mathematics and Statistics, State University of New York at Stony Brook 11794, USA
| | | |
Collapse
|