1
|
Simpson CL, Musolf AM, Cordero RY, Cordero JB, Portas L, Murgia F, Lewis DD, Middlebrooks CD, Ciner EB, Bailey-Wilson JE, Stambolian D. Myopia in African Americans Is Significantly Linked to Chromosome 7p15.2-14.2. Invest Ophthalmol Vis Sci 2021; 62:16. [PMID: 34241624 PMCID: PMC8287048 DOI: 10.1167/iovs.62.9.16] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2020] [Accepted: 01/20/2021] [Indexed: 11/24/2022] Open
Abstract
Purpose The purpose of this study was to perform genetic linkage analysis and association analysis on exome genotyping from highly aggregated African American families with nonpathogenic myopia. African Americans are a particularly understudied population with respect to myopia. Methods One hundred six African American families from the Philadelphia area with a family history of myopia were genotyped using an Illumina ExomePlus array and merged with previous microsatellite data. Myopia was initially measured in mean spherical equivalent (MSE) and converted to a binary phenotype where individuals were identified as affected, unaffected, or unknown. Parametric linkage analysis was performed on both individual variants (single-nucleotide polymorphisms [SNPs] and microsatellites) as well as gene-based markers. Family-based association analysis and transmission disequilibrium test (TDT) analysis modified for rare variants was also performed. Results Genetic linkage analysis identified 2 genomewide significant variants at 7p15.2 and 7p14.2 (in the intergenic region between MIR148A and NFE2L3 and in the noncoding RNA LOC401324) and 2 genomewide significant genes (CRHR2 and AVL9) both at 7p14.3. No genomewide results were found in the association analyses. Conclusions This study identified a significant linkage peak in African American families for myopia at 7p15.2 to 7p14.2, the first potential risk locus for myopia in African Americans. Interesting candidate genes are located in the region, including PDE1C, which is highly expressed in the eyes, and known to be involved in retinal development. Further identification of the causal variants at this linkage peak will help elucidate the genetics of myopia in this understudied population.
Collapse
Affiliation(s)
- Claire L. Simpson
- Department of Genetics, Genomics and Informatics and Department of Ophthalmology, University of Tennessee Health Science Center, Memphis, Tennessee, United States
- Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Baltimore, Maryland, United States
| | - Anthony M. Musolf
- Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Baltimore, Maryland, United States
| | - Roberto Y. Cordero
- Department of Genetics, Genomics and Informatics and Department of Ophthalmology, University of Tennessee Health Science Center, Memphis, Tennessee, United States
| | - Jennifer B. Cordero
- Department of Genetics, Genomics and Informatics and Department of Ophthalmology, University of Tennessee Health Science Center, Memphis, Tennessee, United States
| | - Laura Portas
- Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Baltimore, Maryland, United States
| | - Federico Murgia
- Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Baltimore, Maryland, United States
| | - Deyana D. Lewis
- Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Baltimore, Maryland, United States
| | - Candace D. Middlebrooks
- Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Baltimore, Maryland, United States
| | - Elise B. Ciner
- The Pennsylvania College of Optometry at Salus University, Elkins Park, Pennsylvania, United States
| | - Joan E. Bailey-Wilson
- Department of Genetics, Genomics and Informatics and Department of Ophthalmology, University of Tennessee Health Science Center, Memphis, Tennessee, United States
| | - Dwight Stambolian
- Department of Ophthalmology, University of Pennsylvania, Philadelphia, Pennsylvania, United States
| |
Collapse
|
2
|
Simpson CL, Musolf AM, Li Q, Portas L, Murgia F, Cordero RY, Cordero JB, Moiz BA, Holzinger ER, Middlebrooks CD, Lewis DD, Bailey-Wilson JE, Stambolian D. Exome genotyping and linkage analysis identifies two novel linked regions and replicates two others for myopia in Ashkenazi Jewish families. BMC MEDICAL GENETICS 2019; 20:27. [PMID: 30704416 PMCID: PMC6357511 DOI: 10.1186/s12881-019-0752-8] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/04/2018] [Accepted: 01/11/2019] [Indexed: 01/10/2023]
Abstract
BACKGROUND Myopia is one of most common eye diseases in the world and affects 1 in 4 Americans. It is a complex disease caused by both environmental and genetics effects; the genetics effects are still not well understood. In this study, we performed genetic linkage analyses on Ashkenazi Jewish families with a strong familial history of myopia to elucidate any potential causal genes. METHODS Sixty-four extended Ashkenazi Jewish families were previously collected from New Jersey. Genotypes from the Illumina ExomePlus array were merged with prior microsatellite linkage data from these families. Additional custom markers were added for candidate regions reported in literature for myopia or refractive error. Myopia was defined as mean spherical equivalent (MSE) of -1D or worse and parametric two-point linkage analyses (using TwoPointLods) and multi-point linkage analyses (using SimWalk2) were performed as well as collapsed haplotype pattern (CHP) analysis in SEQLinkage and association analyses performed with FBAT and rv-TDT. RESULTS Strongest evidence of linkage was on 1p36(two-point LOD = 4.47) a region previously linked to refractive error (MYP14) but not myopia. Another genome-wide significant locus was found on 8q24.22 with a maximum two-point LOD score of 3.75. CHP analysis also detected the signal on 1p36, localized to the LINC00339 gene with a maximum HLOD of 3.47, as well as genome-wide significant signals on 7q36.1 and 11p15, which overlaps with the MYP7 locus. CONCLUSIONS We identified 2 novel linkage peaks for myopia on chromosomes 7 and 8 in these Ashkenazi Jewish families and replicated 2 more loci on chromosomes 1 and 11, one previously reported in refractive error but not myopia in these families and the other locus previously reported in the literature. Strong candidate genes have been identified within these linkage peaks in our families. Targeted sequencing in these regions will be necessary to definitively identify causal variants under these linkage peaks.
Collapse
Affiliation(s)
- Claire L Simpson
- Department of Genetics, Genomics and Informatics and Department of Ophthalmology, University of Tennessee Health Science Center, 71 S. Manassas Room 417, Memphis, TN, 38163, USA.,Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, 333 Cassell Dr., Suite 1200, Baltimore, MD, 21224, USA
| | - Anthony M Musolf
- Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, 333 Cassell Dr., Suite 1200, Baltimore, MD, 21224, USA
| | - Qing Li
- Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, 333 Cassell Dr., Suite 1200, Baltimore, MD, 21224, USA
| | - Laura Portas
- Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, 333 Cassell Dr., Suite 1200, Baltimore, MD, 21224, USA
| | - Federico Murgia
- Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, 333 Cassell Dr., Suite 1200, Baltimore, MD, 21224, USA
| | - Roberto Y Cordero
- Department of Genetics, Genomics and Informatics and Department of Ophthalmology, University of Tennessee Health Science Center, 71 S. Manassas Room 417, Memphis, TN, 38163, USA
| | - Jennifer B Cordero
- Department of Genetics, Genomics and Informatics and Department of Ophthalmology, University of Tennessee Health Science Center, 71 S. Manassas Room 417, Memphis, TN, 38163, USA
| | - Bilal A Moiz
- Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, 333 Cassell Dr., Suite 1200, Baltimore, MD, 21224, USA
| | - Emily R Holzinger
- Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, 333 Cassell Dr., Suite 1200, Baltimore, MD, 21224, USA
| | - Candace D Middlebrooks
- Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, 333 Cassell Dr., Suite 1200, Baltimore, MD, 21224, USA
| | - Deyana D Lewis
- Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, 333 Cassell Dr., Suite 1200, Baltimore, MD, 21224, USA
| | - Joan E Bailey-Wilson
- Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, 333 Cassell Dr., Suite 1200, Baltimore, MD, 21224, USA.
| | - Dwight Stambolian
- Department of Ophthalmology, University of Pennsylvania, Rm. 313, Stellar Chance Labs, 422 Curie Blvd, Philadelphia, PA, 19104, USA
| |
Collapse
|
3
|
Ramstetter MD, Dyer TD, Lehman DM, Curran JE, Duggirala R, Blangero J, Mezey JG, Williams AL. Benchmarking Relatedness Inference Methods with Genome-Wide Data from Thousands of Relatives. Genetics 2017; 207:75-82. [PMID: 28739658 PMCID: PMC5586387 DOI: 10.1534/genetics.117.1122] [Citation(s) in RCA: 45] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2017] [Accepted: 07/08/2017] [Indexed: 01/03/2023] Open
Abstract
Inferring relatedness from genomic data is an essential component of genetic association studies, population genetics, forensics, and genealogy. While numerous methods exist for inferring relatedness, thorough evaluation of these approaches in real data has been lacking. Here, we report an assessment of 12 state-of-the-art pairwise relatedness inference methods using a data set with 2485 individuals contained in several large pedigrees that span up to six generations. We find that all methods have high accuracy (92-99%) when detecting first- and second-degree relationships, but their accuracy dwindles to <43% for seventh-degree relationships. However, most identical by descent (IBD) segment-based methods inferred seventh-degree relatives correct to within one relatedness degree for >76% of relative pairs. Overall, the most accurate methods are Estimation of Recent Shared Ancestry (ERSA) and approaches that compute total IBD sharing using the output from GERMLINE and Refined IBD to infer relatedness. Combining information from the most accurate methods provides little accuracy improvement, indicating that novel approaches, such as new methods that leverage relatedness signals from multiple samples, are needed to achieve a sizeable jump in performance.
Collapse
Affiliation(s)
- Monica D Ramstetter
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York 14853
| | - Thomas D Dyer
- South Texas Diabetes and Obesity Institute, University of Texas Rio Grande Valley, Brownsville, Texas 78520
| | - Donna M Lehman
- Department of Medicine, University of Texas Health San Antonio, San Antonio, Texas 78229
| | - Joanne E Curran
- South Texas Diabetes and Obesity Institute, University of Texas Rio Grande Valley, Brownsville, Texas 78520
| | - Ravindranath Duggirala
- South Texas Diabetes and Obesity Institute, University of Texas Rio Grande Valley, Brownsville, Texas 78520
| | - John Blangero
- South Texas Diabetes and Obesity Institute, University of Texas Rio Grande Valley, Brownsville, Texas 78520
| | - Jason G Mezey
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York 14853
- Department of Genetic Medicine, Weill Cornell Medicine, New York, New York 10065
| | - Amy L Williams
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York 14853
| |
Collapse
|
4
|
Quickly identifying identical and closely related subjects in large databases using genotype data. PLoS One 2017; 12:e0179106. [PMID: 28609482 PMCID: PMC5469481 DOI: 10.1371/journal.pone.0179106] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2017] [Accepted: 05/10/2017] [Indexed: 01/05/2023] Open
Abstract
Genome-wide association studies (GWAS) usually rely on the assumption that different samples are not from closely related individuals. Detection of duplicates and close relatives becomes more difficult both statistically and computationally when one wants to combine datasets that may have been genotyped on different platforms. The dbGaP repository at the National Center of Biotechnology Information (NCBI) contains datasets from hundreds of studies with over one million samples. There are many duplicates and closely related individuals both within and across studies from different submitters. Relationships between studies cannot always be identified by the submitters of individual datasets. To aid in curation of dbGaP, we developed a rapid statistical method called Genetic Relationship and Fingerprinting (GRAF) to detect duplicates and closely related samples, even when the sets of genotyped markers differ and the DNA strand orientations are unknown. GRAF extracts genotypes of 10,000 informative and independent SNPs from genotype datasets obtained using different methods, and implements quick algorithms that enable it to find all of the duplicate pairs from more than 880,000 samples within and across dbGaP studies in less than two hours. In addition, GRAF uses two statistical metrics called All Genotype Mismatch Rate (AGMR) and Homozygous Genotype Mismatch Rate (HGMR) to determine subject relationships directly from the observed genotypes, without estimating probabilities of identity by descent (IBD), or kinship coefficients, and compares the predicted relationships with those reported in the pedigree files. We implemented GRAF in a freely available C++ program of the same name. In this paper, we describe the methods in GRAF and validate the usage of GRAF on samples from the dbGaP repository. Other scientists can use GRAF on their own samples and in combination with samples downloaded from dbGaP.
Collapse
|
5
|
Galván-Femenía I, Graffelman J, Barceló-I-Vidal C. Graphics for relatedness research. Mol Ecol Resour 2017; 17:1271-1282. [PMID: 28374569 PMCID: PMC5624821 DOI: 10.1111/1755-0998.12674] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2016] [Revised: 03/15/2017] [Accepted: 03/21/2017] [Indexed: 11/27/2022]
Abstract
Studies of relatedness have been crucial in molecular ecology over the last decades. Good evidence of this is the fact that studies of population structure, evolution of social behaviours, genetic diversity and quantitative genetics all involve relatedness research. The main aim of this article was to review the most common graphical methods used in allele sharing studies for detecting and identifying family relationships. Both IBS- and IBD-based allele sharing studies are considered. Furthermore, we propose two additional graphical methods from the field of compositional data analysis: the ternary diagram and scatterplots of isometric log-ratios of IBS and IBD probabilities. We illustrate all graphical tools with genetic data from the HGDP-CEPH diversity panel, using mainly 377 microsatellites genotyped for 25 individuals from the Maya population of this panel. We enhance all graphics with convex hulls obtained by simulation and use these to confirm the documented relationships. The proposed compositional graphics are shown to be useful in relatedness research, as they also single out the most prominent related pairs. The ternary diagram is advocated for its ability to display all three allele sharing probabilities simultaneously. The log-ratio plots are advocated as an attempt to overcome the problems with the Euclidean distance interpretation in the classical graphics.
Collapse
Affiliation(s)
- Iván Galván-Femenía
- Department of Computer Science, Applied Mathematics and Statistics, Universitat de Girona, Girona, Spain.,Disease Genomics-GCAT Group, Germans Trias Health Research Institute (IGTP)-Program of Predictive and Personalized Medicine of Cancer (PMPPC), Can Ruti Campus, Badalona, Barcelona, Spain
| | - Jan Graffelman
- Department of Statistics and Operations Research, Universitat Politècnica de Catalunya, Barcelona, Spain.,Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Carles Barceló-I-Vidal
- Department of Computer Science, Applied Mathematics and Statistics, Universitat de Girona, Girona, Spain
| |
Collapse
|
6
|
Soave D, Sun L. A generalized Levene's scale test for variance heterogeneity in the presence of sample correlation and group uncertainty. Biometrics 2017; 73:960-971. [DOI: 10.1111/biom.12651] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2016] [Revised: 12/01/2016] [Accepted: 12/01/2016] [Indexed: 10/20/2022]
Affiliation(s)
- David Soave
- Division of Biostatistics, Dalla Lana School of Public Health, University of Toronto; Toronto, Ontario M5T 3M7 Canada
- Program in Genetics and Genome Biology, Research Institute, The Hospital for Sick Children; Toronto, Ontario M5G 0A4 Canada
| | - Lei Sun
- Department of Statistical Sciences, University of Toronto; Toronto, Ontario M5S 3G3 Canada
- Division of Biostatistics, Dalla Lana School of Public Health, University of Toronto; Toronto, Ontario M5T 3M7 Canada
| |
Collapse
|
7
|
PREST-plus identifies pedigree errors and cryptic relatedness in the GAW18 sample using genome-wide SNP data. BMC Proc 2014; 8:S23. [PMID: 25519375 PMCID: PMC4143714 DOI: 10.1186/1753-6561-8-s1-s23] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022] Open
Abstract
Pedigree errors and cryptic relatedness often appear in families or population samples collected for genetic studies. If not identified, these issues can lead to either increased false negatives or false positives in both linkage and association analyses. To identify pedigree errors and cryptic relatedness among individuals from the 20 San Antonio Family Studies (SAFS) families and cryptic relatedness among the 157 putatively unrelated individuals, we apply PREST-plus to the genome-wide single-nucleotide polymorphism (SNP) data and analyze estimated identity-by-descent (IBD) distributions for all pairs of genotyped individuals. Based on the given pedigrees alone, PREST-plus identifies the following putative pairs: 1091 full-sib, 162 half-sib, 360 grandparent-grandchild, 2269 avuncular, 2717 first cousin, 402 half-avuncular, 559 half-first cousin, 2 half-sib+first cousin, 957 parent-offspring and 440,546 unrelated. Using the genotype data, PREST-plus detects 7 mis-specified relative pairs, with their IBD estimates clearly deviating from the null expectations, and it identifies 4 cryptic related pairs involving 7 individuals from 6 families.
Collapse
|