1
|
Zhang Q, Yi GY. Genetic association studies with bivariate mixed responses subject to measurement error and misclassification. Stat Med 2020; 39:3700-3719. [PMID: 32914420 DOI: 10.1002/sim.8688] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2019] [Revised: 04/12/2020] [Accepted: 06/13/2020] [Indexed: 01/01/2023]
Abstract
In genetic association studies, mixed effects models have been widely used in detecting the pleiotropy effects which occur when one gene affects multiple phenotype traits. In particular, bivariate mixed effects models are useful for describing the association of a gene with a continuous trait and a binary trait. However, such models are inadequate to feature the data with response mismeasurement, a characteristic that is often overlooked. It has been well studied that in univariate settings, ignorance of mismeasurement in variables usually results in biased estimation. In this paper, we consider the setting with a bivariate outcome vector which contains a continuous component and a binary component both subject to mismeasurement. We propose an induced likelihood approach and an EM algorithm method to handle measurement error in continuous response and misclassification in binary response simultaneously. Simulation studies confirm that the proposed methods successfully remove the bias induced from the response mismeasurement.
Collapse
Affiliation(s)
- Qihuang Zhang
- Department of Statistics and Actuarial Science, University of Waterloo, Waterloo, Ontario, N2L3G1, Canada
| | - Grace Y Yi
- Department of Statistical and Actuarial Sciences, Department of Computer Science, University of Western Ontario, London, Ontario, Canada, N6A 5B7
| |
Collapse
|
2
|
Berg N, Rodríguez‐Girondo M, Mandemakers K, Janssens AAPO, Beekman M, Slagboom PE. Longevity Relatives Count score identifies heritable longevity carriers and suggests case improvement in genetic studies. Aging Cell 2020; 19:e13139. [PMID: 32352215 PMCID: PMC7294789 DOI: 10.1111/acel.13139] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2019] [Revised: 01/24/2020] [Accepted: 02/23/2020] [Indexed: 12/23/2022] Open
Abstract
Loci associated with longevity are likely to harbor genes coding for key players of molecular pathways involved in a lifelong decreased mortality and decreased/compressed morbidity. However, identifying such loci is challenging. One of the most plausible reasons is the uncertainty in defining long‐lived cases with the heritable longevity trait among long‐living phenocopies. To avoid phenocopies, family selection scores have been constructed, but these have not yet been adopted as state of the art in longevity research. Here, we aim to identify individuals with the heritable longevity trait by using current insights and a novel family score based on these insights. We use a unique dataset connecting living study participants to their deceased ancestors covering 37,825 persons from 1,326 five‐generational families, living between 1788 and 2019. Our main finding suggests that longevity is transmitted for at least two subsequent generations only when at least 20% of all relatives are long‐lived. This proves the importance of family data to avoid phenocopies in genetic studies.
Collapse
Affiliation(s)
- Niels Berg
- Section of Molecular Epidemiology Department of Biomedical Data Sciences Leiden University Medical Center Leiden The Netherlands
- Radboud Group for Historical Demography and Family History Radboud University Nijmegen The Netherlands
| | - Mar Rodríguez‐Girondo
- Section of Medical Statistics Department of Biomedical Data Sciences Leiden University Medical Center Leiden The Netherlands
| | - Kees Mandemakers
- International Institute of Social History Amsterdam The Netherlands
| | | | - Marian Beekman
- Section of Molecular Epidemiology Department of Biomedical Data Sciences Leiden University Medical Center Leiden The Netherlands
| | - P. Eline Slagboom
- Section of Molecular Epidemiology Department of Biomedical Data Sciences Leiden University Medical Center Leiden The Netherlands
- Max Planck Institute for Biology of Ageing Cologne Germany
| |
Collapse
|
3
|
Shafquat A, Crystal RG, Mezey JG. Identifying novel associations in GWAS by hierarchical Bayesian latent variable detection of differentially misclassified phenotypes. BMC Bioinformatics 2020; 21:178. [PMID: 32381021 PMCID: PMC7204256 DOI: 10.1186/s12859-020-3387-z] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2019] [Accepted: 01/24/2020] [Indexed: 12/22/2022] Open
Abstract
Background Heterogeneity in the definition and measurement of complex diseases in Genome-Wide Association Studies (GWAS) may lead to misdiagnoses and misclassification errors that can significantly impact discovery of disease loci. While well appreciated, almost all analyses of GWAS data consider reported disease phenotype values as is without accounting for potential misclassification. Results Here, we introduce Phenotype Latent variable Extraction of disease misdiagnosis (PheLEx), a GWAS analysis framework that learns and corrects misclassified phenotypes using structured genotype associations within a dataset. PheLEx consists of a hierarchical Bayesian latent variable model, where inference of differential misclassification is accomplished using filtered genotypes while implementing a full mixed model to account for population structure and genetic relatedness in study populations. Through simulations, we show that the PheLEx framework dramatically improves recovery of the correct disease state when considering realistic allele effect sizes compared to existing methodologies designed for Bayesian recovery of disease phenotypes. We also demonstrate the potential of PheLEx for extracting new potential loci from existing GWAS data by analyzing bipolar disorder and epilepsy phenotypes available from the UK Biobank. From the PheLEx analysis of these data, we identified new candidate disease loci not previously reported for these datasets that have value for supplemental hypothesis generation. Conclusion PheLEx shows promise in reanalyzing GWAS datasets to provide supplemental candidate loci that are ignored by traditional GWAS analysis methodologies.
Collapse
Affiliation(s)
- Afrah Shafquat
- Department of Computational Biology, Cornell University, Ithaca, NY, USA
| | - Ronald G Crystal
- Department of Genetic Medicine, Weill Cornell Medicine, New York, NY, USA.,Department of Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Jason G Mezey
- Department of Computational Biology, Cornell University, Ithaca, NY, USA. .,Department of Genetic Medicine, Weill Cornell Medicine, New York, NY, USA.
| |
Collapse
|
4
|
Zheng Q, Zhang Y, Jiang J, Jia J, Fan F, Gong Y, Wang Z, Shi Q, Chen D, Huo Y. Exome-Wide Association Study Reveals Several Susceptibility Genes and Pathways Associated With Acute Coronary Syndromes in Han Chinese. Front Genet 2020; 11:336. [PMID: 32328087 PMCID: PMC7160370 DOI: 10.3389/fgene.2020.00336] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2019] [Accepted: 03/20/2020] [Indexed: 11/13/2022] Open
Abstract
Genome-wide association studies have identified more than 150 susceptibility loci for coronary artery disease (CAD); however, there is still a large proportion of missing heritability remaining to be investigated. This study sought to identify population-based genetic variation associated with acute coronary syndromes (ACS) in individuals of Chinese Han descent. We proposed a novel strategy integrating a well-developed risk prediction model into control selection in order to lower the potential misclassification bias and increase the statistical power. An exome-wide association analysis was performed for 1,669 ACS patients and 1,935 healthy controls. Promising variants were further replicated using the existing in silico dataset. Additionally, we performed gene- and pathway-based analyses to investigate the aggregate effect of multiple variants within the same genes or pathways. Although none of the association signals were consistent across studies after Bonferroni correction, one promising variant, rs10409124 at STRN4, showed potential impact on ACS in both European and East Asian populations. Gene-based analysis explored four genes (ANXA7, ZNF655, ZNF347, and ZNF750) that showed evidence for association with ACS after multiple test correction, and identification of ZNF655 was successfully replicated by another dataset. Pathway-based analysis revealed that 32 potential pathways might be involved in the pathogenesis of ACS. Our study identified several candidate genes and pathways associated with ACS. Future studies are needed to further validate these findings and explore these genes and pathways as potential therapeutic targets in ACS.
Collapse
Affiliation(s)
- Qiwen Zheng
- Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing, China
| | - Yan Zhang
- Department of Cardiology, Peking University First Hospital, Beijing, China
| | - Jie Jiang
- Department of Cardiology, Peking University First Hospital, Beijing, China
| | - Jia Jia
- Department of Cardiology, Peking University First Hospital, Beijing, China
| | - Fangfang Fan
- Department of Cardiology, Peking University First Hospital, Beijing, China
| | - Yanjun Gong
- Department of Cardiology, Peking University First Hospital, Beijing, China
| | - Zhi Wang
- Department of Cardiology, Peking University First Hospital, Beijing, China
| | - Qiuping Shi
- Department of Cardiology, Peking University First Hospital, Beijing, China
| | - Dafang Chen
- Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing, China
| | - Yong Huo
- Department of Cardiology, Peking University First Hospital, Beijing, China
| |
Collapse
|
5
|
Gemenet DC, Kitavi MN, David M, Ndege D, Ssali RT, Swanckaert J, Makunde G, Yencho GC, Gruneberg W, Carey E, Mwanga RO, Andrade MI, Heck S, Campos H. Development of diagnostic SNP markers for quality assurance and control in sweetpotato [Ipomoea batatas (L.) Lam.] breeding programs. PLoS One 2020; 15:e0232173. [PMID: 32330201 PMCID: PMC7182229 DOI: 10.1371/journal.pone.0232173] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2019] [Accepted: 04/08/2020] [Indexed: 11/19/2022] Open
Abstract
Quality assurance and control (QA/QC) is an essential element of a breeding program's optimization efforts towards increased genetic gains. Due to auto-hexaploid genome complexity, a low-cost marker platform for routine QA/QC in sweetpotato breeding programs is still unavailable. We used 662 parents of the International Potato Center (CIP)'s global breeding program spanning Peru, Uganda, Mozambique and Ghana, to develop a low-density highly informative single nucleotide polymorphism (SNP) marker set to be deployed for routine QA/QC. Segregation of the selected 30 SNPs (two SNPs per base chromosome) in a recombined breeding population was evaluated using 282 progeny from some of the parents above. The progeny were replicated from in-vitro, screenhouse and field, and the selected SNP-set was confirmed to identify relatively similar mislabeling error rates as a high density SNP-set of 10,159 markers. Six additional trait-specific markers were added to the selected SNP set from previous quantitative trait loci mapping studies. The 36-SNP set will be deployed for QA/QC in breeding pipelines and in fingerprinting of advanced clones or released varieties to monitor genetic gains in famers' fields. The study also enabled evaluation of CIP's global breeding population structure and the effect of some of the most devastating stresses like sweetpotato virus disease on genetic variation management. These results will inform future deployment of genomic selection in sweetpotato.
Collapse
Affiliation(s)
| | - Mercy N. Kitavi
- International Potato Center (CIP), ILRI Campus, Nairobi, Kenya
| | - Maria David
- International Potato Center (CIP), Apartado, Lima, Peru
| | - Dorcah Ndege
- International Potato Center (CIP), ILRI Campus, Nairobi, Kenya
| | | | | | | | - G. Craig Yencho
- North Carolina State University, Raleigh, North Carolina, United States of America
| | | | - Edward Carey
- International Potato Center (CIP), Kumasi, Ghana
| | | | | | - Simon Heck
- International Potato Center (CIP), ILRI Campus, Nairobi, Kenya
| | - Hugo Campos
- International Potato Center (CIP), Apartado, Lima, Peru
| |
Collapse
|
6
|
Longitudinal Phenotypes Improve Genotype Association for Hyperketonemia in Dairy Cattle. Animals (Basel) 2019; 9:ani9121059. [PMID: 31805754 PMCID: PMC6941043 DOI: 10.3390/ani9121059] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2019] [Revised: 11/16/2019] [Accepted: 11/20/2019] [Indexed: 11/17/2022] Open
Abstract
Simple Summary Dairy cows have differing success in supporting their physiological functions while in energy deficit right after calving. Identification of genomic regions associated with different concentrations of non–esterified fatty acids and β–hydroxybutyrate in early postpartum Holstein cows provide insight into an animal’s genetic susceptibility to these conditions. Longitudinal phenotypes may provide a different perspective than cross-sectional phenotype variation and their association with genotypes in the study of complex metabolic diseases in dairy cows. This might allow us to reinforce preventative measures that decrease the incidence of hyperketonemia and improve genetic selection criteria. Abstract The objective of our study was to identify genomic regions associated with varying concentrations of non-esterified fatty acid (NEFA), β-hydroxybutyrate (BHB), and the development of hyperketonemia (HYK) in longitudinally sampled Holstein dairy cows. Our study population consisted of 147 multiparous cows intensively characterized by serial NEFA and BHB concentrations. To identify individuals with contrasting combinations in longitudinal BHB and NEFA concentrations, phenotypes were established using incremental area under the curve (AUC) and categorized as follows: Group (1) high NEFA and high BHB, group (2) low NEFA and high BHB), group (3) low NEFA and low BHB, and group (4) high NEFA and low BHB. Cows were genotyped on the Illumina Bovine High-density (777 K) beadchip. Genome-wide association studies using mixed linear models with the least-related animals were performed to establish a genetic association with HYK, BHB-AUC, NEFA-AUC, and the comparisons of the 4 AUC phenotypic groups using Golden Helix software. Nine single-nucleotide polymorphisms were associated with high longitudinal concentrations of BHB and further investigated. Five candidate genes related to energy metabolism and homeostasis were identified. These results provide biological insight and help identify susceptible animals thus improving genetic selection criteria thereby decreasing the incidence of HYK.
Collapse
|
7
|
Beaulieu-Jones BK, Kohane IS, Beam AL. Learning Contextual Hierarchical Structure of Medical Concepts with Poincairé Embeddings to Clarify Phenotypes. PACIFIC SYMPOSIUM ON BIOCOMPUTING. PACIFIC SYMPOSIUM ON BIOCOMPUTING 2019; 24:8-17. [PMID: 30864306 PMCID: PMC6417814] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Biomedical association studies are increasingly done using clinical concepts, and in particular diagnostic codes from clinical data repositories as phenotypes. Clinical concepts can be represented in a meaningful, vector space using word embedding models. These embeddings allow for comparison between clinical concepts or for straightforward input to machine learning models. Using traditional approaches, good representations require high dimensionality, making downstream tasks such as visualization more difficult. We applied Poincaré embeddings in a 2-dimensional hyperbolic space to a large-scale administrative claims database and show performance comparable to 100-dimensional embeddings in a euclidean space. We then examine disease relationships under different disease contexts to better understand potential phenotypes.
Collapse
Affiliation(s)
| | - Isaac S. Kohane
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA
| | - Andrew L. Beam
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA
| |
Collapse
|
8
|
Ling A, Hay EH, Aggrey SE, Rekaya R. A Bayesian approach for analysis of ordered categorical responses subject to misclassification. PLoS One 2018; 13:e0208433. [PMID: 30543662 PMCID: PMC6292639 DOI: 10.1371/journal.pone.0208433] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2018] [Accepted: 11/10/2018] [Indexed: 11/18/2022] Open
Abstract
Ordinal categorical responses are frequently collected in survey studies, human medicine, and animal and plant improvement programs, just to mention a few. Errors in this type of data are neither rare nor easy to detect. These errors tend to bias the inference, reduce the statistical power and ultimately the efficiency of the decision-making process. Contrarily to the binary situation where misclassification occurs between two response classes, noise in ordinal categorical data is more complex due to the increased number of categories, diversity and asymmetry of errors. Although several approaches have been presented for dealing with misclassification in binary data, only limited practical methods have been proposed to analyze noisy categorical responses. A latent variable model implemented within a Bayesian framework was proposed to analyze ordinal categorical data subject to misclassification using simulated and real datasets. The simulated scenario consisted of a discrete response with three categories and a symmetric error rate of 5% between any two classes. The real data consisted of calving ease records of beef cows. Using real and simulated data, ignoring misclassification resulted in substantial bias in the estimation of genetic parameters and reduction of the accuracy of predicted breeding values. Using our proposed approach, a significant reduction in bias and increase in accuracy ranging from 11% to 17% was observed. Furthermore, most of the misclassified observations (in the simulated data) were identified with a substantially higher probability. Similar results were observed for a scenario with asymmetric misclassification. While the extension to traits with more categories between adjacent classes is straightforward, it could be computationally costly. For traits with high heritability, the performance of the methodology would be expected to improve.
Collapse
Affiliation(s)
- Ashley Ling
- Department of Anismal and Dairy Science, University of Georgia, Athens, Georgia, United States of America
- * E-mail:
| | - El Hamidi Hay
- USDA Agricultural Research Service, Fort Keogh Livestock and Range Research Laboratory, Miles City, Montana, United States of America
| | - Samuel E. Aggrey
- Institute of Bioinformatics, University of Georgia, Athens, Georgia, United States of America
- Department of Poultry Science, University of Georgia, Athens, Georgia, United States of America
| | - Romdhane Rekaya
- Department of Anismal and Dairy Science, University of Georgia, Athens, Georgia, United States of America
- Institute of Bioinformatics, University of Georgia, Athens, Georgia, United States of America
- Department of Statistics, University of Georgia, Athens, Georgia, United States of America
| |
Collapse
|
9
|
Zilhão NR, Olthof MC, Smit DJA, Cath DC, Ligthart L, Mathews CA, Delucchi K, Boomsma DI, Dolan CV. Heritability of tic disorders: a twin-family study. Psychol Med 2017; 47:1085-1096. [PMID: 27974054 PMCID: PMC5410124 DOI: 10.1017/s0033291716002981] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
BACKGROUND Genetic-epidemiological studies that estimate the contributions of genetic factors to variation in tic symptoms are scarce. We estimated the extent to which genetic and environmental influences contribute to tics, employing various phenotypic definitions ranging between mild and severe symptomatology, in a large population-based adult twin-family sample. METHOD In an extended twin-family design, we analysed lifetime tic data reported by adult mono- and dizygotic twins (n = 8323) and their family members (n = 7164; parents and siblings) from 7311 families in the Netherlands Twin Register. We measured tics by the abbreviated version of the Schedule for Tourette and Other Behavioral Syndromes. Heritability was estimated by genetic structural equation modeling for four tic disorder definitions: three dichotomous and one trichotomous phenotype, characterized by increasingly strictly defined criteria. RESULTS Prevalence rates of the different tic disorders in our sample varied between 0.3 and 4.5% depending on tic disorder definition. Tic frequencies decreased with increasing age. Heritability estimates varied between 0.25 and 0.37, depending on phenotypic definitions. None of the phenotypes showed evidence of assortative mating, effects of shared environment or non-additive genetic effects. CONCLUSIONS Heritabilities of mild and severe tic phenotypes were estimated to be moderate. Overlapping confidence intervals of the heritability estimates suggest overlapping genetic liabilities between the various tic phenotypes. The most lenient phenotype (defined only by tic characteristics, excluding criteria B, C and D of DSM-IV) rendered sufficiently reliable heritability estimates. These findings have implications in phenotypic definitions for future genetic studies.
Collapse
Affiliation(s)
- N R Zilhão
- Department of Biological Psychology,Vrije Universiteit,Amsterdam,The Netherlands
| | - M C Olthof
- Department of Psychology,University of Amsterdam,The Netherlands
| | - D J A Smit
- Department of Biological Psychology,Vrije Universiteit,Amsterdam,The Netherlands
| | - D C Cath
- Department of Clinical Psychology,Utrecht University,The Netherlands
| | - L Ligthart
- Department of Biological Psychology,Vrije Universiteit,Amsterdam,The Netherlands
| | - C A Mathews
- Department of Psychiatry,University of Florida,Gainesville, FL,USA
| | - K Delucchi
- Department of Psychiatry,University of California,San Francisco, CA,USA
| | - D I Boomsma
- Department of Biological Psychology,Vrije Universiteit,Amsterdam,The Netherlands
| | - C V Dolan
- Department of Biological Psychology,Vrije Universiteit,Amsterdam,The Netherlands
| |
Collapse
|
10
|
Zawistowski M, Sussman JB, Hofer TP, Bentley D, Hayward RA, Wiitala WL. Corrected ROC analysis for misclassified binary outcomes. Stat Med 2017; 36:2148-2160. [PMID: 28245528 DOI: 10.1002/sim.7260] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2016] [Revised: 01/25/2017] [Accepted: 01/26/2017] [Indexed: 11/06/2022]
Abstract
Creating accurate risk prediction models from Big Data resources such as Electronic Health Records (EHRs) is a critical step toward achieving precision medicine. A major challenge in developing these tools is accounting for imperfect aspects of EHR data, particularly the potential for misclassified outcomes. Misclassification, the swapping of case and control outcome labels, is well known to bias effect size estimates for regression prediction models. In this paper, we study the effect of misclassification on accuracy assessment for risk prediction models and find that it leads to bias in the area under the curve (AUC) metric from standard ROC analysis. The extent of the bias is determined by the false positive and false negative misclassification rates as well as disease prevalence. Notably, we show that simply correcting for misclassification while building the prediction model is not sufficient to remove the bias in AUC. We therefore introduce an intuitive misclassification-adjusted ROC procedure that accounts for uncertainty in observed outcomes and produces bias-corrected estimates of the true AUC. The method requires that misclassification rates are either known or can be estimated, quantities typically required for the modeling step. The computational simplicity of our method is a key advantage, making it ideal for efficiently comparing multiple prediction models on very large datasets. Finally, we apply the correction method to a hospitalization prediction model from a cohort of over 1 million patients from the Veterans Health Administrations EHR. Implementations of the ROC correction are provided for Stata and R. Published 2017. This article is a U.S. Government work and is in the public domain in the USA.
Collapse
Affiliation(s)
- Matthew Zawistowski
- Veterans Affairs Center for Clinical Management Research, Ann Arbor, 48105, MI, U.S.A.,Department of Biostatistics, University of Michigan, Ann Arbor, 48109, MI, U.S.A
| | - Jeremy B Sussman
- Veterans Affairs Center for Clinical Management Research, Ann Arbor, 48105, MI, U.S.A.,Department of Internal Medicine, University of Michigan Medical School, Ann Arbor, 48109, MI, U.S.A
| | - Timothy P Hofer
- Veterans Affairs Center for Clinical Management Research, Ann Arbor, 48105, MI, U.S.A.,Department of Internal Medicine, University of Michigan Medical School, Ann Arbor, 48109, MI, U.S.A
| | - Douglas Bentley
- Veterans Affairs Center for Clinical Management Research, Ann Arbor, 48105, MI, U.S.A
| | - Rodney A Hayward
- Veterans Affairs Center for Clinical Management Research, Ann Arbor, 48105, MI, U.S.A.,Department of Internal Medicine, University of Michigan Medical School, Ann Arbor, 48109, MI, U.S.A
| | - Wyndy L Wiitala
- Veterans Affairs Center for Clinical Management Research, Ann Arbor, 48105, MI, U.S.A
| |
Collapse
|
11
|
Rekaya R, Smith S, Hay EH, Farhat N, Aggrey SE. Analysis of binary responses with outcome-specific misclassification probability in genome-wide association studies. Appl Clin Genet 2016; 9:169-177. [PMID: 27942229 PMCID: PMC5138056 DOI: 10.2147/tacg.s122250] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
Errors in the binary status of some response traits are frequent in human, animal, and plant applications. These error rates tend to differ between cases and controls because diagnostic and screening tests have different sensitivity and specificity. This increases the inaccuracies of classifying individuals into correct groups, giving rise to both false-positive and false-negative cases. The analysis of these noisy binary responses due to misclassification will undoubtedly reduce the statistical power of genome-wide association studies (GWAS). A threshold model that accommodates varying diagnostic errors between cases and controls was investigated. A simulation study was carried out where several binary data sets (case-control) were generated with varying effects for the most influential single nucleotide polymorphisms (SNPs) and different diagnostic error rate for cases and controls. Each simulated data set consisted of 2000 individuals. Ignoring misclassification resulted in biased estimates of true influential SNP effects and inflated estimates for true noninfluential markers. A substantial reduction in bias and increase in accuracy ranging from 12% to 32% was observed when the misclassification procedure was invoked. In fact, the majority of influential SNPs that were not identified using the noisy data were captured using the proposed method. Additionally, truly misclassified binary records were identified with high probability using the proposed method. The superiority of the proposed method was maintained across different simulation parameters (misclassification rates and odds ratios) attesting to its robustness.
Collapse
Affiliation(s)
- Romdhane Rekaya
- Department of Animal and Dairy Science, College of Agricultural and Environmental Sciences
- Department of Statistics, Franklin College of Arts and Sciences
- Institute of Bioinformatics, The University of Georgia, Athens, GA
| | | | - El Hamidi Hay
- United States Department of Agriculture, Agricultural Research Service, Beltsville, MD
| | | | - Samuel E Aggrey
- Institute of Bioinformatics, The University of Georgia, Athens, GA
- Department of Poultry Science, College of Agricultural and Environmental Sciences, University of Georgia, Athens, GA, USA
| |
Collapse
|
12
|
Perlman G, Kotov R, Fu J, Bromet EJ, Fochtmann LJ, Medeiros H, Pato MT, Pato CN. Symptoms of psychosis in schizophrenia, schizoaffective disorder, and bipolar disorder: A comparison of African Americans and Caucasians in the Genomic Psychiatry Cohort. Am J Med Genet B Neuropsychiatr Genet 2016; 171:546-55. [PMID: 26663585 DOI: 10.1002/ajmg.b.32409] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/06/2015] [Accepted: 11/25/2015] [Indexed: 11/10/2022]
Abstract
Several studies have reported differences between African Americans and Caucasians in relative proportion of psychotic symptoms and disorders, but whether this reflects racial bias in the assessment of psychosis is unclear. The purpose of this study was to examine the distribution of psychotic symptoms and potential bias in symptoms assessed via semi-structured interview using a cohort of 3,389 African American and 5,692 Caucasian participants who were diagnosed with schizophrenia, schizoaffective disorder, or bipolar disorder. In this cohort, the diagnosis of schizophrenia was relatively more common, and the diagnosis of bipolar disorder and schizoaffective disorder-bipolar type was less relatively common, among African Americans than Caucasians. With regard to symptoms, relatively more African Americans than Caucasians endorsed hallucinations and delusions symptoms, and this pattern was striking among cases diagnosed with bipolar disorder and schizoaffective-bipolar disorder. In contrast, the relative endorsement of psychotic symptoms was more similar among cases diagnosed with schizophrenia and schizoaffective disorder-depressed type. Differential item function analysis revealed that African Americans with mild psychosis over-endorsed "hallucinations in any modality" and under-endorsed "widespread delusions" relative to Caucasians. Other symptoms did not show evidence of racial bias. Thus, racial bias in assessment of psychotic symptoms does not appear to explain differences in the proportion of symptoms between Caucasians and African Americans. Rather, this may reflect ascertainment bias, perhaps indicative of a disparity in access to services, or differential exposure to risk factors for psychosis by race. © 2015 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Greg Perlman
- Department of Psychiatry, Stony Brook University, Stony Brook, New York
| | - Roman Kotov
- Department of Psychiatry, Stony Brook University, Stony Brook, New York
| | - Jinmiao Fu
- Department of Psychiatry, Stony Brook University, Stony Brook, New York
| | - Evelyn J Bromet
- Department of Psychiatry, Stony Brook University, Stony Brook, New York
| | - Laura J Fochtmann
- Department of Psychiatry, Stony Brook University, Stony Brook, New York
| | - Helena Medeiros
- Department of Psychiatry and the Behavioral Sciences, Keck School of Medicine of the University of Southern California, Los Angeles, California
| | | | - Michele T Pato
- Department of Psychiatry and the Behavioral Sciences, Keck School of Medicine of the University of Southern California, Los Angeles, California
| | - Carlos N Pato
- Department of Psychiatry and the Behavioral Sciences, Keck School of Medicine of the University of Southern California, Los Angeles, California
| |
Collapse
|
13
|
Bianco A, Chiefari E, Nobile CGA, Foti D, Pavia M, Brunetti A. The Association between HMGA1 rs146052672 Variant and Type 2 Diabetes: A Transethnic Meta-Analysis. PLoS One 2015; 10:e0136077. [PMID: 26296198 PMCID: PMC4546600 DOI: 10.1371/journal.pone.0136077] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2015] [Accepted: 07/29/2015] [Indexed: 12/16/2022] Open
Abstract
The high-mobility group A1 (HMGA1) gene has been previously identified as a potential novel candidate gene for susceptibility to insulin resistance and type 2 diabetes (T2D) mellitus. For this reason, several studies have been conducted in recent years examining the association of the HMGA1 gene variant rs146052672 (also designated IVS5-13insC) with T2D. Because of non-univocal data and non-overlapping results among laboratories, we conducted the current meta-analysis with the aim to yield a more precise and reliable conclusion for this association. Using predetermined inclusion criteria, MEDLINE, PubMed, Web of Science, Scopus, Google Scholar and Embase were searched for all relevant available literature published until November 2014. Two of the authors independently evaluated the quality of the included studies and extracted the data. Values from the single studies were combined to determine the meta-analysis pooled estimates. Heterogeneity and publication bias were also examined. Among the articles reviewed, five studies (for a total of 13,789 cases and 13,460 controls) met the predetermined criteria for inclusion in this meta-analysis. The combined adjusted odds ratio estimates revealed that the rs146052672 variant genotype had an overall statistically significant effect on increasing the risk of development of T2D. As most of the study subjects were Caucasian, further studies are needed to establish whether the association of this variant with an increased risk of T2D is generalizable to other populations. Also, in the light of this result, it would appear to be highly desirable that further in-depth investigations should be undertaken to elucidate the biological significance of the HMGA1 rs146052672 variant.
Collapse
Affiliation(s)
- Aida Bianco
- Department of Health Sciences, University “Magna Græcia” of Catanzaro, Catanzaro, Italy
| | - Eusebio Chiefari
- Department of Health Sciences, University “Magna Græcia” of Catanzaro, Catanzaro, Italy
| | - Carmelo G. A. Nobile
- Department of Health Sciences, University “Magna Græcia” of Catanzaro, Catanzaro, Italy
| | - Daniela Foti
- Department of Health Sciences, University “Magna Græcia” of Catanzaro, Catanzaro, Italy
| | - Maria Pavia
- Department of Health Sciences, University “Magna Græcia” of Catanzaro, Catanzaro, Italy
| | - Antonio Brunetti
- Department of Health Sciences, University “Magna Græcia” of Catanzaro, Catanzaro, Italy
| |
Collapse
|