1
|
Cartwright JHE, Čejková J, Fimmel E, Giannerini S, Gonzalez DL, Goracci G, Grácio C, Houwing-Duistermaat J, Matić D, Mišić N, Mulder FAA, Piro O. Information, Coding, and Biological Function: The Dynamics of Life. Artif Life 2024:1-12. [PMID: 38358121 DOI: 10.1162/artl_a_00432] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/16/2024]
Abstract
In the mid-20th century, two new scientific disciplines emerged forcefully: molecular biology and information-communication theory. At the beginning, cross-fertilization was so deep that the term genetic code was universally accepted for describing the meaning of triplets of mRNA (codons) as amino acids. However, today, such synergy has not taken advantage of the vertiginous advances in the two disciplines and presents more challenges than answers. These challenges not only are of great theoretical relevance but also represent unavoidable milestones for next-generation biology: from personalized genetic therapy and diagnosis to Artificial Life to the production of biologically active proteins. Moreover, the matter is intimately connected to a paradigm shift needed in theoretical biology, pioneered a long time ago, that requires combined contributions from disciplines well beyond the biological realm. The use of information as a conceptual metaphor needs to be turned into quantitative and predictive models that can be tested empirically and integrated in a unified view. Successfully achieving these tasks requires a wide multidisciplinary approach, including Artificial Life researchers, to address such an endeavour.
Collapse
Affiliation(s)
- Julyan H E Cartwright
- CSIC-Universidad de Granada, Instituto Andaluz de Ciencias de la Tierra, Department of Chemical Engineering, Instituto Carlos I de Física Teórica y Computacional
| | | | - Elena Fimmel
- Mannheim University of Applied Sciences, Institute of Mathematical Biology
| | | | - Diego Luis Gonzalez
- University of Bologna, Department of Statistical Science CNR, Area della Ricerca di Bologna
| | - Greta Goracci
- Free University of Bozen-Bolzano, Institute of Mathematical Biology
| | - Clara Grácio
- Universidade de Évora, CIMA Faculty of Economics and Management
| | | | - Dragan Matić
- University of Banja Luka, Faculty of Natural Science and Mathematics
| | | | | | - Oreste Piro
- Universitat de les Illes Balears, Department of Physics, Mediterranean Institute for Advanced Studies
| |
Collapse
|
2
|
el Bouhaddani S, Höllerhage M, Uh HW, Moebius C, Bickle M, Höglinger G, Houwing-Duistermaat J. Statistical integration of multi-omics and drug screening data from cell lines. PLoS Comput Biol 2024; 20:e1011809. [PMID: 38295113 PMCID: PMC10878536 DOI: 10.1371/journal.pcbi.1011809] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Revised: 02/20/2024] [Accepted: 01/08/2024] [Indexed: 02/02/2024] Open
Abstract
Data integration methods are used to obtain a unified summary of multiple datasets. For multi-modal data, we propose a computational workflow to jointly analyze datasets from cell lines. The workflow comprises a novel probabilistic data integration method, named POPLS-DA, for multi-omics data. The workflow is motivated by a study on synucleinopathies where transcriptomics, proteomics, and drug screening data are measured in affected LUHMES cell lines and controls. The aim is to highlight potentially druggable pathways and genes involved in synucleinopathies. First, POPLS-DA is used to prioritize genes and proteins that best distinguish cases and controls. For these genes, an integrated interaction network is constructed where the drug screen data is incorporated to highlight druggable genes and pathways in the network. Finally, functional enrichment analyses are performed to identify clusters of synaptic and lysosome-related genes and proteins targeted by the protective drugs. POPLS-DA is compared to other single- and multi-omics approaches. We found that HSPA5, a member of the heat shock protein 70 family, was one of the most targeted genes by the validated drugs, in particular by AT1-blockers. HSPA5 and AT1-blockers have been previously linked to α-synuclein pathology and Parkinson's disease, showing the relevance of our findings. Our computational workflow identified new directions for therapeutic targets for synucleinopathies. POPLS-DA provided a larger interpretable gene set than other single- and multi-omic approaches. An implementation based on R and markdown is freely available online.
Collapse
Affiliation(s)
| | | | - Hae-Won Uh
- Dept. Data science & Biostatistics, UMC Utrecht, Utrecht, Netherlands
| | - Claudia Moebius
- Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany
| | - Marc Bickle
- Roche Institute for Translational Bioengineering, Basel, Switzerland
| | - Günter Höglinger
- Department of Neurology, Hannover Medical School, Hannover, Germany
- Department of Neurology, Ludwig-Maximilians-Universität, Munich, Germany
- German Center for Neurodegenerative Diseases, Munich, Germany
- Munich Cluster for Systems Neurology (SyNergy), Munich, Germany
| | - Jeanine Houwing-Duistermaat
- Dept. Data science & Biostatistics, UMC Utrecht, Utrecht, Netherlands
- Dept. of Mathematics, Radboud University, Nijmegen, Netherlands
| |
Collapse
|
3
|
Stofella M, Skinner SP, Sobott F, Houwing-Duistermaat J, Paci E. High-Resolution Hydrogen-Deuterium Protection Factors from Sparse Mass Spectrometry Data Validated by Nuclear Magnetic Resonance Measurements. J Am Soc Mass Spectrom 2022; 33:813-822. [PMID: 35385652 PMCID: PMC9074100 DOI: 10.1021/jasms.2c00005] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
Experimental measurement of time-dependent spontaneous exchange of amide protons with deuterium of the solvent provides information on the structure and dynamical structural variation in proteins. Two experimental techniques are used to probe the exchange: NMR, which relies on different magnetic properties of hydrogen and deuterium, and MS, which exploits the change in mass due to deuteration. NMR provides residue-specific information, that is, the rate of exchange or, analogously, the protection factor (i.e., the unitless ratio between the rate of exchange for a completely unstructured state and the observed rate). MS provides information that is specific to peptides obtained by proteolytic digestion. The spatial resolution of HDX-MS measurements depends on the proteolytic pattern of the protein, the fragmentation method used, and the overlap between peptides. Different computational approaches have been proposed to extract residue-specific information from peptide-level HDX-MS measurements. Here, we demonstrate the advantages of a method recently proposed that exploits self-consistency and classifies the possible sets of protection factors into a finite number of alternative solutions compatible with experimental data. The degeneracy of the solutions can be reduced (or completely removed) by exploiting the additional information encoded in the shape of the isotopic envelopes. We show how sparse and noisy MS data can provide high-resolution protection factors that correlate with NMR measurements probing the same protein under the same conditions.
Collapse
Affiliation(s)
- Michele Stofella
- School
of Molecular and Cellular Biology, University
of Leeds, LS2 9JT Leeds, United Kingdom
- Dipartimento
di Fisica e Astronomia, Università
di Bologna, 40127 Bologna, Italy
| | - Simon P. Skinner
- School
of Molecular and Cellular Biology, University
of Leeds, LS2 9JT Leeds, United Kingdom
| | - Frank Sobott
- School
of Molecular and Cellular Biology, University
of Leeds, LS2 9JT Leeds, United Kingdom
| | | | - Emanuele Paci
- School
of Molecular and Cellular Biology, University
of Leeds, LS2 9JT Leeds, United Kingdom
- Dipartimento
di Fisica e Astronomia, Università
di Bologna, 40127 Bologna, Italy
- (E.P.)
| |
Collapse
|
4
|
Gu Z, El Bouhaddani S, Pei J, Houwing-Duistermaat J, Uh HW. Statistical integration of two omics datasets using GO2PLS. BMC Bioinformatics 2021; 22:131. [PMID: 33736604 PMCID: PMC7977326 DOI: 10.1186/s12859-021-03958-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2020] [Accepted: 01/06/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Nowadays, multiple omics data are measured on the same samples in the belief that these different omics datasets represent various aspects of the underlying biological systems. Integrating these omics datasets will facilitate the understanding of the systems. For this purpose, various methods have been proposed, such as Partial Least Squares (PLS), decomposing two datasets into joint and residual subspaces. Since omics data are heterogeneous, the joint components in PLS will contain variation specific to each dataset. To account for this, Two-way Orthogonal Partial Least Squares (O2PLS) captures the heterogeneity by introducing orthogonal subspaces and better estimates the joint subspaces. However, the latent components spanning the joint subspaces in O2PLS are linear combinations of all variables, while it might be of interest to identify a small subset relevant to the research question. To obtain sparsity, we extend O2PLS to Group Sparse O2PLS (GO2PLS) that utilizes biological information on group structures among variables and performs group selection in the joint subspace. RESULTS The simulation study showed that introducing sparsity improved the feature selection performance. Furthermore, incorporating group structures increased robustness of the feature selection procedure. GO2PLS performed optimally in terms of accuracy of joint score estimation, joint loading estimation, and feature selection. We applied GO2PLS to datasets from two studies: TwinsUK (a population study) and CVON-DOSIS (a small case-control study). In the first, we incorporated biological information on the group structures of the methylation CpG sites when integrating the methylation dataset with the IgG glycomics data. The targeted genes of the selected methylation groups turned out to be relevant to the immune system, in which the IgG glycans play important roles. In the second, we selected regulatory regions and transcripts that explained the covariance between regulomics and transcriptomics data. The corresponding genes of the selected features appeared to be relevant to heart muscle disease. CONCLUSIONS GO2PLS integrates two omics datasets to help understand the underlying system that involves both omics levels. It incorporates external group information and performs group selection, resulting in a small subset of features that best explain the relationship between two omics datasets for better interpretability.
Collapse
Affiliation(s)
- Zhujie Gu
- Department of Data Science and Biostatistics, UMC Utrecht, div. Julius Centre, Huispost Str. 6.131, 3508 GA, Utrecht, The Netherlands.
| | - Said El Bouhaddani
- Department of Data Science and Biostatistics, UMC Utrecht, div. Julius Centre, Huispost Str. 6.131, 3508 GA, Utrecht, The Netherlands
| | - Jiayi Pei
- Department of Cardiology, UMC Utrecht, Huispost Str. 6.131, 3508 GA, Utrecht, The Netherlands
| | - Jeanine Houwing-Duistermaat
- Department of Data Science and Biostatistics, UMC Utrecht, div. Julius Centre, Huispost Str. 6.131, 3508 GA, Utrecht, The Netherlands.,Department of Statistics, University of Leeds, LS2 9JT, Leeds, UK.,Department of Statistical Sciences, University of Bologna, Bologna, Italy
| | - Hae-Won Uh
- Department of Data Science and Biostatistics, UMC Utrecht, div. Julius Centre, Huispost Str. 6.131, 3508 GA, Utrecht, The Netherlands
| |
Collapse
|
5
|
Mwikali Muli A, Gusnanto A, Houwing-Duistermaat J. Use of shared gamma frailty model in analysis of survival data in twins. Theor Biol Forum 2021; 114:45-58. [PMID: 35502730 DOI: 10.19272/202111402005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
In survival analysis, the effect of a covariate on the outcome is reported in a hazard rate. However, hazards rates are hard to interpret. Here we consider differences in survival probabilities instead. Using data on twins is interesting due to the fact that many observed and unobserved factors are controlled or matched. To model the correlation between twins, some authors have proposed survival models with frailties or random effects. However, there is a potential danger of bias in the estimation if the frailty distribution is misspecified. Frailties are often assumed to follow a gamma distribution. To safeguard us from the impact of the misspecification of this distribution, we consider a flexible non-parametric baseline hazard in addition to a parametric one. We will apply this methodology to the TwinsUK cohort to predict the probability of experiencing a fracture in the next five or ten years, given their bone mineral densities (BMD) and their frailty index. The models with parametric and non-parametric baseline hazards yield very close results in estimating survival probabilities and thus a choice of parametric baseline hazard is generally preferred. We find that bone mineral density is a significant predictor in the model whereas frailty index is not. Low BMD leads to a larger probability of fracture; e.g, in 10 years, the probability of fracture is 21% for low BMD group, 16% for medium BMD group and 8% for high BMD group.
Collapse
Affiliation(s)
| | - Arief Gusnanto
- Department of Statistics, University of Leeds, United Kingdom
| | - Jeanine Houwing-Duistermaat
- Department of Statistics, University of Leeds, United Kingdom. Alan Turing Institute. Department of Biostatistics and research support, Utrecht University Medical Center, The Netherlands
| |
Collapse
|
6
|
Fuady AM, El Bouhaddani S, Uh HW, Houwing-Duistermaat J. Estimation of the effect of surrogate multi-omic biomarkers. Theor Biol Forum 2021; 114:59-73. [PMID: 35502731 DOI: 10.19272/202111402006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Multiple technologies which measure the same omics data set but are based on different aspects of the molecules exist. In practice, studies use different technologies and have therefore different biomarkers. An example is the glycan age index, which is constructed by three different ultra-performance liquid chromatography (UPLC) IgG glycans, and is a biomarker for biological age. A second technology is liquid chromatography- mass spectrometry (LCMS). To estimate the effect of a biomarker on an outcome variable, two issues need to be addressed. Firstly, a measurement error is needed to map one technology to the other one using a calibration study. Here, we consider two approaches, namely one based on the chemical properties of the two technologies and one based on the estimation of this relationship using O2PLS. Secondly, the use of an approximation of the biomarker in the main study needs to be taken into account by use of a regression calibration method. The performance of the two approaches is studied via simulations. The methods are used to estimate the relationship between glycan age and menopause. We have data from two cohorts, namely Korcula and Vis. In conclusion, (1) both measurement error models give similar results and suggest that there is an association between the glycan age index and the menopause status, (2) the chemical mapping approach outperforms O2PLS in the low measurement error variance, while on the larger measurement error variance, O2PLS works better, (3) statistical efficiency is lost due to increased noise level by adding irrelevant information.
Collapse
Affiliation(s)
- Angga M Fuady
- Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, The Netherlands. , Corresponding Author
| | - Said El Bouhaddani
- Department of Data Science and Biostatistics, UMC Utrecht, div. Julius Centre, Utrecht, The Netherlands
| | - Hae-Won Uh
- Department of Data Science and Biostatistics, UMC Utrecht, div. Julius Centre, Utrecht, The Netherlands
| | - Jeanine Houwing-Duistermaat
- Department of Data Science and Biostatistics, UMC Utrecht, div. Julius Centre, Utrecht, The Netherlands. Department of Statistics and Alan Turing Institute, University of Leeds, Leeds, United Kingdom
| |
Collapse
|
7
|
Xie M, Liu H, Houwing-Duistermaat J. Nonparametric clustering for longitudinal functional data with the application to H-NMR spectra of kidney transplant patients. Longitudinal functional data clustering. Theor Biol Forum 2021; 114:15-28. [PMID: 35502728 DOI: 10.19272/202111401003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Longitudinal functional data are increasingly common in the health domain. The motivated dataset for this paper comprises H-NMR spectra of kidney transplant patients [8]. Our aim is to cluster patients into different clinical outcome subgoups to reveal the success of the transplantation. The NMR spectra of each patient at each time point are functional data and the data are longitudinally collected at up to nine different time points. Existing methods are available for functional data collected at one time point, but not for longitudinal functional data collected at a grid of time points subject to missingness. We therefore first apply a method to extract the same number of functional feactures for each subject. Next we propose a novel nonparametric clustering method for mulitivariate functional data. We applied our proposed clustering method to the kidney transplant dataset both to a subset of the raw data with only two time points and the extacted functional features. It appeared that the proposed method achieves better clustering performance on the extracted functional features than on the subset of raw data. A data simulation study was performed to further evaluate the method. The design mimiced the kidney transplant dataset but with a larger sample size. Scenarios which have different levels of noise were considered. The simulation study showed the accuarcy of our proposed method.
Collapse
Affiliation(s)
- Minzhen Xie
- Department of Statistics, University of Leeds, UK.
| | - Haiyan Liu
- Department of Statistics, University of Leeds, UK.
| | | |
Collapse
|
8
|
Dembowska S, Frangi A, Houwing-Duistermaat J, Liu H. Multivariate functional partial least squares for classification using longitudinal data. Theor Biol Forum 2021; 114:75-88. [PMID: 35502732 DOI: 10.19272/202111402007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
The use of statistical methods to predict outcomes using high dimensional datasets in medicine is becoming increasingly popular for forecasting and monitoring patient health. Our work is motivated by a longitudinal dataset containing 1H NMR spectra of metabolites of 18 patients undergoing a kidney transplant alongside their graft outcomes that fall into one of three categories: acute rejection, delayed graft function and primary function. We proposed a functional partial least squares (FPLS) model that extends existing PLS methods for the analysis of longitudinally measured scalar omics datasets to the case of longitudinally measured functional datasets. We designed an iterative algorithm to link multiple time points, and then applied our proposed method to analyse the data from kidney transplant patients. Finally, we compared the AUC of our method to the AUC of the univariate methods which only use the information of one time-point information. It appeared that our method outperforms the existing methods. A simulation study was performed to mimic the kidney transplant dataset but with a larger sample size and different scenarios performed to evaluate the performance of the new method in larger datasets. We consider scenarios which vary in the difficulty to distinguish the two groups. It appeared that the three time-points model performs better than any of the individual models with average AUCs of 0.909 and 0.811 respectively.
Collapse
Affiliation(s)
- Sonia Dembowska
- Department of Statistics, University of Leeds, Leeds, UK. Centre for Computational Imaging and Simulation Technologies in Biomedi-cine (CISTIB), School of Computing, University of Leeds, Leeds, UK.
| | - Alex Frangi
- Centre for Computational Imaging and Simulation Technologies in Biomedi-cine (CISTIB), School of Computing, University of Leeds, Leeds, UK. Leeds Institute for Cardiovascular and Metabolic Medicine (LICAMM), School of Medicine, University of Leeds, Leeds, UK. Medical Imaging Research Center (MIRC), Cardiovascular Science and Electronic Engineering Departments, KU Leuven, Leuven, Belgium.
| | - Jeanine Houwing-Duistermaat
- Department of Statistics, University of Leeds, Leeds, UK. Department of Data Science and Biostatistics, UMC Utrecht, div. Julius Centre, Utrecht, The Netherlands. Department of Statistical Sciences, University of Bologna, Bologna, Italy.
| | - Haiyan Liu
- Department of Statistics, University of Leeds, Leeds, UK.
| |
Collapse
|
9
|
Bouhaddani SE, Uh HW, Jongbloed G, Hayward C, Klarić L, Kiełbasa SM, Houwing-Duistermaat J. Integrating omics datasets with the OmicsPLS package. BMC Bioinformatics 2018; 19:371. [PMID: 30309317 PMCID: PMC6182835 DOI: 10.1186/s12859-018-2371-3] [Citation(s) in RCA: 38] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2018] [Accepted: 09/11/2018] [Indexed: 12/25/2022] Open
Abstract
BACKGROUND With the exponential growth in available biomedical data, there is a need for data integration methods that can extract information about relationships between the data sets. However, these data sets might have very different characteristics. For interpretable results, data-specific variation needs to be quantified. For this task, Two-way Orthogonal Partial Least Squares (O2PLS) has been proposed. To facilitate application and development of the methodology, free and open-source software is required. However, this is not the case with O2PLS. RESULTS We introduce OmicsPLS, an open-source implementation of the O2PLS method in R. It can handle both low- and high-dimensional datasets efficiently. Generic methods for inspecting and visualizing results are implemented. Both a standard and faster alternative cross-validation methods are available to determine the number of components. A simulation study shows good performance of OmicsPLS compared to alternatives, in terms of accuracy and CPU runtime. We demonstrate OmicsPLS by integrating genetic and glycomic data. CONCLUSIONS We propose the OmicsPLS R package: a free and open-source implementation of O2PLS for statistical data integration. OmicsPLS is available at https://cran.r-project.org/package=OmicsPLS and can be installed in R via install.packages("OmicsPLS").
Collapse
Affiliation(s)
- Said el Bouhaddani
- Dept. of Biomedical Data Sciences, LUMC, Albinusdreef 2, Leiden, 2300 RC The Netherlands
- Delft Institute of Applied Mathematics, EEMCS, TU Delft, Van Mourik Broekmanweg 6, Delft, 2628 XE The Netherlands
| | - Hae-Won Uh
- Department of Biostatistics and Research Support, UMC Utrecht, div. Julius Centre, Huispost Str. 6.131, Utrecht, 3508 GA The Netherlands
| | - Geurt Jongbloed
- Delft Institute of Applied Mathematics, EEMCS, TU Delft, Van Mourik Broekmanweg 6, Delft, 2628 XE The Netherlands
| | - Caroline Hayward
- MRC Human Genetics Unit, Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh, EH4 2XU Scotland
| | - Lucija Klarić
- Genos Glycobiology Laboratory, Zagreb, 10000 Croatia
- MRC Human Genetics Unit, Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh, EH4 2XU Scotland
- Usher Institute of Population Health Sciences and Informatics, University of Edinburgh, Edinburgh, EH8 9DX Scotland
| | - Szymon M. Kiełbasa
- Dept. of Biomedical Data Sciences, LUMC, Albinusdreef 2, Leiden, 2300 RC The Netherlands
| | | |
Collapse
|
10
|
Raymond B, Bouwman AC, Wientjes YCJ, Schrooten C, Houwing-Duistermaat J, Veerkamp RF. Genomic prediction for numerically small breeds, using models with pre-selected and differentially weighted markers. Genet Sel Evol 2018; 50:49. [PMID: 30314431 PMCID: PMC6186145 DOI: 10.1186/s12711-018-0419-5] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2018] [Accepted: 10/01/2018] [Indexed: 01/22/2023] Open
Abstract
BACKGROUND Genomic prediction (GP) accuracy in numerically small breeds is limited by the small size of the reference population. Our objective was to test a multi-breed multiple genomic relationship matrices (GRM) GP model (MBMG) that weighs pre-selected markers separately, uses the remaining markers to explain the remaining genetic variance that can be explained by markers, and weighs information of breeds in the reference population by their genetic correlation with the validation breed. METHODS Genotype and phenotype data were used on 595 Jersey bulls from New Zealand and 5503 Holstein bulls from the Netherlands, all with deregressed proofs for stature. Different sets of markers were used, containing either pre-selected markers from a meta-genome-wide association analysis on stature, remaining markers or both. We implemented a multi-breed bivariate GREML model in which we fitted either a single multi-breed GRM (MBSG), or two distinct multi-breed GRM (MBMG), one made with pre-selected markers and the other with remaining markers. Accuracies of predicting stature for Jersey individuals using the multi-breed models (Holstein and Jersey combined reference population) was compared to those obtained using either the Jersey (within-breed) or Holstein (across-breed) reference population. All the models were subsequently fitted in the analysis of simulated phenotypes, with a simulated genetic correlation between breeds of 1, 0.5, and 0.25. RESULTS The MBMG model always gave better prediction accuracies for stature compared to MBSG, within-, and across-breed GP models. For example, with MBSG, accuracies obtained by fitting 48,912 unselected markers (0.43), 357 pre-selected markers (0.38) or a combination of both (0.43), were lower than accuracies obtained by fitting pre-selected and unselected markers in separate GRM in MBMG (0.49). This improvement was further confirmed by results from a simulation study, with MBMG performing on average 23% better than MBSG with all markers fitted. CONCLUSIONS With the MBMG model, it is possible to use information from numerically large breeds to improve prediction accuracy of numerically small breeds. The superiority of MBMG is mainly due to its ability to use information on pre-selected markers, explain the remaining genetic variance and weigh information from a different breed by the genetic correlation between breeds.
Collapse
Affiliation(s)
- Biaty Raymond
- Animal Breeding and Genomics, Wageningen University and Research, P.O. Box 338, 6700 AH Wageningen, The Netherlands
- Biometris, Wageningen University and Research, 6700 AA Wageningen, The Netherlands
| | - Aniek C. Bouwman
- Animal Breeding and Genomics, Wageningen University and Research, P.O. Box 338, 6700 AH Wageningen, The Netherlands
| | - Yvonne C. J. Wientjes
- Animal Breeding and Genomics, Wageningen University and Research, P.O. Box 338, 6700 AH Wageningen, The Netherlands
| | | | - Jeanine Houwing-Duistermaat
- Department of Medical Statistics and Bioinformatics, Leiden University Medical Centre, 2333 ZC Leiden, The Netherlands
- School of Mathematics, Faculty of Mathematics and Physical Sciences, University of Leeds, Leeds, LS2 9JT UK
| | - Roel F. Veerkamp
- Animal Breeding and Genomics, Wageningen University and Research, P.O. Box 338, 6700 AH Wageningen, The Netherlands
| |
Collapse
|
11
|
Rodríguez-Girondo M, Salo P, Burzykowski T, Perola M, Houwing-Duistermaat J, Mertens B. Sequential double cross-validation for assessment of added predictive ability in high-dimensional omic applications. Ann Appl Stat 2018. [DOI: 10.1214/17-aoas1125] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
12
|
el Bouhaddani S, Uh HW, Hayward C, Jongbloed G, Houwing-Duistermaat J. Probabilistic partial least squares model: Identifiability, estimation and application. J MULTIVARIATE ANAL 2018. [DOI: 10.1016/j.jmva.2018.05.009] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
13
|
Raymond B, Bouwman AC, Schrooten C, Houwing-Duistermaat J, Veerkamp RF. Utility of whole-genome sequence data for across-breed genomic prediction. Genet Sel Evol 2018; 50:27. [PMID: 29776327 PMCID: PMC5960108 DOI: 10.1186/s12711-018-0396-8] [Citation(s) in RCA: 47] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2017] [Accepted: 05/04/2018] [Indexed: 11/24/2022] Open
Abstract
Background Genomic prediction (GP) across breeds has so far resulted in low accuracies of the predicted genomic breeding values. Our objective was to evaluate whether using whole-genome sequence (WGS) instead of low-density markers can improve GP across breeds, especially when markers are pre-selected from a genome-wide association study (GWAS), and to test our hypothesis that many non-causal markers in WGS data have a diluting effect on accuracy of across-breed prediction. Methods Estimated breeding values for stature and bovine high-density (HD) genotypes were available for 595 Jersey bulls from New Zealand, 957 Holstein bulls from New Zealand and 5553 Holstein bulls from the Netherlands. BovineHD genotypes for all bulls were imputed to WGS using Beagle4 and Minimac2. Genomic prediction across the three populations was performed with ASReml4, with each population used as single reference and as single validation sets. In addition to the 50k, HD and WGS, markers that were significantly associated with stature in a large meta-GWAS analysis were selected and used for prediction, resulting in 10 prediction scenarios. Furthermore, we estimated the proportion of genetic variance captured by markers in each scenario. Results Across breeds, 50k, HD and WGS markers resulted in very low accuracies of prediction ranging from − 0.04 to 0.13. Accuracies were higher in scenarios with pre-selected markers from a meta-GWAS. For example, using only the 133 most significant markers in 133 QTL regions from the meta-GWAS yielded accuracies ranging from 0.08 to 0.23, while 23,125 markers with a − log10(p) higher than 7 resulted in accuracies of up 0.35. Using WGS data did not significantly improve the proportion of genetic variance captured across breeds compared to scenarios with few but pre-selected markers. Conclusions Our results demonstrated that the accuracy of across-breed GP can be improved by using markers that are pre-selected from WGS based on their potential causal effect. We also showed that simply increasing the number of markers up to the WGS level does not increase the accuracy of across-breed prediction, even when markers that are expected to have a causal effect are included.
Collapse
Affiliation(s)
- Biaty Raymond
- Animal Breeding and Genomics, Wageningen University and Research, P.O. Box 338, 6700 AH, Wageningen, The Netherlands. .,Biometris, Wageningen University and Research, 6700 AA, Wageningen, The Netherlands.
| | - Aniek C Bouwman
- Animal Breeding and Genomics, Wageningen University and Research, P.O. Box 338, 6700 AH, Wageningen, The Netherlands
| | | | - Jeanine Houwing-Duistermaat
- Department of Medical Statistics and Bioinformatics, Leiden University Medical Centre, 2333 ZC, Leiden, The Netherlands.,School of Mathematics, University of Leeds, Leeds, LS2 9JT, UK
| | - Roel F Veerkamp
- Animal Breeding and Genomics, Wageningen University and Research, P.O. Box 338, 6700 AH, Wageningen, The Netherlands
| |
Collapse
|
14
|
Tissier R, Houwing-Duistermaat J, Rodríguez-Girondo M. Improving stability of prediction models based on correlated omics data by using network approaches. PLoS One 2018; 13:e0192853. [PMID: 29462177 PMCID: PMC5819809 DOI: 10.1371/journal.pone.0192853] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2017] [Accepted: 01/31/2018] [Indexed: 12/13/2022] Open
Abstract
Building prediction models based on complex omics datasets such as transcriptomics, proteomics, metabolomics remains a challenge in bioinformatics and biostatistics. Regularized regression techniques are typically used to deal with the high dimensionality of these datasets. However, due to the presence of correlation in the datasets, it is difficult to select the best model and application of these methods yields unstable results. We propose a novel strategy for model selection where the obtained models also perform well in terms of overall predictability. Several three step approaches are considered, where the steps are 1) network construction, 2) clustering to empirically derive modules or pathways, and 3) building a prediction model incorporating the information on the modules. For the first step, we use weighted correlation networks and Gaussian graphical modelling. Identification of groups of features is performed by hierarchical clustering. The grouping information is included in the prediction model by using group-based variable selection or group-specific penalization. We compare the performance of our new approaches with standard regularized regression via simulations. Based on these results we provide recommendations for selecting a strategy for building a prediction model given the specific goal of the analysis and the sizes of the datasets. Finally we illustrate the advantages of our approach by application of the methodology to two problems, namely prediction of body mass index in the DIetary, Lifestyle, and Genetic determinants of Obesity and Metabolic syndrome study (DILGOM) and prediction of response of each breast cancer cell line to treatment with specific drugs using a breast cancer cell lines pharmacogenomics dataset.
Collapse
Affiliation(s)
- Renaud Tissier
- Department of Medical Statistics and Bioinformatics, Leiden University Medical Centre, Leiden, The Netherlands
- Developmental and Educational Psychology, Universiteit Leiden Faculteit Sociale Wetenschappen, Leiden, The Netherlands
- * E-mail:
| | | | - Mar Rodríguez-Girondo
- Department of Medical Statistics and Bioinformatics, Leiden University Medical Centre, Leiden, The Netherlands
| |
Collapse
|
15
|
Houwing-Duistermaat J, van Houwelingen H, Eikenboom J, Bertina R, Rosendaal F, Kamphuisen P. Familial Clustering of Factor VIII and von Willebrand Factor Levels. Thromb Haemost 2017. [DOI: 10.1055/s-0037-1614985] [Citation(s) in RCA: 76] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
Abstract
SummaryRecently, we found that high levels of clotting factor VIII (>150 IU/dl) are common and make an important contribution to thrombotic risk. The determinants of high factor VIII:C are unclear and might be partly genetic. Therefore, we tested the influence of age, blood group and von Willebrand factor (VWF) levels on factor VIII:C levels, and investigated whether factor VIII:C levels are genetically determined. We performed an analysis of 564 female relatives of hemophilia A patients, who had visited our center for genetic counseling. In univariate analysis, AB0 blood group, age and VWF antigen (VWF:Ag) levels all influenced factor VIII:C levels. After adjustment for the effect of VWF:Ag levels, both blood group and age still had an effect on factor VIII:C levels. In sister pairs, the Pearson correlation coefficient between factor VIII:C levels was 0.17 (p = 0.024) and this correlation remained positive (0.15, p = 0.046) after correction for the influence of VWF:Ag. In mother-daughter pairs, no correlation of factor VIII:C levels was found. The correlation of VWF:Ag levels in sisterpairs was 0.41 (p <0.001) and in mother-daughter pairs 0.44 (p <0.001), in line with the assumption that VWF:Ag levels are under control of autosomal genes. Familial influence on plasma factor VIII:C and VWF:Ag levels was investigated with a recently developed familial aggregation test. This test verifies whether familial aggregation of a particular parameter exists in a set of pedigrees. In 435 women from 168 families, factor VIII:C as well as VWF:Ag levels correlated significantly within families, which suggests a familial influence. The familial aggregation was more prominent for VWF:Ag levels than for factor VIII:C levels, possibly because the genetic effect on VWF:Ag levels is larger than on factor VIII:C levels. Our results support the presence of a familial influence on factor VIII:C as well as on VWF:Ag levels.Our results support the presence of a familial influence on factor VIII:C as well as on VWF:Ag levels.
Collapse
|
16
|
Xu MK, Gaysina D, Tsonaka R, Morin AJS, Croudace TJ, Barnett JH, Houwing-Duistermaat J, Richards M, Jones PB. Monoamine Oxidase A ( MAOA) Gene and Personality Traits from Late Adolescence through Early Adulthood: A Latent Variable Investigation. Front Psychol 2017; 8:1736. [PMID: 29075213 PMCID: PMC5641687 DOI: 10.3389/fpsyg.2017.01736] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2017] [Accepted: 09/20/2017] [Indexed: 11/13/2022] Open
Abstract
Very few molecular genetic studies of personality traits have used longitudinal phenotypic data, therefore molecular basis for developmental change and stability of personality remains to be explored. We examined the role of the monoamine oxidase A gene (MAOA) on extraversion and neuroticism from adolescence to adulthood, using modern latent variable methods. A sample of 1,160 male and 1,180 female participants with complete genotyping data was drawn from a British national birth cohort, the MRC National Survey of Health and Development (NSHD). The predictor variable was based on a latent variable representing genetic variations of the MAOA gene measured by three SNPs (rs3788862, rs5906957, and rs979606). Latent phenotype variables were constructed using psychometric methods to represent cross-sectional and longitudinal phenotypes of extraversion and neuroticism measured at ages 16 and 26. In males, the MAOA genetic latent variable (AAG) was associated with lower extraversion score at age 16 (β = −0.167; CI: −0.289, −0.045; p = 0.007, FDRp = 0.042), as well as greater increase in extraversion score from 16 to 26 years (β = 0.197; CI: 0.067, 0.328; p = 0.003, FDRp = 0.036). No genetic association was found for neuroticism after adjustment for multiple testing. Although, we did not find statistically significant associations after multiple testing correction in females, this result needs to be interpreted with caution due to issues related to x-inactivation in females. The latent variable method is an effective way of modeling phenotype- and genetic-based variances and may therefore improve the methodology of molecular genetic studies of complex psychological traits.
Collapse
Affiliation(s)
- Man K Xu
- Faculty of Psychology and Educational Sciences, Welten Institute, Open University of the Netherlands, Heerlen, Netherlands.,Department of Medical Statistics and Bioinformatics, Leiden University Medical Centre, Leiden, Netherlands.,Department of Psychiatry, University of Cambridge, Cambridge, United Kingdom.,Department of Psychology, Education, and Child Studies, Erasmus University Rotterdam, Rotterdam, Netherlands
| | - Darya Gaysina
- EDGE Lab, School of Psychology, University of Sussex, Brighton, United Kingdom
| | - Roula Tsonaka
- Department of Medical Statistics and Bioinformatics, Leiden University Medical Centre, Leiden, Netherlands
| | - Alexandre J S Morin
- Substantive-Methodological Synergy Research Laboratory, Department of Psychology, Concordia University, Montreal, QC, Canada
| | - Tim J Croudace
- School of Nursing and Health Sciences, University of Dundee, Dundee, United Kingdom
| | | | | | - Marcus Richards
- MRC Unit for Lifelong Health and Ageing at UCL, London, United Kingdom
| | - Peter B Jones
- Department of Psychiatry, University of Cambridge, Cambridge, United Kingdom
| | | |
Collapse
|
17
|
Auffray C, Balling R, Barroso I, Bencze L, Benson M, Bergeron J, Bernal-Delgado E, Blomberg N, Bock C, Conesa A, Del Signore S, Delogne C, Devilee P, Di Meglio A, Eijkemans M, Flicek P, Graf N, Grimm V, Guchelaar HJ, Guo YK, Gut IG, Hanbury A, Hanif S, Hilgers RD, Honrado Á, Hose DR, Houwing-Duistermaat J, Hubbard T, Janacek SH, Karanikas H, Kievits T, Kohler M, Kremer A, Lanfear J, Lengauer T, Maes E, Meert T, Müller W, Nickel D, Oledzki P, Pedersen B, Petkovic M, Pliakos K, Rattray M, I Màs JR, Schneider R, Sengstag T, Serra-Picamal X, Spek W, Vaas LAI, van Batenburg O, Vandelaer M, Varnai P, Villoslada P, Vizcaíno JA, Wubbe JPM, Zanetti G. Erratum to: Making sense of big data in health research: towards an EU action plan. Genome Med 2016; 8:118. [PMID: 27821178 PMCID: PMC5100330 DOI: 10.1186/s13073-016-0376-y] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2016] [Accepted: 10/26/2016] [Indexed: 11/10/2022] Open
Affiliation(s)
- Charles Auffray
- European Institute for Systems Biology and Medicine, 1 avenue Claude Vellefaux, 75010, Paris, France. .,CIRI-UMR5308, CNRS-ENS-INSERM-UCBL, Université de Lyon, 50 avenue Tony Garnier, 69007, Lyon, France.
| | - Rudi Balling
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, 7 Avenue des Hauts Fourneaux, 4362, Esch-sur-Alzette, Luxembourg.
| | - Inês Barroso
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SA, UK
| | - László Bencze
- Health Services Management Training Centre, Faculty of Health and Public Services, Semmelweis University, Kútvölgyi út 2, 1125, Budapest, Hungary
| | - Mikael Benson
- Centre for Personalised Medicine, Linköping University, 581 85, Linköping, Sweden
| | - Jay Bergeron
- Translational & Bioinformatics, Pfizer Inc., 300 Technology Square, Cambridge, MA, 02139, USA
| | - Enrique Bernal-Delgado
- Institute for Health Sciences, IACS - IIS Aragon, San Juan Bosco 13, 50009, Zaragoza, Spain
| | - Niklas Blomberg
- ELIXIR, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Christoph Bock
- CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Lazarettgasse 14, AKH BT25.2, 1090, Vienna, Austria.,Department of Laboratory Medicine, Medical University of Vienna, Lazarettgasse 14, AKH BT25.2, 1090, Vienna, Austria.,Max Planck Institute for Informatics, Campus E1 4, 66123, Saarbrücken, Germany
| | - Ana Conesa
- Príncipe Felipe Research Center, C/Eduardo Primo Yúfera 3, 46012, Valencia, Spain.,University of Florida, Institute of Food and Agricultural Sciences (IFAS), 2033 Mowry Road, Gainesville, FL, 32610, USA
| | | | - Christophe Delogne
- Technology, Data & Analytics, KPMG Luxembourg, Société Coopérative, 39 Avenue John F. Kennedy, 1855, Luxembourg, Luxembourg
| | - Peter Devilee
- Department of Human Genetics, Department of Pathology, Leiden University Medical Centre, Einthovenweg 20, 2333 ZC, Leiden, The Netherlands
| | - Alberto Di Meglio
- Information Technology Department, European Organization for Nuclear Research (CERN), 385 Route de Meyrin, 1211, Geneva 23, Switzerland
| | - Marinus Eijkemans
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Heidelberglaan 100, 3508 GA, Utrecht, The Netherlands
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Norbert Graf
- Department of Pediatric Oncology/Hematology, Saarland University, Campus Homburg, Building 9, 66421, Homburg, Germany
| | - Vera Grimm
- Project Management Jülich, Forschungszentrum Jülich GmbH, Wilhelm-Johnen-Straße, 52428, Jülich, Germany
| | - Henk-Jan Guchelaar
- Department of Clinical Pharmacy & Toxicology, Leiden University Medical Center, Albinusdreef 2, 2333 ZA, Leiden, The Netherlands
| | - Yi-Ke Guo
- Data Science Institute, Imperial College London, South Kensington, London, SW7 2AZ, UK
| | - Ivo Glynne Gut
- CNAG-CRG, Center for Genomic Regulation, Barcelona Institute for Science and Technology (BIST), C/Baldiri Reixac 4, 08029, Barcelona, Spain
| | - Allan Hanbury
- Institute of Software Technology and Interactive Systems, TU Wien, Favoritenstrasse 9-11/188, 1040, Vienna, Austria
| | - Shahid Hanif
- The Association of the British Pharmaceutical Industry, 7th Floor, Southside, 105 Victoria Street, London, SW1E 6QT, UK
| | - Ralf-Dieter Hilgers
- Department of Medical Statistics, RWTH-Aachen University, Universitätsklinikum Aachen, Pauwelsstraße 30, 52074, Aachen, Germany
| | - Ángel Honrado
- SYNAPSE Research Management Partners, Diputació 237, Àtic 3ª, 08007, Barcelona, Spain
| | - D Rod Hose
- Department of Infection, Immunity and Cardiovascular Disease and Insigneo Institute for In-Silico Medicine, Medical School, University of Sheffield, Beech Hill Road, Sheffield, S10 2RX, UK
| | | | - Tim Hubbard
- Department of Medical & Molecular Genetics, King's College London, London, SE1 9RT, UK.,Genomics England, London, EC1M 6BQ, UK
| | - Sophie Helen Janacek
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Haralampos Karanikas
- National and Kapodistrian University of Athens, Medical School, Xristou Lada 6, 10561, Athens, Greece
| | - Tim Kievits
- Vitromics Healthcare Holding B.V., Onderwijsboulevard 225, 5223 DE, 's-Hertogenbosch, The Netherlands
| | - Manfred Kohler
- Fraunhofer Institute for Molecular Biology and Applied Ecology ScreeningPort, Schnackenburgallee 114, 22525, Hamburg, Germany
| | - Andreas Kremer
- ITTM S.A., 9 avenue des Hauts Fourneaux, 4362, Esch-sur-Alzette, Luxembourg
| | - Jerry Lanfear
- Research Business Technology, Pfizer Ltd, GP4 Building, Granta Park, Cambridge, CB21 6GP, UK
| | - Thomas Lengauer
- Max Planck Institute for Informatics, Campus E1 4, 66123, Saarbrücken, Germany
| | - Edith Maes
- Health Economics & Outcomes Research, Deloitte Belgium, Berkenlaan 8A, 1831, Diegem, Belgium
| | - Theo Meert
- Janssen Pharmaceutica N.V., R&D G3O, Turnhoutseweg 30, 2340, Beerse, Belgium
| | - Werner Müller
- Faculty of Life Sciences, University of Manchester, AV Hill Building, Oxford Road, Manchester, M13 9PT, UK
| | - Dörthe Nickel
- UMR3664 IC/CNRS, Institut Curie, Section Recherche, Pavillon Pasteur, 26 rue d'Ulm, 75248, Paris cedex 05, France
| | - Peter Oledzki
- Linguamatics Ltd, 324 Cambridge Science Park Milton Rd, Cambridge, CB4 0WG, UK
| | - Bertrand Pedersen
- PwC Luxembourg, 2 rue Gerhard Mercator, 2182, Luxembourg, Luxembourg
| | - Milan Petkovic
- Philips, HighTechCampus 36, 5656AE, Eindhoven, The Netherlands
| | - Konstantinos Pliakos
- Department of Public Health and Primary Care, KU Leuven Kulak, Etienne Sabbelaan 53, 8500, Kortrijk, Belgium
| | - Magnus Rattray
- Faculty of Life Sciences, University of Manchester, AV Hill Building, Oxford Road, Manchester, M13 9PT, UK
| | - Josep Redón I Màs
- INCLIVA Health Research Institute, University of Valencia, CIBERobn ISCIII, Avenida Menéndez Pelayo 4 accesorio, 46010, Valencia, Spain
| | - Reinhard Schneider
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, 7 Avenue des Hauts Fourneaux, 4362, Esch-sur-Alzette, Luxembourg
| | - Thierry Sengstag
- Swiss Institute of Bioinformatics (SIB) and University of Basel, Klingelbergstrasse 50/ 70, 4056, Basel, Switzerland
| | - Xavier Serra-Picamal
- Agency for Health Quality and Assessment of Catalonia (AQuAS), Carrer de Roc Boronat 81-95, 08005, Barcelona, Spain
| | - Wouter Spek
- EuroBioForum Foundation, Chrysantstraat 10, 3135 HG, Vlaardingen, The Netherlands
| | - Lea A I Vaas
- Fraunhofer Institute for Molecular Biology and Applied Ecology ScreeningPort, Schnackenburgallee 114, 22525, Hamburg, Germany
| | - Okker van Batenburg
- EuroBioForum Foundation, Chrysantstraat 10, 3135 HG, Vlaardingen, The Netherlands
| | - Marc Vandelaer
- Integrated BioBank of Luxembourg, 6 rue Nicolas-Ernest Barblé, 1210, Luxembourg, Luxembourg
| | - Peter Varnai
- Technopolis Group, 3 Pavilion Buildings, Brighton, BN1 1EE, UK
| | - Pablo Villoslada
- Hospital Clinic of Barcelona, Institute d'Investigacions Biomediques August Pi Sunyer (IDIBAPS), Rosello 149, 08036, Barcelona, Spain
| | - Juan Antonio Vizcaíno
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - John Peter Mary Wubbe
- European Platform for Patients' Organisations, Science and Industry (Epposi), De Meeûs Square 38-40, 1000, Brussels, Belgium
| | - Gianluigi Zanetti
- CRS4, Ed.1 POLARIS, 09129, Pula, Italy.,BBMRI-ERIC, Neue Stiftingtalstrasse 2/B/6, 8010, Graz, Austria
| |
Collapse
|
18
|
Tissier R, Uh HW, van den Akker E, Balliu B, Tsonaka S, Houwing-Duistermaat J. Gene coexpression network analysis for family studies based on a meta-analytic approach. BMC Proc 2016; 10:119-123. [PMID: 27980622 PMCID: PMC5133496 DOI: 10.1186/s12919-016-0016-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
For a better understanding of the biological mechanisms involved in complex traits or diseases, networks are often useful tools in genetic studies: coexpression networks based on pairwise correlations between genes are commonly used. In case of a family-based design, it can be problematic when there is a large between-family variation in expression levels. We propose here a gene coexpression network analysis for family studies. We build a coexpression network for each family and then combine the results. We applied our approach to data provided for analysis in the Genetic Analysis Workshop 19 and compared it to 2 naïve approaches—ignoring correlations among the expressions and decorrelating the gene expression by using the residuals of a mixed model—and a single-probe analysis. Our approach seemed to better deal with heterogeneity with regard to the naïve approaches. The naïve approaches did not provide any significant results, while our approach detected genes via indirect effects. It also detected more genes than the single-probe analysis.
Collapse
Affiliation(s)
- Renaud Tissier
- Department of Medical Statistics and Bioinformatics, Leiden University Medical Center, PO Box 9600, 2300 RC Leiden, The Netherlands
| | - Hae-Won Uh
- Department of Medical Statistics and Bioinformatics, Leiden University Medical Center, PO Box 9600, 2300 RC Leiden, The Netherlands
| | - Erik van den Akker
- Molecular epidemiology, Leiden University Medical Centre, Leiden, The Netherlands ; Pattern Recognition & Bioinformatics, Delft University of Technology, Leiden, The Netherlands
| | - Brunilda Balliu
- Department of Medical Statistics and Bioinformatics, Leiden University Medical Center, PO Box 9600, 2300 RC Leiden, The Netherlands
| | - Spyridoula Tsonaka
- Department of Medical Statistics and Bioinformatics, Leiden University Medical Center, PO Box 9600, 2300 RC Leiden, The Netherlands
| | - Jeanine Houwing-Duistermaat
- Department of Medical Statistics and Bioinformatics, Leiden University Medical Center, PO Box 9600, 2300 RC Leiden, The Netherlands ; Department of Statistics, University of Leeds, Leeds, UK
| |
Collapse
|
19
|
Auffray C, Balling R, Barroso I, Bencze L, Benson M, Bergeron J, Bernal-Delgado E, Blomberg N, Bock C, Conesa A, Del Signore S, Delogne C, Devilee P, Di Meglio A, Eijkemans M, Flicek P, Graf N, Grimm V, Guchelaar HJ, Guo YK, Gut IG, Hanbury A, Hanif S, Hilgers RD, Honrado Á, Hose DR, Houwing-Duistermaat J, Hubbard T, Janacek SH, Karanikas H, Kievits T, Kohler M, Kremer A, Lanfear J, Lengauer T, Maes E, Meert T, Müller W, Nickel D, Oledzki P, Pedersen B, Petkovic M, Pliakos K, Rattray M, I Màs JR, Schneider R, Sengstag T, Serra-Picamal X, Spek W, Vaas LAI, van Batenburg O, Vandelaer M, Varnai P, Villoslada P, Vizcaíno JA, Wubbe JPM, Zanetti G. Making sense of big data in health research: Towards an EU action plan. Genome Med 2016; 8:71. [PMID: 27338147 PMCID: PMC4919856 DOI: 10.1186/s13073-016-0323-y] [Citation(s) in RCA: 124] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
Medicine and healthcare are undergoing profound changes. Whole-genome sequencing and high-resolution imaging technologies are key drivers of this rapid and crucial transformation. Technological innovation combined with automation and miniaturization has triggered an explosion in data production that will soon reach exabyte proportions. How are we going to deal with this exponential increase in data production? The potential of "big data" for improving health is enormous but, at the same time, we face a wide range of challenges to overcome urgently. Europe is very proud of its cultural diversity; however, exploitation of the data made available through advances in genomic medicine, imaging, and a wide range of mobile health applications or connected devices is hampered by numerous historical, technical, legal, and political barriers. European health systems and databases are diverse and fragmented. There is a lack of harmonization of data formats, processing, analysis, and data transfer, which leads to incompatibilities and lost opportunities. Legal frameworks for data sharing are evolving. Clinicians, researchers, and citizens need improved methods, tools, and training to generate, analyze, and query data effectively. Addressing these barriers will contribute to creating the European Single Market for health, which will improve health and healthcare for all Europeans.
Collapse
Affiliation(s)
- Charles Auffray
- European Institute for Systems Biology and Medicine, 1 avenue Claude Vellefaux, 75010, Paris, France.
- CIRI-UMR5308, CNRS-ENS-INSERM-UCBL, Université de Lyon, 50 avenue Tony Garnier, 69007, Lyon, France.
| | - Rudi Balling
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, 7 Avenue des Hauts Fourneaux, 4362, Esch-sur-Alzette, Luxembourg.
| | - Inês Barroso
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SA, UK
| | - László Bencze
- Health Services Management Training Centre, Faculty of Health and Public Services, Semmelweis University, Kútvölgyi út 2, 1125, Budapest, Hungary
| | - Mikael Benson
- Centre for Personalised Medicine, Linköping University, 581 85, Linköping, Sweden
| | - Jay Bergeron
- Translational & Bioinformatics, Pfizer Inc., 300 Technology Square, Cambridge, MA, 02139, USA
| | - Enrique Bernal-Delgado
- Institute for Health Sciences, IACS - IIS Aragon, San Juan Bosco 13, 50009, Zaragoza, Spain
| | - Niklas Blomberg
- ELIXIR, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Christoph Bock
- CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Lazarettgasse 14, AKH BT25.2, 1090, Vienna, Austria
- Department of Laboratory Medicine, Medical University of Vienna, Lazarettgasse 14, AKH BT25.2, 1090, Vienna, Austria
- Max Planck Institute for Informatics, Campus E1 4, 66123, Saarbrücken, Germany
| | - Ana Conesa
- Príncipe Felipe Research Center, C/ Eduardo Primo Yúfera 3, 46012, Valencia, Spain
- University of Florida, Institute of Food and Agricultural Sciences (IFAS), 2033 Mowry Road, Gainesville, FL, 32610, USA
| | | | - Christophe Delogne
- Technology, Data & Analytics, KPMG Luxembourg, Société Coopérative, 39 Avenue John F. Kennedy, 1855, Luxembourg, Luxembourg
| | - Peter Devilee
- Department of Human Genetics, Department of Pathology, Leiden University Medical Centre, Einthovenweg 20, 2333 ZC, Leiden, The Netherlands
| | - Alberto Di Meglio
- Information Technology Department, European Organization for Nuclear Research (CERN), 385 Route de Meyrin, 1211, Geneva 23, Switzerland
| | - Marinus Eijkemans
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Heidelberglaan 100, 3508 GA, Utrecht, The Netherlands
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Norbert Graf
- Department of Pediatric Oncology/Hematology, Saarland University, Campus Homburg, Building 9, 66421, Homburg, Germany
| | - Vera Grimm
- Project Management Jülich, Forschungszentrum Jülich GmbH, Wilhelm-Johnen-Straße, 52428, Jülich, Germany
| | - Henk-Jan Guchelaar
- Department of Clinical Pharmacy & Toxicology, Leiden University Medical Center, Albinusdreef 2, 2333 ZA, Leiden, The Netherlands
| | - Yi-Ke Guo
- Data Science Institute, Imperial College London, South Kensington, London, SW7 2AZ, UK
| | - Ivo Glynne Gut
- CNAG-CRG, Center for Genomic Regulation, Barcelona Institute for Science and Technology (BIST), C/Baldiri Reixac 4, 08029, Barcelona, Spain
| | - Allan Hanbury
- Institute of Software Technology and Interactive Systems, TU Wien, Favoritenstrasse 9-11/188, 1040, Vienna, Austria
| | - Shahid Hanif
- The Association of the British Pharmaceutical Industry, 7th Floor, Southside, 105 Victoria Street, London, SW1E 6QT, UK
| | - Ralf-Dieter Hilgers
- Department of Medical Statistics, RWTH-Aachen University, Universitätsklinikum Aachen, Pauwelsstraße 30, 52074, Aachen, Germany
| | - Ángel Honrado
- SYNAPSE Research Management Partners, Diputació 237, Àtic 3ª, 08007, Barcelona, Spain
| | - D Rod Hose
- Department of Infection, Immunity and Cardiovascular Disease and Insigneo Institute for In-Silico Medicine, Medical School, University of Sheffield, Beech Hill Road, Sheffield, S10 2RX, UK
| | | | - Tim Hubbard
- Department of Medical & Molecular Genetics, King's College London, London, SE1 9RT, UK
- Genomics England, London, EC1M 6BQ, UK
| | - Sophie Helen Janacek
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Haralampos Karanikas
- National and Kapodistrian University of Athens, Medical School, Xristou Lada 6, 10561, Athens, Greece
| | - Tim Kievits
- Vitromics Healthcare Holding B.V., Onderwijsboulevard 225, 5223 DE, 's-Hertogenbosch, The Netherlands
| | - Manfred Kohler
- Fraunhofer Institute for Molecular Biology and Applied Ecology ScreeningPort, Schnackenburgallee 114, 22525, Hamburg, Germany
| | - Andreas Kremer
- ITTM S.A., 9 avenue des Hauts Fourneaux, 4362, Esch-sur-Alzette, Luxembourg
| | - Jerry Lanfear
- Research Business Technology, Pfizer Ltd, GP4 Building, Granta Park, Cambridge, CB21 6GP, UK
| | - Thomas Lengauer
- Max Planck Institute for Informatics, Campus E1 4, 66123, Saarbrücken, Germany
| | - Edith Maes
- Health Economics & Outcomes Research, Deloitte Belgium, Berkenlaan 8A, 1831, Diegem, Belgium
| | - Theo Meert
- Janssen Pharmaceutica N.V., R&D G3O, Turnhoutseweg 30, 2340, Beerse, Belgium
| | - Werner Müller
- Faculty of Life Sciences, University of Manchester, AV Hill Building, Oxford Road, Manchester, M13 9PT, UK
| | - Dörthe Nickel
- UMR3664 IC/CNRS, Institut Curie, Section Recherche, Pavillon Pasteur, 26 rue d'Ulm, 75248, Paris cedex 05, France
| | - Peter Oledzki
- Linguamatics Ltd, 324 Cambridge Science Park Milton Rd, Cambridge, CB4 0WG, UK
| | - Bertrand Pedersen
- PwC Luxembourg, 2 rue Gerhard Mercator, 2182, Luxembourg, Luxembourg
| | - Milan Petkovic
- Philips, HighTechCampus 36, 5656AE, Eindhoven, The Netherlands
| | - Konstantinos Pliakos
- Department of Public Health and Primary Care, KU Leuven Kulak, Etienne Sabbelaan 53, 8500, Kortrijk, Belgium
| | - Magnus Rattray
- Faculty of Life Sciences, University of Manchester, AV Hill Building, Oxford Road, Manchester, M13 9PT, UK
| | - Josep Redón I Màs
- INCLIVA Health Research Institute, University of Valencia, CIBERobn ISCIII, Avenida Menéndez Pelayo 4 accesorio, 46010, Valencia, Spain
| | - Reinhard Schneider
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, 7 Avenue des Hauts Fourneaux, 4362, Esch-sur-Alzette, Luxembourg
| | - Thierry Sengstag
- Swiss Institute of Bioinformatics (SIB) and University of Basel, Klingelbergstrasse 50/70, 4056, Basel, Switzerland
| | - Xavier Serra-Picamal
- Agency for Health Quality and Assessment of Catalonia (AQuAS), Carrer de Roc Boronat 81-95, 08005, Barcelona, Spain
| | - Wouter Spek
- EuroBioForum Foundation, Chrysantstraat 10, 3135 HG, Vlaardingen, The Netherlands
| | - Lea A I Vaas
- Fraunhofer Institute for Molecular Biology and Applied Ecology ScreeningPort, Schnackenburgallee 114, 22525, Hamburg, Germany
| | - Okker van Batenburg
- EuroBioForum Foundation, Chrysantstraat 10, 3135 HG, Vlaardingen, The Netherlands
| | - Marc Vandelaer
- Integrated BioBank of Luxembourg, 6 rue Nicolas-Ernest Barblé, 1210, Luxembourg, Luxembourg
| | - Peter Varnai
- Technopolis Group, 3 Pavilion Buildings, Brighton, BN1 1EE, UK
| | - Pablo Villoslada
- Hospital Clinic of Barcelona, Institute d'Investigacions Biomediques August Pi Sunyer (IDIBAPS), Rosello 149, 08036, Barcelona, Spain
| | - Juan Antonio Vizcaíno
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - John Peter Mary Wubbe
- European Platform for Patients' Organisations, Science and Industry (Epposi), De Meeûs Square 38-40, 1000, Brussels, Belgium
| | - Gianluigi Zanetti
- CRS4, Ed.1 POLARIS, 09129, Pula, Italy
- BBMRI-ERIC, Neue Stiftingtalstrasse 2/B/6, 8010, Graz, Austria
| |
Collapse
|
20
|
Abstract
Background Rapid computational and technological developments made large amounts of omics data available in different biological levels. It is becoming clear that simultaneous data analysis methods are needed for better interpretation and understanding of the underlying systems biology. Different methods have been proposed for this task, among them Partial Least Squares (PLS) related methods. To also deal with orthogonal variation, systematic variation in the data unrelated to one another, we consider the Two-way Orthogonal PLS (O2PLS): an integrative data analysis method which is capable of modeling systematic variation, while providing more parsimonious models aiding interpretation. Results A simulation study to assess the performance of O2PLS showed positive results in both low and higher dimensions. More noise (50 % of the data) only affected the systematic part estimates. A data analysis was conducted using data on metabolomics and transcriptomics from a large Finnish cohort (DILGOM). A previous sequential study, using the same data, showed significant correlations between the Lipo-Leukocyte (LL) module and lipoprotein metabolites. The O2PLS results were in agreement with these findings, identifying almost the same set of co-varying variables. Moreover, our integrative approach identified other associative genes and metabolites, while taking into account systematic variation in the data. Including orthogonal components enhanced overall fit, but the orthogonal variation was difficult to interpret. Conclusions Simulations showed that the O2PLS estimates were close to the true parameters in both low and higher dimensions. In the presence of more noise (50 %), the orthogonal part estimates could not distinguish well between joint and unique variation. The joint estimates were not systematically affected. Simultaneous analysis with O2PLS on metabolome and transcriptome data showed that the LL module, together with VLDL and HDL metabolites, were important for the metabolomic and transcriptomic relation. This is in agreement with an earlier study. In addition more gene expression and metabolites are identified being important for the joint covariation.
Collapse
Affiliation(s)
- Said El Bouhaddani
- Department of Medical Statistics and Bioinformatics, LUMC, Albinusdreef 2, Leiden, 2300, RC, The Netherlands.
| | - Jeanine Houwing-Duistermaat
- Department of Medical Statistics and Bioinformatics, LUMC, Albinusdreef 2, Leiden, 2300, RC, The Netherlands.
| | - Perttu Salo
- National Institute for Health and Welfare (THL), Mannerheimintie 166, Helsinki, FI-00271, Finland.
| | - Markus Perola
- National Institute for Health and Welfare (THL), Mannerheimintie 166, Helsinki, FI-00271, Finland.
| | - Geurt Jongbloed
- Department of Statistics, EEMCS, TU Delft, Mekelweg 4, Delft, 2628, CD, The Netherlands.
| | - Hae-Won Uh
- Department of Medical Statistics and Bioinformatics, LUMC, Albinusdreef 2, Leiden, 2300, RC, The Netherlands.
| |
Collapse
|
21
|
Peters MJ, Joehanes R, Pilling LC, Schurmann C, Conneely KN, Powell J, Reinmaa E, Sutphin GL, Zhernakova A, Schramm K, Wilson YA, Kobes S, Tukiainen T, Ramos YF, Göring HHH, Fornage M, Liu Y, Gharib SA, Stranger BE, De Jager PL, Aviv A, Levy D, Murabito JM, Munson PJ, Huan T, Hofman A, Uitterlinden AG, Rivadeneira F, van Rooij J, Stolk L, Broer L, Verbiest MMPJ, Jhamai M, Arp P, Metspalu A, Tserel L, Milani L, Samani NJ, Peterson P, Kasela S, Codd V, Peters A, Ward-Caviness CK, Herder C, Waldenberger M, Roden M, Singmann P, Zeilinger S, Illig T, Homuth G, Grabe HJ, Völzke H, Steil L, Kocher T, Murray A, Melzer D, Yaghootkar H, Bandinelli S, Moses EK, Kent JW, Curran JE, Johnson MP, Williams-Blangero S, Westra HJ, McRae AF, Smith JA, Kardia SLR, Hovatta I, Perola M, Ripatti S, Salomaa V, Henders AK, Martin NG, Smith AK, Mehta D, Binder EB, Nylocks KM, Kennedy EM, Klengel T, Ding J, Suchy-Dicey AM, Enquobahrie DA, Brody J, Rotter JI, Chen YDI, Houwing-Duistermaat J, Kloppenburg M, Slagboom PE, Helmer Q, den Hollander W, Bean S, Raj T, Bakhshi N, Wang QP, Oyston LJ, Psaty BM, Tracy RP, Montgomery GW, Turner ST, Blangero J, Meulenbelt I, Ressler KJ, Yang J, Franke L, Kettunen J, Visscher PM, Neely GG, Korstanje R, Hanson RL, Prokisch H, Ferrucci L, Esko T, Teumer A, van Meurs JBJ, Johnson AD. The transcriptional landscape of age in human peripheral blood. Nat Commun 2015; 6:8570. [PMID: 26490707 PMCID: PMC4639797 DOI: 10.1038/ncomms9570] [Citation(s) in RCA: 407] [Impact Index Per Article: 45.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2015] [Accepted: 09/07/2015] [Indexed: 02/08/2023] Open
Abstract
Disease incidences increase with age, but the molecular characteristics of ageing that lead to increased disease susceptibility remain inadequately understood. Here we perform a whole-blood gene expression meta-analysis in 14,983 individuals of European ancestry (including replication) and identify 1,497 genes that are differentially expressed with chronological age. The age-associated genes do not harbor more age-associated CpG-methylation sites than other genes, but are instead enriched for the presence of potentially functional CpG-methylation sites in enhancer and insulator regions that associate with both chronological age and gene expression levels. We further used the gene expression profiles to calculate the ‘transcriptomic age' of an individual, and show that differences between transcriptomic age and chronological age are associated with biological features linked to ageing, such as blood pressure, cholesterol levels, fasting glucose, and body mass index. The transcriptomic prediction model adds biological relevance and complements existing epigenetic prediction models, and can be used by others to calculate transcriptomic age in external cohorts. Ageing increases the risk of many diseases. Here the authors compare blood cell transcriptomes of over 14,000 individuals and identify a set of about 1,500 genes that are differently expressed with age, shedding light on transcriptional programs linked to the ageing process and age-associated diseases.
Collapse
Affiliation(s)
- Marjolein J Peters
- Department of Internal Medicine, Erasmus Medical Centre Rotterdam, Rotterdam 3000CA, The Netherlands
| | - Roby Joehanes
- The National Heart, Lung, and Blood Institute's and Boston University's Framingham Heart Study, Framingham, Massachusetts 01702, USA.,Population Sciences Branch, Division of Intramural Research, National Heart, Lung, and Blood Institute, Bethesda, Maryland 20817, USA
| | - Luke C Pilling
- Epidemiology and Public Health, University of Exeter Medical School, Exeter EX4 1DB, UK
| | - Claudia Schurmann
- Department of Functional Genomics, Interfaculty Institute for Genetics and Functional Genomics, University Medicine Greifswald, Greifswald 17493, Germany.,The Charles Bronfman Institute for Personalized Medicine, Genetics of Obesity &Related Metabolic Traits Program, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York 10029, USA
| | - Karen N Conneely
- Department of Human Genetics, School of Medicine, Emory University, Atlanta, Georgia 30301, USA
| | - Joseph Powell
- Centre for Neurogenetics and Statistical Genomics, Queensland Brain Institute, University of Queensland, St Lucia, Brisbane, Queensland 4000, Australia.,The Institute for Molecular Bioscience, University of Queensland, Brisbane, Queensland 4000, Australia
| | - Eva Reinmaa
- Estonian Genome Center, University of Tartu, Tartu 0794, Estonia
| | - George L Sutphin
- Nathan Shock Center of Excellence in the Basic Biology of Aging, The Jackson Laboratory, Bar Harbor, Maine 04609, USA
| | - Alexandra Zhernakova
- Department of Genetics, University Medical Center Groningen, University of Groningen, Groningen 9700RB, The Netherlands
| | - Katharina Schramm
- Institute of Human Genetics, Helmholz Zentrum München - German Research Center for Environmental Health, Neuherberg 85764, Germany.,Institute of Human Genetics, Technical University Munich, Munich 85540, Germany
| | - Yana A Wilson
- Neuroscience Division, Garvan Institute of Medical Research, Australia and Charles Perkins Centre and School of Molecular Bioscience, The University of Sydney, Sydney, New South Wales 2006, Australia
| | - Sayuko Kobes
- Phoenix Epidemiology and Clinical Research Branch, National Institute of Diabetes and Digestive and Kidney Disease, National Institutes of Health, Phoenix, Arizona 85001, USA
| | - Taru Tukiainen
- Institute for Molecular Medicine Finland FIMM, University of Helsinki, Helsinki 00131, Finland.,Department of Chronic Disease Prevention, National Institute for Health and Welfare, Helsinki 00131, Finland
| | | | - Yolande F Ramos
- Department of Molecular Epidemiology, Leiden University Medical Center, Leiden 2300RC, The Netherlands
| | - Harald H H Göring
- Department of Genetics, Texas Biomedical Research Institute, San Antonio, Texas 78201, USA
| | - Myriam Fornage
- Division of Epidemiology, Human Genetics, and Environmental Sciences, School of Public Health, University of Texas Health Sciences, Center at Houston, Texas 77001, USA.,Institute of Molecular Medicine, University of Texas Health Sciences Center at Houston, Houston, Texas 77001, USA
| | - Yongmei Liu
- Department of Epidemiology and Prevention, Public Health Sciences, Wake Forest School of Medicine, Winston-Salem, North Carolina 27101, USA
| | - Sina A Gharib
- Computational Medicine Core, Center for Lung Biology, University of Washington, Seattle, Washington 98101, USA
| | - Barbara E Stranger
- Section of Genetic Medicine, Institute for Genomics and Systems Biology, University of Chicago, Chicago, Illinois 60290, USA
| | - Philip L De Jager
- Program in Translational NeuroPsychiatric Genomics, Department of Neurology, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts 02108, USA
| | - Abraham Aviv
- Center of Human Development and Aging, New Jersey Medical School, Newark 07101, USA
| | - Daniel Levy
- The National Heart, Lung, and Blood Institute's and Boston University's Framingham Heart Study, Framingham, Massachusetts 01702, USA.,Population Sciences Branch, Division of Intramural Research, National Heart, Lung, and Blood Institute, Bethesda, Maryland 20817, USA
| | - Joanne M Murabito
- The National Heart, Lung, and Blood Institute's and Boston University's Framingham Heart Study, Framingham, Massachusetts 01702, USA.,General Internal Medicine Section, Boston University, Boston, Massachusetts 02108, USA
| | - Peter J Munson
- The Mathematical and Statistical Computing Laboratory, Center for Information Technology, National Institutes of Health, Bethesda, Maryland 20817, USA
| | - Tianxiao Huan
- The National Heart, Lung, and Blood Institute's and Boston University's Framingham Heart Study, Framingham, Massachusetts 01702, USA.,Population Sciences Branch, Division of Intramural Research, National Heart, Lung, and Blood Institute, Bethesda, Maryland 20817, USA
| | - Albert Hofman
- Department of Epidemiology, Erasmus Medical Center, Rotterdam 3000CA, The Netherlands
| | - André G Uitterlinden
- Department of Internal Medicine, Erasmus Medical Centre Rotterdam, Rotterdam 3000CA, The Netherlands.,Department of Epidemiology, Erasmus Medical Center, Rotterdam 3000CA, The Netherlands
| | - Fernando Rivadeneira
- Department of Internal Medicine, Erasmus Medical Centre Rotterdam, Rotterdam 3000CA, The Netherlands.,Department of Epidemiology, Erasmus Medical Center, Rotterdam 3000CA, The Netherlands
| | - Jeroen van Rooij
- Department of Internal Medicine, Erasmus Medical Centre Rotterdam, Rotterdam 3000CA, The Netherlands
| | - Lisette Stolk
- Department of Internal Medicine, Erasmus Medical Centre Rotterdam, Rotterdam 3000CA, The Netherlands
| | - Linda Broer
- Department of Internal Medicine, Erasmus Medical Centre Rotterdam, Rotterdam 3000CA, The Netherlands
| | - Michael M P J Verbiest
- Department of Internal Medicine, Erasmus Medical Centre Rotterdam, Rotterdam 3000CA, The Netherlands
| | - Mila Jhamai
- Department of Internal Medicine, Erasmus Medical Centre Rotterdam, Rotterdam 3000CA, The Netherlands
| | - Pascal Arp
- Department of Internal Medicine, Erasmus Medical Centre Rotterdam, Rotterdam 3000CA, The Netherlands
| | - Andres Metspalu
- Estonian Genome Center, University of Tartu, Tartu 0794, Estonia
| | - Liina Tserel
- Molecular Pathology, Institute of Biomedicine, University of Tartu, Tartu 0794, Estonia
| | - Lili Milani
- Estonian Genome Center, University of Tartu, Tartu 0794, Estonia
| | - Nilesh J Samani
- Department of Cardiovascular Sciences, University of Leicester, Leicester LE1, UK.,National Institute for Health Research Leicester Cardiovascular Biomedical Research Unit, Glenfield Hospital, Leicester LE1, UK
| | - Pärt Peterson
- Molecular Pathology, Institute of Biomedicine, University of Tartu, Tartu 0794, Estonia
| | - Silva Kasela
- Institute of Molecular and Cell Biology, Estonian Genome Center, University of Tartu, Tartu 0794, Estonia
| | - Veryan Codd
- Department of Cardiovascular Sciences, University of Leicester, Leicester LE1, UK.,National Institute for Health Research Leicester Cardiovascular Biomedical Research Unit, Glenfield Hospital, Leicester LE1, UK
| | - Annette Peters
- Institute of Epidemiologie II, Helmholtz Zentrum Muenchen, German Research Center for Environmental Health, Neuherberg 85764, Germany.,Research Unit of Molecular Epidemiology, Helmholtz Zentrum Muenchen, German Research Center for Environmental Health, Neuherberg 85764, Germany
| | - Cavin K Ward-Caviness
- Institute of Epidemiologie II, Helmholtz Zentrum Muenchen, German Research Center for Environmental Health, Neuherberg 85764, Germany
| | - Christian Herder
- Institute of Clinical Diabetology, German Diabetes Center, Leibniz Center for Diabetes Research at Heinrich Heine University Düsseldorf, Düsseldorf 40593, Germany
| | - Melanie Waldenberger
- Institute of Epidemiologie II, Helmholtz Zentrum Muenchen, German Research Center for Environmental Health, Neuherberg 85764, Germany.,Research Unit of Molecular Epidemiology, Helmholtz Zentrum Muenchen, German Research Center for Environmental Health, Neuherberg 85764, Germany
| | - Michael Roden
- Institute of Clinical Diabetology, German Diabetes Center, Leibniz Center for Diabetes Research at Heinrich Heine University Düsseldorf, Düsseldorf 40593, Germany.,Division of Endocrinology and Diabetology, University Hospital Düsseldorf, Heinrich Heine University, Düsseldorf 40593, Germany
| | - Paula Singmann
- Institute of Epidemiologie II, Helmholtz Zentrum Muenchen, German Research Center for Environmental Health, Neuherberg 85764, Germany.,Research Unit of Molecular Epidemiology, Helmholtz Zentrum Muenchen, German Research Center for Environmental Health, Neuherberg 85764, Germany
| | - Sonja Zeilinger
- Institute of Epidemiologie II, Helmholtz Zentrum Muenchen, German Research Center for Environmental Health, Neuherberg 85764, Germany.,Research Unit of Molecular Epidemiology, Helmholtz Zentrum Muenchen, German Research Center for Environmental Health, Neuherberg 85764, Germany
| | - Thomas Illig
- Hannover Unified Biobank, Hannover Medical School, Hannover 30519, Germany
| | - Georg Homuth
- Department of Functional Genomics, Interfaculty Institute for Genetics and Functional Genomics, University Medicine Greifswald, Greifswald 17493, Germany
| | - Hans-Jörgen Grabe
- Department of Psychiatry and Psychotherapy, Helios Hospital Stralsund, University Medicine Greifswald, Greifswald 17489, Germany
| | - Henry Völzke
- Institute for Community Medicine, University Medicine Greifswald, Greifswald 17489, Germany
| | - Leif Steil
- Department of Functional Genomics, Interfaculty Institute for Genetics and Functional Genomics, University Medicine Greifswald, Greifswald 17493, Germany
| | - Thomas Kocher
- Unit of Periodontology, Department of Restorative Dentistry, Periodontology and Endodontology, University Medicine Greifswald, Greifswald 17489, Germany
| | - Anna Murray
- Epidemiology and Public Health, University of Exeter Medical School, Exeter EX4 1DB, UK
| | - David Melzer
- Epidemiology and Public Health, University of Exeter Medical School, Exeter EX4 1DB, UK
| | - Hanieh Yaghootkar
- Genetics of Complex Traits, University of Exeter Medical School, University of Exeter, Exeter EX2 5DW, UK
| | | | - Eric K Moses
- Centre for Genetic Origins of Health and Disease, The University of Western Australia, and Faculty of Health Sciences, Curtin University, Perth, Western Australia 9011, Australia
| | - Jack W Kent
- Department of Genetics, Texas Biomedical Research Institute, San Antonio, Texas 78201, USA
| | - Joanne E Curran
- Department of Genetics, Texas Biomedical Research Institute, San Antonio, Texas 78201, USA
| | - Matthew P Johnson
- Department of Genetics, Texas Biomedical Research Institute, San Antonio, Texas 78201, USA
| | | | - Harm-Jan Westra
- Department of Genetics, University Medical Center Groningen, University of Groningen, Groningen 9700RB, The Netherlands.,Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge 02138, USA.,Divisions of Genetics and Rheumatology, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts 02108, USA.,Partners Center for Personalized Genetic Medicine, Boston, Massachusetts 02108, USA
| | - Allan F McRae
- The Queensland Brain Institute, University of Queensland, Brisbane, Queensland 4000, Australia.,University of Queensland Diamantina Institute, University of Queensland, Princess Alexandra Hospital, Brisbane, Queensland 4000, Australia
| | - Jennifer A Smith
- Department of Epidemiology, University of Michigan, Ann Arbor, Michigan 48103, USA
| | - Sharon L R Kardia
- Department of Epidemiology, University of Michigan, Ann Arbor, Michigan 48103, USA
| | - Iiris Hovatta
- Department of Biosciences, University of Helsinki, Helsinki 00100, Finland.,Department of Mental Health and Substance Abuse Services, National Institute for Health and Welfare, Helsinki 00100, Finland
| | - Markus Perola
- Estonian Genome Center, University of Tartu, Tartu 0794, Estonia.,Institute for Molecular Medicine Finland FIMM, University of Helsinki, Helsinki 00131, Finland.,Department of Chronic Disease Prevention, National Institute for Health and Welfare, Helsinki 00131, Finland
| | - Samuli Ripatti
- Institute for Molecular Medicine Finland FIMM, University of Helsinki, Helsinki 00131, Finland.,Department of Chronic Disease Prevention, National Institute for Health and Welfare, Helsinki 00131, Finland.,Wellcome Trust Sanger Institute, Hinxton, Cambridge CB4, UK.,Department of Public Health, Hjelt Institute, University of Helsinki, Helsinki 00100, Finland
| | - Veikko Salomaa
- Department of Chronic Disease Prevention, National Institute for Health and Welfare, Helsinki 00131, Finland
| | - Anjali K Henders
- The Institute for Molecular Bioscience, University of Queensland, Brisbane, Queensland 4000, Australia
| | - Nicholas G Martin
- QIMR Berghofer Medical Research Institute, Brisbane, Queensland 4000, Australia
| | - Alicia K Smith
- Department of Psychiatry and Behavioral Sciences, Emory University School of Medicine, Atlanta, Georgia 30301, USA
| | - Divya Mehta
- Max-Planck Institute of Psychiatry, Munich 80331, Germany
| | | | - K Maria Nylocks
- Department of Psychiatry and Behavioral Sciences, Emory University School of Medicine, Atlanta, Georgia 30301, USA
| | - Elizabeth M Kennedy
- Department of Human Genetics, School of Medicine, Emory University, Atlanta, Georgia 30301, USA
| | | | - Jingzhong Ding
- Department of Internal Medicine, Wake Forest School of Medicine, Winston-Salem, North Carolina 27101, USA
| | - Astrid M Suchy-Dicey
- Department of Epidemiology, University of Washington, Seattle, Washington 98101, USA
| | - Daniel A Enquobahrie
- Department of Epidemiology, University of Washington, Seattle, Washington 98101, USA
| | - Jennifer Brody
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, Washington 98101, USA
| | - Jerome I Rotter
- Institute for Translational Genomics and Population Sciences, Los Angeles Biomedical Research Institute at Harbor-UCLA Medical Center, Torrance, California 90501, USA
| | - Yii-Der I Chen
- Institute for Translational Genomics and Population Sciences, Los Angeles Biomedical Research Institute at Harbor-UCLA Medical Center, Torrance, California 90501, USA
| | | | - Margreet Kloppenburg
- Department of Rheumatology, Leiden University Medical Center, Leiden 2300RC, The Netherlands.,Department of Clinical Epidemiology, Leiden University Medical Center, Leiden 2300RC, The Netherlands
| | - P Eline Slagboom
- Department of Molecular Epidemiology, Leiden University Medical Center, Leiden 2300RC, The Netherlands
| | - Quinta Helmer
- Department of Medical Statistics, Leiden University Medical Center, Leiden 2300RC, The Netherlands
| | - Wouter den Hollander
- Department of Molecular Epidemiology, Leiden University Medical Center, Leiden 2300RC, The Netherlands
| | - Shannon Bean
- Nathan Shock Center of Excellence in the Basic Biology of Aging, The Jackson Laboratory, Bar Harbor, Maine 04609, USA
| | - Towfique Raj
- Division of Immunology, Department of Microbiology and Immunobiology, Harvard Medical School, Boston, Massachusetts 02138, USA
| | - Noman Bakhshi
- Neuroscience Division, Garvan Institute of Medical Research, Australia and Charles Perkins Centre and School of Molecular Bioscience, The University of Sydney, Sydney, New South Wales 2006, Australia
| | - Qiao Ping Wang
- Neuroscience Division, Garvan Institute of Medical Research, Australia and Charles Perkins Centre and School of Molecular Bioscience, The University of Sydney, Sydney, New South Wales 2006, Australia
| | - Lisa J Oyston
- Neuroscience Division, Garvan Institute of Medical Research, Australia and Charles Perkins Centre and School of Molecular Bioscience, The University of Sydney, Sydney, New South Wales 2006, Australia
| | - Bruce M Psaty
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, Washington 98195, USA.,Cardiovascular Health Research Unit, Department of Epidemiology, University of Washington, Seattle, Washington 98195, USA.,Cardiovascular Health Research Unit, Department of Health Services, University of Washington, Seattle, Washington 98195, USA.,Group Health Research Institute, Group Health Cooperative, Seattle, Washington 98195, USA
| | - Russell P Tracy
- Department of Pathology, University of Vermont College of Medicine, Colchester, Vermont 98195, USA
| | - Grant W Montgomery
- QIMR Berghofer Medical Research Institute, Brisbane, Queensland 4000, Australia
| | - Stephen T Turner
- Division of Nephrology and Hypertension, Department of Medicine, Mayo Clinic, Rochester, Minnesota 55901, USA
| | - John Blangero
- Department of Genetics, Texas Biomedical Research Institute, San Antonio, Texas 78201, USA
| | - Ingrid Meulenbelt
- Department of Molecular Epidemiology, Leiden University Medical Center, Leiden 2300RC, The Netherlands
| | - Kerry J Ressler
- Department of Psychiatry and Behavioral Sciences, Emory University School of Medicine, Atlanta, Georgia 30301, USA
| | - Jian Yang
- The Queensland Brain Institute, University of Queensland, Brisbane, Queensland 4000, Australia.,University of Queensland Diamantina Institute, University of Queensland, Princess Alexandra Hospital, Brisbane, Queensland 4000, Australia
| | - Lude Franke
- Department of Genetics, University Medical Center Groningen, University of Groningen, Groningen 9700RB, The Netherlands
| | - Johannes Kettunen
- Institute for Molecular Medicine Finland FIMM, University of Helsinki, Helsinki 00131, Finland.,Department of Chronic Disease Prevention, National Institute for Health and Welfare, Helsinki 00131, Finland.,Computational Medicine, Institute of Health Sciences, Faculty of Medicine, University of Oulu, Oulu 90570, Finland
| | - Peter M Visscher
- The Queensland Brain Institute, University of Queensland, Brisbane, Queensland 4000, Australia.,University of Queensland Diamantina Institute, University of Queensland, Princess Alexandra Hospital, Brisbane, Queensland 4000, Australia
| | - G Gregory Neely
- Neuroscience Division, Garvan Institute of Medical Research, Australia and Charles Perkins Centre and School of Molecular Bioscience, The University of Sydney, Sydney, New South Wales 2006, Australia
| | - Ron Korstanje
- Nathan Shock Center of Excellence in the Basic Biology of Aging, The Jackson Laboratory, Bar Harbor, Maine 04609, USA
| | - Robert L Hanson
- Phoenix Epidemiology and Clinical Research Branch, National Institute of Diabetes and Digestive and Kidney Disease, National Institutes of Health, Phoenix, Arizona 85001, USA
| | - Holger Prokisch
- Institute of Human Genetics, Helmholz Zentrum München - German Research Center for Environmental Health, Neuherberg 85764, Germany.,Institute of Human Genetics, Technical University Munich, Munich 85540, Germany
| | - Luigi Ferrucci
- Clinical Research Branch, National Institute on Aging, Baltimore, Maryland 21218, USA
| | - Tonu Esko
- Estonian Genome Center, University of Tartu, Tartu 0794, Estonia.,Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge 02138, USA.,Division of Endocrinology, Children's Hospital Boston, Boston, Massachusetts 02108, USA.,Department of Genetics, Harvard Medical School, Boston, Massachusetts 02108, USA
| | - Alexander Teumer
- Department of Functional Genomics, Interfaculty Institute for Genetics and Functional Genomics, University Medicine Greifswald, Greifswald 17493, Germany
| | - Joyce B J van Meurs
- Department of Internal Medicine, Erasmus Medical Centre Rotterdam, Rotterdam 3000CA, The Netherlands
| | - Andrew D Johnson
- The National Heart, Lung, and Blood Institute's and Boston University's Framingham Heart Study, Framingham, Massachusetts 01702, USA.,Population Sciences Branch, Division of Intramural Research, National Heart, Lung, and Blood Institute, Bethesda, Maryland 20817, USA
| |
Collapse
|
22
|
Tsonaka R, van der Woude D, Houwing-Duistermaat J. Marginal genetic effects estimation in family and twin studies using random-effects models. Biometrics 2015; 71:1130-8. [PMID: 26148843 DOI: 10.1111/biom.12350] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2014] [Revised: 04/01/2015] [Accepted: 05/01/2015] [Indexed: 11/30/2022]
Abstract
Random-effects models are often used in family-based genetic association studies to properly capture the within families relationships. In such models, the regression parameters have a conditional on the random effects interpretation and they measure, e.g., genetic effects for each family. Estimating parameters that can be used to make inferences at the population level is often more relevant than the family-specific effects, but not straightforward. This is mainly for two reasons: First the analysis of family data often requires high-dimensional random-effects vectors to properly model the familial relationships, for instance when members with a different degree of relationship are considered, such as trios, mix of monozygotic and dizygotic twins, etc. The second complication is the biased sampling design, such as the multiple cases families design, which is often employed to enrich the sample with genetic information. For these reasons deriving parameters with the desired marginal interpretation can be challenging. In this work we consider the marginalized mixed-effects models, we discuss challenges in applying them in ascertained family data and propose penalized maximum likelihood methodology to stabilize the parameter estimation by using external information on the disease prevalence or heritability. The performance of our methodology is evaluated via simulation and is illustrated on data from Rheumatoid Arthritis patients, where we estimate the marginal effect of HLA-DRB1*13 and shared epitope alleles across three different study designs and combine them using meta-analysis.
Collapse
Affiliation(s)
- Roula Tsonaka
- Department of Medical Statistics and BioInformatics, Leiden University Medical Center, Post Zone S5-P, PO Box 9600, 2300 RC Leiden, The Netherlands
| | - Diane van der Woude
- Department of Rheumatology, Leiden University Medical Center, Leiden, The Netherlands
| | - Jeanine Houwing-Duistermaat
- Department of Medical Statistics and BioInformatics, Leiden University Medical Center, Post Zone S5-P, PO Box 9600, 2300 RC Leiden, The Netherlands
| |
Collapse
|
23
|
Balliu B, Tsonaka R, Boehringer S, Houwing-Duistermaat J. A retrospective likelihood approach for efficient integration of multiple omics factors in case-control association studies. Genet Epidemiol 2015; 39:156-65. [PMID: 25620726 DOI: 10.1002/gepi.21884] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2014] [Revised: 10/08/2014] [Accepted: 12/02/2014] [Indexed: 11/09/2022]
Abstract
Integrative omics, the joint analysis of outcome and multiple types of omics data, such as genomics, epigenomics, and transcriptomics data, constitute a promising approach for powerful and biologically relevant association studies. These studies often employ a case-control design, and often include nonomics covariates, such as age and gender, that may modify the underlying omics risk factors. An open question is how to best integrate multiple omics and nonomics information to maximize statistical power in case-control studies that ascertain individuals based on the phenotype. Recent work on integrative omics have used prospective approaches, modeling case-control status conditional on omics, and nonomics risk factors. Compared to univariate approaches, jointly analyzing multiple risk factors with a prospective approach increases power in nonascertained cohorts. However, these prospective approaches often lose power in case-control studies. In this article, we propose a novel statistical method for integrating multiple omics and nonomics factors in case-control association studies. Our method is based on a retrospective likelihood function that models the joint distribution of omics and nonomics factors conditional on case-control status. The new method provides accurate control of Type I error rate and has increased efficiency over prospective approaches in both simulated and real data.
Collapse
Affiliation(s)
- Brunilda Balliu
- Department of Medical Statistics and Bioinformatics, Leiden University Medical Center, The Netherlands
| | | | | | | |
Collapse
|
24
|
Boomsma DI, Wijmenga C, Slagboom EP, Swertz MA, Karssen LC, Abdellaoui A, Ye K, Guryev V, Vermaat M, van Dijk F, Francioli LC, Hottenga JJ, Laros JFJ, Li Q, Li Y, Cao H, Chen R, Du Y, Li N, Cao S, van Setten J, Menelaou A, Pulit SL, Hehir-Kwa JY, Beekman M, Elbers CC, Byelas H, de Craen AJM, Deelen P, Dijkstra M, den Dunnen JT, de Knijff P, Houwing-Duistermaat J, Koval V, Estrada K, Hofman A, Kanterakis A, Enckevort DV, Mai H, Kattenberg M, van Leeuwen EM, Neerincx PBT, Oostra B, Rivadeneira F, Suchiman EHD, Uitterlinden AG, Willemsen G, Wolffenbuttel BH, Wang J, de Bakker PIW, van Ommen GJ, van Duijn CM. The Genome of the Netherlands: design, and project goals. Eur J Hum Genet 2014; 22:221-7. [PMID: 23714750 PMCID: PMC3895638 DOI: 10.1038/ejhg.2013.118] [Citation(s) in RCA: 178] [Impact Index Per Article: 17.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2012] [Revised: 02/28/2013] [Accepted: 03/24/2013] [Indexed: 11/09/2022] Open
Abstract
Within the Netherlands a national network of biobanks has been established (Biobanking and Biomolecular Research Infrastructure-Netherlands (BBMRI-NL)) as a national node of the European BBMRI. One of the aims of BBMRI-NL is to enrich biobanks with different types of molecular and phenotype data. Here, we describe the Genome of the Netherlands (GoNL), one of the projects within BBMRI-NL. GoNL is a whole-genome-sequencing project in a representative sample consisting of 250 trio-families from all provinces in the Netherlands, which aims to characterize DNA sequence variation in the Dutch population. The parent-offspring trios include adult individuals ranging in age from 19 to 87 years (mean=53 years; SD=16 years) from birth cohorts 1910-1994. Sequencing was done on blood-derived DNA from uncultured cells and accomplished coverage was 14-15x. The family-based design represents a unique resource to assess the frequency of regional variants, accurately reconstruct haplotypes by family-based phasing, characterize short indels and complex structural variants, and establish the rate of de novo mutational events. GoNL will also serve as a reference panel for imputation in the available genome-wide association studies in Dutch and other cohorts to refine association signals and uncover population-specific variants. GoNL will create a catalog of human genetic variation in this sample that is uniquely characterized with respect to micro-geographic location and a wide range of phenotypes. The resource will be made available to the research and medical community to guide the interpretation of sequencing projects. The present paper summarizes the global characteristics of the project.
Collapse
Affiliation(s)
- Dorret I Boomsma
- Department of Biological Psychology, VU University Amsterdam, Netherlands Twin Register, Amsterdam, The Netherlands
| | - Cisca Wijmenga
- University of Groningen, University Medical Center Groningen, Department of Genetics, Groningen, The Netherlands
| | - Eline P Slagboom
- Molecular Epidemiology Section, Leiden University Medical Center, Netherlands Consortium for Healthy Ageing, Leiden, The Netherlands
| | - Morris A Swertz
- University of Groningen, University Medical Center Groningen, Department of Genetics, Groningen, The Netherlands
| | - Lennart C Karssen
- Department of Epidemiology, Erasmus Medical Center, Rotterdam, The Netherlands
| | - Abdel Abdellaoui
- Department of Biological Psychology, VU University Amsterdam, Netherlands Twin Register, Amsterdam, The Netherlands
| | - Kai Ye
- Molecular Epidemiology Section, Leiden University Medical Center, Netherlands Consortium for Healthy Ageing, Leiden, The Netherlands
| | - Victor Guryev
- European Research Institute for the Biology of Ageing, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- Hubrecht Institute, Royal Netherlands Academy of Arts and Sciences, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Martijn Vermaat
- Netherlands Bioinformatics Centre, Nijmegen, The Netherlands
- Department of Human Genetics, Center for Human and Clinical Genetics, Leiden University Medical Center, Leiden, The Netherlands
- Leiden Genome Technology Center, Leiden University Medical Center, Leiden, The Netherlands
| | - Freerk van Dijk
- University of Groningen, University Medical Center Groningen, Department of Genetics, Groningen, The Netherlands
| | - Laurent C Francioli
- Department of Medical Genetics, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Jouke Jan Hottenga
- Department of Biological Psychology, VU University Amsterdam, Netherlands Twin Register, Amsterdam, The Netherlands
| | - Jeroen F J Laros
- Netherlands Bioinformatics Centre, Nijmegen, The Netherlands
- Department of Human Genetics, Center for Human and Clinical Genetics, Leiden University Medical Center, Leiden, The Netherlands
- Leiden Genome Technology Center, Leiden University Medical Center, Leiden, The Netherlands
| | | | | | | | | | | | - Ning Li
- BGI-Europe, Copenhagen, Denmark
| | | | - Jessica van Setten
- Department of Medical Genetics, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Androniki Menelaou
- Department of Medical Genetics, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Sara L Pulit
- Department of Medical Genetics, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Jayne Y Hehir-Kwa
- Department of Human Genetics, Radboud University Medical Centre, Nijmegen, The Netherlands
| | - Marian Beekman
- Department of Gerontology and Geriatrics, Leiden University Medical Centre, Leiden, The Netherlands
| | - Clara C Elbers
- Department of Medical Genetics, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Heorhiy Byelas
- University of Groningen, University Medical Center Groningen, Department of Genetics, Groningen, The Netherlands
| | - Anton J M de Craen
- Department of Gerontology and Geriatrics, Leiden University Medical Centre, Leiden, The Netherlands
| | - Patrick Deelen
- University of Groningen, University Medical Center Groningen, Department of Genetics, Groningen, The Netherlands
| | - Martijn Dijkstra
- University of Groningen, University Medical Center Groningen, Department of Genetics, Groningen, The Netherlands
| | - Johan T den Dunnen
- Department of Human Genetics, Center for Human and Clinical Genetics, Leiden University Medical Center, Leiden, The Netherlands
- Leiden Genome Technology Center, Leiden University Medical Center, Leiden, The Netherlands
| | - Peter de Knijff
- Department of Human Genetics, Center for Human and Clinical Genetics, Leiden University Medical Center, Leiden, The Netherlands
- Leiden Genome Technology Center, Leiden University Medical Center, Leiden, The Netherlands
| | - Jeanine Houwing-Duistermaat
- Department of Medical Statistics and Bioinformatics, Leiden University Medical Centre, Leiden, The Netherlands
| | - Vyacheslav Koval
- Erasmus Medical Centre, Genetic Laboratory Internal Medicine, Rotterdam, The Netherlands
| | - Karol Estrada
- Erasmus Medical Centre, Genetic Laboratory Internal Medicine, Rotterdam, The Netherlands
| | - Albert Hofman
- Department of Epidemiology, Erasmus Medical Center, Rotterdam, The Netherlands
| | - Alexandros Kanterakis
- University of Groningen, University Medical Center Groningen, Department of Genetics, Groningen, The Netherlands
| | | | - Hailiang Mai
- Netherlands Bioinformatics Centre, Nijmegen, The Netherlands
| | - Mathijs Kattenberg
- Department of Biological Psychology, VU University Amsterdam, Netherlands Twin Register, Amsterdam, The Netherlands
| | | | - Pieter B T Neerincx
- University of Groningen, University Medical Center Groningen, Department of Genetics, Groningen, The Netherlands
| | - Ben Oostra
- Department of Clinical Genetics, Erasmus University Medical School, Rotterdam, The Netherlands
| | - Fernanodo Rivadeneira
- Erasmus Medical Centre, Genetic Laboratory Internal Medicine, Rotterdam, The Netherlands
| | - Eka H D Suchiman
- Molecular Epidemiology Section, Leiden University Medical Center, Netherlands Consortium for Healthy Ageing, Leiden, The Netherlands
| | - Andre G Uitterlinden
- Erasmus Medical Centre, Genetic Laboratory Internal Medicine, Rotterdam, The Netherlands
| | - Gonneke Willemsen
- Department of Biological Psychology, VU University Amsterdam, Netherlands Twin Register, Amsterdam, The Netherlands
| | - Bruce H Wolffenbuttel
- LifeLines Cohort Study & Department of Endocrinology, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| | - Jun Wang
- BGI-Shenzhen, Shenzhen, China
- Department of Biology, University of Copenhagen, Copenhagen, Denmark
- The Novo Nordisk Foundation Center for Basic Metabolic Research, University of Copenhagen, Copenhagen, Denmark
| | - Paul I W de Bakker
- Department of Medical Genetics, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Gert-Jan van Ommen
- Department of Human Genetics, Leiden University Medical Centre, Leiden, The Netherlands
| | | |
Collapse
|
25
|
Reinards TH, Albers H, Brinkman D, Kamphuis S, Van Rossum M, Hoppenreijs E, Girschick H, Wouters C, Saurenmann R, Toes R, Huizinga T, Houwing-Duistermaat J, Schilham M, Ten Cate R. PReS-FINAL-2109: Genetic variations in patients with juvenile idiopathic arthritis and uveitis. Pediatr Rheumatol Online J 2013. [PMCID: PMC4044289 DOI: 10.1186/1546-0096-11-s2-p121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
|
26
|
De Rooy D, Zhernakova A, Tsonaka R, Willemze A, Kurreeman F, Trynka G, van Toorn L, Toes R, Huizinga T, Houwing-Duistermaat J, Gregersen P, van der Helm-van Mil A. OP0049 A Genetic Variant in the Region of MMP-9 is Associated with Serum Levels and Progression of Joint Damage in Rheumatoid Arthritis. Ann Rheum Dis 2013. [DOI: 10.1136/annrheumdis-2013-eular.254] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
|
27
|
De Rooy D, Tsonaka R, Andersson M, Forslind K, Zhernakova A, de Kovel C, Koeleman B, van der Heijde D, Huizinga T, Toes R, Houwing-Duistermaat J, Svensson B, van der Helm-van Mil A. OP0021 Genetic Factors for the Severity of ACPA-Negative Rheumatoid Arthritis in Two Cohorts of Early Disease: A Genome-Wide Study. Ann Rheum Dis 2013. [DOI: 10.1136/annrheumdis-2013-eular.226] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
|
28
|
Knevel R, Krabben A, Wilson G, Brouwer E, Lindqvist E, de Rooy D, Daha N, van der Linden M, Tsonaka R, Zhernakova S, Westra HJ, Franke L, Houwing-Duistermaat J, Toes R, Huizinga T, Saxne T, van der Helm-van Mil A. THU0001 Variation in granzyme-b is associated with progression of joint destruction in rheumatoid arthritis. Ann Rheum Dis 2013. [DOI: 10.1136/annrheumdis-2012-eular.1966] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
|
29
|
de Rooy D, Yeremenko N, Knevel R, Brouwer E, Wilson G, Lindqvist E, Saxne T, Krabben A, Leijsma M, Tsonaka R, Daha N, Zhernakova S, Houwing-Duistermaat J, Huizinga T, Baeten D, Toes R, van der Helm-van Mil A. OP0089 Genetic studies on DICKKOPF-1, sclerostin and the severity of joint destruction in rheumatoid arthritis. Ann Rheum Dis 2013. [DOI: 10.1136/annrheumdis-2012-eular.1772] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
|
30
|
Tsonaka R, De Visser MCH, Houwing-Duistermaat J. Estimation of genetic effects in multiple cases family studies using penalized maximum likelihood methodology. Biostatistics 2012; 14:220-31. [PMID: 22989557 DOI: 10.1093/biostatistics/kxs032] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Family studies are often used in genetic research to explore associations between genetic markers and various phenotypes. A commonly used design oversamples families enriched with the disease under study for efficient data collection and estimation. For instance, in a multiple cases family study, families are selected based on the number of affected relatives. In such cases, valid inference for the model parameters relies on the proper modeling of both the within family correlations and the outcome-dependent sampling, also known as ascertainment. A flexible modeling approach is the ascertainment-corrected mixed-effects model, but it is known to only be asymptotically identifiable, because in small samples the available data do not provide sufficient information to estimate both the intercept and the genetic variance. To deal with this issue, we propose a penalized maximum likelihood estimation procedure which reliably estimates the model parameters in small family studies by using external population-based information.
Collapse
Affiliation(s)
- Roula Tsonaka
- Department of Medical Statistics and BioInformatics, Leiden University Medical Center, Post Zone S5-P, PO Box 9600, 2300 RC Leiden, The Netherlands.
| | | | | |
Collapse
|
31
|
Ghosh S, Bickeböller H, Bailey J, Bailey-Wilson JE, Cantor R, Culverhouse R, Daw W, Destefano AL, Engelman CD, Hinrichs A, Houwing-Duistermaat J, König IR, Kent J, Laird N, Pankratz N, Paterson A, Pugh E, Suarez B, Sun Y, Thomas A, Tintle N, Zhu X, Ziegler A, Maccluer JW, Almasy L. Identifying rare variants from exome scans: the GAW17 experience. BMC Proc 2011; 5 Suppl 9:S1. [PMID: 22373325 PMCID: PMC3287821 DOI: 10.1186/1753-6561-5-s9-s1] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Genetic Analysis Workshop 17 (GAW17) provided a platform for evaluating existing statistical genetic methods and for developing novel methods to analyze rare variants that modulate complex traits. In this article, we present an overview of the 1000 Genomes Project exome data and simulated phenotype data that were distributed to GAW17 participants for analyses, the different issues addressed by the participants, and the process of preparation of manuscripts resulting from the discussions during the workshop.
Collapse
Affiliation(s)
- Saurabh Ghosh
- Human Genetics Unit, Indian Statistical Institute, Kolkata 700018, India.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
32
|
Guchelaar H, Pander J, Bohringer S, van der Straaten T, Ariyurek Y, Houwing-Duistermaat J, Gelderblom H, Punt CJA. Genome-wide association study (GWAS) of the efficacy of capecitabine, oxaliplatin, and bevacizumab in metastatic colorectal cancer in the CAIRO2 trial of the Dutch Colorectal Cancer Group (DCCG). J Clin Oncol 2011. [DOI: 10.1200/jco.2011.29.15_suppl.3609] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
|
33
|
Ruhaak LR, Uh HW, Beekman M, Hokke CH, Westendorp RGJ, Houwing-Duistermaat J, Wuhrer M, Deelder AM, Slagboom PE. Plasma protein N-glycan profiles are associated with calendar age, familial longevity and health. J Proteome Res 2011; 10:1667-74. [PMID: 21184610 DOI: 10.1021/pr1009959] [Citation(s) in RCA: 76] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The development of medical interventions for the preservation of disease-free longevity would be facilitated by markers that predict healthy aging. Altered protein N-glycosylation patterns have been found with increasing age and several disease states. Here we investigate whether glycans derived from the total glycoprotein pool in plasma mark familial longevity and distinguish healthy from unhealthy aging. Total plasma N-glycan profiles of 2396 middle aged participants in the Leiden Longevity Study (LLS) were obtained by glycan release, labeling, and subsequent HPLC analysis with fluorescence detection. After normalization and batch correction, several regression strategies were applied to evaluate associations between glycan patterns, familial longevity, and healthy aging. Two N-glycan features (LC-7 and LC-8) were identified to be more abundant in plasma of the offspring of long-lived individuals as compared to controls. These results were not confounded by the altered lipid status or glucose homeostasis of the offspring. Furthermore, a decrease in levels of LC-8 was associated with the occurrence of myocardial infarction (p = 0.049, coefficient = -0.065), indicating that plasma glycosylation patterns do not only mark familial longevity but may also reflect healthy aging. In conclusion, we describe two glycan features, of which increased levels mark familial longevity and decreased levels of one of these features mark the presence of cardiovascular disease.
Collapse
Affiliation(s)
- L Renee Ruhaak
- Department of Parasitology, Biomolecular Mass Spectrometry Unit, Leiden University Medical Center, Leiden, The Netherlands.
| | | | | | | | | | | | | | | | | |
Collapse
|
34
|
Kurreeman FAS, Rocha D, Houwing-Duistermaat J, Vrijmoet S, Teixeira VH, Migliorini P, Balsa A, Westhovens R, Barrera P, Alves H, Vaz C, Fernandes M, Pascual-Salcedo D, Michou L, Bombardieri S, Radstake T, van Riel P, van de Putte L, Lopes-Vaz A, Prum B, Bardin T, Gut I, Cornelis F, Huizinga TWJ, Petit-Teixeira E, Toes REM. Replication of the tumor necrosis factor receptor-associated factor 1/complement component 5 region as a susceptibility locus for rheumatoid arthritis in a European family-based study. ACTA ACUST UNITED AC 2010; 58:2670-4. [PMID: 18759306 DOI: 10.1002/art.23793] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
OBJECTIVE We recently showed, using a candidate gene approach in a case-control association study, that a 65-kb block encompassing tumor necrosis factor receptor-associated factor 1 (TRAF1) and C5 is strongly associated with rheumatoid arthritis (RA). Compared with case-control association studies, family-based studies have the added advantage of controlling potential differences in population structure and are not likely to be hampered by variation in population allele frequencies, as is seen for many genetic polymorphisms, including the TRAF1/C5 locus. The aim of this study was to confirm this association in populations of European origin by using a family-based approach. METHODS A total of 1,356 western European white individuals from 452 "trio" families were genotyped for the rs10818488 polymorphism, using the TaqMan allelic discrimination assay. RESULTS We observed evidence for association, demonstrating departure from Mendel's law, with an overtransmission of the rs10818488 A allele (A = 55%; P = 0.036). By taking into consideration parental phenotypes, we also observed an increased A allele frequency in affected versus unaffected parents (A = 64%; combined P = 0.015). Individuals carrying the A allele had a 1.2-fold increased risk of developing RA (allelic odds ratio 1.24, 95% confidence interval 1.04-1.50). CONCLUSION Using a family-based study that is robust against population stratification, we provide evidence for the association of the TRAF1/C5 rs10818488 A allele and RA in populations of European descent, further substantiating our previous findings. Future functional studies should yield insight into the biologic relevance of this locus to the pathways involved in RA.
Collapse
|
35
|
Kurreeman FAS, Goulielmos GN, Alizadeh BZ, Rueda B, Houwing-Duistermaat J, Sanchez E, Bevova M, Radstake TR, Vonk MC, Galanakis E, Ortego N, Verduyn W, Zervou MI, Roep BO, Dema B, Espino L, Urcelay E, Boumpas DT, van den Berg LH, Wijmenga C, Koeleman BPC, Huizinga TWJ, Toes REM, Martin J. The TRAF1-C5 region on chromosome 9q33 is associated with multiple autoimmune diseases. Ann Rheum Dis 2009; 69:696-9. [PMID: 19433411 DOI: 10.1136/ard.2008.106567] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
OBJECTIVES The TRAF1-C5 locus has recently been identified as a genetic risk factor for rheumatoid arthritis (RA). Since genetic risk factors tend to overlap with several autoimmune diseases, a study was undertaken to investigate whether this region is associated with type 1 diabetes (TID), celiac disease (CD), systemic sclerosis (SSc) and systemic lupus erythematosus (SLE). METHODS The most consistently associated SNP, rs10818488, was genotyped in a total of 735 patients with T1D, 1049 with CD, 367 with SSc, 746 with SLE and 3494 ethnically- and geographically-matched healthy individuals. The replication sample set consisted of 99 patients with T1D, 272 with SLE and 482 healthy individuals from Crete. RESULTS A significant association was detected between the rs10818488 A allele and T1D (OR 1.14, p=0.027) and SLE (OR 1.16, p=0.016), which was replicated in 99 patients with T1D, 272 with SLE and 482 controls from Crete (OR 1.64, p=0.002; OR 1.43, p=0.002, respectively). Joint analysis of all patients with T1D (N=961) and all patients with SLE (N=1018) compared with 3976 healthy individuals yielded an allelic common OR of 1.19 (p=0.002) and 1.22 (p=2.6 x 10(-4)), respectively. However, combining our dataset with the T1D sample set from the WTCCC resulted in a non-significant association (OR 1.06, p=0.087). In contrast, previously unpublished results from the SLEGEN study showed a significant association of the same allele (OR 1.19, p=0.0038) with an overall effect of 1.22 (p=1.02 x 10(-6)) in a total of 1577 patients with SLE and 4215 healthy individuals. CONCLUSION A significant association was found for the TRAF1-C5 locus in SLE, implying that this region lies in a pathway relevant to multiple autoimmune diseases.
Collapse
Affiliation(s)
- Fina A S Kurreeman
- Department of Rheumatology, Leiden University Medical Center, Albinusdreef 2, Leiden, The Netherlands
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
36
|
Kurreeman FAS, Padyukov L, Marques RB, Schrodi SJ, Seddighzadeh M, Stoeken-Rijsbergen G, van der Helm-van Mil AHM, Allaart CF, Verduyn W, Houwing-Duistermaat J, Alfredsson L, Begovich AB, Klareskog L, Huizinga TWJ, Toes REM. A candidate gene approach identifies the TRAF1/C5 region as a risk factor for rheumatoid arthritis. PLoS Med 2007; 4:e278. [PMID: 17880261 PMCID: PMC1976626 DOI: 10.1371/journal.pmed.0040278] [Citation(s) in RCA: 200] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/22/2007] [Accepted: 08/13/2007] [Indexed: 11/25/2022] Open
Abstract
BACKGROUND Rheumatoid arthritis (RA) is a chronic autoimmune disorder affecting approximately 1% of the population. The disease results from the interplay between an individual's genetic background and unknown environmental triggers. Although human leukocyte antigens (HLAs) account for approximately 30% of the heritable risk, the identities of non-HLA genes explaining the remainder of the genetic component are largely unknown. Based on functional data in mice, we hypothesized that the immune-related genes complement component 5 (C5) and/or TNF receptor-associated factor 1 (TRAF1), located on Chromosome 9q33-34, would represent relevant candidate genes for RA. We therefore aimed to investigate whether this locus would play a role in RA. METHODS AND FINDINGS We performed a multitiered case-control study using 40 single-nucleotide polymorphisms (SNPs) from the TRAF1 and C5 (TRAF1/C5) region in a set of 290 RA patients and 254 unaffected participants (controls) of Dutch origin. Stepwise replication of significant SNPs was performed in three independent sample sets from the Netherlands (ncases/controls = 454/270), Sweden (ncases/controls = 1,500/1,000) and US (ncases/controls = 475/475). We observed a significant association (p < 0.05) of SNPs located in a haplotype block that encompasses a 65 kb region including the 3' end of C5 as well as TRAF1. A sliding window analysis revealed an association peak at an intergenic region located approximately 10 kb from both C5 and TRAF1. This peak, defined by SNP14/rs10818488, was confirmed in a total of 2,719 RA patients and 1,999 controls (odds ratiocommon = 1.28, 95% confidence interval 1.17-1.39, pcombined = 1.40 x 10(-8)) with a population-attributable risk of 6.1%. The A (minor susceptibility) allele of this SNP also significantly correlates with increased disease progression as determined by radiographic damage over time in RA patients (p = 0.008). CONCLUSIONS Using a candidate-gene approach we have identified a novel genetic risk factor for RA. Our findings indicate that a polymorphism in the TRAF1/C5 region increases the susceptibility to and severity of RA, possibly by influencing the structure, function, and/or expression levels of TRAF1 and/or C5.
Collapse
Affiliation(s)
- Fina A. S Kurreeman
- Department of Rheumatology, Leiden University Medical Centre, Leiden, The Netherlands
| | - Leonid Padyukov
- Rheumatology Unit, Department of Medicine, Karolinska Institute at Karolinska Hospital, Stockholm, Sweden
| | - Rute B Marques
- Department of Rheumatology, Leiden University Medical Centre, Leiden, The Netherlands
| | | | - Maria Seddighzadeh
- Rheumatology Unit, Department of Medicine, Karolinska Institute at Karolinska Hospital, Stockholm, Sweden
| | | | | | - Cornelia F Allaart
- Department of Rheumatology, Leiden University Medical Centre, Leiden, The Netherlands
| | - Willem Verduyn
- Department of Immunohaematology and Bloodbank, Leiden University Medical Center, Leiden, The Netherlands
| | | | - Lars Alfredsson
- Institute of Environmental Medicine, Karolinska Institute, Stockholm, Sweden
| | | | - Lars Klareskog
- Rheumatology Unit, Department of Medicine, Karolinska Institute at Karolinska Hospital, Stockholm, Sweden
| | - Tom W. J Huizinga
- Department of Rheumatology, Leiden University Medical Centre, Leiden, The Netherlands
| | - Rene E. M Toes
- Department of Rheumatology, Leiden University Medical Centre, Leiden, The Netherlands
- * To whom correspondence should be addressed. E-mail:
| |
Collapse
|