1
|
Sandoval-Castillo J, Beheregaray LB, Wellenreuther M. Genomic prediction of growth in a commercially, recreationally, and culturally important marine resource, the Australian snapper (Chrysophrys auratus). G3 (BETHESDA, MD.) 2022; 12:jkac015. [PMID: 35100370 PMCID: PMC8896003 DOI: 10.1093/g3journal/jkac015] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/05/2021] [Accepted: 01/07/2022] [Indexed: 06/14/2023]
Abstract
Growth is one of the most important traits of an organism. For exploited species, this trait has ecological and evolutionary consequences as well as economical and conservation significance. Rapid changes in growth rate associated with anthropogenic stressors have been reported for several marine fishes, but little is known about the genetic basis of growth traits in teleosts. We used reduced genome representation data and genome-wide association approaches to identify growth-related genetic variation in the commercially, recreationally, and culturally important Australian snapper (Chrysophrys auratus, Sparidae). Based on 17,490 high-quality single-nucleotide polymorphisms and 363 individuals representing extreme growth phenotypes from 15,000 fish of the same age and reared under identical conditions in a sea pen, we identified 100 unique candidates that were annotated to 51 proteins. We documented a complex polygenic nature of growth in the species that included several loci with small effects and a few loci with larger effects. Overall heritability was high (75.7%), reflected in the high accuracy of the genomic prediction for the phenotype (small vs large). Although the single-nucleotide polymorphisms were distributed across the genome, most candidates (60%) clustered on chromosome 16, which also explains the largest proportion of heritability (16.4%). This study demonstrates that reduced genome representation single-nucleotide polymorphisms and the right bioinformatic tools provide a cost-efficient approach to identify growth-related loci and to describe genomic architectures of complex quantitative traits. Our results help to inform captive aquaculture breeding programs and are of relevance to monitor growth-related evolutionary shifts in wild populations in response to anthropogenic pressures.
Collapse
Affiliation(s)
- Jonathan Sandoval-Castillo
- Molecular Ecology Laboratory, College of Science and Engineering, Flinders University, Bedford Park, SA 5042, Australia
| | - Luciano B Beheregaray
- Molecular Ecology Laboratory, College of Science and Engineering, Flinders University, Bedford Park, SA 5042, Australia
| | - Maren Wellenreuther
- School of Biological Sciences, The New Zealand Institute for Plant and Food Research Limited, Nelson 7010, New Zealand
- Seafood Production Group, The School of Biological Sciences, University of Auckland, Auckland 1010, New Zealand
| |
Collapse
|
2
|
Casto-Rebollo C, Argente MJ, García ML, Pena R, Ibáñez-Escriche N. Identification of functional mutations associated with environmental variance of litter size in rabbits. Genet Sel Evol 2020; 52:22. [PMID: 32375645 PMCID: PMC7203823 DOI: 10.1186/s12711-020-00542-w] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2019] [Accepted: 04/27/2020] [Indexed: 12/18/2022] Open
Abstract
Background Environmental variance (VE) is partly under genetic control and has recently been proposed as a measure of resilience. Unravelling the genetic background of the VE of complex traits could help to improve resilience of livestock and stabilize their production across farming systems. The objective of this study was to identify genes and functional mutations associated with variation in VE of litter size (LS) in rabbits. To achieve this, we combined the results of a genome-wide association study (GWAS) and a whole-genome sequencing (WGS) analysis using data from two divergently selected rabbit lines for high and low VE of LS. These lines differ in terms of biomarkers of immune response and mortality. Moreover, rabbits with a lower VE of LS were found to be more resilient to infections than animals with a higher VE of LS. Results By using two GWAS approaches (single-marker regression and Bayesian multiple-marker regression), we identified four genomic regions associated with VE of LS, on chromosomes 3, 7, 10, and 14. We detected 38 genes in the associated genomic regions and, using WGS, we identified 129 variants in the splicing, UTR, and coding (missense and frameshift effects) regions of 16 of these 38 genes. These genes were related to the immune system, the development of sensory structures, and stress responses. All of these variants (except one) segregated in one of the rabbit lines and were absent (n = 91) or fixed in the other one (n = 37). The fixed variants were in the HDAC9, ITGB8, MIS18A, ENSOCUG00000021276 and URB1 genes. We also identified a 1-bp deletion in the 3′UTR region of the HUNK gene that was fixed in the low VE line and absent in the high VE line. Conclusions This is the first study that combines GWAS and WGS analyses to study the genetic basis of VE. The new candidate genes and functional mutations identified in this study suggest that the VE of LS is under the control of functions related to the immune system, stress response, and the nervous system. These findings could also explain differences in resilience between rabbits with homogeneous and heterogeneous VE of litter size.
Collapse
Affiliation(s)
- Cristina Casto-Rebollo
- Institute for Animal Science and Technology, Universitat Politècnica de València, Valencia, Spain
| | - María José Argente
- Departamento de Tecnología Agroalimentaria, Universidad Miguel Hernández de Elche, Orihuela, Spain
| | - María Luz García
- Departamento de Tecnología Agroalimentaria, Universidad Miguel Hernández de Elche, Orihuela, Spain
| | - Romi Pena
- Departament de Ciència Animal, Universitat de Lleida-AGROTECNIO Center, Lleida, Catalonia, Spain
| | - Noelia Ibáñez-Escriche
- Institute for Animal Science and Technology, Universitat Politècnica de València, Valencia, Spain.
| |
Collapse
|
3
|
Sosa‐Madrid BS, Hernández P, Blasco A, Haley CS, Fontanesi L, Santacreu MA, Pena RN, Navarro P, Ibáñez‐Escriche N. Genomic regions influencing intramuscular fat in divergently selected rabbit lines. Anim Genet 2020; 51:58-69. [PMID: 31696970 PMCID: PMC7004202 DOI: 10.1111/age.12873] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/04/2019] [Indexed: 12/12/2022]
Abstract
Intramuscular fat (IMF) is one of the main meat quality traits for breeding programmes in livestock species. The main objective of this study was to identify genomic regions associated with IMF content comparing two rabbit populations divergently selected for this trait, and to generate a list of putative candidate genes. Animals were genotyped using the Affymetrix Axiom OrcunSNP Array (200k). After quality control, the data involved 477 animals and 93 540 SNPs. Two methods were used in this research: single marker regressions with the data adjusted by genomic relatedness, and a Bayesian multiple marker regression. Associated genomic regions were located on the rabbit chromosomes (OCU) OCU1, OCU8 and OCU13. The highest value for the percentage of the genomic variance explained by a genomic region was found in two consecutive genomic windows on OCU8 (7.34%). Genes in the associated regions of OCU1 and OCU8 presented biological functions related to the control of adipose cell function, lipid binding, transportation and localisation (APOLD1, PLBD1, PDE6H, GPRC5D and GPRC5A) and lipid metabolic processes (MTMR2). The EWSR1 gene, underlying the OCU13 region, is linked to the development of brown adipocytes. The findings suggest that there is a large component of polygenic effect behind the differences in IMF content in these two lines, as the variance explained by most of the windows was low. The genomic regions of OCU1, OCU8 and OCU13 revealed novel candidate genes. Further studies would be needed to validate the associations and explore their possible application in selection programmes.
Collapse
Affiliation(s)
- Bolívar S. Sosa‐Madrid
- Institute for Animal Science and TechnologyUniversitat Politècnica de València46022 ValenciaSpain
| | - Pilar Hernández
- Institute for Animal Science and TechnologyUniversitat Politècnica de València46022 ValenciaSpain
| | - Agustín Blasco
- Institute for Animal Science and TechnologyUniversitat Politècnica de València46022 ValenciaSpain
| | - Chris S. Haley
- MRC Human Genetics UnitMRC Institute of Genetics and Molecular MedicineUniversity of EdinburghCrewe Road, Edinburgh EH4 2XUUnited Kingdom
- Roslin Institute and Royal (Dick) School of Veterinary StudiesUniversity of EdinburghMidlothian EH25 9RGUnited Kingdom
| | - Luca Fontanesi
- Division of Animal SciencesDepartment of Agricultural and Food SciencesUniversity of Bologna40127 BolognaItaly
| | - María A. Santacreu
- Institute for Animal Science and TechnologyUniversitat Politècnica de València46022 ValenciaSpain
| | - Romi N. Pena
- Departament de Ciència AnimalUniversitat de Lleida–Agrotecnio CentreE-25198 LleidaCatalonia, Spain
| | - Pau Navarro
- MRC Human Genetics UnitMRC Institute of Genetics and Molecular MedicineUniversity of EdinburghCrewe Road, Edinburgh EH4 2XUUnited Kingdom
| | - Noelia Ibáñez‐Escriche
- Institute for Animal Science and TechnologyUniversitat Politècnica de València46022 ValenciaSpain
| |
Collapse
|
4
|
Sosa-Madrid BS, Santacreu MA, Blasco A, Fontanesi L, Pena RN, Ibáñez-Escriche N. A genomewide association study in divergently selected lines in rabbits reveals novel genomic regions associated with litter size traits. J Anim Breed Genet 2019; 137:123-138. [PMID: 31657065 DOI: 10.1111/jbg.12451] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2019] [Revised: 10/02/2019] [Accepted: 10/03/2019] [Indexed: 12/28/2022]
Abstract
Uterine capacity (UC), defined as the total number of kits from unilaterally ovariectomized does at birth, has a high genetic correlation with litter size. The aim of our research was to identify genomic regions associated with litter size traits through a genomewide association study using rabbits from a divergent selection experiment for UC. A high-density SNP array (200K) was used to genotype 181 does from a control population, high and low UC lines. Traits included total number born (TNB), number born alive (NBA), number born dead, ovulation rate (OR), implanted embryos (IE) and embryo, foetal and prenatal survivals at second parity. We implemented the Bayes B method and the associations were tested by Bayes factors and the percentage of genomic variance (GV) explained by windows. Different genomic regions associated with TNB, NBA, IE and OR were found. These regions explained 7.36%, 1.27%, 15.87% and 3.95% of GV, respectively. Two consecutive windows on chromosome 17 were associated with TNB, NBA and IE. This genomic region accounted for 6.32% of GV of TNB. In this region, we found the BMP4, PTDGR, PTGER2, STYX and CDKN3 candidate genes which presented functional annotations linked to some reproductive processes. Our findings suggest that a genomic region on chromosome 17 has an important effect on litter size traits. However, further analyses are needed to validate this region in other maternal rabbit lines.
Collapse
Affiliation(s)
| | - María Antonia Santacreu
- Institute for Animal Science and Technology, Universitat Politècnica de València, Valencia, Spain
| | - Agustín Blasco
- Institute for Animal Science and Technology, Universitat Politècnica de València, Valencia, Spain
| | - Luca Fontanesi
- Department of Agricultural and Food Sciences, Division of Animal Sciences, University of Bologna, Bologna, Italy
| | - Romi Natacha Pena
- Departament de Ciència Animal, Universitat de Lleida-Agrotecnio Center, Lleida, Spain
| | - Noelia Ibáñez-Escriche
- Institute for Animal Science and Technology, Universitat Politècnica de València, Valencia, Spain
| |
Collapse
|
5
|
Liu Y, Lusk CM, Cho MH, Silverman EK, Qiao D, Zhang R, Scheurer ME, Kheradmand F, Wheeler DA, Tsavachidis S, Armstrong G, Zhu D, Wistuba II, Chow CWB, Behrens C, Pikielny CW, Neslund-Dudas C, Pinney SM, Anderson M, Kupert E, Bailey-Wilson J, Gaba C, Mandal D, You M, de Andrade M, Yang P, Field JK, Liloglou T, Davies M, Lissowska J, Swiatkowska B, Zaridze D, Mukeriya A, Janout V, Holcatova I, Mates D, Milosavljevic S, Scelo G, Brennan P, McKay J, Liu G, Hung RJ, Christiani DC, Schwartz AG, Amos CI, Spitz MR. Rare Variants in Known Susceptibility Loci and Their Contribution to Risk of Lung Cancer. J Thorac Oncol 2018; 13:1483-1495. [PMID: 29981437 PMCID: PMC6366341 DOI: 10.1016/j.jtho.2018.06.016] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2018] [Revised: 06/06/2018] [Accepted: 06/17/2018] [Indexed: 10/28/2022]
Abstract
BACKGROUND Genome-wide association studies are widely used to map genomic regions contributing to lung cancer (LC) susceptibility, but they typically do not identify the precise disease-causing genes/variants. To unveil the inherited genetic variants that cause LC, we performed focused exome-sequencing analyses on genes located in 121 genome-wide association study-identified loci previously implicated in the risk of LC, chronic obstructive pulmonary disease, pulmonary function level, and smoking behavior. METHODS Germline DNA from 260 case patients with LC and 318 controls were sequenced by utilizing VCRome 2.1 exome capture. Filtering was based on enrichment of rare and potential deleterious variants in cases (risk alleles) or controls (protective alleles). Allelic association analyses of single-variant and gene-based burden tests of multiple variants were performed. Promising candidates were tested in two independent validation studies with a total of 1773 case patients and 1123 controls. RESULTS We identified 48 rare variants with deleterious effects in the discovery analysis and validated 12 of the 43 candidates that were covered in the validation platforms. The top validated candidates included one well-established truncating variant, namely, BRCA2, DNA repair associated gene (BRCA2) K3326X (OR = 2.36, 95% confidence interval [CI]: 1.38-3.99), and three newly identified variations, namely, lymphotoxin beta gene (LTB) p.Leu87Phe (OR = 7.52, 95% CI: 1.01-16.56), prolyl 3-hydroxylase 2 gene (P3H2) p.Gln185His (OR = 5.39, 95% CI: 0.75-15.43), and dishevelled associated activator of morphogenesis 2 gene (DAAM2) p.Asp762Gly (OR = 0.25, 95% CI: 0.10-0.79). Burden tests revealed strong associations between zinc finger protein 93 gene (ZNF93), DAAM2, bromodomain containing 9 gene (BRD9), and the gene LTB and LC susceptibility. CONCLUSION Our results extend the catalogue of regions associated with LC and highlight the importance of germline rare coding variants in LC susceptibility.
Collapse
Affiliation(s)
- Yanhong Liu
- Dan L. Duncan Comprehensive Cancer Center, Department of Medicine, Baylor College of Medicine, Houston, TX 77030, USA
| | - Christine M. Lusk
- Karmanos Cancer Institute, Wayne State University, Detroit, MI 48201, USA
| | - Michael H. Cho
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA 02115, USA
| | - Edwin K. Silverman
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA 02115, USA
| | - Dandi Qiao
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA 02115, USA
| | - Ruyang Zhang
- Harvard University School of Public Health, Boston, MA 02115, USA
| | - Michael E. Scheurer
- Department of Pediatrics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Farrah Kheradmand
- Dan L. Duncan Comprehensive Cancer Center, Department of Medicine, Baylor College of Medicine, Houston, TX 77030, USA
- Michael E. DeBakey Veterans Affairs Medical Center; Houston, TX 77030, USA
| | - David A. Wheeler
- Department of Molecular and Human Genetics, Human Genome Sequence Center, Baylor College of Medicine, Houston, TX 77030, USA
| | - Spiridon Tsavachidis
- Dan L. Duncan Comprehensive Cancer Center, Department of Medicine, Baylor College of Medicine, Houston, TX 77030, USA
| | - Georgina Armstrong
- Dan L. Duncan Comprehensive Cancer Center, Department of Medicine, Baylor College of Medicine, Houston, TX 77030, USA
| | - Dakai Zhu
- Dan L. Duncan Comprehensive Cancer Center, Department of Medicine, Baylor College of Medicine, Houston, TX 77030, USA
- Institute for Clinical and Translational Research, Baylor College of Medicine, Houston, TX 77030, USA
| | - Ignacio I. Wistuba
- Department of Translational Molecular Pathology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Chi-Wan B. Chow
- Department of Translational Molecular Pathology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Carmen Behrens
- Department of Thoracic/Head and Neck Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Claudio W. Pikielny
- Department of Biomedical Data Science, Geisel School of Medicine, Dartmouth College, Lebanon, NH 03755, USA
| | | | - Susan M. Pinney
- University of Cincinnati College of Medicine, Cincinnati, OH 45267, USA
| | - Marshall Anderson
- University of Cincinnati College of Medicine, Cincinnati, OH 45267, USA
| | - Elena Kupert
- University of Cincinnati College of Medicine, Cincinnati, OH 45267, USA
| | | | - Colette Gaba
- The University of Toledo College of Medicine, Toledo, OH 43614, USA
| | - Diptasri Mandal
- Louisiana State University Health Sciences Center, New Orleans, LA 70112, USA
| | - Ming You
- Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | | | - Ping Yang
- Mayo Clinic College of Medicine, Rochester, MN 55905, USA
| | - John K. Field
- Roy Castle Lung Cancer Research Programme, The University of Liverpool, Department of Molecular and Clinical Cancer Medicine, Liverpool, UK
| | - Triantafillos Liloglou
- Roy Castle Lung Cancer Research Programme, The University of Liverpool, Department of Molecular and Clinical Cancer Medicine, Liverpool, UK
| | - Michael Davies
- Roy Castle Lung Cancer Research Programme, The University of Liverpool, Department of Molecular and Clinical Cancer Medicine, Liverpool, UK
| | - Jolanta Lissowska
- The M. Sklodowska-Curie Institute of Oncology Center, Warsaw 02781, Poland
| | - Beata Swiatkowska
- Nofer Institute of Occupational Medicine, Department of Environmental Epidemiology, Lodz 91348, Poland
| | - David Zaridze
- Russian N.N. Blokhin Cancer Research Centre, Moscow 115478, Russian Federation
| | - Anush Mukeriya
- Russian N.N. Blokhin Cancer Research Centre, Moscow 115478, Russian Federation
| | - Vladimir Janout
- Faculty of Health Sciences, Palacky University, Olomouc 77515, Czech Republic
| | - Ivana Holcatova
- Institute of Public Health and Preventive Medicine, Charles University, 2nd Faculty of Medicine, Prague 12800, Czech Republic
| | - Dana Mates
- National Institute of Public Health, Bucharest 050463, Romania
| | - Sasa Milosavljevic
- International Organization for Cancer Prevention and Research (IOCPR), Belgrade, Serbia
| | | | - Paul Brennan
- International Agency for Research on Cancer, Lyon, France
| | - James McKay
- International Agency for Research on Cancer, Lyon, France
| | - Geoffrey Liu
- Princess Margaret Cancer Center, Toronto, ON, M5G 2M9, Canada
| | - Rayjean J. Hung
- Lunenfeld-Tanenbaum Research Institute, Sinai Health System, Toronto, ON, M5G 1X5 Canada
| | | | | | - Ann G. Schwartz
- Karmanos Cancer Institute, Wayne State University, Detroit, MI 48201, USA
| | - Christopher I Amos
- Dan L. Duncan Comprehensive Cancer Center, Department of Medicine, Baylor College of Medicine, Houston, TX 77030, USA
- Institute for Clinical and Translational Research, Baylor College of Medicine, Houston, TX 77030, USA
| | - Margaret R. Spitz
- Dan L. Duncan Comprehensive Cancer Center, Department of Medicine, Baylor College of Medicine, Houston, TX 77030, USA
| |
Collapse
|
6
|
Kirpich A, Ainsworth EA, Wedow JM, Newman JRB, Michailidis G, McIntyre LM. Variable selection in omics data: A practical evaluation of small sample sizes. PLoS One 2018; 13:e0197910. [PMID: 29927942 PMCID: PMC6013185 DOI: 10.1371/journal.pone.0197910] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2017] [Accepted: 05/10/2018] [Indexed: 01/04/2023] Open
Abstract
In omics experiments, variable selection involves a large number of metabolites/ genes and a small number of samples (the n < p problem). The ultimate goal is often the identification of one, or a few features that are different among conditions- a biomarker. Complicating biomarker identification, the p variables often contain a correlation structure due to the biology of the experiment making identifying causal compounds from correlated compounds difficult. Additionally, there may be elements in the experimental design (blocks, batches) that introduce structure in the data. While this problem has been discussed in the literature and various strategies proposed, the over fitting problems concomitant with such approaches are rarely acknowledged. Instead of viewing a single omics experiment as a definitive test for a biomarker, an unrealistic analytical goal, we propose to view such studies as screening studies where the goal of the study is to reduce the number of features present in the second round of testing, and to limit the Type II error. Using this perspective, the performance of LASSO, ridge regression and Elastic Net was compared with the performance of an ANOVA via a simulation study and two real data comparisons. Interestingly, a dramatic increase in the number of features had no effect on Type I error for the ANOVA approach. ANOVA, even without multiple test correction, has a low false positive rates in the scenarios tested. The Elastic Net has an inflated Type I error (from 10 to 50%) for small numbers of features which increases with sample size. The Type II error rate for the ANOVA is comparable or lower than that for the Elastic Net leading us to conclude that an ANOVA is an effective analytical tool for the initial screening of features in omics experiments.
Collapse
Affiliation(s)
- Alexander Kirpich
- Department of Biology, University of Florida, Gainesville, FL, United States of America
- Informatics Institute, University of Florida, Gainesville, FL, United States of America
| | - Elizabeth A. Ainsworth
- Department of Plant Biology, University of Illinois at Urbana-Champaign, Urbana, IL, United States of America
- USDA ARS Global Change and Photosynthesis Research Unit, Urbana, IL, United States of America
| | - Jessica M. Wedow
- Department of Plant Biology, University of Illinois at Urbana-Champaign, Urbana, IL, United States of America
| | - Jeremy R. B. Newman
- Department of Biology, University of Florida, Gainesville, FL, United States of America
| | - George Michailidis
- Informatics Institute, University of Florida, Gainesville, FL, United States of America
- Department of Statistics, University of Florida, Gainesville, FL, United States of America
| | - Lauren M. McIntyre
- Department of Biology, University of Florida, Gainesville, FL, United States of America
- Informatics Institute, University of Florida, Gainesville, FL, United States of America
- Genetics Institute, University of Florida, Gainesville, FL, United States of America
| |
Collapse
|
7
|
Genetics of body fat mass and related traits in a pig population selected for leanness. Sci Rep 2017; 7:9118. [PMID: 28831160 PMCID: PMC5567295 DOI: 10.1038/s41598-017-08961-4] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2017] [Accepted: 07/17/2017] [Indexed: 12/21/2022] Open
Abstract
Obesity is characterized as the excessive accumulation of body fat and has a complex genetic foundation in humans including monogenic high-risk mutations and polygenic contributions. Domestic pigs represent a valuable model on an obesity-promoting high-caloric diet while constantly evaluated for body characteristics. As such, we investigated the genetics of obesity-related traits, comprising subcutaneous fat thickness, lean mass percentage, and growth rate, in a pig population. We conducted genome-wide association analyses using an integrative approach of single-marker regression models and multi-marker Bayesian analyses. Thus, we identified 30 genomic regions distributed over 14 different chromosomes contributing to the variation in obesity-related traits. In these regions, we validated the association of four candidate genes that are functionally connected to the regulation of appetite, processes of adipogenesis, and extracellular matrix formation. Our findings revealed fundamental genetic factors which deserves closer attention regarding their roles in the etiology of obesity.
Collapse
|
8
|
López de Maturana E, Pineda S, Brand A, Van Steen K, Malats N. Toward the integration of Omics data in epidemiological studies: still a "long and winding road". Genet Epidemiol 2016; 40:558-569. [PMID: 27432111 DOI: 10.1002/gepi.21992] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2015] [Revised: 05/22/2016] [Accepted: 06/05/2016] [Indexed: 12/23/2022]
Abstract
Primary and secondary prevention can highly benefit a personalized medicine approach through the accurate discrimination of individuals at high risk of developing a specific disease from those at moderate and low risk. To this end precise risk prediction models need to be built. This endeavor requires a precise characterization of the individual exposome, genome, and phenome. Massive molecular omics data representing the different layers of the biological processes of the host and the nonhost will enable to build more accurate risk prediction models. Epidemiologists aim to integrate omics data along with important information coming from other sources (questionnaires, candidate markers) that has been proved to be relevant in the discrimination risk assessment of complex diseases. However, the integrative models in large-scale epidemiologic research are still in their infancy and they face numerous challenges, some of them at the analytical stage. So far, there are a small number of studies that have integrated more than two omics data sets, and the inclusion of non-omics data in the same models is still missing in most of studies. In this contribution, we aim at approaching the omics and non-omics data integration from the epidemiology scope by considering the "massive" inclusion of variables in the risk assessment and predictive models. We also provide already available examples of integrative contributions in the field, propose analytical strategies that allow considering both omics and non-omics data in the models, and finally review the challenges imbedding this type of research.
Collapse
Affiliation(s)
| | - Sílvia Pineda
- Genetic and Molecular Epidemiology Group, Spanish National Cancer Research Centre (CNIO), Madrid, Spain
| | - Angela Brand
- Institute for Public Health Genomics, Maastricht University, Maastricht, Netherlands
| | - Kristel Van Steen
- Laboratory of Biostatistics, Biomedicine and Bioinformatics, GIGA, University of Liège, Belgium
| | - Núria Malats
- Genetic and Molecular Epidemiology Group, Spanish National Cancer Research Centre (CNIO), Madrid, Spain.
| |
Collapse
|
9
|
Masson-Lecomte A, López de Maturana E, Goddard ME, Picornell A, Rava M, González-Neira A, Márquez M, Carrato A, Tardon A, Lloreta J, Garcia-Closas M, Silverman D, Rothman N, Kogevinas M, Allory Y, Chanock SJ, Real FX, Malats N. Inflammatory-Related Genetic Variants in Non-Muscle-Invasive Bladder Cancer Prognosis: A Multimarker Bayesian Assessment. Cancer Epidemiol Biomarkers Prev 2016; 25:1144-50. [PMID: 27197286 DOI: 10.1158/1055-9965.epi-15-0894] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2015] [Accepted: 04/22/2016] [Indexed: 11/16/2022] Open
Abstract
BACKGROUND Increasing evidence points to the role of tumor immunologic environment on urothelial bladder cancer prognosis. This effect might be partly dependent on the host genetic context. We evaluated the association of SNPs in inflammation-related genes with non-muscle-invasive bladder cancer (NMIBC) risk-of-recurrence and risk-of-progression. METHODS We considered 822 NMIBC included in the SBC/EPICURO Study followed-up >10 years. We selected 1,679 SNPs belonging to 251 inflammatory genes. The association of SNPs with risk-of-recurrence and risk-of-progression was assessed using Cox regression single-marker (SMM) and multimarker methods (MMM) Bayes A and Bayesian LASSO. Discriminative abilities of the models were calculated using the c index and validated with bootstrap cross-validation procedures. RESULTS While no SNP was found to be associated with risk-of-recurrence using SMM, three SNPs in TNIP1, CD5, and JAK3 showed very strong association with posterior probabilities >90% using MMM. Regarding risk-of-progression, one SNP in CD3G was significantly associated using SMM (HR, 2.69; P = 1.55 × 10(-5)) and two SNPs in MASP1 and AIRE, showed a posterior probability ≥80% with MMM. Validated discriminative abilities of the models without and with the SNPs were 58.4% versus 60.5% and 72.1% versus 72.8% for risk-of-recurrence and risk-of-progression, respectively. CONCLUSIONS Using innovative analytic approaches, we demonstrated that SNPs in inflammatory-related genes were associated with NMIBC prognosis and that they improve the discriminative ability of prognostic clinical models for NMIBC. IMPACT This study provides proof of concept for the joint effect of genetic variants in improving the discriminative ability of clinical prognostic models. The approach may be extended to other diseases. Cancer Epidemiol Biomarkers Prev; 25(7); 1144-50. ©2016 AACR.
Collapse
Affiliation(s)
- Alexandra Masson-Lecomte
- Genetic and Molecular Epidemiology Group, Spanish National Cancer Research Centre (CNIO), Madrid, Spain. Urology Department, Henri Mondor Academic Hospital, Paris Est Créteil University, Créteil, France
| | | | - Michael E Goddard
- Biosciences Research Division, Department of Environment and Primary Industries, Agribio, Bundoora, Victoria, Australia. Department of Food and Agricultural Systems, University of Melbourne, Melbourne, Australia
| | - Antoni Picornell
- Genetic and Molecular Epidemiology Group, Spanish National Cancer Research Centre (CNIO), Madrid, Spain
| | - Marta Rava
- Genetic and Molecular Epidemiology Group, Spanish National Cancer Research Centre (CNIO), Madrid, Spain
| | - Anna González-Neira
- Human Genotyping-CEGEN Unit, Spanish National Cancer Research Centre (CNIO), Madrid, Spain
| | - Mirari Márquez
- Genetic and Molecular Epidemiology Group, Spanish National Cancer Research Centre (CNIO), Madrid, Spain
| | - Alfredo Carrato
- Servicio de Oncología, Hospital Universitario Ramon y Cajal, Madrid, and Servicio de Oncología, Hospital Universitario de Elche, Elche, Spain
| | - Adonina Tardon
- Department of Preventive Medicine, Universidad de Oviedo, Oviedo, Spain
| | - Josep Lloreta
- Institut Municipal d'Investigació Mèdica - Hospital del Mar and Departament de Patologia, Hospital del Mar - IMAS, Barcelona, Spain
| | | | - Debra Silverman
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Department of Health and Human Services, Bethesda, Maryland
| | - Nathaniel Rothman
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Department of Health and Human Services, Bethesda, Maryland
| | - Manolis Kogevinas
- Centre for Research in Environmental Epidemiology (CREAL) and Institut Municipal d'Investigació Mèdica - Hospital del Mar, Barcelona, Spain
| | - Yves Allory
- Pathology Department, Henri Mondor Academic Hospital, Paris Est Créteil University, INSERM, Créteil, France
| | - Stephen J Chanock
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Department of Health and Human Services, Bethesda, Maryland
| | - Francisco X Real
- Epithelial Carcinogenesis Group, Spanish National Cancer Research Centre (CNIO), Madrid, Spain. Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, Barcelona, Spain
| | - Núria Malats
- Genetic and Molecular Epidemiology Group, Spanish National Cancer Research Centre (CNIO), Madrid, Spain.
| | | |
Collapse
|
10
|
Rare Variants in Transcript and Potential Regulatory Regions Explain a Small Percentage of the Missing Heritability of Complex Traits in Cattle. PLoS One 2015; 10:e0143945. [PMID: 26642058 PMCID: PMC4671594 DOI: 10.1371/journal.pone.0143945] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2015] [Accepted: 11/11/2015] [Indexed: 11/19/2022] Open
Abstract
The proportion of genetic variation in complex traits explained by rare variants is a key question for genomic prediction, and for identifying the basis of “missing heritability”–the proportion of additive genetic variation not captured by common variants on SNP arrays. Sequence variants in transcript and regulatory regions from 429 sequenced animals were used to impute high density SNP genotypes of 3311 Holstein sires to sequence. There were 675,062 common variants (MAF>0.05), 102,549 uncommon variants (0.01<MAF<0.05), and 83,856 rare variants (MAF<0.01). We describe a novel method for estimating the proportion of the rare variants that are sequencing errors using parent-progeny duos. We then used mixed model methodology to estimate the proportion of variance captured by these different classes of variants for fat, milk and protein yields, as well as for fertility. Common sequence variants captured 83%, 77%, 76% and 84% of the total genetic variance for fat, milk, and protein yields and fertility, respectively. This was between 2 and 5% more variance than that captured from 600k SNPs on a high density chip, although the difference was not significant. Rare variants captured 3%, 0%, 1% and 14% of the genetic variance for fat, milk and protein yields, and fertility respectively, whereas pedigree explained the remaining amount of genetic variance (none for fertility). The proportion of variation explained by rare variants is likely to be under-estimated due to reduced accuracies of imputation for this class of variants. Using common sequence variants slightly improved accuracy of genomic predictions for fat and milk yield, compared to high density SNP array genotypes. However, including rare variants from transcript regions did not increase the accuracy of genomic predictions. These results suggest that rare variants recover a small percentage of the missing heritability for complex traits, however very large reference sets will be required to exploit this to improve the accuracy of genomic predictions. Our results do suggest the contribution of rare variants to genetic variation may be greater for fitness traits.
Collapse
|
11
|
The genetics of feed conversion efficiency traits in a commercial broiler line. Sci Rep 2015; 5:16387. [PMID: 26552583 PMCID: PMC4639841 DOI: 10.1038/srep16387] [Citation(s) in RCA: 47] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2015] [Accepted: 10/14/2015] [Indexed: 11/26/2022] Open
Abstract
Individual feed conversion efficiency (FCE) is a major trait that influences the usage of energy resources and the ecological footprint of livestock production. The underlying biological processes of FCE are complex and are influenced by factors as diverse as climate, feed properties, gut microbiota, and individual genetic predisposition. To gain an insight to the genetic relationships with FCE traits and to contribute to the improvement of FCE in commercial chicken lines, a genome-wide association study was conducted using a commercial broiler population (n = 859) tested for FCE and weight traits during the finisher period from 39 to 46 days of age. Both single-marker (generalized linear model) and multi-marker (Bayesian approach) analyses were applied to the dataset to detect genes associated with the variability in FCE. The separate analyses revealed 22 quantitative trait loci (QTL) regions on 13 different chromosomes; the integration of both approaches resulted in 7 overlapping QTL regions. The analyses pointed to acylglycerol kinase (AGK) and general transcription factor 2-I (GTF2I) as positional and functional candidate genes. Non-synonymous polymorphisms of both candidate genes revealed evidence for a functional importance of these genes by influencing different biological aspects of FCE.
Collapse
|
12
|
Dehman A, Ambroise C, Neuvial P. Performance of a blockwise approach in variable selection using linkage disequilibrium information. BMC Bioinformatics 2015; 16:148. [PMID: 25951947 PMCID: PMC4430909 DOI: 10.1186/s12859-015-0556-6] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2014] [Accepted: 03/30/2015] [Indexed: 12/03/2022] Open
Abstract
Background Genome-wide association studies (GWAS) aim at finding genetic markers that are significantly associated with a phenotype of interest. Single nucleotide polymorphism (SNP) data from the entire genome are collected for many thousands of SNP markers, leading to high-dimensional regression problems where the number of predictors greatly exceeds the number of observations. Moreover, these predictors are statistically dependent, in particular due to linkage disequilibrium (LD). We propose a three-step approach that explicitly takes advantage of the grouping structure induced by LD in order to identify common variants which may have been missed by single marker analyses (SMA). In the first step, we perform a hierarchical clustering of SNPs with an adjacency constraint using LD as a similarity measure. In the second step, we apply a model selection approach to the obtained hierarchy in order to define LD blocks. Finally, we perform Group Lasso regression on the inferred LD blocks. We investigate the efficiency of this approach compared to state-of-the art regression methods: haplotype association tests, SMA, and Lasso and Elastic-Net regressions. Results Our results on simulated data show that the proposed method performs better than state-of-the-art approaches as soon as the number of causal SNPs within an LD block exceeds 2. Our results on semi-simulated data and a previously published HIV data set illustrate the relevance of the proposed method and its robustness to a real LD structure. The method is implemented in the R package BALD (Blockwise Approach using Linkage Disequilibrium), available from http://www.math-evry.cnrs.fr/publications/logiciels. Conclusions Our results show that the proposed method is efficient not only at the level of LD blocks by inferring well the underlying block structure but also at the level of individual SNPs. Thus, this study demonstrates the importance of tailored integration of biological knowledge in high-dimensional genomic studies such as GWAS. Electronic supplementary material The online version of this article (doi:10.1186/s12859-015-0556-6) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Alia Dehman
- Laboratoire de Mathématiques et Modélisation d'Evry (LaMME), Université d'Evry-Val-d'Essonne/UMR CNRS 8071/ENSIIE/USC INRA, Evry, France.
| | - Christophe Ambroise
- Laboratoire de Mathématiques et Modélisation d'Evry (LaMME), Université d'Evry-Val-d'Essonne/UMR CNRS 8071/ENSIIE/USC INRA, Evry, France.
| | - Pierre Neuvial
- Laboratoire de Mathématiques et Modélisation d'Evry (LaMME), Université d'Evry-Val-d'Essonne/UMR CNRS 8071/ENSIIE/USC INRA, Evry, France.
| |
Collapse
|