1
|
Ruperao P, Rangan P, Shah T, Thakur V, Kalia S, Mayes S, Rathore A. The Progression in Developing Genomic Resources for Crop Improvement. Life (Basel) 2023; 13:1668. [PMID: 37629524 PMCID: PMC10455509 DOI: 10.3390/life13081668] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Revised: 07/21/2023] [Accepted: 07/25/2023] [Indexed: 08/27/2023] Open
Abstract
Sequencing technologies have rapidly evolved over the past two decades, and new technologies are being continually developed and commercialized. The emerging sequencing technologies target generating more data with fewer inputs and at lower costs. This has also translated to an increase in the number and type of corresponding applications in genomics besides enhanced computational capacities (both hardware and software). Alongside the evolving DNA sequencing landscape, bioinformatics research teams have also evolved to accommodate the increasingly demanding techniques used to combine and interpret data, leading to many researchers moving from the lab to the computer. The rich history of DNA sequencing has paved the way for new insights and the development of new analysis methods. Understanding and learning from past technologies can help with the progress of future applications. This review focuses on the evolution of sequencing technologies, their significant enabling role in generating plant genome assemblies and downstream applications, and the parallel development of bioinformatics tools and skills, filling the gap in data analysis techniques.
Collapse
Affiliation(s)
- Pradeep Ruperao
- Center of Excellence in Genomics and Systems Biology, International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad 502324, India
| | - Parimalan Rangan
- ICAR-National Bureau of Plant Genetic Resources, PUSA Campus, New Delhi 110012, India;
| | - Trushar Shah
- International Institute of Tropical Agriculture (IITA), Nairobi 30709-00100, Kenya;
| | - Vivek Thakur
- Department of Systems & Computational Biology, School of Life Sciences, University of Hyderabad, Hyderabad 500046, India;
| | - Sanjay Kalia
- Department of Biotechnology, Ministry of Science and Technology, Government of India, New Delhi 110003, India;
| | - Sean Mayes
- Center of Excellence in Genomics and Systems Biology, International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad 502324, India
| | - Abhishek Rathore
- Excellence in Breeding, International Maize and Wheat Improvement Center (CIMMYT), Hyderabad 502324, India
| |
Collapse
|
2
|
Santonja Á, Moya-García AA, Ribelles N, Jiménez-Rodríguez B, Pajares B, Fernández-De Sousa CE, Pérez-Ruiz E, Del Monte-Millán M, Ruiz-Borrego M, de la Haba J, Sánchez-Rovira P, Romero A, González-Neira A, Lluch A, Alba E. Role of germline variants in the metastasis of breast carcinomas. Oncotarget 2022; 13:843-862. [PMID: 35782051 PMCID: PMC9245581 DOI: 10.18632/oncotarget.28250] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2022] [Accepted: 06/20/2022] [Indexed: 11/25/2022] Open
Abstract
Most cancer-related deaths in breast cancer patients are associated with metastasis, a multistep, intricate process that requires the cooperation of tumour cells, tumour microenvironment and metastasis target tissues. It is accepted that metastasis does not depend on the tumour characteristics but the host’s genetic makeup. However, there has been limited success in determining the germline genetic variants that influence metastasis development, mainly because of the limitations of traditional genome-wide association studies to detect the relevant genetic polymorphisms underlying complex phenotypes. In this work, we leveraged the extreme discordant phenotypes approach and the epistasis networks to analyse the genotypes of 97 breast cancer patients. We found that the host’s genetic makeup facilitates metastases by the dysregulation of gene expression that can promote the dispersion of metastatic seeds and help establish the metastatic niche—providing a congenial soil for the metastatic seeds.
Collapse
Affiliation(s)
- Ángela Santonja
- Instituto de Investigación Biomédica de Málaga (IBIMA), Hospitales Universitarios Regional y Virgen de la Victoria de Málaga, Spain.,Laboratorio de Biología Molecular del Cáncer, Centro de Investigaciones Médico-Sanitarias (CIMES), Universidad de Málaga, Málaga, Spain.,These authors contributed equally to this work
| | - Aurelio A Moya-García
- Laboratorio de Biología Molecular del Cáncer, Centro de Investigaciones Médico-Sanitarias (CIMES), Universidad de Málaga, Málaga, Spain.,Departmento de Biología Molecular y Bioquímica, Universidad de Málaga, Málaga, Spain.,These authors contributed equally to this work
| | - Nuria Ribelles
- Unidad de Gestión Clínica Intercentro de Oncología, Instituto de Investigación Biomédica de Málaga (IBIMA), Hospitales Universitarios Regional y Virgen de la Victoria de Málaga, Málaga, Spain.,Centro de Investigación Biomédica en Red de Oncología, CIBERONC-ISCIII, Madrid, Spain
| | - Begoña Jiménez-Rodríguez
- Unidad de Gestión Clínica Intercentro de Oncología, Instituto de Investigación Biomédica de Málaga (IBIMA), Hospitales Universitarios Regional y Virgen de la Victoria de Málaga, Málaga, Spain
| | - Bella Pajares
- Unidad de Gestión Clínica Intercentro de Oncología, Instituto de Investigación Biomédica de Málaga (IBIMA), Hospitales Universitarios Regional y Virgen de la Victoria de Málaga, Málaga, Spain
| | - Cristina E Fernández-De Sousa
- Instituto de Investigación Biomédica de Málaga (IBIMA), Hospitales Universitarios Regional y Virgen de la Victoria de Málaga, Spain.,Laboratorio de Biología Molecular del Cáncer, Centro de Investigaciones Médico-Sanitarias (CIMES), Universidad de Málaga, Málaga, Spain
| | | | - María Del Monte-Millán
- Centro de Investigación Biomédica en Red de Oncología, CIBERONC-ISCIII, Madrid, Spain.,Instituto de Investigación Sanitaria Gregorio Marañón, Universidad Complutense, Madrid, Spain
| | | | - Juan de la Haba
- Centro de Investigación Biomédica en Red de Oncología, CIBERONC-ISCIII, Madrid, Spain.,Biomedical Research Institute, Complejo Hospitalario Reina Sofía, Córdoba, Spain
| | | | - Atocha Romero
- Molecular Oncology Laboratory, Hospital Clínico San Carlos, IdISSC, Madrid, Spain
| | - Anna González-Neira
- Human Genotyping-CEGEN Unit, Human Cancer Genetics Program, Spanish National Cancer Research Centre (CNIO), Madrid, Spain
| | - Ana Lluch
- Centro de Investigación Biomédica en Red de Oncología, CIBERONC-ISCIII, Madrid, Spain.,Department of Oncology and Hematology, Hospital Clínico Universitario, Valencia, Spain.,INCLIVA Biomedical Research Institute, Universidad de Valencia, Valencia, Spain
| | - Emilio Alba
- Laboratorio de Biología Molecular del Cáncer, Centro de Investigaciones Médico-Sanitarias (CIMES), Universidad de Málaga, Málaga, Spain.,Unidad de Gestión Clínica Intercentro de Oncología, Instituto de Investigación Biomédica de Málaga (IBIMA), Hospitales Universitarios Regional y Virgen de la Victoria de Málaga, Málaga, Spain.,Centro de Investigación Biomédica en Red de Oncología, CIBERONC-ISCIII, Madrid, Spain
| |
Collapse
|
3
|
Zepeda-Batista JL, Núñez-Domínguez R, Ramírez-Valverde R, Jahuey-Martínez FJ, Herrera-Ojeda JB, Parra-Bracamonte GM. Discovering of Genomic Variations Associated to Growth Traits by GWAS in Braunvieh Cattle. Genes (Basel) 2021; 12:genes12111666. [PMID: 34828272 PMCID: PMC8618990 DOI: 10.3390/genes12111666] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2021] [Revised: 10/07/2021] [Accepted: 10/20/2021] [Indexed: 01/01/2023] Open
Abstract
A genome-wide association study (GWAS) was performed to elucidate genetic architecture of growth traits in Braunvieh cattle. Methods: The study included 300 genotyped animals by the GeneSeek® Genomic Profiler Bovine LDv.4 panel; after quality control, 22,734 SNP and 276 animals were maintained in the analysis. The examined phenotypic data considered birth (BW), weaning (WW), and yearling weights. The association analysis was performed using the principal components method via the egscore function of the GenABEL version 1.8-0 package in the R environment. The marker rs133262280 located in BTA 22 was associated with BW, and two SNPs were associated with WW, rs43668789 (BTA 11) and rs136155567 (BTA 27). New QTL associated with these liveweight traits and four positional and functional candidate genes potentially involved in variations of the analyzed traits were identified. The most important genes in these genomic regions were MCM2 (minichromosome maintenance complex component 2), TPRA1 (transmembrane protein adipocyte associated 1), GALM (galactose mutarotase), and NRG1 (neuregulin 1), related to embryonic cleavage, bone and tissue growth, cell adhesion, and organic development. This study is the first to present a GWAS conducted in Braunvieh cattle in Mexico providing evidence for genetic architecture of assessed growth traits. Further specific analysis of found associated genes and regions will clarify its contribution to the genetic basis of growth-related traits.
Collapse
Affiliation(s)
- José Luis Zepeda-Batista
- Facultad de Medicina Veterinaria y Zootecnia, Universidad de Colima, Kilometro 40 Autopista Colima-Manzanillo, Tecomán 28100, Colima, Mexico;
- Departamento de Zootecnia, Posgrado en Producción Animal, Universidad Autónoma Chapingo, Km. 38.5 Carretera México-Texcoco, Chapingo 56230, Texcoco, Mexico; (R.N.-D.); (R.R.-V.)
| | - Rafael Núñez-Domínguez
- Departamento de Zootecnia, Posgrado en Producción Animal, Universidad Autónoma Chapingo, Km. 38.5 Carretera México-Texcoco, Chapingo 56230, Texcoco, Mexico; (R.N.-D.); (R.R.-V.)
| | - Rodolfo Ramírez-Valverde
- Departamento de Zootecnia, Posgrado en Producción Animal, Universidad Autónoma Chapingo, Km. 38.5 Carretera México-Texcoco, Chapingo 56230, Texcoco, Mexico; (R.N.-D.); (R.R.-V.)
| | - Francisco Joel Jahuey-Martínez
- Facultad de Zootecnia y Ecologa, Universidad Autónoma de Chihuahua, Periférico Francisco R. Almada, Km 1, Chihuahua 33820, Chihuahua, Mexico;
| | - Jessica Beatriz Herrera-Ojeda
- Departamento de Ciencias Básicas, Instituto Tecnológico del Valle de Morelia, Instituto Tecnológico Nacional, Morelia 58100, Michoacán, Mexico;
| | - Gaspar Manuel Parra-Bracamonte
- Centro de Biotecnología Genómica, Instituto Politécnico Nacional, Boulevard del Maestro S/N esq. Elías Piña, Col. Narciso Mendoza, Ciudad Reynosa 88710, Tamaulipas, Mexico
- Correspondence: ; Tel.: +52-899-924-3627 (ext. 87709)
| |
Collapse
|
4
|
Abstract
We recently reported a family-based genome wide association study (GWAS) for pediatric stroke pointing our attention to two significantly associated genes of the ADAMTS (a disintegrin and metalloproteinase with thrombospondin motifs) gene family ADAMTS2 (rs469568, p = 8x10-6) and ADAMTS12 (rs1364044, p = 2.9x10-6). To further investigate these candidate genes, we applied a targeted resequencing approach on 48 discordant sib-pairs for pediatric stroke followed by genotyping of the detected non-synonymous variants in the full cohort of 270 offspring trios and subsequent fine mapping analysis. We identified eight non-synonymous SNPs in ADAMTS2 and six in ADAMTS12 potentially influencing the respective protein function. These variants were genotyped within a cohort of 270 affected offspring trios, association analysis revealed the ADAMTS12 variant rs77581578 to be significantly under-transmitted (p = 6.26x10-3) to pediatric stroke patients. The finding was validated in a pediatric venous thromboembolism (VTE) cohort of 189 affected trios. Subsequent haplotype analysis of ADAMTS12 detected a significantly associated haplotype comprising the originally identified GWAS variant. Several ADAMTS genes such as ADAMTS13 are involved in thromboembolic disease process. Here, we provide further evidence for ADAMTS12 to likely play a role in pediatric stroke. Further functional studies are warranted to assess the functional role of ADAMTS12 in the pathogenesis of stroke.
Collapse
|
5
|
Reyer H, Oster M, Wittenburg D, Murani E, Ponsuksili S, Wimmers K. Genetic Contribution to Variation in Blood Calcium, Phosphorus, and Alkaline Phosphatase Activity in Pigs. Front Genet 2019; 10:590. [PMID: 31316547 PMCID: PMC6610066 DOI: 10.3389/fgene.2019.00590] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2018] [Accepted: 06/04/2019] [Indexed: 12/18/2022] Open
Abstract
Blood values of calcium (Ca), inorganic phosphorus (IP), and alkaline phosphatase activity (ALP) are valuable indicators for mineral status and bone mineralization. The mineral homeostasis is maintained by absorption, retention, and excretion processes employing a number of known and unknown sensing and regulating factors with implications on immunity. Due to the high inter-individual variation of Ca and P levels in the blood of pigs and to clarify molecular contributions to this variation, the genetics of hematological traits related to the Ca and P balance were investigated in a German Landrace population, integrating both single-locus and multi-locus genome-wide association study (GWAS) approaches. Genomic heritability estimates suggest a moderate genetic contribution to the variation of hematological Ca (N = 456), IP (N = 1049), ALP (N = 439), and the Ca/P ratio (N = 455), with values ranging from 0.27 to 0.54. The genome-wide analysis of markers adds a number of genomic regions to the list of quantitative trait loci, some of which overlap with previous results. Despite the gaps in knowledge of genes involved in Ca and P metabolism, genes like THBS2, SHH, PTPRT, PTGS1, and FRAS1 with reported connections to bone metabolism were derived from the significantly associated genomic regions. Additionally, genomic regions included TRAFD1 and genes coding for phosphate transporters (SLC17A1-SLC17A4), which are linked to Ca and P homeostasis. The study calls for improved functional annotation of the proposed candidate genes to derive features involved in maintaining Ca and P balance. This gene information can be exploited to diagnose and predict characteristics of micronutrient utilization, bone development, and a well-functioning musculoskeletal system in pig husbandry and breeding.
Collapse
Affiliation(s)
- Henry Reyer
- Genomics Unit, Institute for Genome Biology, Leibniz Institute for Farm Animal Biology, Dummerstorf, Germany
| | - Michael Oster
- Genomics Unit, Institute for Genome Biology, Leibniz Institute for Farm Animal Biology, Dummerstorf, Germany
| | - Dörte Wittenburg
- Biomathematics and Bioinformatics Unit, Institute of Genetics and Biometry, Leibniz Institute for Farm Animal Biology, Dummerstorf, Germany
| | - Eduard Murani
- Genomics Unit, Institute for Genome Biology, Leibniz Institute for Farm Animal Biology, Dummerstorf, Germany
| | - Siriluck Ponsuksili
- Functional Genome Analysis Unit, Institute for Genome Biology, Leibniz Institute for Farm Animal Biology, Dummerstorf, Germany
| | - Klaus Wimmers
- Genomics Unit, Institute for Genome Biology, Leibniz Institute for Farm Animal Biology, Dummerstorf, Germany.,Department of Animal Breeding and Genetics, Faculty of Agricultural and Environmental Sciences, University of Rostock, Rostock, Germany
| |
Collapse
|
6
|
|
7
|
Witten A, Bolbrinker J, Barysenka A, Huber M, Rühle F, Nowak-Göttl U, Garbe E, Kreutz R, Stoll M. Targeted resequencing of a locus for heparin-induced thrombocytopenia on chromosome 5 identified in a genome-wide association study. J Mol Med (Berl) 2018; 96:765-775. [PMID: 29934777 DOI: 10.1007/s00109-018-1661-6] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2017] [Revised: 06/05/2018] [Accepted: 06/12/2018] [Indexed: 12/18/2022]
Abstract
Immune-mediated heparin-induced thrombocytopenia (HIT) is the clinically most important adverse drug reaction (ADR) in response to heparin therapy characterized by a prothrombotic state despite a decrease in platelet count. We conducted a genome-wide association study in 96 suspected HIT cases and 96 controls to explore the genetic predisposition for HIT within a case-control pharmacovigilance study followed by replication in additional 86 cases and 86 controls from the same study. One single nucleotide polymorphism (SNP, rs1433265, P = 6.5 × 10-5, odds ratio (OR) 2.79) from 16 identified SNPs was successfully replicated (P = 1.5 × 10-4, OR 2.77; combined data set P = 2.7 × 10-8, OR 2.77) and remained the most strongly associated SNP after imputing locus genotypes. Fine mapping revealed a significantly associated risk-conferring haplotype (P = 4.9 × 10-6, OR 2.41). In order to find rare variants contributing to the association signals, we applied a targeted resequencing approach in a subgroup of 73 HIT patients and 23 controls for the regions with the 16 most strongly HIT-associated SNPs. C-alpha testing was applied to test for the impact of rare variants and we detected two candidate genes, the discoidin domain receptor tyrosine kinase 1 (DDR1, P = 3.6 × 10-2) and the multiple C2 and transmembrane domain containing 2 (MCTP2, P = 4.5 × 10-2). For the genes interactor of little elongation complex ELL subunit 1 (ICE1) and a disintegrin-like and metalloproteinase with thrombospondin type 1 motif, 16 (ADAMTS16) nearby rs1433265, we identified several missense variants. Although replication in an independent population is warranted, these findings provide a basis for future studies aiming to identify and characterize genetic susceptibility factors for HIT. KEY MESSAGES: We identified and validated a HIT-associated locus on chromosome 5. Targeted NGS analysis for rare variants identifies DDR1 and MCTP2 as novel candidates. In addition, missense variants for ADAMTS16 and ICE1 were identified in the locus.
Collapse
Affiliation(s)
- Anika Witten
- Department of Genetic Epidemiology, Institute of Human Genetics, University Hospital Münster, Münster, Germany
| | - Juliane Bolbrinker
- Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Institute of Clinical Pharmacology and Toxicology, Berlin, Germany
| | - Andrei Barysenka
- Department of Genetic Epidemiology, Institute of Human Genetics, University Hospital Münster, Münster, Germany
| | - Matthias Huber
- Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Institute of Clinical Pharmacology and Toxicology, Berlin, Germany
| | - Frank Rühle
- Department of Genetic Epidemiology, Institute of Human Genetics, University Hospital Münster, Münster, Germany
| | - Ulrike Nowak-Göttl
- Thrombosis and Hemostasis Unit, Department of Clinical Chemistry, University Hospital of Kiel and Lübeck, Kiel, Germany
| | - Edeltraut Garbe
- Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Institute of Clinical Pharmacology and Toxicology, Berlin, Germany.,Department of Clinical Epidemiology, Leibniz Institute for Prevention Research and Epidemiology - BIPS, Bremen, Germany
| | - Reinhold Kreutz
- Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Institute of Clinical Pharmacology and Toxicology, Berlin, Germany
| | - Monika Stoll
- Department of Genetic Epidemiology, Institute of Human Genetics, University Hospital Münster, Münster, Germany. .,Department of Biochemistry, Cardiovascular Research Institute Maastricht, Maastricht University, Maastricht, The Netherlands.
| |
Collapse
|
8
|
Chatelain C, Durand G, Thuillier V, Augé F. Performance of epistasis detection methods in semi-simulated GWAS. BMC Bioinformatics 2018; 19:231. [PMID: 29914375 PMCID: PMC6006572 DOI: 10.1186/s12859-018-2229-8] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2017] [Accepted: 06/04/2018] [Indexed: 01/23/2023] Open
Abstract
BACKGROUND Part of the missing heritability in Genome Wide Association Studies (GWAS) is expected to be explained by interactions between genetic variants, also called epistasis. Various statistical methods have been developed to detect epistasis in case-control GWAS. These methods face major statistical challenges due to the number of tests required, the complexity of the Linkage Disequilibrium (LD) structure, and the lack of consensus regarding the definition of epistasis. Their limited impact in terms of uncovering new biological knowledge might be explained in part by the limited amount of experimental data available to validate their statistical performances in a realistic GWAS context. In this paper, we introduce a simulation pipeline for generating real scale GWAS data, including epistasis and realistic LD structure. We evaluate five exhaustive bivariate interaction methods, fastepi, GBOOST, SHEsisEpi, DSS, and IndOR. Two hundred thirty four different disease scenarios are considered in extensive simulations. We report the performances of each method in terms of false positive rate control, power, area under the ROC curve (AUC), and computation time using a GPU. Finally we compare the result of each methods on a real GWAS of type 2 diabetes from the Welcome Trust Case Control Consortium. RESULTS GBOOST, SHEsisEpi and DSS allow a satisfactory control of the false positive rate. fastepi and IndOR present an increase in false positive rate in presence of LD between causal SNPs, with our definition of epistasis. DSS performs best in terms of power and AUC in most scenarios with no or weak LD between causal SNPs. All methods can exhaustively analyze a GWAS with 6.105 SNPs and 15,000 samples in a couple of hours using a GPU. CONCLUSION This study confirms that computation time is no longer a limiting factor for performing an exhaustive search of epistasis in large GWAS. For this task, using DSS on SNP pairs with limited LD seems to be a good strategy to achieve the best statistical performance. A combination approach using both DSS and GBOOST is supported by the simulation results and the analysis of the WTCCC dataset demonstrated that this approach can detect distinct genes in epistasis. Finally, weak epistasis between common variants will be detectable with existing methods when GWAS of a few tens of thousands cases and controls are available.
Collapse
Affiliation(s)
| | - Guillermo Durand
- Laboratoire de Probabilités et Modèles Aléatoires, Université Pierre et Marie Curie, 4, place Jussieu, Paris Cedex 05, 75252 France
| | - Vincent Thuillier
- SANOFI R&D, Biostatistics & Programming, Chilly Mazarin, 91385 France
| | - Franck Augé
- SANOFI R&D, Translational Sciences, Chilly Mazarin, 91385 France
| |
Collapse
|
9
|
Han Z, Zhang J, Cai S, Chen X, Quan X, Zhang G. Association mapping for total polyphenol content, total flavonoid content and antioxidant activity in barley. BMC Genomics 2018; 19:81. [PMID: 29370751 PMCID: PMC5784657 DOI: 10.1186/s12864-018-4483-6] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2017] [Accepted: 01/16/2018] [Indexed: 11/30/2022] Open
Abstract
BACKGROUND The interest has been increasing on the phenolic compounds in plants because of their nutritive function as food and the roles regulating plant growth. However, their underlying genetic mechanism in barley is still not clear. RESULTS A genome-wide association study (GWAS) was conducted for total phenolic content (TPC), total flavonoid content (FLC) and antioxidant activity (AOA) in 67 cultivated and 156 Tibetan wild barley genotypes. Most markers associated with phenolic content were different in cultivated and wild barleys. The markers bPb-0572 and bPb-4531 were identified as the major QTLs controlling phenolic compounds in Tibetan wild barley. Moreover, the marker bPb-4531 was co-located with the UDP- glycosyltransferase gene (HvUGT), which is a homolog to Arabidopsis UGTs and involved in biosynthesis of flavonoid glycosides . CONCLUSIONS GWAS is an efficient tool for exploring the genetic architecture of phenolic compounds in the cultivated and Tibetan wild barleys. The DArT markers applied in this study can be used in barley breeding for developing new barley cultivars with higher phenolics content. The candidate gene (HvUGT) provides a potential route for deep understanding of the molecular mechanism of flavonoid synthesis.
Collapse
Affiliation(s)
- Zhigang Han
- Department of Agronomy, Zhejiang Key Laboratory of Crop Germplasm, Zhejiang University, Hangzhou, 310058 China
| | - Jingjie Zhang
- Department of Agronomy, Zhejiang Key Laboratory of Crop Germplasm, Zhejiang University, Hangzhou, 310058 China
| | - Shengguan Cai
- Department of Agronomy, Zhejiang Key Laboratory of Crop Germplasm, Zhejiang University, Hangzhou, 310058 China
| | - Xiaohui Chen
- Department of Agronomy, Zhejiang Key Laboratory of Crop Germplasm, Zhejiang University, Hangzhou, 310058 China
| | - Xiaoyan Quan
- School of Biological Science and Technology, University of Jinan, Jinan, 250022 China
| | - Guoping Zhang
- Department of Agronomy, Zhejiang Key Laboratory of Crop Germplasm, Zhejiang University, Hangzhou, 310058 China
| |
Collapse
|
10
|
Abstract
Although the term quantitative trait locus (QTL) strictly refers merely to a genetic variant that causes changes in a quantitative phenotype such as height, QTL analysis more usually describes techniques used to study oligogenic or polygenic traits where each identified locus contributes a relatively small amount to the genetic determination of the trait, which may be categorical in nature. Originally, too, it would be clear that it covered segregation and genetic linkage analysis, but now genetic association analysis in a genome-wide SNP or sequencing experiment would be the commonest application. The same biometrical genetic statistical apparatus used in this setting-analysis of variance, linear or generalized linear mixed models-can actually be applied to categorical phenotypes, as well as to multiple traits simultaneously, dealing with and taking advantage of genetic pleiotropy. Most recently, they are being used to make inferences about population and evolutionary genetics, with applications ranging from human disease to control of disease-causing organisms. Several computer software packages make it relatively straightforward to fit these statistically complex models to the large amounts of genotype and phenotype data routinely collected today.
Collapse
Affiliation(s)
- David L Duffy
- Genetic Epidemiology Laboratory, QIMR Berghofer Medical Research Institute, 300 Herston Rd., Brisbane, QLD, 4006, Australia.
| |
Collapse
|
11
|
Shu L, Zhao Y, Kurt Z, Byars SG, Tukiainen T, Kettunen J, Orozco LD, Pellegrini M, Lusis AJ, Ripatti S, Zhang B, Inouye M, Mäkinen VP, Yang X. Mergeomics: multidimensional data integration to identify pathogenic perturbations to biological systems. BMC Genomics 2016; 17:874. [PMID: 27814671 PMCID: PMC5097440 DOI: 10.1186/s12864-016-3198-9] [Citation(s) in RCA: 75] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2016] [Accepted: 10/25/2016] [Indexed: 12/17/2022] Open
Abstract
BACKGROUND Complex diseases are characterized by multiple subtle perturbations to biological processes. New omics platforms can detect these perturbations, but translating the diverse molecular and statistical information into testable mechanistic hypotheses is challenging. Therefore, we set out to create a public tool that integrates these data across multiple datasets, platforms, study designs and species in order to detect the most promising targets for further mechanistic studies. RESULTS We developed Mergeomics, a computational pipeline consisting of independent modules that 1) leverage multi-omics association data to identify biological processes that are perturbed in disease, and 2) overlay the disease-associated processes onto molecular interaction networks to pinpoint hubs as potential key regulators. Unlike existing tools that are mostly dedicated to specific data type or settings, the Mergeomics pipeline accepts and integrates datasets across platforms, data types and species. We optimized and evaluated the performance of Mergeomics using simulation and multiple independent datasets, and benchmarked the results against alternative methods. We also demonstrate the versatility of Mergeomics in two case studies that include genome-wide, epigenome-wide and transcriptome-wide datasets from human and mouse studies of total cholesterol and fasting glucose. In both cases, the Mergeomics pipeline provided statistical and contextual evidence to prioritize further investigations in the wet lab. The software implementation of Mergeomics is freely available as a Bioconductor R package. CONCLUSION Mergeomics is a flexible and robust computational pipeline for multidimensional data integration. It outperforms existing tools, and is easily applicable to datasets from different studies, species and omics data types for the study of complex traits.
Collapse
Affiliation(s)
- Le Shu
- Department of Integrative Biology and Physiology, University of California, Los Angeles, Los Angeles, CA, USA
| | - Yuqi Zhao
- Department of Integrative Biology and Physiology, University of California, Los Angeles, Los Angeles, CA, USA
| | - Zeyneb Kurt
- Department of Integrative Biology and Physiology, University of California, Los Angeles, Los Angeles, CA, USA
| | - Sean Geoffrey Byars
- Center for Systems Genomics, University of Melbourne, Melbourne, Australia.,School of BioSciences, University of Melbourne, Melbourne, Australia
| | | | | | - Luz D Orozco
- Department of Molecular, Cell and Developmental Biology, University of California, Los Angeles, Los Angeles, CA, USA
| | - Matteo Pellegrini
- Department of Molecular, Cell and Developmental Biology, University of California, Los Angeles, Los Angeles, CA, USA
| | - Aldons J Lusis
- Department of Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
| | | | - Bin Zhang
- Department of Genetics and Genomics Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Michael Inouye
- Center for Systems Genomics, University of Melbourne, Melbourne, Australia.,School of BioSciences, University of Melbourne, Melbourne, Australia.,Department of Pathology, University of Melbourne, Melbourne, Australia
| | - Ville-Petteri Mäkinen
- Department of Integrative Biology and Physiology, University of California, Los Angeles, Los Angeles, CA, USA. .,South Australian Health and Medical Research Institute, Adelaide, Australia. .,School of Biological Sciences, University of Adelaide, Adelaide, Australia. .,Computational Medicine, Faculty of Medicine, University of Oulu and Biocenter Oulu, Oulu, Finland.
| | - Xia Yang
- Department of Integrative Biology and Physiology, University of California, Los Angeles, Los Angeles, CA, USA. .,Insitute for Quantitative and Computational Biosciences, University of California, Los Angeles, Los Angeles, CA, USA.
| |
Collapse
|
12
|
Jahuey-Martínez FJ, Parra-Bracamonte GM, Sifuentes-Rincón AM, Martínez-González JC, Gondro C, García-Pérez CA, López-Bustamante LA. Genomewide association analysis of growth traits in Charolais beef cattle1. J Anim Sci 2016; 94:4570-4582. [DOI: 10.2527/jas.2016-0359] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Affiliation(s)
- F. J. Jahuey-Martínez
- Centro de Biotecnología Genómica-Instituto Politécnico Nacional, Reynosa, Tamaulipas, México, 88710
| | - G. M. Parra-Bracamonte
- Centro de Biotecnología Genómica-Instituto Politécnico Nacional, Reynosa, Tamaulipas, México, 88710
| | - A. M. Sifuentes-Rincón
- Centro de Biotecnología Genómica-Instituto Politécnico Nacional, Reynosa, Tamaulipas, México, 88710
| | - J. C. Martínez-González
- Universidad Autónoma de Tamaulipas-Facultad de Ingeniería y Ciencias, Victoria, Tamaulipas, México, 87749
| | - C. Gondro
- The Centre for Genetic Analyses and Applications, University of New England, Armidale, NSW, Australia, 2351
| | - C. A. García-Pérez
- Centro de Biotecnología Genómica-Instituto Politécnico Nacional, Reynosa, Tamaulipas, México, 88710
| | | |
Collapse
|
13
|
Genome-Wide Association Studies of the Human Gut Microbiota. PLoS One 2015; 10:e0140301. [PMID: 26528553 PMCID: PMC4631601 DOI: 10.1371/journal.pone.0140301] [Citation(s) in RCA: 176] [Impact Index Per Article: 19.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2015] [Accepted: 09/05/2015] [Indexed: 12/17/2022] Open
Abstract
The bacterial composition of the human fecal microbiome is influenced by many lifestyle factors, notably diet. It is less clear, however, what role host genetics plays in dictating the composition of bacteria living in the gut. In this study, we examined the association of ~200K host genotypes with the relative abundance of fecal bacterial taxa in a founder population, the Hutterites, during two seasons (n = 91 summer, n = 93 winter, n = 57 individuals collected in both). These individuals live and eat communally, minimizing variation due to environmental exposures, including diet, which could potentially mask small genetic effects. Using a GWAS approach that takes into account the relatedness between subjects, we identified at least 8 bacterial taxa whose abundances were associated with single nucleotide polymorphisms in the host genome in each season (at genome-wide FDR of 20%). For example, we identified an association between a taxon known to affect obesity (genus Akkermansia) and a variant near PLD1, a gene previously associated with body mass index. Moreover, we replicate a previously reported association from a quantitative trait locus (QTL) mapping study of fecal microbiome abundance in mice (genus Lactococcus, rs3747113, P = 3.13 x 10−7). Finally, based on the significance distribution of the associated microbiome QTLs in our study with respect to chromatin accessibility profiles, we identified tissues in which host genetic variation may be acting to influence bacterial abundance in the gut.
Collapse
|
14
|
Reed E, Nunez S, Kulp D, Qian J, Reilly MP, Foulkes AS. A guide to genome-wide association analysis and post-analytic interrogation. Stat Med 2015; 34:3769-92. [PMID: 26343929 PMCID: PMC5019244 DOI: 10.1002/sim.6605] [Citation(s) in RCA: 57] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2015] [Revised: 06/09/2015] [Accepted: 07/06/2015] [Indexed: 01/14/2023]
Abstract
This tutorial is a learning resource that outlines the basic process and provides specific software tools for implementing a complete genome‐wide association analysis. Approaches to post‐analytic visualization and interrogation of potentially novel findings are also presented. Applications are illustrated using the free and open‐source R statistical computing and graphics software environment, Bioconductor software for bioinformatics and the UCSC Genome Browser. Complete genome‐wide association data on 1401 individuals across 861,473 typed single nucleotide polymorphisms from the PennCATH study of coronary artery disease are used for illustration. All data and code, as well as additional instructional resources, are publicly available through the Open Resources in Statistical Genomics project: http://www.stat-gen.org. © 2015 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.
Collapse
Affiliation(s)
- Eric Reed
- Department of Mathematics and Statistics, Mount Holyoke College, South Hadley, MA, U.S.A
| | - Sara Nunez
- Department of Mathematics and Statistics, Mount Holyoke College, South Hadley, MA, U.S.A
| | - David Kulp
- Department of Computer Science, University of Massachusetts, Amherst, MA, U.S.A
| | - Jing Qian
- Department of Biostatistics and Epidemiology, University of Massachusetts, Amherst, MA, U.S.A
| | - Muredach P Reilly
- Department of Medicine, University of Pennsylvania, Philadelphia, PA, U.S.A
| | - Andrea S Foulkes
- Department of Mathematics and Statistics, Mount Holyoke College, South Hadley, MA, U.S.A
| |
Collapse
|
15
|
Upton A, Trelles O, Cornejo-García JA, Perkins JR. Review: High-performance computing to detect epistasis in genome scale data sets. Brief Bioinform 2015; 17:368-79. [PMID: 26272945 DOI: 10.1093/bib/bbv058] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2015] [Indexed: 11/14/2022] Open
Abstract
It is becoming clear that most human diseases have a complex etiology that cannot be explained by single nucleotide polymorphisms (SNPs) or simple additive combinations; the general consensus is that they are caused by combinations of multiple genetic variations. The limited success of some genome-wide association studies is partly a result of this focus on single genetic markers. A more promising approach is to take into account epistasis, by considering the association of multiple SNP interactions with disease. However, as genomic data continues to grow in resolution, and genome and exome sequencing become more established, the number of combinations of variants to consider increases rapidly. Two potential solutions should be considered: the use of high-performance computing, which allows us to consider a larger number of variables, and heuristics to make the solution more tractable, essential in the case of genome sequencing. In this review, we look at different computational methods to analyse epistatic interactions within disease-related genetic data sets created by microarray technology. We also review efforts to use epistatic analysis results to produce biomarkers for diagnostic tests and give our views on future directions in this field in light of advances in sequencing technology and variants in non-coding regions.
Collapse
|
16
|
Use of genome-wide association studies for cancer research and drug repositioning. PLoS One 2015; 10:e0116477. [PMID: 25803826 PMCID: PMC4372357 DOI: 10.1371/journal.pone.0116477] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2014] [Accepted: 12/08/2014] [Indexed: 01/13/2023] Open
Abstract
Although genome-wide association studies have identified many risk loci associated with colorectal cancer, the molecular basis of these associations are still unclear. We aimed to infer biological insights and highlight candidate genes of interest within GWAS risk loci. We used an in silico pipeline based on functional annotation, quantitative trait loci mapping of cis-acting gene, PubMed text-mining, protein-protein interaction studies, genetic overlaps with cancer somatic mutations and knockout mouse phenotypes, and functional enrichment analysis to prioritize the candidate genes at the colorectal cancer risk loci. Based on these analyses, we observed that these genes were the targets of approved therapies for colorectal cancer, and suggested that drugs approved for other indications may be repurposed for the treatment of colorectal cancer. This study highlights the use of publicly available data as a cost effective solution to derive biological insights, and provides an empirical evidence that the molecular basis of colorectal cancer can provide important leads for the discovery of new drugs.
Collapse
|
17
|
|
18
|
Kadarmideen HN. Genomics to systems biology in animal and veterinary sciences: Progress, lessons and opportunities. Livest Sci 2014. [DOI: 10.1016/j.livsci.2014.04.028] [Citation(s) in RCA: 59] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|