1
|
Dorji J, Reverter A, Alexandre PA, Chamberlain AJ, Vander-Jagt CJ, Kijas J, Porto-Neto LR. Ancestral alleles defined for 70 million cattle variants using a population-based likelihood ratio test. Genet Sel Evol 2024; 56:11. [PMID: 38321371 PMCID: PMC10848479 DOI: 10.1186/s12711-024-00879-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Accepted: 01/30/2024] [Indexed: 02/08/2024] Open
Abstract
BACKGROUND The study of ancestral alleles provides insights into the evolutionary history, selection, and genetic structures of a population. In cattle, ancestral alleles are widely used in genetic analyses, including the detection of signatures of selection, determination of breed ancestry, and identification of admixture. Having a comprehensive list of ancestral alleles is expected to improve the accuracy of these genetic analyses. However, the list of ancestral alleles in cattle, especially at the whole genome sequence level, is far from complete. In fact, the current largest list of ancestral alleles (~ 42 million) represents less than 28% of the total number of detected variants in cattle. To address this issue and develop a genomic resource for evolutionary studies, we determined ancestral alleles in cattle by comparing prior derived whole-genome sequence variants to an out-species group using a population-based likelihood ratio test. RESULTS Our study determined and makes available the largest list of ancestral alleles in cattle to date (70.1 million) and includes 2.3 million on the X chromosome. There was high concordance (97.6%) of the determined ancestral alleles with those from previous studies when only high-probability ancestral alleles were considered (29.8 million positions) and another 23.5 million high-confidence ancestral alleles were novel, expanding the available reference list to improve the accuracies of genetic analyses involving ancestral alleles. The high concordance of the results with previous studies implies that our approach using genomic sequence variants and a likelihood ratio test to determine ancestral alleles is appropriate. CONCLUSIONS Considering the high concordance of ancestral alleles across studies, the ancestral alleles determined in this study including those not previously listed, particularly those with high-probability estimates, may be used for further genetic analyses with reasonable accuracy. Our approach that used predetermined variants in species and the likelihood ratio test to determine ancestral alleles is applicable to other species for which sequence level genotypes are available.
Collapse
Affiliation(s)
- Jigme Dorji
- CSIRO, Agriculture & Food, St. Lucia, QLD, 4067, Australia.
| | | | | | - Amanda J Chamberlain
- AgriBio, Centre for AgriBioscience, Agriculture Victoria, Bundoora, VIC, 3083, Australia
| | - Christy J Vander-Jagt
- AgriBio, Centre for AgriBioscience, Agriculture Victoria, Bundoora, VIC, 3083, Australia
| | - James Kijas
- CSIRO, Agriculture & Food, St. Lucia, QLD, 4067, Australia
| | | |
Collapse
|
2
|
Naji MM, Utsunomiya YT, Sölkner J, Rosen BD, Mészáros G. Investigation of ancestral alleles in the Bovinae subfamily. BMC Genomics 2021; 22:108. [PMID: 33557747 PMCID: PMC7871596 DOI: 10.1186/s12864-021-07412-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2020] [Accepted: 01/27/2021] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND In evolutionary theory, divergence and speciation can arise from long periods of reproductive isolation, genetic mutation, selection and environmental adaptation. After divergence, alleles can either persist in their initial state (ancestral allele - AA), co-exist or be replaced by a mutated state (derived alleles -DA). In this study, we aligned whole genome sequences of individuals from the Bovinae subfamily to the cattle reference genome (ARS.UCD-1.2) for defining ancestral alleles necessary for selection signatures study. RESULTS Accommodating independent divergent of each lineage from the initial ancestral state, AA were defined based on fixed alleles on at least two groups of yak, bison and gayal-gaur-banteng resulting in ~ 32.4 million variants. Using non-overlapping scanning windows of 10 Kb, we counted the AA observed within taurine and zebu cattle. We focused on the extreme points, regions with top 0. 1% (high count) and regions without any occurrence of AA (null count). High count regions preserved gene functions from ancestral states that are still beneficial in the current condition, while null counts regions were linked to mutated ones. For both cattle, high count regions were associated with basal lipid metabolism, essential for survival of various environmental pressures. Mutated regions were associated to productive traits in taurine, i.e. higher metabolism, cell development and behaviors and in immune response domain for zebu. CONCLUSIONS Our findings suggest that retaining and losing AA in some regions are varied and made it species-specific with possibility of overlapping as it depends on the selective pressure they had to experience.
Collapse
Affiliation(s)
- Maulana M. Naji
- University of Natural Resources and Life Sciences (BOKU), Vienna, Austria
| | - Yuri T. Utsunomiya
- São Paulo State University (Unesp), School of Veterinary Medicine, Department of Production and Animal Health, Araçatuba, São Paulo Brazil
- International Atomic Energy Agency (IAEA) Collaborating Centre on Animal Genomics and Bioinformatics, Araçatuba, São Paulo Brazil
- AgroPartners Consulting. R. Floriano Peixoto, 120-Sala 43A-Centro, Araçatuba, SP 16010-220 Brazil
- Personal-PEC. R. Sebastiao Lima, 1336-Centro, Campo Grande, MS 79004-600 Brazil
| | - Johann Sölkner
- University of Natural Resources and Life Sciences (BOKU), Vienna, Austria
| | | | - Gábor Mészáros
- University of Natural Resources and Life Sciences (BOKU), Vienna, Austria
| |
Collapse
|
3
|
García Rabaneda C, Perea F, Bellido Díaz ML, Morales García AI, Martínez Atienza M, Sousa Silva L, González MÁG, Cabello FR, Esteban de la Rosa RJ. Founding mutations explains hotspots of polycystic kidney disease in Southern Spain. Clin Kidney J 2020; 14:1845-1847. [PMID: 34221391 PMCID: PMC8243269 DOI: 10.1093/ckj/sfaa261] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2020] [Accepted: 11/23/2020] [Indexed: 11/14/2022] Open
Abstract
Our group identified two pathogenic variants on the PKD1 gene, c.10527_10528delGA and c.7292T>A, from unrelated families. They came from two small counties in Granada, with 61 and 26 autosomal dominant polycystic kidney disease (ADPKD) individuals affected. To determine a common ancestor, healthy and ADPKD individuals from these families were genotyped by analysing four microsatellites located on chromosome 16. Our study identified a common haplotype in all ADPKD individuals. These findings underpin our hypothesis of the founder effect and explain why there is a high frequency of ADPKD in small regions. Determining hotspots of ADPKD will help to better plan healthcare in the future.
Collapse
Affiliation(s)
- Carmen García Rabaneda
- Servicio de Análisis Clínicos, Hospital Universitario San Cecilio, Granada, Spain.,Grupo de Estudio de la Enfermedad Poliquística Autosómica Dominante (GEEPAD), Granada, Spain
| | - Francisco Perea
- Servicio de Análisis Clínicos e Inmunología, UGC Laboratorio Clínico, Granada, Spain.,Servicio de Análisis Clínicos e Inmunología, UGC Laboratorio ClÍnico, Granada, Spain
| | - María Luz Bellido Díaz
- Grupo de Estudio de la Enfermedad Poliquística Autosómica Dominante (GEEPAD), Granada, Spain.,Servicio de Análisis Clínicos e Inmunología, UGC Laboratorio Clínico, Granada, Spain.,Servicio de Análisis Clínicos e Inmunología, UGC Laboratorio ClÍnico, Granada, Spain
| | - Ana I Morales García
- Grupo de Estudio de la Enfermedad Poliquística Autosómica Dominante (GEEPAD), Granada, Spain.,Servicio de Nefrología, Hospital Universitario San Cecilio, Granada, Spain.,Instituto de Investigación Biosanitario de Granada (IBS.GRANADA), Granada, Spain
| | - Margarita Martínez Atienza
- Grupo de Estudio de la Enfermedad Poliquística Autosómica Dominante (GEEPAD), Granada, Spain.,Servicio de Análisis Clínicos e Inmunología, UGC Laboratorio Clínico, Granada, Spain.,Servicio de Análisis Clínicos e Inmunología, UGC Laboratorio ClÍnico, Granada, Spain
| | - Lisbeth Sousa Silva
- Laboratorio de Genética del Complejo Hospitalario Universitario de Santiago de Compostela (NEFROCHUS) Spain
| | | | | | - Rafael J Esteban de la Rosa
- Grupo de Estudio de la Enfermedad Poliquística Autosómica Dominante (GEEPAD), Granada, Spain.,Instituto de Investigación Biosanitario de Granada (IBS.GRANADA), Granada, Spain.,Servicio de Nefrología, Hospital Universitario Virgen de las Nieves, Granada, Spain.,Asociación Amigos del Riñón, Granada, Spain
| |
Collapse
|
4
|
Park L. Population mutation properties of tumor evolution. Med Oncol 2020; 37:94. [PMID: 32975667 DOI: 10.1007/s12032-020-01421-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2020] [Accepted: 09/15/2020] [Indexed: 11/25/2022]
Abstract
Tumor growth patterns differ depending on the individual tumor, leading to various patterns of genetic heterogeneity across tumors. Despite their importance, the population mutation properties of tumor evolution have not been well studied, especially in terms of overall genetic heterogeneity. The current study aims to examine factors in tumor evolution influencing overall genetic heterogeneity. Extensive simulations of the representative evolutionary patterns of various tumors were conducted in the current study to determine the overall genetic characteristics of tumor evolution. The variations in cell birth/death rates and angiogenesis duration were examined based on the simplest growth pattern of tumors with avascular growth, angiogenesis, and vascular growth. To examine the impact of evolutionary tree structure, three-step linear evolution and branching evolution were investigated based on various subpopulation initiation times. The population size during the initial growth phase and the duration of angiogenesis are important factors affecting overall genetic heterogeneity. Furthermore, the shape of the evolutionary tree is crucial for defining the genetic heterogeneity of the extant tumor cell population, indicating the importance of the initiation timing of subpopulations. Branching evolution results in slightly greater genetic heterogeneity in terms of allelic distribution than in terms of the number of variants. The results of the current study reveal that the pattern of tumor evolution is critical for defining tumor genetic heterogeneity. These population genetic approaches are important for understanding the properties of tumor cell populations.
Collapse
Affiliation(s)
- LeeYoung Park
- Natural Science Research Institute, Yonsei University, Yonsei-ro 50, Seodaemun-Gu, Seoul, 03722, Korea.
| |
Collapse
|
5
|
Park L. Evidence of Recent Intricate Adaptation in Human Populations. PLoS One 2016; 11:e0165870. [PMID: 27992444 PMCID: PMC5167553 DOI: 10.1371/journal.pone.0165870] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2016] [Accepted: 10/19/2016] [Indexed: 11/18/2022] Open
Abstract
Recent human adaptations have shaped population differentiation in genomic regions containing putative functional variants, mostly located in predicted regulatory elements. However, their actual functionalities and the underlying mechanism of recent adaptation remain poorly understood. In the current study, regions of genes and repeats were investigated for functionality depending on the degree of population differentiation, FST or ΔDAF (a difference in derived allele frequency). The high FST in the 5´ or 3´ untranslated regions (UTRs), in particular, confirmed that population differences arose mainly from differences in regulation. Expression quantitative trait loci (eQTL) analyses using lymphoblastoid cell lines indicated that the majority of the highly population-specific regions represented cis- and/or trans-eQTL. However, groups having the highest ΔDAFs did not necessarily have higher proportions of eQTL variants; in these groups, the patterns were complex, indicating recent intricate adaptations. The results indicated that East Asian (EAS) and European populations (EUR) experienced mutual selection pressures. The mean derived allele frequency of the high ΔDAF groups suggested that EAS and EUR underwent strong adaptation; however, the African population in Africa (AFR) experienced slight, yet broad, adaptation. The DAF distributions of variants in the gene regions showed clear selective pressure in each population, which implies the existence of more recent regulatory adaptations in cells other than lymphoblastoid cell lines. In-depth analysis of population-differentiated regions indicated that the coding gene, RNF135, represented a trans-regulation hotspot via cis-regulation by the population-specific variants in the region of selective sweep. Together, the results provide strong evidence of actual intricate adaptation of human populations via regulatory manipulation.
Collapse
Affiliation(s)
- Leeyoung Park
- Natural Science Research Institute, Yonsei University, Seoul, Korea
- * E-mail:
| |
Collapse
|
6
|
Tuğrul M, Paixão T, Barton NH, Tkačik G. Dynamics of Transcription Factor Binding Site Evolution. PLoS Genet 2015; 11:e1005639. [PMID: 26545200 PMCID: PMC4636380 DOI: 10.1371/journal.pgen.1005639] [Citation(s) in RCA: 68] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2015] [Accepted: 10/09/2015] [Indexed: 11/19/2022] Open
Abstract
Evolution of gene regulation is crucial for our understanding of the phenotypic differences between species, populations and individuals. Sequence-specific binding of transcription factors to the regulatory regions on the DNA is a key regulatory mechanism that determines gene expression and hence heritable phenotypic variation. We use a biophysical model for directional selection on gene expression to estimate the rates of gain and loss of transcription factor binding sites (TFBS) in finite populations under both point and insertion/deletion mutations. Our results show that these rates are typically slow for a single TFBS in an isolated DNA region, unless the selection is extremely strong. These rates decrease drastically with increasing TFBS length or increasingly specific protein-DNA interactions, making the evolution of sites longer than ∼ 10 bp unlikely on typical eukaryotic speciation timescales. Similarly, evolution converges to the stationary distribution of binding sequences very slowly, making the equilibrium assumption questionable. The availability of longer regulatory sequences in which multiple binding sites can evolve simultaneously, the presence of “pre-sites” or partially decayed old sites in the initial sequence, and biophysical cooperativity between transcription factors, can all facilitate gain of TFBS and reconcile theoretical calculations with timescales inferred from comparative genomics. Evolution has produced a remarkable diversity of living forms that manifests in qualitative differences as well as quantitative traits. An essential factor that underlies this variability is transcription factor binding sites, short pieces of DNA that control gene expression levels. Nevertheless, we lack a thorough theoretical understanding of the evolutionary times required for the appearance and disappearance of these sites. By combining a biophysically realistic model for how cells read out information in transcription factor binding sites with model for DNA sequence evolution, we explore these timescales and ask what factors crucially affect them. We find that the emergence of binding sites from a random sequence is generically slow under point and insertion/deletion mutational mechanisms. Strong selection, sufficient genomic sequence in which the sites can evolve, the existence of partially decayed old binding sites in the sequence, as well as certain biophysical mechanisms such as cooperativity, can accelerate the binding site gain times and make them consistent with the timescales suggested by comparative analyses of genomic data.
Collapse
Affiliation(s)
- Murat Tuğrul
- Institute of Science and Technology Austria, Klosterneuburg, Austria
- * E-mail:
| | - Tiago Paixão
- Institute of Science and Technology Austria, Klosterneuburg, Austria
| | | | - Gašper Tkačik
- Institute of Science and Technology Austria, Klosterneuburg, Austria
| |
Collapse
|