1
|
Yang Y, Braga MV, Dean MD. Insertion-Deletion Events Are Depleted in Protein Regions with Predicted Secondary Structure. Genome Biol Evol 2024; 16:evae093. [PMID: 38735759 PMCID: PMC11102076 DOI: 10.1093/gbe/evae093] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2024] [Revised: 04/16/2024] [Accepted: 04/21/2024] [Indexed: 05/14/2024] Open
Abstract
A fundamental goal in evolutionary biology and population genetics is to understand how selection shapes the fate of new mutations. Here, we test the null hypothesis that insertion-deletion (indel) events in protein-coding regions occur randomly with respect to secondary structures. We identified indels across 11,444 sequence alignments in mouse, rat, human, chimp, and dog genomes and then quantified their overlap with four different types of secondary structure-alpha helices, beta strands, protein bends, and protein turns-predicted by deep-learning methods of AlphaFold2. Indels overlapped secondary structures 54% as much as expected and were especially underrepresented over beta strands, which tend to form internal, stable regions of proteins. In contrast, indels were enriched by 155% over regions without any predicted secondary structures. These skews were stronger in the rodent lineages compared to the primate lineages, consistent with population genetic theory predicting that natural selection will be more efficient in species with larger effective population sizes. Nonsynonymous substitutions were also less common in regions of protein secondary structure, although not as strongly reduced as in indels. In a complementary analysis of thousands of human genomes, we showed that indels overlapping secondary structure segregated at significantly lower frequency than indels outside of secondary structure. Taken together, our study shows that indels are selected against if they overlap secondary structure, presumably because they disrupt the tertiary structure and function of a protein.
Collapse
Affiliation(s)
- Yi Yang
- Molecular and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA
| | - Matthew V Braga
- Molecular and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA
| | - Matthew D Dean
- Molecular and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA
| |
Collapse
|
2
|
Hassan NE, Al-Janabi AA. Investigation of Interferon Gamma Activity Using Bioinformatics Methods. ARCHIVES OF RAZI INSTITUTE 2021; 76:1245-1253. [PMID: 35355749 PMCID: PMC8934094 DOI: 10.22092/ari.2021.356106.1780] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Subscribe] [Scholar Register] [Received: 09/11/2021] [Accepted: 10/02/2021] [Indexed: 05/25/2023]
Abstract
Breast cancer grows from the breast tissue and is a severe health problem worldwide. Genetics is believed to be the primary cause of all cases of breast cancer via gene mutation. Bioinformatics methodology has been used to determine the sequences and structures of bioactive substances. This study aimed to analyze the function and structure of the Interferon Gamma (IFNγ) in healthy controls and patients with breast cancer using bioinformatics methods. Blood samples were collected from 75 patients with breast cancer and 25 healthy subjects as control samples. The results showed transition mutation (30%) and transversion mutation (70%) in patients with breast cancer. Moreover, missense mutations (84%) and silent mutations (16%) were detected by BLAST. In addition, the amino acid of the IFNγ protein consisting of alpha-helical, β-sheet, and coil of secondary structure was determined in this study using BioEdit. The results of the physicochemical properties of the IFNγ protein reflect the function, stability, molecular weight, isoelectric point, and instability index of the IFNγ protein using ProtParam. Moreover, the results of mutation affected the percentage of alpha-helix, β-turns, and coil in breast cancer patients compared to healthy groups with reference of NCBI using PSIpred program. Additionally, the PHYRE2 server and RasMol program showed a tertiary structure of the IFNγ protein in breast cancer patients. Furthermore, the STRING program revealed the poly IFNγ protein interacted with other proteins to perform its functions normally. From the recorded data in the current study, it was concluded that IFNγ is considered a marker for patients with breast cancer.
Collapse
Affiliation(s)
- N E Hassan
- Department of Applied Science, University of Technology, Baghdad, Iraq
| | - A A Al-Janabi
- Department of Applied Science, University of Technology, Baghdad, Iraq
| |
Collapse
|
3
|
Huang J, Bennett J, Flouri T, Leaché AD, Yang Z. Phase Resolution of Heterozygous Sites in Diploid Genomes is Important to Phylogenomic Analysis under the Multispecies Coalescent Model. Syst Biol 2021; 71:334-352. [PMID: 34143216 PMCID: PMC8977997 DOI: 10.1093/sysbio/syab047] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2020] [Revised: 06/03/2021] [Accepted: 06/21/2021] [Indexed: 01/01/2023] Open
Abstract
Genome sequencing projects routinely generate haploid consensus sequences from diploid
genomes, which are effectively chimeric sequences with the phase at heterozygous sites
resolved at random. The impact of phasing errors on phylogenomic analyses under the
multispecies coalescent (MSC) model is largely unknown. Here, we conduct a computer
simulation to evaluate the performance of four phase-resolution strategies (the true phase
resolution, the diploid analytical integration algorithm which averages over all phase
resolutions, computational phase resolution using the program PHASE, and random
resolution) on estimation of the species tree and evolutionary parameters in analysis of
multilocus genomic data under the MSC model. We found that species tree estimation is
robust to phasing errors when species divergences were much older than average coalescent
times but may be affected by phasing errors when the species tree is shallow. Estimation
of parameters under the MSC model with and without introgression is affected by phasing
errors. In particular, random phase resolution causes serious overestimation of population
sizes for modern species and biased estimation of cross-species introgression probability.
In general, the impact of phasing errors is greater when the mutation rate is higher, the
data include more samples per species, and the species tree is shallower with recent
divergences. Use of phased sequences inferred by the PHASE program produced small biases
in parameter estimates. We analyze two real data sets, one of East Asian brown frogs and
another of Rocky Mountains chipmunks, to demonstrate that heterozygote phase-resolution
strategies have similar impacts on practical data analyses. We suggest that genome
sequencing projects should produce unphased diploid genotype sequences if fully phased
data are too challenging to generate, and avoid haploid consensus sequences, which have
heterozygous sites phased at random. In case the analytical integration algorithm is
computationally unfeasible, computational phasing prior to population genomic analyses is
an acceptable alternative. [BPP; introgression; multispecies coalescent; phase; species
tree.]
Collapse
Affiliation(s)
- Jun Huang
- Department of Genetics, Evolution and Environment, University College London, Gower Street, London WC1E 6BT, UK.,Department of Mathematics, Beijing Jiaotong University, Beijing, 100044, P.R. China
| | - Jeremy Bennett
- Department of Genetics, Evolution and Environment, University College London, Gower Street, London WC1E 6BT, UK.,Department of Ecology and Evolutionary Biology, University of Connecticut, 75 N. Eagleville Road, Unit 3043, Storrs, CT 06269-3043, USA
| | - Tomáš Flouri
- Department of Genetics, Evolution and Environment, University College London, Gower Street, London WC1E 6BT, UK
| | - Adam D Leaché
- Department of Biology & Burke Museum of Natural History and Culture, University of Washington, Seattle, WA 98195-1800, USA
| | - Ziheng Yang
- Department of Genetics, Evolution and Environment, University College London, Gower Street, London WC1E 6BT, UK
| |
Collapse
|
4
|
Werren EA, Garcia O, Bigham AW. Identifying adaptive alleles in the human genome: from selection mapping to functional validation. Hum Genet 2020; 140:241-276. [PMID: 32728809 DOI: 10.1007/s00439-020-02206-7] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2020] [Accepted: 07/07/2020] [Indexed: 12/19/2022]
Abstract
The suite of phenotypic diversity across geographically distributed human populations is the outcome of genetic drift, gene flow, and natural selection throughout human evolution. Human genetic variation underlying local biological adaptations to selective pressures is incompletely characterized. With the emergence of population genetics modeling of large-scale genomic data derived from diverse populations, scientists are able to map signatures of natural selection in the genome in a process known as selection mapping. Inferred selection signals further can be used to identify candidate functional alleles that underlie putative adaptive phenotypes. Phenotypic association, fine mapping, and functional experiments facilitate the identification of candidate adaptive alleles. Functional investigation of candidate adaptive variation using novel techniques in molecular biology is slowly beginning to unravel how selection signals translate to changes in biology that underlie the phenotypic spectrum of our species. In addition to informing evolutionary hypotheses of adaptation, the discovery and functional annotation of adaptive alleles also may be of clinical significance. While selection mapping efforts in non-European populations are growing, there remains a stark under-representation of diverse human populations in current public genomic databases, of both clinical and non-clinical cohorts. This lack of inclusion limits the study of human biological variation. Identifying and functionally validating candidate adaptive alleles in more global populations is necessary for understanding basic human biology and human disease.
Collapse
Affiliation(s)
- Elizabeth A Werren
- Department of Human Genetics, The University of Michigan, Ann Arbor, MI, USA
- Department of Anthropology, The University of Michigan, Ann Arbor, MI, USA
| | - Obed Garcia
- Department of Anthropology, The University of Michigan, Ann Arbor, MI, USA
| | - Abigail W Bigham
- Department of Anthropology, University of California Los Angeles, 341 Haines Hall, Los Angeles, CA, 90095, USA.
| |
Collapse
|
5
|
Zhu C, Miller M, Zeng Z, Wang Y, Mahlich Y, Aptekmann A, Bromberg Y. Computational Approaches for Unraveling the Effects of Variation in the Human Genome and Microbiome. Annu Rev Biomed Data Sci 2020. [DOI: 10.1146/annurev-biodatasci-030320-041014] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
The past two decades of analytical efforts have highlighted how much more remains to be learned about the human genome and, particularly, its complex involvement in promoting disease development and progression. While numerous computational tools exist for the assessment of the functional and pathogenic effects of genome variants, their precision is far from satisfactory, particularly for clinical use. Accumulating evidence also suggests that the human microbiome's interaction with the human genome plays a critical role in determining health and disease states. While numerous microbial taxonomic groups and molecular functions of the human microbiome have been associated with disease, the reproducibility of these findings is lacking. The human microbiome–genome interaction in healthy individuals is even less well understood. This review summarizes the available computational methods built to analyze the effect of variation in the human genome and microbiome. We address the applicability and precision of these methods across their possible uses. We also briefly discuss the exciting, necessary, and now possible integration of the two types of data to improve the understanding of pathogenicity mechanisms.
Collapse
Affiliation(s)
- Chengsheng Zhu
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, New Jersey 08873, USA;,
| | - Maximilian Miller
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, New Jersey 08873, USA;,
| | - Zishuo Zeng
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, New Jersey 08873, USA;,
| | - Yanran Wang
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, New Jersey 08873, USA;,
| | - Yannick Mahlich
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, New Jersey 08873, USA;,
| | - Ariel Aptekmann
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, New Jersey 08873, USA;,
| | - Yana Bromberg
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, New Jersey 08873, USA;,
- Department of Genetics, Rutgers University, Piscataway, New Jersey 08854, USA
| |
Collapse
|
6
|
Joseph TA, Pe'er I. Inference of Population Structure from Time-Series Genotype Data. Am J Hum Genet 2019; 105:317-333. [PMID: 31256878 PMCID: PMC6698887 DOI: 10.1016/j.ajhg.2019.06.002] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2019] [Accepted: 06/04/2019] [Indexed: 10/26/2022] Open
Abstract
Sequencing ancient DNA can offer direct probing of population history. Yet, such data are commonly analyzed with standard tools that assume DNA samples are all contemporary. We present DyStruct, a model and inference algorithm for inferring shared ancestry from temporally sampled genotype data. DyStruct explicitly incorporates temporal dynamics by modeling individuals as mixtures of unobserved populations whose allele frequencies drift over time. We develop an efficient inference algorithm for our model using stochastic variational inference. On simulated data, we show that DyStruct outperforms the current state of the art when individuals are sampled over time. Using a dataset of 296 modern and 80 ancient samples, we demonstrate DyStruct is able to capture a well-supported admixture event of steppe ancestry into modern Europe. We further apply DyStruct to a genome-wide dataset of 2,067 modern and 262 ancient samples used to study the origin of farming in the Near East. We show that DyStruct provides new insight into population history when compared with alternate approaches, within feasible run time.
Collapse
Affiliation(s)
- Tyler A Joseph
- Department of Computer Science, Columbia University, New York, NY 10027, USA.
| | - Itsik Pe'er
- Department of Computer Science, Columbia University, New York, NY 10027, USA; Department of Systems Biology, Columbia University, New York, NY 10027, USA; Data Science Institute, Columbia University, New York, NY 10027, USA.
| |
Collapse
|
7
|
Mongue AJ, Hansen ME, Gu L, Sorenson CE, Walters JR. Nonfertilizing sperm in Lepidoptera show little evidence for recurrent positive selection. Mol Ecol 2019; 28:2517-2530. [PMID: 30972892 PMCID: PMC6584056 DOI: 10.1111/mec.15096] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2018] [Revised: 03/29/2019] [Accepted: 03/29/2019] [Indexed: 11/30/2022]
Abstract
Sperm are among the most variable cells in nature. Some of this variation results from nonadaptive errors in spermatogenesis, but many species consistently produce multiple sperm morphs, the adaptive significance of which remains unknown. Here, we investigate the evolution of dimorphic sperm in Lepidoptera, the butterflies and moths. Males of this order produce both fertilizing sperm and a secondary, nonfertilizing type that lacks DNA. Previous organismal studies suggested a role for nonfertilizing sperm in sperm competition, but this hypothesis has never been evaluated from a molecular framework. We combined published data sets with new sequencing in two species, the monandrous Carolina sphinx moth and the highly polyandrous monarch butterfly. Based on population genetic analyses, we see evidence for increased adaptive evolution in fertilizing sperm, but only in the polyandrous species. This signal comes primarily from a decrease in nonsynonymous polymorphism in sperm proteins compared to the rest of the genome, suggesting stronger purifying selection, consistent with selection via sperm competition. Nonfertilizing sperm proteins, in contrast, do not show an effect of mating system and do not appear to evolve differently from the background genome in either species, arguing against the involvement of nonfertilizing sperm in direct sperm competition. Based on our results and previous work, we suggest that nonfertilizing sperm may be used to delay female remating in these insects and decrease the risk of sperm competition rather than directly affect its outcome.
Collapse
Affiliation(s)
- Andrew J Mongue
- Department of Ecology and Evolutionary Biology, University of Kansas, Lawrence, Kansas
| | - Megan E Hansen
- Department of Ecology and Evolutionary Biology, University of Kansas, Lawrence, Kansas
| | - Liuqi Gu
- Department of Ecology and Evolutionary Biology, University of Kansas, Lawrence, Kansas
| | - Clyde E Sorenson
- Department of Entomology, North Carolina State University, Raleigh, North Carolina
| | - James R Walters
- Department of Ecology and Evolutionary Biology, University of Kansas, Lawrence, Kansas
| |
Collapse
|
8
|
Zeng Q, Wu S, Sukumaran J, Rodrigo A. Models of microbiome evolution incorporating host and microbial selection. MICROBIOME 2017; 5:127. [PMID: 28946894 PMCID: PMC5613328 DOI: 10.1186/s40168-017-0343-x] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/08/2017] [Accepted: 09/15/2017] [Indexed: 05/25/2023]
Abstract
BACKGROUND Numerous empirical studies suggest that hosts and microbes exert reciprocal selective effects on their ecological partners. Nonetheless, we still lack an explicit framework to model the dynamics of both hosts and microbes under selection. In a previous study, we developed an agent-based forward-time computational framework to simulate the neutral evolution of host-associated microbial communities in a constant-sized, unstructured population of hosts. These neutral models allowed offspring to sample microbes randomly from parents and/or from the environment. Additionally, the environmental pool of available microbes was constituted by fixed and persistent microbial OTUs and by contributions from host individuals in the preceding generation. METHODS In this paper, we extend our neutral models to allow selection to operate on both hosts and microbes. We do this by constructing a phenome for each microbial OTU consisting of a sample of traits that influence host and microbial fitnesses independently. Microbial traits can influence the fitness of hosts ("host selection") and the fitness of microbes ("trait-mediated microbial selection"). Additionally, the fitness effects of traits on microbes can be modified by their hosts ("host-mediated microbial selection"). We simulate the effects of these three types of selection, individually or in combination, on microbiome diversities and the fitnesses of hosts and microbes over several thousand generations of hosts. RESULTS We show that microbiome diversity is strongly influenced by selection acting on microbes. Selection acting on hosts only influences microbiome diversity when there is near-complete direct or indirect parental contribution to the microbiomes of offspring. Unsurprisingly, microbial fitness increases under microbial selection. Interestingly, when host selection operates, host fitness only increases under two conditions: (1) when there is a strong parental contribution to microbial communities or (2) in the absence of a strong parental contribution, when host-mediated selection acts on microbes concomitantly. CONCLUSIONS We present a computational framework that integrates different selective processes acting on the evolution of microbiomes. Our framework demonstrates that selection acting on microbes can have a strong effect on microbial diversities and fitnesses, whereas selection on hosts can have weaker outcomes.
Collapse
Affiliation(s)
- Qinglong Zeng
- Department of Biology, Duke University, Durham, NC USA
- Research School of Biology, The Australian National University, Canberra, Australian Capital Territories Australia
| | - Steven Wu
- Biodesign Institute, Arizona State University, Tempe, AZ USA
| | - Jeet Sukumaran
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI USA
| | - Allen Rodrigo
- Research School of Biology, The Australian National University, Canberra, Australian Capital Territories Australia
| |
Collapse
|
9
|
Abstract
Major histocompatibility complex (MHC) class I genes are critically involved in the defense against intracellular pathogens. MHC diversity comparisons among samples of closely related taxa may reveal traces of past or ongoing selective processes. The bonobo and chimpanzee are the closest living evolutionary relatives of humans and last shared a common ancestor some 1 mya. However, little is known concerning MHC class I diversity in bonobos or in central chimpanzees, the most numerous and genetically diverse chimpanzee subspecies. Here, we used a long-read sequencing technology (PacBio) to sequence the classical MHC class I genes A, B, C, and A-like in 20 and 30 wild-born bonobos and chimpanzees, respectively, with a main focus on central chimpanzees to assess and compare diversity in those two species. We describe in total 21 and 42 novel coding region sequences for the two species, respectively. In addition, we found evidence for a reduced MHC class I diversity in bonobos as compared to central chimpanzees as well as to western chimpanzees and humans. The reduced bonobo MHC class I diversity may be the result of a selective process in their evolutionary past since their split from chimpanzees.
Collapse
Affiliation(s)
- Vincent Maibach
- Department of Primatology, Max Planck Institute for Evolutionary Anthropology, Deutscher Platz 6, 04103, Leipzig, Germany.
| | - Jörg B Hans
- Department of Primatology, Max Planck Institute for Evolutionary Anthropology, Deutscher Platz 6, 04103, Leipzig, Germany
| | | | - Tomas Marques-Bonet
- Institute of Evolutionary Biology (UPF-CSIC), PRBB, Dr. Aiguader 88, 08003, Barcelona, Catalonia, Spain
- Catalan Institution of Research and Advanced Studies (ICREA), Passeig de Lluís Companys, 23, 08010, Barcelona, Spain
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Baldiri i Reixac 4, 08028, Barcelona, Spain
| | - Linda Vigilant
- Department of Primatology, Max Planck Institute for Evolutionary Anthropology, Deutscher Platz 6, 04103, Leipzig, Germany
| |
Collapse
|
10
|
Natural Selection and Functional Potentials of Human Noncoding Elements Revealed by Analysis of Next Generation Sequencing Data. PLoS One 2015; 10:e0129023. [PMID: 26053627 PMCID: PMC4460046 DOI: 10.1371/journal.pone.0129023] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2014] [Accepted: 05/04/2015] [Indexed: 11/19/2022] Open
Abstract
Noncoding DNA sequences (NCS) have attracted much attention recently due to their functional potentials. Here we attempted to reveal the functional roles of noncoding sequences from the point of view of natural selection that typically indicates the functional potentials of certain genomic elements. We analyzed nearly 37 million single nucleotide polymorphisms (SNPs) of Phase I data of the 1000 Genomes Project. We estimated a series of key parameters of population genetics and molecular evolution to characterize sequence variations of the noncoding genome within and between populations, and identified the natural selection footprints in NCS in worldwide human populations. Our results showed that purifying selection is prevalent and there is substantial constraint of variations in NCS, while positive selectionis more likely to be specific to some particular genomic regions and regional populations. Intriguingly, we observed larger fraction of non-conserved NCS variants with lower derived allele frequency in the genome, indicating possible functional gain of non-conserved NCS. Notably, NCS elements are enriched for potentially functional markers such as eQTLs, TF motif, and DNase I footprints in the genome. More interestingly, some NCS variants associated with diseases such as Alzheimer's disease, Type 1 diabetes, and immune-related bowel disorder (IBD) showed signatures of positive selection, although the majority of NCS variants, reported as risk alleles by genome-wide association studies, showed signatures of negative selection. Our analyses provided compelling evidence of natural selection forces on noncoding sequences in the human genome and advanced our understanding of their functional potentials that play important roles in disease etiology and human evolution.
Collapse
|
11
|
Kessler MD, Dean MD. Effective population size does not predict codon usage bias in mammals. Ecol Evol 2014; 4:3887-900. [PMID: 25505518 PMCID: PMC4242573 DOI: 10.1002/ece3.1249] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2014] [Revised: 08/04/2014] [Accepted: 08/07/2014] [Indexed: 12/20/2022] Open
Abstract
Synonymous codons are not used at equal frequency throughout the genome, a phenomenon termed codon usage bias (CUB). It is often assumed that interspecific variation in the intensity of CUB is related to species differences in effective population sizes (Ne), with selection on CUB operating less efficiently in species with small Ne. Here, we specifically ask whether variation in Ne predicts differences in CUB in mammals and report two main findings. First, across 41 mammalian genomes, CUB was not correlated with two indirect proxies of Ne (body mass and generation time), even though there was statistically significant evidence of selection shaping CUB across all species. Interestingly, autosomal genes showed higher codon usage bias compared to X-linked genes, and high-recombination genes showed higher codon usage bias compared to low recombination genes, suggesting intraspecific variation in Ne predicts variation in CUB. Second, across six mammalian species with genetic estimates of Ne (human, chimpanzee, rabbit, and three mouse species: Mus musculus, M. domesticus, and M. castaneus), Ne and CUB were weakly and inconsistently correlated. At least in mammals, interspecific divergence in Ne does not strongly predict variation in CUB. One hypothesis is that each species responds to a unique distribution of selection coefficients, confounding any straightforward link between Ne and CUB.
Collapse
Affiliation(s)
- Michael D Kessler
- Molecular and Computational Biology, University of Southern California 1050 Childs Way, Los Angeles, California, 90089
| | - Matthew D Dean
- Molecular and Computational Biology, University of Southern California 1050 Childs Way, Los Angeles, California, 90089
| |
Collapse
|
12
|
Soto-Calderón ID, Lee EJ, Jensen-Seaman MI, Anthony NM. Factors affecting the relative abundance of nuclear copies of mitochondrial DNA (numts) in hominoids. J Mol Evol 2012; 75:102-11. [PMID: 23053193 DOI: 10.1007/s00239-012-9519-y] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2011] [Accepted: 09/24/2012] [Indexed: 10/27/2022]
Abstract
Although nuclear copies of mitochondrial DNA (numts) can originate from any portion of the mitochondrial genome, evidence from humans suggests that more variable parts of the mitochondrial genome, such as the mitochondrial control region (MCR), are under-represented in the nucleus. This apparent deficit might arise from the erosion of sequence identity in numts originating from rapidly evolving mitochondrial sequences. However, the extent to which mitochondrial sequence properties impacts the number of numts detected in genomic surveys has not been evaluated. In order to address this question, we: (1) conducted exhaustive BLAST searches of MCR numts in three hominoid genomes; (2) assessed numt prevalence across the four MCR sub-domains (HV1, CCD, HV2, and MCR(F)); (3) estimated their insertion rates in great apes (Hominoidea); and (4) examined the relationship between mitochondrial DNA variability and numt prevalence in sequences originating from MCR and coding regions of the mitochondrial genome. Results indicate a marked deficit of numts from HV2 and MCR(F) MCR sub-domains in all three species. These MCR sub-domains exhibited the highest proportion of variable sites and the lowest number of detected numts per mitochondrial site. Variation in MCR insertion rate between lineages was also observed with a pronounced burst in recent integrations within chimpanzees and orangutans. A deficit of numts from HV2/MCR(F) was observed regardless of age, whereas HV1 is under-represented only in older numts (>25 million years). Finally, more variable mitochondrial genes also exhibit a lower identity with nuclear copies and because of this, appear to be under-represented in human numt databases.
Collapse
Affiliation(s)
- I D Soto-Calderón
- Department of Biological Sciences, University of New Orleans, 2000 Lakeshore Drive, New Orleans, LA 70148, USA.
| | | | | | | |
Collapse
|
13
|
Sargsyan O. Analytical framework for identifying and differentiating recent hitchhiking and severe bottleneck effects from multi-locus DNA sequence data. PLoS One 2012; 7:e37588. [PMID: 22662176 PMCID: PMC3360760 DOI: 10.1371/journal.pone.0037588] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2012] [Accepted: 04/21/2012] [Indexed: 11/19/2022] Open
Abstract
Hitchhiking and severe bottleneck effects have impact on the dynamics of genetic diversity of a population by inducing homogenization at a single locus and at the genome-wide scale, respectively. As a result, identification and differentiation of the signatures of such events from DNA sequence data at a single locus is challenging. This paper develops an analytical framework for identifying and differentiating recent homogenization events at multiple neutral loci in low recombination regions. The dynamics of genetic diversity at a locus after a recent homogenization event is modeled according to the infinite-sites mutation model and the Wright-Fisher model of reproduction with constant population size. In this setting, I derive analytical expressions for the distribution, mean, and variance of the number of polymorphic sites in a random sample of DNA sequences from a locus affected by a recent homogenization event. Based on this framework, three likelihood-ratio based tests are presented for identifying and differentiating recent homogenization events at multiple loci. Lastly, I apply the framework to two data sets. First, I consider human DNA sequences from four non-coding loci on different chromosomes for inferring evolutionary history of modern human populations. The results suggest, in particular, that recent homogenization events at the loci are identifiable when the effective human population size is 50,000 or greater in contrast to 10,000, and the estimates of the recent homogenization events are agree with the "Out of Africa" hypothesis. Second, I use HIV DNA sequences from HIV-1-infected patients to infer the times of HIV seroconversions. The estimates are contrasted with other estimates derived as the mid-time point between the last HIV-negative and first HIV-positive screening tests. The results show that significant discrepancies can exist between the estimates.
Collapse
Affiliation(s)
- Ori Sargsyan
- Theoretical Biology and Biophysics and Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, New Mexico, United States of America.
| |
Collapse
|
14
|
Can –omics inform a food safety assessment? Regul Toxicol Pharmacol 2010; 58:S62-70. [DOI: 10.1016/j.yrtph.2010.05.009] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2010] [Accepted: 05/20/2010] [Indexed: 02/06/2023]
|
15
|
Features of recent codon evolution: a comparative polymorphism-fixation study. J Biomed Biotechnol 2010; 2010:202918. [PMID: 20622912 PMCID: PMC2896653 DOI: 10.1155/2010/202918] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2010] [Accepted: 03/31/2010] [Indexed: 11/17/2022] Open
Abstract
Features of amino-acid and codon changes can provide us important insights on protein evolution. So far, investigators have often examined mutation patterns at either interspecies fixed substitution or intraspecies nucleotide polymorphism level, but not both. Here, we performed a unique analysis of a combined set of intra-species polymorphisms and inter-species substitutions in human codons. Strong difference in mutational pattern was found at codon positions 1, 2, and 3 between the polymorphism and fixation data. Fixation had strong bias towards increasing the rarest codons but decreasing the most frequently used codons, suggesting that codon equilibrium has not been reached yet. We detected strong CpG effect on CG-containing codons and subsequent suppression by fixation. Finally, we detected the signature of purifying selection against Amid R:U dinucleotides at synonymous dicodon boundaries. Overall, fixation process could effectively and quickly correct the volatile changes introduced by polymorphisms so that codon changes could be gradual and directional and that codon composition could be kept relatively stable during evolution.
Collapse
|
16
|
Lambert CA, Tishkoff SA. Genetic structure in African populations: implications for human demographic history. COLD SPRING HARBOR SYMPOSIA ON QUANTITATIVE BIOLOGY 2010; 74:395-402. [PMID: 20453204 DOI: 10.1101/sqb.2009.74.053] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
Abstract
The continent of Africa is the source of all anatomically modern humans that dispersed across the planet during the past 100,000 years. As such, African populations are characterized by high genetic diversity and low levels of linkage disequilibrium (LD) among loci, as compared to populations from other continents. African populations also possess a number of genetic adaptations that have evolved in response to the diverse climates, diets, geographic environments, and infectious agents that characterize the African continent. Recently, Tishkoff et al. (2009) performed a genome-wide analysis of substructure based on DNA from 2432 Africans from 121 geographically diverse populations. The authors analyzed patterns of variation at 1327 nuclear microsatellite and insertion/deletion markers and identified 14 ancestral population clusters that correlate well with self-described ethnicity and shared cultural or linguistic properties. The results suggest that African populations may have maintained a large and subdivided population structure throughout much of their evolutionary history. In this chapter, we synthesize recent work documenting evidence of African population structure and discuss the implications for inferences about evolutionary history in both African populations and anatomically modern humans as a whole.
Collapse
Affiliation(s)
- C A Lambert
- Department of Genetics, University of Pennsylvania, Philadelphia, PA 19104, USA
| | | |
Collapse
|
17
|
Clarke RA, Zhao Z, Guo AY, Roper K, Teng L, Fang ZM, Samaratunga H, Lavin MF, Gardiner RA. New genomic structure for prostate cancer specific gene PCA3 within BMCC1: implications for prostate cancer detection and progression. PLoS One 2009; 4:e4995. [PMID: 19319183 PMCID: PMC2655648 DOI: 10.1371/journal.pone.0004995] [Citation(s) in RCA: 64] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2008] [Accepted: 02/05/2009] [Indexed: 11/20/2022] Open
Abstract
Background The prostate cancer antigen 3 (PCA3/DD3) gene is a highly specific biomarker upregulated in prostate cancer (PCa). In order to understand the importance of PCA3 in PCa we investigated the organization and evolution of the PCA3 gene locus. Methods/Principal Findings We have employed cDNA synthesis, RTPCR and DNA sequencing to identify 4 new transcription start sites, 4 polyadenylation sites and 2 new differentially spliced exons in an extended form of PCA3. Primers designed from these novel PCA3 exons greatly improve RT-PCR based discrimination between PCa, PCa metastases and BPH specimens. Comparative genomic analyses demonstrated that PCA3 has only recently evolved in an anti-sense orientation within a second gene, BMCC1/PRUNE2. BMCC1 has been shown previously to interact with RhoA and RhoC, determinants of cellular transformation and metastasis, respectively. Using RT-PCR we demonstrated that the longer BMCC1-1 isoform - like PCA3 – is upregulated in PCa tissues and metastases and in PCa cell lines. Furthermore PCA3 and BMCC1-1 levels are responsive to dihydrotestosterone treatment. Conclusions/Significance Upregulation of two new PCA3 isoforms in PCa tissues improves discrimination between PCa and BPH. The functional relevance of this specificity is now of particular interest given PCA3's overlapping association with a second gene BMCC1, a regulator of Rho signalling. Upregulation of PCA3 and BMCC1 in PCa has potential for improved diagnosis.
Collapse
Affiliation(s)
- Raymond A. Clarke
- Prostate Cancer Institute, Cancer Care Centre, St George Hospital Clinical School of Medicine, University of New South Wales, Kogarah, New South Wales, Australia
- Division of Cancer and Cell Biology, Queensland Institute of Medical Research, Brisbane, Queensland, Australia
| | - Zhongming Zhao
- Department of Psychiatry and Center for the Study of Biological Complexity, Virginia Commonwealth University, Richmond, Virginia, United States of Amerca
| | - An-Yuan Guo
- Department of Psychiatry and Center for the Study of Biological Complexity, Virginia Commonwealth University, Richmond, Virginia, United States of Amerca
| | - Kathrein Roper
- Hopkins Marine Station, Stanford University, Stanford, California, United States of America
| | - Linda Teng
- Division of Cancer and Cell Biology, Queensland Institute of Medical Research, Brisbane, Queensland, Australia
| | - Zhi-Ming Fang
- Prostate Cancer Institute, Cancer Care Centre, St George Hospital Clinical School of Medicine, University of New South Wales, Kogarah, New South Wales, Australia
| | | | - Martin F. Lavin
- Division of Cancer and Cell Biology, Queensland Institute of Medical Research, Brisbane, Queensland, Australia
- University of Queensland Centre for Clinical Research, Brisbane, Australia
- * E-mail: (MFL); (RAG)
| | - Robert A. Gardiner
- University of Queensland Centre for Clinical Research, Brisbane, Australia
- * E-mail: (MFL); (RAG)
| |
Collapse
|
18
|
Liu X, Maxwell TJ, Boerwinkle E, Fu YX. Inferring population mutation rate and sequencing error rate using the SNP frequency spectrum in a sample of DNA sequences. Mol Biol Evol 2009; 26:1479-90. [PMID: 19318520 DOI: 10.1093/molbev/msp059] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
One challenge of analyzing samples of DNA sequences is to account for the nonnegligible polymorphisms produced by error when the sequencing error rate is high or the sample size is large. Specifically, those artificial sequence variations will bias the observed single nucleotide polymorphism (SNP) frequency spectrum, which in turn may further bias the estimators of the population mutation rate theta =4N mu for diploids. In this paper, we propose a new approach based on the generalized least squares (GLS) method to estimate theta, given a SNP frequency spectrum in a random sample of DNA sequences from a population. With this approach, error rate epsilon can be either known or unknown. In the latter case, epsilon can be estimated given an estimation of theta. Using coalescent simulation, we compared our estimators with other estimators of theta. The results showed that the GLS estimators are more efficient than other theta estimators with error, and the estimation of epsilon is usable in practice when the theta per bp is small. We demonstrate the application of the estimators with 10-kb noncoding region sequence sampled from a human population and provide suggestions for choosing theta estimators with error.
Collapse
Affiliation(s)
- Xiaoming Liu
- Human Genetics Center, School of Public Health, The University of Texas Health Science Center at Houston, TX, USA
| | | | | | | |
Collapse
|
19
|
Lee JY, Edwards SV. DIVERGENCE ACROSS AUSTRALIA'S CARPENTARIAN BARRIER: STATISTICAL PHYLOGEOGRAPHY OF THE RED-BACKED FAIRY WREN (MALURUS MELANOCEPHALUS). Evolution 2008; 62:3117-34. [DOI: 10.1111/j.1558-5646.2008.00543.x] [Citation(s) in RCA: 140] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
|
20
|
de Groot NG, Heijmans CMC, de Groot N, Otting N, de Vos-Rouweller AJM, Remarque EJ, Bonhomme M, Doxiadis GGM, Crouau-Roy B, Bontrop RE. Pinpointing a selective sweep to the chimpanzee MHC class I region by comparative genomics. Mol Ecol 2008; 17:2074-88. [PMID: 18346126 DOI: 10.1111/j.1365-294x.2008.03716.x] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Chimpanzees experienced a reduction of the allelic repertoire at the major histocompatibility complex (MHC) class I A and B loci, which may have been caused by a retrovirus belonging to the simian immunodeficiency virus (SIV) family. Extended MHC haplotypes were defined in a pedigreed chimpanzee colony. Comparison of genetic variation at microsatellite markers mapping inside and outside the Mhc region was carried out in humans and chimpanzees to investigate the genomic extent of the repertoire reduction. Multilocus demographic analyses underscored that chimpanzees indeed experienced a selective sweep that mainly targeted the chromosomal segment carrying the Mhc class I region. Probably due to genetic linkage, the sweep also affected other polymorphic loci, mapping in the close vicinity of the Mhc class I region genes. Nevertheless, although the allelic repertoire at particular Mhc class I and II loci appears to be limited, naturally occurring recombination events allowed the establishment of haplotype diversity after the sweep. However, recombination did not have sufficient time to erase the signal of the selective sweep.
Collapse
Affiliation(s)
- Natasja G de Groot
- Biomedical Primate Research Centre, Department of Comparative Genetics and Refinement, Lange Kleiweg 139, 2288 GJ Rijswijk, The Netherlands.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
21
|
Royer B, Soares DC, Barlow PN, Bontrop RE, Roll P, Robaglia-Schlupp A, Blancher A, Levasseur A, Cau P, Pontarotti P, Szepetowski P. Molecular evolution of the human SRPX2 gene that causes brain disorders of the Rolandic and Sylvian speech areas. BMC Genet 2007; 8:72. [PMID: 17942002 PMCID: PMC2151080 DOI: 10.1186/1471-2156-8-72] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2007] [Accepted: 10/18/2007] [Indexed: 12/11/2022] Open
Abstract
Background The X-linked SRPX2 gene encodes a Sushi Repeat-containing Protein of unknown function and is mutated in two disorders of the Rolandic/Sylvian speech areas. Since it is linked to defects in the functioning and the development of brain areas for speech production, SRPX2 may thus have participated in the adaptive organization of such brain regions. To address this issue, we have examined the recent molecular evolution of the SRPX2 gene. Results The complete coding region was sequenced in 24 human X chromosomes from worldwide populations and in six representative nonhuman primate species. One single, fixed amino acid change (R75K) has been specifically incorporated in human SRPX2 since the human-chimpanzee split. The R75K substitution occurred in the first sushi domain of SRPX2, only three amino acid residues away from a previously reported disease-causing mutation (Y72S). Three-dimensional structural modeling of the first sushi domain revealed that Y72 and K75 are both situated in the hypervariable loop that is usually implicated in protein-protein interactions. The side-chain of residue 75 is exposed, and is located within an unusual and SRPX-specific protruding extension to the hypervariable loop. The analysis of non-synonymous/synonymous substitution rate (Ka/Ks) ratio in primates was performed in order to test for positive selection during recent evolution. Using the branch models, the Ka/Ks ratio for the human branch was significantly different (p = 0.027) from that of the other branches. In contrast, the branch-site tests did not reach significance. Genetic analysis was also performed by sequencing 9,908 kilobases (kb) of intronic SRPX2 sequences. Despite low nucleotide diversity, neither the HKA (Hudson-Kreitman-Aguadé) test nor the Tajima's D test reached significance. Conclusion The R75K human-specific variation occurred in an important functional loop of the first sushi domain of SRPX2, indicating that this evolutionary mutation may have functional importance; however, positive selection for R75K could not be demonstrated. Nevertheless, our data contribute to the first understanding of molecular evolution of the human SPRX2 gene. Further experiments are now required in order to evaluate the possible consequences of R75K on SRPX2 interactions and functioning.
Collapse
Affiliation(s)
- Barbara Royer
- INSERM UMR 491, Université de la Méditerranée, 13385 Marseille, Cedex 5, France.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
22
|
Labuda D, Labbé C, Langlois S, Lefebvre JF, Freytag V, Moreau C, Sawicki J, Beaulieu P, Pastinen T, Hudson TJ, Sinnett D. Patterns of variation in DNA segments upstream of transcription start sites. Hum Mutat 2007; 28:441-50. [PMID: 17274005 PMCID: PMC2683062 DOI: 10.1002/humu.20463] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
It is likely that evolutionary differences among species are driven by sequence changes in regulatory regions. Likewise, polymorphisms in the promoter regions may be responsible for interindividual differences at the level of populations. We present an unbiased survey of genetic variation in 2-kb segments upstream of the transcription start sites of 28 protein-coding genes, characterized in five population groups of different geographic origin. On average, we found 9.1 polymorphisms and 8.8 haplotypes per segment with corresponding nucleotide and haplotype diversities of 0.082% and 58%, respectively. We characterized these segments through different summary statistics, Hardy-Weinberg equilibria fixation index (Fst) estimates, and neutrality tests, as well as by analyzing the distributions of haplotype allelic classes, introduced here to assess the departure from neutrality and examined by coalescent simulations under a simple population model, assuming recombinations or different demography. Our results suggest that genetic diversity in some of these regions could have been shaped by purifying selection and driven by adaptive changes in the other, thus explaining the relatively large variance in the corresponding genetic diversity indices loci. However, some of these effects could be also due to linkage with surrounding sequences, and the neutralists' explanations cannot be ruled out given uncertainty in the underlying demographic histories and the possibility of random effects due to the small size of the studied segments.
Collapse
Affiliation(s)
- Damian Labuda
- Centre de Recherche, Hôpital Sainte-Justine, Montréal, Quebec, Canada.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
23
|
Curnoe D. Modern human origins in Australasia: Testing the predictions of competing models. HOMO-JOURNAL OF COMPARATIVE HUMAN BIOLOGY 2007; 58:117-57. [PMID: 17433327 DOI: 10.1016/j.jchb.2006.08.004] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/06/2006] [Accepted: 08/23/2006] [Indexed: 11/28/2022]
Abstract
The evolutionary background to the emergence of modern humans remains controversial. Four models have been proposed to explain this process and each has clearly definable and testable predictions about the geographical origins of early Australians and their possible biological interaction with other Pleistocene populations. The present study considers the phenetic affinities of early Australians from Kow Swamp (KS 1 and KS 5) and Keilor to Pleistocene Africans and Asians from calvarial dimensions. The study includes analyses employing log-transformed and size-corrected (Mosimann variables) data. The strongest signals to emerge are as follows: (1) a phenetic pattern in which Australians are most like each other, (2) all three crania possess a mosaic of archaic and modern features, (3) Kow Swamp crania also show strong affinities to archaic remains, (4) Keilor is more modern than KS 1 and KS 5 and (5) Keilor shows affinities to Pleistocene East Asian modern crania (Liujiang and Upper Cave 101) providing evidence for a broad regional morphology. The results refute the predictions of multi-species replacement models for early Australians but are consistent with single-species models. Combined with published evidence from DNA, the present study indicates that the Assimilation model presently offers the best explanation for the origins of Pleistocene Australians.
Collapse
Affiliation(s)
- D Curnoe
- Human Origins Group, School of Medical Sciences, Faculty of Medicine, University of New South Wales, Sydney, NSW 2052, Australia.
| |
Collapse
|
24
|
Baysal BE, Lawrence EC, Ferrell RE. Sequence variation in human succinate dehydrogenase genes: evidence for long-term balancing selection on SDHA. BMC Biol 2007; 5:12. [PMID: 17376234 PMCID: PMC1852088 DOI: 10.1186/1741-7007-5-12] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2006] [Accepted: 03/21/2007] [Indexed: 02/01/2023] Open
Abstract
BACKGROUND Balancing selection operating for long evolutionary periods at a locus is characterized by the maintenance of distinct alleles because of a heterozygote or rare-allele advantage. The loci under balancing selection are distinguished by their unusually high polymorphism levels. In this report, we provide statistical and comparative genetic evidence suggesting that the SDHA gene is under long-term balancing selection. SDHA encodes the major catalytical subunit (flavoprotein, Fp) of the succinate dehydrogenase enzyme complex (SDH; mitochondrial complex II). The inhibition of Fp by homozygous SDHA mutations or by 3-nitropropionic acid poisoning causes central nervous system pathologies. In contrast, heterozygous mutations in SDHB, SDHC, and SDHD, the other SDH subunit genes, cause hereditary paraganglioma (PGL) tumors, which show constitutive activation of pathways induced by oxygen deprivation (hypoxia). RESULTS We sequenced the four SDH subunit genes (10.8 kb) in 24 African American and 24 European American samples. We also sequenced the SDHA gene (2.8 kb) in 18 chimpanzees. Increased nucleotide diversity distinguished the human SDHA gene from its chimpanzee ortholog and from the PGL genes. Sequence analysis uncovered two common SDHA missense variants and refuted the previous suggestions that these variants originate from different genetic loci. Two highly dissimilar SDHA haplotype clusters were present in intermediate frequencies in both racial groups. The SDHA variation pattern showed statistically significant deviations from neutrality by the Tajima, Fu and Li, Hudson-Kreitman-Aguadé, and Depaulis haplotype number tests. Empirically, the elevated values of the nucleotide diversity (% pi = 0.231) and the Tajima statistics (D = 1.954) in the SDHA gene were comparable with the most outstanding cases for balancing selection in the African American population. CONCLUSION The SDHA gene has a strong signature of balancing selection. The SDHA variants that have increased in frequency during human evolution might, by influencing the regulation of cellular oxygen homeostasis, confer protection against certain environmental toxins or pathogens that are prevalent in Africa.
Collapse
Affiliation(s)
- Bora E Baysal
- Department of Obstetrics, Gynecology and Reproductive Sciences, University of Pittsburgh School of Medicine, Pittsburgh, PA 15213, USA
- Department of Human Genetics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA 15213, USA
| | - Elizabeth C Lawrence
- Department of Human Genetics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA 15213, USA
| | - Robert E Ferrell
- Department of Human Genetics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA 15213, USA
| |
Collapse
|
25
|
Schmegner C, Hoegel J, Vogel W, Assum G. A comparison of the variability spectra of two genomic loci in a European group of individuals reveals fundamental differences pointing to selection or a population bottleneck. Ann Hum Genet 2007; 71:370-8. [PMID: 17222291 DOI: 10.1111/j.1469-1809.2006.00342.x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Knowledge about the variability spectra of neutrally evolving sequences in a population is a prerequisite for the identification of genes, which may have been under positive selection during recent human evolution. Here, we report the results of a re-sequencing project of a presumably neutrally evolving chromosome 22 locus with a severely reduced recombination frequency in a group of 24 individuals of German origin. The comparison of these data with the results of a similar analysis of a chromosome 17 locus revealed striking differences, although the same group of individuals was used. For the chromosome 17 locus two well-separated groups of sequences, a positive value of Tajima's D and a TMRCA of 700,000 years were observed. In contrast, the sequences from the chromosome 22 locus were found to be relatively homogeneous, with no deep splits between subgroups; the obtained value for Tajima's D was negative and the TMRCA was only 260,000 years. These discrepancies may be explained by selection or demographic processes. Regarding demography, the most plausible explanation is the assumption of a severe bottleneck in the history of the European population: in the case of the chromosome 17 locus two ancient lineages passed this bottleneck; for the chromosome 22 locus it was only one ancient lineage.
Collapse
Affiliation(s)
- C Schmegner
- Institut für Humangenetik, Universität Ulm, Albert-Einstein-Allee 11, D-89081 Ulm, Germany
| | | | | | | |
Collapse
|
26
|
Marchani EE, Rogers AR, O'Rourke DH. Brief communication: The Thule migration: Rejecting population histories using computer simulation. AMERICAN JOURNAL OF PHYSICAL ANTHROPOLOGY 2007; 134:281-4. [PMID: 17568448 DOI: 10.1002/ajpa.20650] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Locked within our genetic code are the histories of our genes and the genes of our ancestors. Deciphering a population's history from genetic data often involves lengthy investigations of many loci for many individuals. We test hypothetical population histories of the Thule expansion using a new coalescent simulation method that uses little more than mitochondrial haplogroup data. This new methodology rejects a severe bottleneck at expansion and reveals the range of probable population histories on which to focus future research.
Collapse
Affiliation(s)
- E E Marchani
- Department of Anthropology, University of Utah, Salt Lake City, UT 84112, USA.
| | | | | |
Collapse
|
27
|
Seo D, Jiang C, Zhao Z. A novel statistical method to estimate the effective SNP size in vertebrate genomes and categorized genomic regions. BMC Genomics 2006; 7:329. [PMID: 17196097 PMCID: PMC1769377 DOI: 10.1186/1471-2164-7-329] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2006] [Accepted: 12/29/2006] [Indexed: 11/29/2022] Open
Abstract
Background The local environment of single nucleotide polymorphisms (SNPs) contains abundant genetic information for the study of mechanisms of mutation, genome evolution, and causes of diseases. Recent studies revealed that neighboring-nucleotide biases on SNPs were strong and the genome-wide bias patterns could be represented by a small subset of the total SNPs. It remains unsolved for the estimation of the effective SNP size, the number of SNPs that are sufficient to represent the bias patterns observed from the whole SNP data. Results To estimate the effective SNP size, we developed a novel statistical method, SNPKS, which considers both the statistical and biological significances. SNPKS consists of two major steps: to obtain an initial effective size by the Kolmogorov-Smirnov test (KS test) and to find an intermediate effective size by interval evaluation. The SNPKS algorithm was implemented in computer programs and applied to the real SNP data. The effective SNP size was estimated to be 38,200, 39,300, 38,000, and 38,700 in the human, chimpanzee, dog, and mouse genomes, respectively, and 39,100, 39,600, 39,200, and 42,200 in human intergenic, genic, intronic, and CpG island regions, respectively. Conclusion SNPKS is the first statistical method to estimate the effective SNP size. It runs efficiently and greatly outperforms the algorithm implemented in SNPNB. The application of SNPKS to the real SNP data revealed the similar small effective SNP size (38,000 – 42,200) in the human, chimpanzee, dog, and mouse genomes as well as in human genomic regions. The findings suggest strong influence of genetic factors across vertebrate genomes.
Collapse
|
28
|
Directionality of point mutation and 5-methylcytosine deamination rates in the chimpanzee genome. BMC Genomics 2006; 7:316. [PMID: 17166280 PMCID: PMC1764022 DOI: 10.1186/1471-2164-7-316] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2006] [Accepted: 12/13/2006] [Indexed: 12/02/2022] Open
Abstract
Background The pattern of point mutation is important for studying mutational mechanisms, genome evolution, and diseases. Previous studies of mutation direction were largely based on substitution data from a limited number of loci. To date, there is no genome-wide analysis of mutation direction or methylation-dependent transition rates in the chimpanzee or its categorized genomic regions. Results In this study, we performed a detailed examination of mutation direction in the chimpanzee genome and its categorized genomic regions using 588,918 SNPs whose ancestral alleles could be inferred by mapping them to human genome sequences. The C→T (G→A) changes occurred most frequently in the chimpanzee genome. Each type of transition occurred approximately four times more frequently than each type of transversion. Notably, the frequency of C→T (G→A) was the highest in exons among the genomic categories regardless of whether we calculated directly, normalized with the nucleotide content, or removed the SNPs involved in the CpG effect. Moreover, the directionality of the point mutation in exons and CpG islands were opposite relative to their corresponding intergenic regions, indicating that different forces govern the nucleotide changes. Our analysis suggests that the GC content is not in equilibrium in the chimpanzee genome. Further quantitative analysis revealed that the 5-methylcytosine deamination rates at CpG sites were highly dependent on the local GC content and the lengths of SNP flanking sequences and varied among categorized genomic regions. Conclusion We present the first mutational spectrum, estimated by three different approaches, in the chimpanzee genome. Our results provide detailed information on recent nucleotide changes and methylation-dependent transition rates in the chimpanzee genome after its split from the human. These results have important implications for understanding genome composition evolution, mechanisms of point mutation, and other genetic factors such as selection, biased codon usage, biased gene conversion, and recombination.
Collapse
|
29
|
Newman RM, Hall L, Connole M, Chen GL, Sato S, Yuste E, Diehl W, Hunter E, Kaur A, Miller GM, Johnson WE. Balancing selection and the evolution of functional polymorphism in Old World monkey TRIM5alpha. Proc Natl Acad Sci U S A 2006; 103:19134-9. [PMID: 17142324 PMCID: PMC1679755 DOI: 10.1073/pnas.0605838103] [Citation(s) in RCA: 132] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
Retroviral restriction factor TRIM5alpha exhibits a high degree of sequence variation among primate species. It has been proposed that this diversity is the cumulative result of ancient, lineage-specific episodes of positive selection. Here, we describe the contribution of within-species variation to the evolution of TRIM5alpha. Sampling within two geographically distinct Old World monkey species revealed extensive polymorphism, including individual polymorphisms that predate speciation (shared polymorphism). In some instances, alleles were more closely related to orthologues of other species than to one another. Both silent and nonsynonymous changes clustered in two domains. Functional assays revealed consequences of polymorphism, including differential restriction of a small panel of retroviruses by very similar alleles. Together, these features indicate that the primate TRIM5alpha locus has evolved under balancing selection. Except for the MHC there are few, if any, examples of long-term balancing selection in primates. Our results suggest a complex evolutionary scenario, in which fixation of lineage-specific adaptations is superimposed on a subset of critical polymorphisms that predate speciation events and have been maintained by balancing selection for millions of years.
Collapse
Affiliation(s)
- Ruchi M. Newman
- *Department of Microbiology and Molecular Genetics, Harvard Medical School, Southborough, MA 01772
| | - Laura Hall
- *Department of Microbiology and Molecular Genetics, Harvard Medical School, Southborough, MA 01772
| | | | - Guo-Lin Chen
- Neurochemistry, New England Primate Research Center, Harvard Medical School, Southborough, MA 01772; and
| | - Shuji Sato
- *Department of Microbiology and Molecular Genetics, Harvard Medical School, Southborough, MA 01772
| | - Eloisa Yuste
- *Department of Microbiology and Molecular Genetics, Harvard Medical School, Southborough, MA 01772
| | - William Diehl
- *Department of Microbiology and Molecular Genetics, Harvard Medical School, Southborough, MA 01772
- Emory Vaccine Research Center, Emory University, Atlanta, GA 30329
| | - Eric Hunter
- *Department of Microbiology and Molecular Genetics, Harvard Medical School, Southborough, MA 01772
- Emory Vaccine Research Center, Emory University, Atlanta, GA 30329
| | | | - Gregory M. Miller
- Neurochemistry, New England Primate Research Center, Harvard Medical School, Southborough, MA 01772; and
| | - Welkin E. Johnson
- *Department of Microbiology and Molecular Genetics, Harvard Medical School, Southborough, MA 01772
- To whom correspondence should be addressed at:
New England Primate Research Center, One Pine Hill Drive, Box 9102, Southborough, MA 01772-9102. E-mail:
| |
Collapse
|
30
|
Abstract
Analyses of recently acquired genomic sequence data are leading to important insights into the early evolution of anatomically modern humans, as well as into the more recent demographic processes that accompanied the global radiation of Homo sapiens. Some of the new results contradict early, but still influential, conclusions that were based on analyses of gene trees from mitochondrial DNA and Y-chromosome sequences. In this review, we discuss the different genetic and statistical methods that are available for studying human population history, and identify the most plausible models of human evolution that can accommodate the contrasting patterns observed at different loci throughout the genome.
Collapse
Affiliation(s)
- Daniel Garrigan
- Division of Biotechnology, University of Arizona, Tucson, AZ 85721, USA
| | | |
Collapse
|
31
|
Foll M, Gaggiotti O. Identifying the environmental factors that determine the genetic structure of populations. Genetics 2006; 174:875-91. [PMID: 16951078 PMCID: PMC1602080 DOI: 10.1534/genetics.106.059451] [Citation(s) in RCA: 278] [Impact Index Per Article: 15.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The study of population genetic structure is a fundamental problem in population biology because it helps us obtain a deeper understanding of the evolutionary process. One of the issues most assiduously studied in this context is the assessment of the relative importance of environmental factors (geographic distance, language, temperature, altitude, etc.) on the genetic structure of populations. The most widely used method to address this question is the multivariate Mantel test, a nonparametric method that calculates a correlation coefficient between a dependent matrix of pairwise population genetic distances and one or more independent matrices of environmental differences. Here we present a hierarchical Bayesian method that estimates F(ST) values for each local population and relates them to environmental factors using a generalized linear model. The method is demonstrated by applying it to two data sets, a data set for a population of the argan tree and a human data set comprising 51 populations distributed worldwide. We also carry out a simulation study to investigate the performance of the method and find that it can correctly identify the factors that play a role in the structuring of genetic diversity under a wide range of scenarios.
Collapse
Affiliation(s)
- Matthieu Foll
- Laboratoire d'Ecologie Alpine (LECA), UMR CNRS 5553, 38 041 Grenoble Cedex 09, France
| | | |
Collapse
|
32
|
|
33
|
Mountain JL, Ramakrishnan U. Impact of human population history on distributions of individual-level genetic distance. Hum Genomics 2006; 2:4-19. [PMID: 15814064 PMCID: PMC3525116 DOI: 10.1186/1479-7364-2-1-4] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Summaries of human genomic variation shed light on human evolution and provide a framework for biomedical research. Variation is often summarised in terms of one or a few statistics (eg F(ST) and gene diversity). Now that multilocus genotypes for hundreds of autosomal loci are available for thousands of individuals, new approaches are applicable. Recently, trees of individuals and other clustering approaches have demonstrated the power of an individual-focused analysis. We propose analysing the distributions of genetic distances between individuals. Each distribution, or common ancestry profile (CAP), is unique to an individual, and does not require a priori assignment of individuals to populations. Here, we consider a range of models of population history and, using coalescent simulation, reveal the potential insights gained from a set of CAPs. Information lies in the shapes of individual profiles--sometimes captured by variance of individual CAPs--and the variation across profiles. Analysis of short tandem repeat genotype data for over 1,000 individuals from 52 populations is consistent with dramatic differences in population histories across human groups.
Collapse
Affiliation(s)
- Joanna L Mountain
- Department of Anthropological Sciences, Stanford University, Stanford, CA 94305-2117, USA.
| | | |
Collapse
|
34
|
Jiang C, Zhao Z. Mutational spectrum in the recent human genome inferred by single nucleotide polymorphisms. Genomics 2006; 88:527-34. [PMID: 16860534 DOI: 10.1016/j.ygeno.2006.06.003] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2006] [Revised: 06/01/2006] [Accepted: 06/06/2006] [Indexed: 01/09/2023]
Abstract
So far, there is no genome-wide estimation of the mutational spectrum in humans. In this study, we systematically examined the directionality of the point mutations and maintenance of GC content in the human genome using approximately 1.8 million high-quality human single nucleotide polymorphisms and their ancestral sequences in chimpanzees. The frequency of C-->T (G-->A) changes was the highest among all mutation types and the frequency of each type of transition was approximately fourfold that of each type of transversion. In intergenic regions, when the GC content increased, the frequency of changes from G or C increased. In exons, the frequency of G:C-->A:T was the highest among the genomic categories and contributed mainly by the frequent mutations at the CpG sites. In contrast, mutations at the CpG sites, or CpG-->TpG/CpA mutations, occurred less frequently in the CpG islands relative to intergenic regions with similar GC content. Our results suggest that the GC content is overall not in equilibrium in the human genome, with a trend toward shifting the human genome to be AT rich and shifting the GC content of a region to approach the genome average. Our results, which differ from previous estimates based on limited loci or on the rodent lineage, provide the first representative and reliable mutational spectrum in the recent human genome and categorized genomic regions.
Collapse
Affiliation(s)
- Cizhong Jiang
- Virginia Institute for Psychiatric and Behavioral Genetics, Virginia Commonwealth University, Richmond, VA 23298-0126, USA
| | | |
Collapse
|
35
|
Zhao Z, Yu N, Fu YX, Li WH. Nucleotide variation and haplotype diversity in a 10-kb noncoding region in three continental human populations. Genetics 2006; 174:399-409. [PMID: 16783003 PMCID: PMC1569808 DOI: 10.1534/genetics.106.060301] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Noncoding regions are usually less subject to natural selection than coding regions and so may be more useful for studying human evolution. The recent surveys of worldwide DNA variation in four 10-kb noncoding regions revealed many interesting but also some incongruent patterns. Here we studied another 10-kb noncoding region, which is in 6p22. Sixty-six single-nucleotide polymorphisms were found among the 122 worldwide human sequences, resulting in 46 genotypes, from which 48 haplotypes were inferred. The distribution patterns of DNA variation, genotypes, and haplotypes suggest rapid population expansion in relatively recent times. The levels of polymorphism within human populations and divergence between humans and chimpanzees at this locus were generally similar to those for the other four noncoding regions. Fu and Li's tests rejected the neutrality assumption in the total sample and in the African sample but Tajima's test did not reject neutrality. A detailed examination of the contributions of various types of mutations to the parameters used in the neutrality tests clarified the discrepancy between these test results. The age estimates suggest a relatively young history in this region. Combining three autosomal noncoding regions, we estimated the long-term effective population size of humans to be 11,000 +/- 2800 using Tajima's estimator and 17,600 +/- 4700 using Watterson's estimator and the age of the most recent common ancestor to be 860,000 +/- 258,000 years ago.
Collapse
Affiliation(s)
- Zhongming Zhao
- Department of Psychiatry, Virginia Commonwealth University, Richmond, Virginia 23298, USA
| | | | | | | |
Collapse
|
36
|
Eswaran V, Harpending H, Rogers AR. Genomics refutes an exclusively African origin of humans. J Hum Evol 2006; 49:1-18. [PMID: 15878780 DOI: 10.1016/j.jhevol.2005.02.006] [Citation(s) in RCA: 92] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2004] [Revised: 12/14/2004] [Accepted: 02/08/2005] [Indexed: 11/16/2022]
Abstract
Ten years ago, evidence from genetics gave strong support to the "recent African origin" view of the evolution of modern humans, which posits that Homo sapiens arose as a new species in Africa and subsequently spread, leading to the extinction of other archaic human species. Subsequent data from the nuclear genome not only fail to support this model, they do not support any simple model of human demographic history. In this paper, we study a process in which the modern human phenotype originates in Africa and then advances across the world by local demic diffusion, hybridization, and natural selection. While the multiregional model of human origins posits a number of independent single locus selective sweeps, and the "out of Africa" model posits a sweep of a new species, we study the intermediate case of a phenotypic sweep. Numerical simulations of this process replicate many of the seemingly contradictory features of the genetic data, and suggest that as much as 80% of nuclear loci have assimilated genetic material from non-African archaic humans.
Collapse
Affiliation(s)
- Vinayak Eswaran
- Department of Mechanical Engineering, Indian Institute of Technology, Kanpur, India 208016.
| | | | | |
Collapse
|
37
|
Khelifi A, Meunier J, Duret L, Mouchiroud D. GC content evolution of the human and mouse genomes: insights from the study of processed pseudogenes in regions of different recombination rates. J Mol Evol 2006; 62:745-52. [PMID: 16752212 DOI: 10.1007/s00239-005-0186-0] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2005] [Accepted: 02/02/2006] [Indexed: 01/27/2023]
Abstract
Processed pseudogenes are generated by reverse transcription of a functional gene. They are generally nonfunctional after their insertion and, as a consequence, are no longer subjected to the selective constraints associated with functional genes. Because of this property they can be used as neutral markers in molecular evolution. In this work, we investigated the relationship between the evolution of GC content in recently inserted processed pseudogenes and the local recombination pattern in two mammalian genomes (human and mouse). We confirmed, using original markers, that recombination drives GC content in the human genome and we demonstrated that this is also true for the mouse genome despite lower recombination rates. Finally, we discussed the consequences on isochores evolution and the contrast between the human and the mouse pattern.
Collapse
Affiliation(s)
- Adel Khelifi
- Laboratoire de Biométrie et Biologie Evolutive, UMR CNRS 5558, Université Claude Bernard-Lyon 1, 16 rue Raphael Dubois, 69622 Villeurbanne Cedex, France.
| | | | | | | |
Collapse
|
38
|
Carson AR, Cheung J, Scherer SW. Duplication and relocation of the functional DPY19L2 gene within low copy repeats. BMC Genomics 2006; 7:45. [PMID: 16526957 PMCID: PMC1475853 DOI: 10.1186/1471-2164-7-45] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2006] [Accepted: 03/09/2006] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Low copy repeats (LCRs) are thought to play an important role in recent gene evolution, especially when they facilitate gene duplications. Duplicate genes are fundamental to adaptive evolution, providing substrates for the development of new or shared gene functions. Moreover, silencing of duplicate genes can have an indirect effect on adaptive evolution by causing genomic relocation of functional genes. These changes are theorized to have been a major factor in speciation. RESULTS Here we present a novel example showing functional gene relocation within a LCR. We characterize the genomic structure and gene content of eight related LCRs on human Chromosomes 7 and 12. Two members of a novel transmembrane gene family, DPY19L, were identified in these regions, along with six transcribed pseudogenes. One of these genes, DPY19L2, is found on Chromosome 12 and is not syntenic with its mouse orthologue. Instead, the human locus syntenic to mouse Dpy19l2 contains a pseudogene, DPY19L2P1. This indicates that the ancestral copy of this gene has been silenced, while the descendant copy has remained active. Thus, the functional copy of this gene has been relocated to a new genomic locus. We then describe the expansion and evolution of the DPY19L gene family from a single gene found in invertebrate animals. Ancient duplications have led to multiple homologues in different lineages, with three in fish, frogs and birds and four in mammals. CONCLUSION Our results show that the DPY19L family has expanded throughout the vertebrate lineage and has undergone recent primate-specific evolution within LCRs.
Collapse
Affiliation(s)
- Andrew R Carson
- Department of Genetics and Genomic Biology, Hospital for Sick Children, Toronto, Ontario, Canada
- Department of Medical and Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
| | - Joseph Cheung
- Department of Genetics and Genomic Biology, Hospital for Sick Children, Toronto, Ontario, Canada
| | - Stephen W Scherer
- Department of Genetics and Genomic Biology, Hospital for Sick Children, Toronto, Ontario, Canada
- Department of Medical and Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
| |
Collapse
|
39
|
Abstract
Mitochondrial DNA and microsatellite sequences are powerful genetic markers for inferring the genealogy and the population genetic structure of animals but they have only limited resolution for organisms that display low genetic variability due to recent strong bottlenecks. An alternative source of data for deciphering migrations and origins in genetically uniform hosts can be provided by some of their microbes, if their evolutionary history correlates closely with that of the host. In this review, we first discuss how a variety of viruses, and the bacterium Helicobacter pylori, can be used as genetic tracers for one of the most intensively studied species, Homo sapiens. Then, we review statistical problems and limitations that affect the calculation of particular population genetic parameters for these microbes, such as mutation rates, with particular emphasis on the effects of recombination, selection and mode of transmission. Finally, we extend the discussion to other host-parasite systems and advocate the adoption of an integrative approach to both sampling and analysis.
Collapse
Affiliation(s)
- Thierry Wirth
- Department of Biology, Lehrstuhl für Zoologie und Evolutionsbiologie, University Konstanz, 78457 Konstanz, Germany.
| | | | | |
Collapse
|
40
|
Soejima M, Tachida H, Tsuneoka M, Takenaka O, Kimura H, Koda Y. Nucleotide Sequence Analyses of Human Complement 6 (C6) Gene Suggest Balancing Selection. Ann Hum Genet 2005. [DOI: 10.1046/j.1469-1809.2005.00165.x] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
41
|
Schmegner C, Hoegel J, Vogel W, Assum G. Genetic variability in a genomic region with long-range linkage disequilibrium reveals traces of a bottleneck in the history of the European population. Hum Genet 2005; 118:276-86. [PMID: 16184404 DOI: 10.1007/s00439-005-0056-2] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2005] [Accepted: 08/10/2005] [Indexed: 11/28/2022]
Abstract
The inference of the demographic history of populations from genetic variability data is not only of academic interest. It also provides background information for the identification of genes which may have played a role in human evolution or in the aetiology of human disease. To obtain a clear picture of this background, it is necessary to compare data obtained from a number of genomic loci. Due to its very low recombination rate, the NF1 gene region can be regarded as a further suitable locus. A combined resequencing and SNP typing project in a European population disclosed the presence of only two well separated subgroups of NF1 sequences. Statistical analysis revealed a bimodal distribution of the pairwise differences, a positive value of Tajima's D and a TMRCA of 700,000 years for the whole sample, and pairwise differences indicative for a growing population and TMRCAs of 130,000 to 150,000 years for the subgroups. Together, the data lead to a model that the recent European population went through a bottleneck during the last 150,000 years of its history. Regarding the given timeframe, this bottleneck could either reflect a speciation event which led to the anatomically modern human (AMH), or a severe reduction of the population size during the emigration of AMHs out of Africa or the immigration into Europe.
Collapse
Affiliation(s)
- Claudia Schmegner
- Abteilung Humangenetik, Universität Ulm, Albert-Einstein-Allee 11, 89081, Ulm, Germany
| | | | | | | |
Collapse
|
42
|
Cutter AD. Nucleotide polymorphism and linkage disequilibrium in wild populations of the partial selfer Caenorhabditis elegans. Genetics 2005; 172:171-84. [PMID: 16272415 PMCID: PMC1456145 DOI: 10.1534/genetics.105.048207] [Citation(s) in RCA: 131] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
An understanding of the relative contributions of different evolutionary forces on an organism's genome requires an accurate description of the patterns of genetic variation within and between natural populations. To this end, I report a survey of nucleotide polymorphism in six loci from 118 strains of the nematode Caenorhabditis elegans. These strains derive from wild populations of several regions within France, Germany, and new localities in Scotland, in addition to stock center isolates. Overall levels of silent-site diversity are low within and between populations of this self-fertile species, averaging 0.2% in European samples and 0.3% worldwide. Population structure is present despite a lack of association of sequences with geography, and migration appears to occur at all geographic scales. Linkage disequilibrium is extensive in the C. elegans genome, extending even between chromosomes. Nevertheless, recombination is clearly present in the pattern of polymorphisms, indicating that outcrossing is an infrequent, but important, feature in this species ancestry. The range of outcrossing rates consistent with the data is inferred from linkage disequilibrium, using "scattered" samples representing the collecting phase of the coalescent process in a subdivided population. I propose that genetic variation in this species is shaped largely by population subdivision due to self-fertilization coupled with long- and short-range migration between subpopulations.
Collapse
Affiliation(s)
- Asher D Cutter
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh EH9 3JT, United Kingdom
| |
Collapse
|
43
|
Barreiro LB, Patin E, Neyrolles O, Cann HM, Gicquel B, Quintana-Murci L. The heritage of pathogen pressures and ancient demography in the human innate-immunity CD209/CD209L region. Am J Hum Genet 2005; 77:869-86. [PMID: 16252244 PMCID: PMC1271393 DOI: 10.1086/497613] [Citation(s) in RCA: 68] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2005] [Accepted: 08/26/2005] [Indexed: 10/26/2022] Open
Abstract
The innate immunity system constitutes the first line of host defense against pathogens. Two closely related innate immunity genes, CD209 and CD209L, are particularly interesting because they directly recognize a plethora of pathogens, including bacteria, viruses, and parasites. Both genes, which result from an ancient duplication, possess a neck region, made up of seven repeats of 23 amino acids each, known to play a major role in the pathogen-binding properties of these proteins. To explore the extent to which pathogens have exerted selective pressures on these innate immunity genes, we resequenced them in a group of samples from sub-Saharan Africa, Europe, and East Asia. Moreover, variation in the number of repeats of the neck region was defined in the entire Human Genome Diversity Panel for both genes. Our results, which are based on diversity levels, neutrality tests, population genetic distances, and neck-region length variation, provide genetic evidence that CD209 has been under a strong selective constraint that prevents accumulation of any amino acid changes, whereas CD209L variability has most likely been shaped by the action of balancing selection in non-African populations. In addition, our data point to the neck region as the functional target of such selective pressures: CD209 presents a constant size in the neck region populationwide, whereas CD209L presents an excess of length variation, particularly in non-African populations. An additional interesting observation came from the coalescent-based CD209 gene tree, whose binary topology and time depth (approximately 2.8 million years ago) are compatible with an ancestral population structure in Africa. Altogether, our study has revealed that even a short segment of the human genome can uncover an extraordinarily complex evolutionary history, including different pathogen pressures on host genes as well as traces of admixture among archaic hominid populations.
Collapse
Affiliation(s)
- Luis B Barreiro
- Centre National de la Recherche Scientifique FRE 2849, Unit of Molecular Prevention and Therapy of Human Diseases, Institut Pasteur, 25, 75724 Paris Cedex 15, France
| | | | | | | | | | | |
Collapse
|
44
|
|
45
|
Tarazona-Santos E, Tishkoff SA. Divergent patterns of linkage disequilibrium and haplotype structure across global populations at the interleukin-13 (IL13) locus. Genes Immun 2005; 6:53-65. [PMID: 15602587 DOI: 10.1038/sj.gene.6364149] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Interleukin-13 (IL-13) is a cytokine involved in Th2 immune response, which plays a role in susceptibility to infection by extracellular parasites as well as complex diseases of the immune system such as asthma and allergies. To determine the pattern of genetic diversity at the IL13 gene, we sequenced 3950 bp encompassing the IL13 gene and its promoter in 264 chromosomes from individuals originating from East and West Africa, Europe, China and South America. Thirty-one single-nucleotide polymorphisms (SNPs) arranged in 88 haplotypes were indentified, including the nonsynonymous substitution Arg130Gln in exon 4, which differs in frequency across ethnic groups. We show that genetic diversity and linkage disequilibrium (LD) are not evenly distributed across the gene and that sites in the 5' and 3' regions of the gene show strong differentiation among continental groups. We observe a divergent pattern of haplotype variation and LD across geographic regions and we identify a set of htSNPs that will be useful for functional genetic association studies of complex disease. We use several statistical tests to distinguish the effects of natural selection and demographic history on patterns of genetic diversity at the IL13 locus.
Collapse
Affiliation(s)
- E Tarazona-Santos
- Department of Biology, University of Maryland, College Park, MD 80742, USA
| | | |
Collapse
|
46
|
Garrigan D, Mobasher Z, Kingan SB, Wilder JA, Hammer MF. Deep haplotype divergence and long-range linkage disequilibrium at xp21.1 provide evidence that humans descend from a structured ancestral population. Genetics 2005; 170:1849-56. [PMID: 15937130 PMCID: PMC1449746 DOI: 10.1534/genetics.105.041095] [Citation(s) in RCA: 60] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Fossil evidence links human ancestry with populations that evolved from modern gracile morphology in Africa 130,000-160,000 years ago. Yet fossils alone do not provide clear answers to the question of whether the ancestors of all modern Homo sapiens comprised a single African population or an amalgamation of distinct archaic populations. DNA sequence data have consistently supported a single-origin model in which anatomically modern Africans expanded and completely replaced all other archaic hominin populations. Aided by a novel experimental design, we present the first genetic evidence that statistically rejects the null hypothesis that our species descends from a single, historically panmictic population. In a global sample of 42 X chromosomes, two African individuals carry a lineage of noncoding 17.5-kb sequence that has survived for >1 million years without any clear traces of ongoing recombination with other lineages at this locus. These patterns of deep haplotype divergence and long-range linkage disequilibrium are best explained by a prolonged period of ancestral population subdivision followed by relatively recent interbreeding. This inference supports human evolution models that incorporate admixture between divergent African branches of the genus Homo.
Collapse
Affiliation(s)
- Daniel Garrigan
- Genomic Analysis and Technology Core, University of Arizona, Tucson, Arizona 85721, USA
| | | | | | | | | |
Collapse
|
47
|
Zhang F, Zhao Z. The influence of neighboring-nucleotide composition on single nucleotide polymorphisms (SNPs) in the mouse genome and its comparison with human SNPs. Genomics 2005; 84:785-95. [PMID: 15475257 DOI: 10.1016/j.ygeno.2004.06.015] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2004] [Accepted: 06/28/2004] [Indexed: 11/23/2022]
Abstract
We analyzed the neighboring-nucleotide composition of 433,192 biallelic substitutions, representing the largest public collection of SNPs across the mouse genome. Large neighboring-nucleotide biases relative to the genome- or chromosome-specific average were observed at the immediate adjacent sites and small biases extended farther from the substitution site. For all substitutions, the biases for A, C, G, and T were 0.21, 2.63, 0.71, and -3.55%, respectively, on the immediate adjacent 5' site and -3.67, 0.75, 2.69, and 0.23%, respectively, on the immediate adjacent 3' side. Further examination of the six categories of substitution revealed that the neighboring-nucleotide patterns for transitions were strongly influenced by the hypermutability of dinucleotide CpG and the neighboring effects on transversions were complex. Probability of a transversion increased with increasing A + T content of the two immediate adjacent sites, which was similarly observed in the human and Arabidopsis genomes. Overall, the bias patterns for the neighboring nucleotides in the mouse and human genomes were essentially the same; however, the extent of the biases was notably less in mice. Our results provide the first comprehensive view of the neighboring-nucleotide effects in the mouse genome and are important for understanding the mutational mechanisms and sequence evolution in the mammalian genomes.
Collapse
Affiliation(s)
- Fengkai Zhang
- Virginia Institute for Psychiatric and Behavioral Genetics, Virginia Commonwealth University, PO Box 980126, Richmond, VA 23298-0126, USA
| | | |
Collapse
|
48
|
Harding RM, McVean G. A structured ancestral population for the evolution of modern humans. Curr Opin Genet Dev 2005; 14:667-74. [PMID: 15531162 DOI: 10.1016/j.gde.2004.08.010] [Citation(s) in RCA: 55] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The view that modern humans evolved through a bottleneck from a single founding group of archaic Homo is being challenged by new analyses of contemporary genetic variation. A wide range of middle to late Pleistocene ages for gene genealogies and evidence for early population structures point to a diverse and scattered ancestry associated with a metapopulation history of local extinctions, re-colonization and admixture. A different balance of the same processes has shaped chimpanzee diversity.
Collapse
Affiliation(s)
- Rosalind M Harding
- Biological Anthropology Unit and Statistics Department, University of Oxford, 1 South Parks Road, Oxford OX1 3TG, UK.
| | | |
Collapse
|
49
|
de Groot NG, Garcia CA, Verschoor EJ, Doxiadis GGM, Marsh SGE, Otting N, Bontrop RE. Reduced MIC gene repertoire variation in West African chimpanzees as compared to humans. Mol Biol Evol 2005; 22:1375-85. [PMID: 15758205 DOI: 10.1093/molbev/msi127] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
Abstract
The human major histocompatibility complex class I chain-related (MIC) genes are members of a multicopy family showing similarity to the classical HLA-A, HLA-B, and HLA-C genes. Only the MICA and MICB genes produce functional transcripts. In chimpanzees, however, only one MIC gene is expressed, showing an intermediate character, resulting from a deletion fusing the MICA and MICB gene segments together. The present population study illustrates that all chimpanzee haplotypes sampled possess the hybrid MICA/B gene. In contrast to the human situation this gene displays reduced allelic variation. The observed repertoire reduction of the chimpanzee MICA/B gene is in conformity with the severe repertoire condensation documented for Patr-B locus lineages, probably due to the close proximity of both genes.
Collapse
Affiliation(s)
- Natasja G de Groot
- Department of Comparative Genetics and Refinement, Biomedical Primate Research Centre, Rijswijk, The Netherlands.
| | | | | | | | | | | | | |
Collapse
|
50
|
Nachman MW, D'Agostino SL, Tillquist CR, Mobasher Z, Hammer MF. Nucleotide variation at Msn and Alas2, two genes flanking the centromere of the X chromosome in humans. Genetics 2005; 167:423-37. [PMID: 15166166 PMCID: PMC1470878 DOI: 10.1534/genetics.167.1.423] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The centromeric region of the X chromosome in humans experiences low rates of recombination over a considerable physical distance. In such a region, the effects of selection may extend to linked sites that are far away. To investigate the effects of this recombinational environment on patterns of nucleotide variability, we sequenced 4581 bp at Msn and 4697 bp at Alas2, two genes situated on either side of the X chromosome centromere, in a worldwide sample of 41 men, as well as in one common chimpanzee and one orangutan. To investigate patterns of linkage disequilibrium (LD) across the centromere, we also genotyped several informative sites from each gene in 120 men from sub-Saharan Africa. By studying X-linked loci in males, we were able to recover haplotypes and study long-range patterns of LD directly. Overall patterns of variability were remarkably similar at these two loci. Both loci exhibited (i) very low levels of nucleotide diversity (among the lowest seen in the human genome); (ii) a strong skew in the distribution of allele frequencies, with an excess of both very-low and very-high-frequency derived alleles in non-African populations; (iii) much less variation in the non-African than in the African samples; (iv) very high levels of population differentiation; and (v) complete LD among all sites within loci. We also observed significant LD between Msn and Alas2 in Africa, despite the fact that they are separated by approximately 10 Mb. These observations are difficult to reconcile with a simple demographic model but may be consistent with positive and/or purifying selection acting on loci within this large region of low recombination.
Collapse
Affiliation(s)
- Michael W Nachman
- Department of Ecology and Evolutionary Biology, Division of Biotechnology, University of Arizona, Tucson, Arizona 85721, USA.
| | | | | | | | | |
Collapse
|