1
|
Jordan IK, Rogozin IB, Glazko GV, Koonin EV. Origin of a substantial fraction of human regulatory sequences from transposable elements. Trends Genet 2003; 19:68-72. [PMID: 12547512 DOI: 10.1016/s0168-9525(02)00006-9] [Citation(s) in RCA: 414] [Impact Index Per Article: 18.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
Transposable elements (TEs) are abundant in mammalian genomes and have potentially contributed to their hosts' evolution by providing novel regulatory or coding sequences. We surveyed different classes of regulatory region in the human genome to assess systematically the potential contribution of TEs to gene regulation. Almost 25% of the analyzed promoter regions contain TE-derived sequences, including many experimentally characterized cis-regulatory elements. Scaffold/matrix attachment regions (S/MARs) and locus control regions (LCRs) that are involved in the simultaneous regulation of multiple genes also contain numerous TE-derived sequences. Thus, TEs have probably contributed substantially to the evolution of both gene-specific and global patterns of human gene regulation.
Collapse
|
Review |
22 |
414 |
2
|
Jjingo D, Conley AB, Yi SV, Lunyak VV, Jordan IK. On the presence and role of human gene-body DNA methylation. Oncotarget 2012; 3:462-74. [PMID: 22577155 PMCID: PMC3380580 DOI: 10.18632/oncotarget.497] [Citation(s) in RCA: 366] [Impact Index Per Article: 28.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2012] [Accepted: 04/28/2012] [Indexed: 01/23/2023] Open
Abstract
DNA methylation of promoter sequences is a repressive epigenetic mark that down-regulates gene expression. However, DNA methylation is more prevalent within gene-bodies than seen for promoters, and gene-body methylation has been observed to be positively correlated with gene expression levels. This paradox remains unexplained, and accordingly the role of DNA methylation in gene-bodies is poorly understood. We addressed the presence and role of human gene-body DNA methylation using a meta-analysis of human genome-wide methylation, expression and chromatin data sets. Methylation is associated with transcribed regions as genic sequences have higher levels of methylation than intergenic or promoter sequences. We also find that the relationship between gene-body DNA methylation and expression levels is non-monotonic and bell-shaped. Mid-level expressed genes have the highest levels of gene-body methylation, whereas the most lowly and highly expressed sets of genes both have low levels of methylation. While gene-body methylation can be seen to efficiently repress the initiation of intragenic transcription, the vast majority of methylated sites within genes are not associated with intragenic promoters. In fact, highly expressed genes initiate the most intragenic transcription inconsistent with the previously held notion that gene-body methylation serves to repress spurious intragenic transcription to allow for efficient transcriptional elongation. These observations lead us to propose a model to explain the presence of human gene-body methylation. This model holds that the repression of intragenic transcription by gene-body methylation is largely epiphenomenal, and suggests that gene-body methylation levels are predominantly shaped via the accessibility of the DNA to methylating enzyme complexes.
Collapse
|
Meta-Analysis |
13 |
366 |
3
|
Jordan IK, Rogozin IB, Wolf YI, Koonin EV. Essential genes are more evolutionarily conserved than are nonessential genes in bacteria. Genome Res 2002; 12:962-8. [PMID: 12045149 PMCID: PMC1383730 DOI: 10.1101/gr.87702] [Citation(s) in RCA: 345] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
The "knockout-rate" prediction holds that essential genes should be more evolutionarily conserved than are nonessential genes. This is because negative (purifying) selection acting on essential genes is expected to be more stringent than that for nonessential genes, which are more functionally dispensable and/or redundant. However, a recent survey of evolutionary distances between Saccharomyces cerevisiae and Caenorhabditis elegans proteins did not reveal any difference between the rates of evolution for essential and nonessential genes. An analysis of mouse and rat orthologous genes also found that essential and nonessential genes evolved at similar rates when genes thought to evolve under directional selection were excluded from the analysis. In the present study, we combine genomic sequence data with experimental knockout data to compare the rates of evolution and the levels of selection for essential versus nonessential bacterial genes. In contrast to the results obtained for eukaryotic genes, essential bacterial genes appear to be more conserved than are nonessential genes over both relatively short (microevolutionary) and longer (macroevolutionary) time scales.
Collapse
|
research-article |
23 |
345 |
4
|
Piriyapongsa J, Mariño-Ramírez L, Jordan IK. Origin and evolution of human microRNAs from transposable elements. Genetics 2007; 176:1323-37. [PMID: 17435244 PMCID: PMC1894593 DOI: 10.1534/genetics.107.072553] [Citation(s) in RCA: 259] [Impact Index Per Article: 14.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
We sought to evaluate the extent of the contribution of transposable elements (TEs) to human microRNA (miRNA) genes along with the evolutionary dynamics of TE-derived human miRNAs. We found 55 experimentally characterized human miRNA genes that are derived from TEs, and these TE-derived miRNAs have the potential to regulate thousands of human genes. Sequence comparisons revealed that TE-derived human miRNAs are less conserved, on average, than non-TE-derived miRNAs. However, there are 18 TE-derived miRNAs that are relatively conserved, and 14 of these are related to the ancient L2 and MIR families. Comparison of miRNA vs. mRNA expression patterns for TE-derived miRNAs and their putative target genes showed numerous cases of anti-correlated expression that are consistent with regulation via mRNA degradation. In addition to the known human miRNAs that we show to be derived from TE sequences, we predict an additional 85 novel TE-derived miRNA genes. TE sequences are typically disregarded in genomic surveys for miRNA genes and target sites; this is a mistake. Our results indicate that TEs provide a natural mechanism for the origination miRNAs that can contribute to regulatory divergence between species as well as a rich source for the discovery of as yet unknown miRNA genes.
Collapse
|
Research Support, Non-U.S. Gov't |
18 |
259 |
5
|
Piriyapongsa J, Jordan IK. A family of human microRNA genes from miniature inverted-repeat transposable elements. PLoS One 2007; 2:e203. [PMID: 17301878 PMCID: PMC1784062 DOI: 10.1371/journal.pone.0000203] [Citation(s) in RCA: 247] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2006] [Accepted: 01/21/2007] [Indexed: 12/26/2022] Open
Abstract
While hundreds of novel microRNA (miRNA) genes have been discovered in the last few years alone, the origin and evolution of these non-coding regulatory sequences remain largely obscure. In this report, we demonstrate that members of a recently discovered family of human miRNA genes, hsa-mir-548, are derived from Made1 transposable elements. Made1 elements are short miniature inverted-repeat transposable elements (MITEs), which consist of two 37 base pair (bp) terminal inverted repeats that flank 6 bp of internal sequence. Thus, Made1 elements are nearly perfect palindromes, and when expressed as RNA they form highly stable hairpin loops. Apparently, these Made1-related structures are recognized by the RNA interference enzymatic machinery and processed to form 22 bp mature miRNA sequences. Consistent with their origin from MITEs, hsa-mir-548 genes are primate-specific and have many potential paralogs in the human genome. There are more than 3,500 putative hsa-mir-548 target genes; analysis of their expression profiles and functional affinities suggests cancer-related regulatory roles for hsa-mir-548. Taken together, the characteristics of Made1 elements, and MITEs in general, point to a specific mechanism for the generation of numerous small regulatory RNAs and target sites throughout the genome. The evolutionary lineage-specific nature of MITEs could also provide for the generation of novel regulatory phenotypes related to species diversification. Finally, we propose that MITEs may represent an evolutionary link between siRNAs and miRNAs.
Collapse
|
Research Support, Non-U.S. Gov't |
18 |
247 |
6
|
Jordan IK, Kondrashov FA, Adzhubei IA, Wolf YI, Koonin EV, Kondrashov AS, Sunyaev S. A universal trend of amino acid gain and loss in protein evolution. Nature 2005; 433:633-8. [PMID: 15660107 DOI: 10.1038/nature03306] [Citation(s) in RCA: 233] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2004] [Accepted: 12/15/2004] [Indexed: 11/08/2022]
Abstract
Amino acid composition of proteins varies substantially between taxa and, thus, can evolve. For example, proteins from organisms with (G + C)-rich (or (A + T)-rich) genomes contain more (or fewer) amino acids encoded by (G + C)-rich codons. However, no universal trends in ongoing changes of amino acid frequencies have been reported. We compared sets of orthologous proteins encoded by triplets of closely related genomes from 15 taxa representing all three domains of life (Bacteria, Archaea and Eukaryota), and used phylogenies to polarize amino acid substitutions. Cys, Met, His, Ser and Phe accrue in at least 14 taxa, whereas Pro, Ala, Glu and Gly are consistently lost. The same nine amino acids are currently accrued or lost in human proteins, as shown by analysis of non-synonymous single-nucleotide polymorphisms. All amino acids with declining frequencies are thought to be among the first incorporated into the genetic code; conversely, all amino acids with increasing frequencies, except Ser, were probably recruited late. Thus, expansion of initially under-represented amino acids, which began over 3,400 million years ago, apparently continues to this day.
Collapse
|
|
20 |
233 |
7
|
Dunn J, Qiu H, Kim S, Jjingo D, Hoffman R, Kim CW, Jang I, Son DJ, Kim D, Pan C, Fan Y, Jordan IK, Jo H. Flow-dependent epigenetic DNA methylation regulates endothelial gene expression and atherosclerosis. J Clin Invest 2014; 124:3187-99. [PMID: 24865430 PMCID: PMC4071393 DOI: 10.1172/jci74792] [Citation(s) in RCA: 233] [Impact Index Per Article: 21.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2013] [Accepted: 03/28/2014] [Indexed: 12/17/2022] Open
Abstract
In atherosclerosis, plaques preferentially develop in arterial regions of disturbed blood flow (d-flow), which alters endothelial gene expression and function. Here, we determined that d-flow regulates genome-wide DNA methylation patterns in a DNA methyltransferase-dependent (DNMT-dependent) manner. Induction of d-flow by partial carotid ligation surgery in a murine model induced DNMT1 in arterial endothelium. In cultured endothelial cells, DNMT1 was enhanced by oscillatory shear stress (OS), and reduction of DNMT with either the inhibitor 5-aza-2'-deoxycytidine (5Aza) or siRNA markedly reduced OS-induced endothelial inflammation. Moreover, administration of 5Aza reduced lesion formation in 2 mouse models of atherosclerosis. Using both reduced representation bisulfite sequencing (RRBS) and microarray, we determined that d-flow in the carotid artery resulted in hypermethylation within the promoters of 11 mechanosensitive genes and that 5Aza treatment restored normal methylation patterns. Of the identified genes, HoxA5 and Klf3 encode transcription factors that contain cAMP response elements, suggesting that the methylation status of these loci could serve as a mechanosensitive master switch in gene expression. Together, our results demonstrate that d-flow controls epigenomic DNA methylation patterns in a DNMT-dependent manner, which in turn alters endothelial gene expression and induces atherosclerosis.
Collapse
MESH Headings
- Animals
- Apolipoproteins E/deficiency
- Apolipoproteins E/genetics
- Atherosclerosis/genetics
- Atherosclerosis/metabolism
- Atherosclerosis/physiopathology
- Azacitidine/analogs & derivatives
- Azacitidine/pharmacology
- DNA (Cytosine-5-)-Methyltransferase 1
- DNA (Cytosine-5-)-Methyltransferases/antagonists & inhibitors
- DNA (Cytosine-5-)-Methyltransferases/genetics
- DNA (Cytosine-5-)-Methyltransferases/metabolism
- DNA Methylation
- Decitabine
- Disease Models, Animal
- Endothelial Cells/drug effects
- Endothelial Cells/metabolism
- Epigenesis, Genetic
- Gene Expression Regulation
- Homeodomain Proteins/genetics
- Human Umbilical Vein Endothelial Cells
- Humans
- Kruppel-Like Transcription Factors/genetics
- Mice
- Mice, Inbred C57BL
- Mice, Knockout
- Phosphoproteins/genetics
- Plaque, Atherosclerotic/etiology
- Plaque, Atherosclerotic/genetics
- Plaque, Atherosclerotic/physiopathology
- Promoter Regions, Genetic
- RNA, Messenger/genetics
- RNA, Messenger/metabolism
- Regional Blood Flow
- Stress, Mechanical
- Transcription Factors
Collapse
|
Research Support, N.I.H., Extramural |
11 |
233 |
8
|
Piriyapongsa J, Jordan IK. Dual coding of siRNAs and miRNAs by plant transposable elements. RNA (NEW YORK, N.Y.) 2008; 14:814-21. [PMID: 18367716 PMCID: PMC2327354 DOI: 10.1261/rna.916708] [Citation(s) in RCA: 195] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/10/2007] [Accepted: 02/15/2008] [Indexed: 05/18/2023]
Abstract
We recently proposed a specific model whereby miRNAs encoded from short nonautonomous DNA-type TEs known as MITEs evolved from corresponding ancestral full-length (autonomous) elements that originally encoded short interfering (siRNAs). Our miRNA-origins model predicts that evolutionary intermediates may exist as TEs that encode both siRNAs and miRNAs, and we analyzed Arabidopsis thaliana and Oryza sativa (rice) genomic sequence and expression data to test this prediction. We found a number of examples of individual plant TE insertions that encode both siRNAs and miRNAs. We show evidence that these dual coding TEs can be expressed as readthrough transcripts from the intronic regions of spliced RNA messages. These TE transcripts can fold to form the hairpin (stem-loop) structures characteristic of miRNA genes along with longer double-stranded RNA regions that typically are processed as siRNAs. Taken together with a recent study showing Drosha independent processing of miRNAs from Drosophila introns, our results indicate that ancestral miRNAs could have evolved from TEs prior to the full elaboration of the miRNA biogenesis pathway. Later, as the specific miRNA biogenesis pathway evolved, and numerous other expressed inverted repeat regions came to be recognized by the miRNA processing endonucleases, the host gene-related regulatory functions of miRNAs emerged. In this way, host genomes were afforded an additional level of regulatory complexity as a by-product of TE defense mechanisms. The siRNA-to-miRNA evolutionary transition is representative of a number of other regulatory mechanisms that evolved to silence TEs and were later co-opted to serve as regulators of host gene expression.
Collapse
MESH Headings
- Arabidopsis/genetics
- Base Sequence
- Computational Biology
- DNA Transposable Elements/genetics
- DNA, Plant/chemistry
- DNA, Plant/genetics
- Evolution, Molecular
- Genes, Plant
- MicroRNAs/chemistry
- MicroRNAs/genetics
- Models, Genetic
- Models, Molecular
- Molecular Sequence Data
- Nucleic Acid Conformation
- Oryza/genetics
- Plants/genetics
- RNA, Plant/chemistry
- RNA, Plant/genetics
- RNA, Small Interfering/chemistry
- RNA, Small Interfering/genetics
Collapse
|
research-article |
17 |
195 |
9
|
Jordan IK, Wolf YI, Koonin EV. No simple dependence between protein evolution rate and the number of protein-protein interactions: only the most prolific interactors tend to evolve slowly. BMC Evol Biol 2003; 3:1. [PMID: 12515583 PMCID: PMC140311 DOI: 10.1186/1471-2148-3-1] [Citation(s) in RCA: 165] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2002] [Accepted: 01/06/2003] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND It has been suggested that rates of protein evolution are influenced, to a great extent, by the proportion of amino acid residues that are directly involved in protein function. In agreement with this hypothesis, recent work has shown a negative correlation between evolutionary rates and the number of protein-protein interactions. However, the extent to which the number of protein-protein interactions influences evolutionary rates remains unclear. Here, we address this question at several different levels of evolutionary relatedness. RESULTS Manually curated data on the number of protein-protein interactions among Saccharomyces cerevisiae proteins was examined for possible correlation with evolutionary rates between S. cerevisiae and Schizosaccharomyces pombe orthologs. Only a very weak negative correlation between the number of interactions and evolutionary rate of a protein was observed. Furthermore, no relationship was found between a more general measure of the evolutionary conservation of S. cerevisiae proteins, based on the taxonomic distribution of their homologs, and the number of protein-protein interactions. However, when the proteins from yeast were assorted into discrete bins according to the number of interactions, it turned out that 6.5% of the proteins with the greatest number of interactions evolved, on average, significantly slower than the rest of the proteins. Comparisons were also performed using protein-protein interaction data obtained with high-throughput analysis of Helicobacter pylori proteins. No convincing relationship between the number of protein-protein interactions and evolutionary rates was detected, either for comparisons of orthologs from two completely sequenced H. pylori strains or for comparisons of H. pylori and Campylobacter jejuni orthologs, even when the proteins were classified into bins by the number of interactions. CONCLUSION The currently available comparative-genomic data do not support the hypothesis that the evolutionary rates of the majority of proteins substantially depend on the number of protein-protein interactions they are involved in. However, a small fraction of yeast proteins with the largest number of interactions (the hubs of the interaction network) tend to evolve slower than the bulk of the proteins.
Collapse
|
Comparative Study |
22 |
165 |
10
|
Jordan IK, Mariño-Ramírez L, Wolf YI, Koonin EV. Conservation and coevolution in the scale-free human gene coexpression network. Mol Biol Evol 2004; 21:2058-70. [PMID: 15282333 DOI: 10.1093/molbev/msh222] [Citation(s) in RCA: 153] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
The role of natural selection in biology is well appreciated. Recently, however, a critical role for physical principles of network self-organization in biological systems has been revealed. Here, we employ a systems level view of genome-scale sequence and expression data to examine the interplay between these two sources of order, natural selection and physical self-organization, in the evolution of human gene regulation. The topology of a human gene coexpression network, derived from tissue-specific expression profiles, shows scale-free properties that imply evolutionary self-organization via preferential node attachment. Genes with numerous coexpressed partners (the hubs of the coexpression network) evolve more slowly on average than genes with fewer coexpressed partners, and genes that are coexpressed show similar rates of evolution. Thus, the strength of selective constraints on gene sequences is affected by the topology of the gene coexpression network. This connection is strong for the coding regions and 3' untranslated regions (UTRs), but the 5' UTRs appear to evolve under a different regime. Surprisingly, we found no connection between the rate of gene sequence divergence and the extent of gene expression profile divergence between human and mouse. This suggests that distinct modes of natural selection might govern sequence versus expression divergence, and we propose a model, based on rapid, adaptation-driven divergence and convergent evolution of gene expression patterns, for how natural selection could influence gene expression divergence.
Collapse
|
Journal Article |
21 |
153 |
11
|
Jordan IK, Wolf YI, Koonin EV. Duplicated genes evolve slower than singletons despite the initial rate increase. BMC Evol Biol 2004; 4:22. [PMID: 15238160 PMCID: PMC481058 DOI: 10.1186/1471-2148-4-22] [Citation(s) in RCA: 139] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2004] [Accepted: 07/06/2004] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Gene duplication is an important mechanism that can lead to the emergence of new functions during evolution. The impact of duplication on the mode of gene evolution has been the subject of several theoretical and empirical comparative-genomic studies. It has been shown that, shortly after the duplication, genes seem to experience a considerable relaxation of purifying selection. RESULTS Here we demonstrate two opposite effects of gene duplication on evolutionary rates. Sequence comparisons between paralogs show that, in accord with previous observations, a substantial acceleration in the evolution of paralogs occurs after duplication, presumably due to relaxation of purifying selection. The effect of gene duplication on evolutionary rate was also assessed by sequence comparison between orthologs that have paralogs (duplicates) and those that do not (singletons). It is shown that, in eukaryotes, duplicates, on average, evolve significantly slower than singletons. Eukaryotic ortholog evolutionary rates for duplicates are also negatively correlated with the number of paralogs per gene and the strength of selection between paralogs. A tally of annotated gene functions shows that duplicates tend to be enriched for proteins with known functions, particularly those involved in signaling and related cellular processes; by contrast, singletons include an over-abundance of poorly characterized proteins. CONCLUSIONS These results suggest that whether or not a gene duplicate is retained by selection depends critically on the pre-existing functional utility of the protein encoded by the ancestral singleton. Duplicates of genes of a higher biological import, which are subject to strong functional constraints on the sequence, are retained relatively more often. Thus, the evolutionary trajectory of duplicated genes appears to be determined by two opposing trends, namely, the post-duplication rate acceleration and the generally slow evolutionary rate owing to the high level of functional constraints.
Collapse
MESH Headings
- Animals
- Base Composition/genetics
- DNA/genetics
- DNA, Archaeal/genetics
- DNA, Bacterial/genetics
- Evolution, Molecular
- Genes/genetics
- Genes/physiology
- Genes, Archaeal/genetics
- Genes, Archaeal/physiology
- Genes, Bacterial/genetics
- Genes, Bacterial/physiology
- Genes, Duplicate/genetics
- Genes, Duplicate/physiology
- Genes, Fungal/genetics
- Genes, Fungal/physiology
- Genes, Insect/genetics
- Genes, Insect/physiology
- Gram-Negative Bacteria/genetics
- Gram-Positive Bacteria/genetics
- Humans
- Mice
- Mutation/genetics
- Sequence Homology, Nucleic Acid
Collapse
|
Journal Article |
21 |
139 |
12
|
Bick AG, Metcalf GA, Mayo KR, Lichtenstein L, Rura S, Carroll RJ, Musick A, Linder JE, Jordan IK, Nagar SD, Sharma S, Meller R, Basford M, Boerwinkle E, Cicek MS, Doheny KF, Eichler EE, Gabriel S, Gibbs RA, Glazer D, Harris PA, Jarvik GP, Philippakis A, Rehm HL, Roden DM, Thibodeau SN, Topper S, Blegen AL, Wirkus SJ, Wagner VA, Meyer JG, Cicek MS, Muzny DM, Venner E, Mawhinney MZ, Griffith SML, Hsu E, Ling H, Adams MK, Walker K, Hu J, Doddapaneni H, Kovar CL, Murugan M, Dugan S, Khan Z, Boerwinkle E, Lennon NJ, Austin-Tse C, Banks E, Gatzen M, Gupta N, Henricks E, Larsson K, McDonough S, Harrison SM, Kachulis C, Lebo MS, Neben CL, Steeves M, Zhou AY, Smith JD, Frazar CD, Davis CP, Patterson KE, Wheeler MM, McGee S, Lockwood CM, Shirts BH, Pritchard CC, Murray ML, Vasta V, Leistritz D, Richardson MA, Buchan JG, Radhakrishnan A, Krumm N, Ehmen BW, Schwartz S, Aster MMT, Cibulskis K, Haessly A, Asch R, Cremer A, Degatano K, Shergill A, Gauthier LD, Lee SK, Hatcher A, Grant GB, Brandt GR, Covarrubias M, Banks E, Able A, Green AE, Carroll RJ, Zhang J, Condon HR, Wang Y, Dillon MK, Albach CH, Baalawi W, Choi SH, Wang X, Rosenthal EA, Ramirez AH, Lim S, Nambiar S, Ozenberger B, Wise AL, Lunt C, Ginsburg GS, Denny JC. Genomic data in the All of Us Research Program. Nature 2024; 627:340-346. [PMID: 38374255 PMCID: PMC10937371 DOI: 10.1038/s41586-023-06957-x] [Citation(s) in RCA: 129] [Impact Index Per Article: 129.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Accepted: 12/08/2023] [Indexed: 02/21/2024]
Abstract
Comprehensively mapping the genetic basis of human disease across diverse individuals is a long-standing goal for the field of human genetics1-4. The All of Us Research Program is a longitudinal cohort study aiming to enrol a diverse group of at least one million individuals across the USA to accelerate biomedical research and improve human health5,6. Here we describe the programme's genomics data release of 245,388 clinical-grade genome sequences. This resource is unique in its diversity as 77% of participants are from communities that are historically under-represented in biomedical research and 46% are individuals from under-represented racial and ethnic minorities. All of Us identified more than 1 billion genetic variants, including more than 275 million previously unreported genetic variants, more than 3.9 million of which had coding consequences. Leveraging linkage between genomic data and the longitudinal electronic health record, we evaluated 3,724 genetic variants associated with 117 diseases and found high replication rates across both participants of European ancestry and participants of African ancestry. Summary-level data are publicly available, and individual-level data can be accessed by researchers through the All of Us Researcher Workbench using a unique data passport model with a median time from initial researcher registration to data access of 29 hours. We anticipate that this diverse dataset will advance the promise of genomic medicine for all.
Collapse
|
research-article |
1 |
129 |
13
|
Jordan IK, Makarova KS, Spouge JL, Wolf YI, Koonin EV. Lineage-specific gene expansions in bacterial and archaeal genomes. Genome Res 2001; 11:555-65. [PMID: 11282971 PMCID: PMC311027 DOI: 10.1101/gr.gr-1660r] [Citation(s) in RCA: 122] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Gene duplication is an important mechanistic antecedent to the evolution of new genes and novel biochemical functions. In an attempt to assess the contribution of gene duplication to genome evolution in archaea and bacteria, clusters of related genes that appear to have expanded subsequent to the diversification of the major prokaryotic lineages (lineage-specific expansions) were analyzed. Analysis of 21 completely sequenced prokaryotic genomes shows that lineage-specific expansions comprise a substantial fraction (approximately 5%-33%) of their coding capacities. A positive correlation exists between the fraction of the genes taken up by lineage-specific expansions and the total number of genes in a genome. Consistent with the notion that lineage-specific expansions are made up of relatively recently duplicated genes, >90% of the detected clusters consists of only two to four genes. The more common smaller clusters tend to include genes with higher pairwise similarity (as reflected by average score density) than larger clusters. Regardless of size, cluster members tend to be located more closely on bacterial chromosomes than expected by chance, which could reflect a history of tandem gene duplication. In addition to the small clusters, almost all genomes also contain rare large clusters of size > or =20. Several examples of the potential adaptive significance of these large clusters are explored. The presence or absence of clusters and their related genes was used as the basis for the construction of a similarity graph for completely sequenced prokaryotic genomes. The topology of the resulting graph seems to reflect a combined effect of common ancestry, horizontal transfer, and lineage-specific gene loss.
Collapse
|
research-article |
24 |
122 |
14
|
Bowen NJ, Jordan IK, Epstein JA, Wood V, Levin HL. Retrotransposons and their recognition of pol II promoters: a comprehensive survey of the transposable elements from the complete genome sequence of Schizosaccharomyces pombe. Genome Res 2003; 13:1984-97. [PMID: 12952871 PMCID: PMC403668 DOI: 10.1101/gr.1191603] [Citation(s) in RCA: 117] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
The complete DNA sequence of the genome of Schizosaccharomyces pombe provides the opportunity to investigate the entire complement of transposable elements (TEs), their association with specific sequences, their chromosomal distribution, and their evolution. Using homology-based sequence identification, we found that the sequenced strain of S. pombe contained only one family of full-length transposons. This family, Tf2, consisted of 13 full-length copies of a long terminal repeat (LTR) retrotransposon. We found that LTR-LTR recombination of previously existing transposons had resulted in extensive populations of solo LTRs. These included 35 solo LTRs of Tf2, as well as 139 solo LTRs from other Tf families. Phylogenetic analysis of solo Tf LTRs reveals that Tf1 and Tf2 were the most recently active elements within the genome. The solo LTRs also served as footprints for previous insertion events by the Tf retrotransposons. Analysis of 186 genomic insertion events revealed a close association with RNA polymerase II promoters. These insertions clustered in the promoter-proximal regions of genes, upstream of protein coding regions by 100 to 400 nucleotides. The association of Tf insertions with pol II promoters was very similar to the preference previously observed for Tf1 integration. We found that the recently active Tf elements were absent from centromeres and pericentromeric regions of the genome containing tandem tRNA gene clusters. In addition, our analysis revealed that chromosome III has twice the density of insertion events compared to the other two chromosomes. Finally we describe a novel repetitive sequence, wtf, which was also preferentially located on chromosome III, and was often located near solo LTRs of Tf elements.
Collapse
|
research-article |
22 |
117 |
15
|
Rogozin IB, Basu MK, Jordan IK, Pavlov YI, Koonin EV. APOBEC4, a new member of the AID/APOBEC family of polynucleotide (deoxy)cytidine deaminases predicted by computational analysis. Cell Cycle 2005; 4:1281-5. [PMID: 16082223 DOI: 10.4161/cc.4.9.1994] [Citation(s) in RCA: 105] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
Using iterative database searches, we identified a new subfamily of the AID/APOBEC family of RNA/DNA editing cytidine deaminases. The new subfamily, which is represented by readily identifiable orthologs in mammals, chicken, and frog, but not fishes, was designated APOBEC4. The zinc-coordinating motifs involved in catalysis and the secondary structure of the APOBEC4 deaminase domain are evolutionarily conserved, suggesting that APOBEC4 proteins are active polynucleotide (deoxy)cytidine deaminases. In reconstructed maximum likelihood phylogenetic trees, APOBEC4 forms a distinct clade with a high statistical support. APOBEC4 and APOBEC1 are joined in a moderately supported cluster clearly separated from AID, APOBEC2 and APOBEC3 subfamilies. In mammals, APOBEC4 is expressed primarily in testis which suggests the possibility that it is an editing enzyme for mRNAs involved in spermatogenesis.
Collapse
|
|
20 |
105 |
16
|
Jordan IK, Mariño-Ramírez L, Koonin EV. Evolutionary significance of gene expression divergence. Gene 2005; 345:119-26. [PMID: 15716085 PMCID: PMC1859841 DOI: 10.1016/j.gene.2004.11.034] [Citation(s) in RCA: 104] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2004] [Revised: 11/08/2004] [Accepted: 11/15/2004] [Indexed: 11/25/2022]
Abstract
Recent large-scale studies of evolutionary changes in gene expression among mammalian species have led to the proposal that gene expression divergence may be neutral with respect to organismic fitness. Here, we employ a comparative analysis of mammalian gene sequence divergence and gene expression divergence to test the hypothesis that the evolution of gene expression is predominantly neutral. Two models of neutral gene expression evolution are considered: 1-purely neutral evolution (i.e., no selective constraint) of gene expression levels and patterns and 2-neutral evolution accompanied by selective constraint. With respect to purely neutral evolution, levels of change in gene expression between human-mouse orthologs are correlated with levels of gene sequence divergence that are determined largely by purifying selection. In contrast, evolutionary changes of tissue-specific gene expression profiles do not show such a correlation with sequence divergence. However, divergence of both gene expression levels and profiles are significantly lower for orthologous human-mouse gene pairs than for pairs of randomly chosen human and mouse genes. These data clearly point to the action of selective constraint on gene expression divergence and are inconsistent with the purely neutral model; however, there is likely to be a neutral component in evolution of gene expression, particularly, in tissues where the expression of a given gene is low and functionally irrelevant. The model of neutral evolution with selective constraint predicts a regular, clock-like accumulation of gene expression divergence. However, relative rate tests of the divergence among human-mouse-rat orthologous gene sets reveal clock-like evolution for gene sequence divergence, and to a lesser extent for gene expression level divergence, but not for the divergence of tissue-specific gene expression profiles. Taken together, these results indicate that gene expression divergence is subject to the effects of purifying selective constraint and suggest that it might also be substantially influenced by positive Darwinian selection.
Collapse
|
Comparative Study |
20 |
104 |
17
|
Conley AB, Piriyapongsa J, Jordan IK. Retroviral promoters in the human genome. ACTA ACUST UNITED AC 2008; 24:1563-7. [PMID: 18535086 DOI: 10.1093/bioinformatics/btn243] [Citation(s) in RCA: 95] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
MOTIVATION Endogenous retrovirus (ERV) elements have been shown to contribute promoter sequences that can initiate transcription of adjacent human genes. However, the extent to which retroviral sequences initiate transcription within the human genome is currently unknown. We analyzed genome sequence and high-throughput expression data to systematically evaluate the presence of retroviral promoters in the human genome. RESULTS We report the existence of 51,197 ERV-derived promoter sequences that initiate transcription within the human genome, including 1743 cases where transcription is initiated from ERV sequences that are located in gene proximal promoter or 5' untranslated regions (UTRs). A total of 114 of the ERV-derived transcription start sites can be demonstrated to drive transcription of 97 human genes, producing chimeric transcripts that are initiated within ERV long terminal repeat (LTR) sequences and read-through into known gene sequences. ERV promoters drive tissue-specific and lineage-specific patterns of gene expression and contribute to expression divergence between paralogs. These data illustrate the potential of retroviral sequences to regulate human transcription on a large scale consistent with a substantial effect of ERVs on the function and evolution of the human genome.
Collapse
|
Research Support, Non-U.S. Gov't |
17 |
95 |
18
|
Jordan IK, Matyunina LV, McDonald JF. Evidence for the recent horizontal transfer of long terminal repeat retrotransposon. Proc Natl Acad Sci U S A 1999; 96:12621-5. [PMID: 10535972 PMCID: PMC23018 DOI: 10.1073/pnas.96.22.12621] [Citation(s) in RCA: 95] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The evolutionary dynamics existing between transposable elements (TEs) and their host genomes have been likened to an "arms race." The selfish drive of TEs to replicate, in turn, elicits the evolution of host-mediated regulatory mechanisms aimed at repressing transpositional activity. It has been postulated that horizontal (cross-species) transfer may be one effective strategy by which TEs and other selfish genes can escape host-mediated silencing mechanisms over evolutionary time; however, to date, the most definitive evidence that TEs horizontally transfer between species has been limited to class II or DNA-type elements. Evidence that the more numerous and widely distributed retroelements may also be horizontally transferred between species has been more ambiguous. In this paper, we report definitive evidence for a recent horizontal transfer of the copia long terminal repeat retrotransposon between Drosophila melanogaster and Drosophila willistoni.
Collapse
|
research-article |
26 |
95 |
19
|
Rogozin IB, Spiridonov AN, Sorokin AV, Wolf YI, Jordan IK, Tatusov RL, Koonin EV. Purifying and directional selection in overlapping prokaryotic genes. Trends Genet 2002; 18:228-32. [PMID: 12047938 DOI: 10.1016/s0168-9525(02)02649-5] [Citation(s) in RCA: 94] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
In overlapping genes, the same DNA sequence codes for two proteins using different reading frames. Analysis of overlapping genes can help in understanding the mode of evolution of a coding region from noncoding DNA. We identified 71 pairs of convergent genes, with overlapping 3' ends longer than 15 nucleotides, that are conserved in at least two prokaryotic genomes. Among the overlap regions, we observed a statistically significant bias towards the 123:132 phase (i.e. the second codon base in one gene facing the degenerate third position in the second gene). This phase ensures the least mutual constraint on nonconservative amino acid replacements in both overlapping coding sequences. The excess of this phase is compatible with directional (positive) selection acting on the overlapping coding regions. This could be a general evolutionary mode for genes emerging from noncoding sequences, in which the protein sequence has not been subject to selection.
Collapse
|
|
23 |
94 |
20
|
Wang J, Geesman GJ, Hostikka SL, Atallah M, Blackwell B, Lee E, Cook PJ, Pasaniuc B, Shariat G, Halperin E, Dobke M, Rosenfeld MG, Jordan IK, Lunyak VV. Inhibition of activated pericentromeric SINE/Alu repeat transcription in senescent human adult stem cells reinstates self-renewal. Cell Cycle 2011; 10:3016-30. [PMID: 21862875 PMCID: PMC3218602 DOI: 10.4161/cc.10.17.17543] [Citation(s) in RCA: 84] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2011] [Accepted: 07/28/2011] [Indexed: 01/01/2023] Open
Abstract
Cellular aging is linked to deficiencies in efficient repair of DNA double strand breaks and authentic genome maintenance at the chromatin level. Aging poses a significant threat to adult stem cell function by triggering persistent DNA damage and ultimately cellular senescence. Senescence is often considered to be an irreversible process. Moreover, critical genomic regions engaged in persistent DNA damage accumulation are unknown. Here we report that 65% of naturally occurring repairable DNA damage in self-renewing adult stem cells occurs within transposable elements. Upregulation of Alu retrotransposon transcription upon ex vivo aging causes nuclear cytotoxicity associated with the formation of persistent DNA damage foci and loss of efficient DNA repair in pericentric chromatin. This occurs due to a failure to recruit of condensin I and cohesin complexes. Our results demonstrate that the cytotoxicity of induced Alu repeats is functionally relevant for the human adult stem cell aging. Stable suppression of Alu transcription can reverse the senescent phenotype, reinstating the cells' self-renewing properties and increasing their plasticity by altering so-called "master" pluripotency regulators.
Collapse
|
Research Support, N.I.H., Extramural |
14 |
84 |
21
|
Abstract
The availability of multiple complete genome sequences from the same species can facilitate attempts to systematically address basic questions in genome evolution. We refer to such efforts as "microevolutionary genomics". We report the results of comparative analyses of complete intraspecific genome (and proteome) sequences from four bacterial species--Chlamydophila pneumoniae, Escherichia coli, Helicobacter pylori and Neisseria meningitidis. Comparisons of average synonymous (K(s)) and nonsynonymous (K(a)) substitution rates were used to assess the influence of various biological factors on the rate of protein evolution. For example, E. coli experiences the most intense purifying selection of the species analyzed, and this may be due to the relatively larger population size of this species. In addition, essential genes were shown to be more evolutionarily conserved than nonessential genes in E. coli and duplicated genes have higher rates of evolution than unique genes for all species studied except C. pneumoniae. Different functional categories of genes were shown to evolve at significantly different rates emphasizing the role of category-specific functional constraints in determining evolutionary rates. Finally, functionally characterized genes tend to be conserved between strains, while uncharacterized genes are over-represented among the unique, strain-specific genes. This suggests the possibility that nonessential genes are responsible for driving the evolutionary diversification between strains.
Collapse
|
Review |
23 |
84 |
22
|
Mariño-Ramírez L, Lewis KC, Landsman D, Jordan IK. Transposable elements donate lineage-specific regulatory sequences to host genomes. Cytogenet Genome Res 2005; 110:333-41. [PMID: 16093685 PMCID: PMC1803082 DOI: 10.1159/000084965] [Citation(s) in RCA: 83] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2003] [Accepted: 01/13/2004] [Indexed: 12/11/2022] Open
Abstract
The evolutionary implications of transposable element (TE) influences on gene regulation are explored here. An historical perspective is presented to underscore the importance of TE influences on gene regulation with respect to both the discovery of TEs and the early conceptualization of their potential impact on host genome evolution. Evidence that points to a role for TEs in host gene regulation is reviewed, and comparisons between genome sequences are used to demonstrate the fact that TEs are particularly lineage-specific components of their host genomes. Consistent with these two properties of TEs, regulatory effects and evolutionary specificity, human-mouse genome wide sequence comparisons reveal that the regulatory sequences that are contributed by TEs are exceptionally lineage specific. This suggests a particular mechanism by which TEs may drive the diversification of gene regulation between evolutionary lineages.
Collapse
|
Review |
20 |
83 |
23
|
Norris ET, Wang L, Conley AB, Rishishwar L, Mariño-Ramírez L, Valderrama-Aguirre A, Jordan IK. Genetic ancestry, admixture and health determinants in Latin America. BMC Genomics 2018; 19:861. [PMID: 30537949 PMCID: PMC6288849 DOI: 10.1186/s12864-018-5195-7] [Citation(s) in RCA: 70] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Modern Latin American populations were formed via genetic admixture among ancestral source populations from Africa, the Americas and Europe. We are interested in studying how combinations of genetic ancestry in admixed Latin American populations may impact genomic determinants of health and disease. For this study, we characterized the impact of ancestry and admixture on genetic variants that underlie health- and disease-related phenotypes in population genomic samples from Colombia, Mexico, Peru, and Puerto Rico. RESULTS We analyzed a total of 347 admixed Latin American genomes along with 1102 putative ancestral source genomes from Africans, Europeans, and Native Americans. We characterized the genetic ancestry, relatedness, and admixture patterns for each of the admixed Latin American genomes, finding a spectrum of ancestry proportions within and between populations. We then identified single nucleotide polymorphisms (SNPs) with anomalous ancestry-enrichment patterns, i.e. SNPs that exist in any given Latin American population at a higher frequency than expected based on the population's genetic ancestry profile. For this set of ancestry-enriched SNPs, we inspected their phenotypic impact on disease, metabolism, and the immune system. All four of the Latin American populations show ancestry-enrichment for a number of shared pathways, yielding evidence of similar selection pressures on these populations during their evolution. For example, all four populations show ancestry-enriched SNPs in multiple genes from immune system pathways, such as the cytokine receptor interaction, T cell receptor signaling, and antigen presentation pathways. We also found SNPs with excess African or European ancestry that are associated with ancestry-specific gene expression patterns and play crucial roles in the immune system and infectious disease responses. Genes from both the innate and adaptive immune system were found to be regulated by ancestry-enriched SNPs with population-specific regulatory effects. CONCLUSIONS Ancestry-enriched SNPs in Latin American populations have a substantial effect on health- and disease-related phenotypes. The concordant impact observed for same phenotypes across populations points to a process of adaptive introgression, whereby ancestry-enriched SNPs with specific functional utility appear to have been retained in modern populations by virtue of their effects on health and fitness.
Collapse
|
Journal Article |
7 |
70 |
24
|
Tsaparas P, Mariño-Ramírez L, Bodenreider O, Koonin EV, Jordan IK. Global similarity and local divergence in human and mouse gene co-expression networks. BMC Evol Biol 2006; 6:70. [PMID: 16968540 PMCID: PMC1601971 DOI: 10.1186/1471-2148-6-70] [Citation(s) in RCA: 64] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2006] [Accepted: 09/12/2006] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND A genome-wide comparative analysis of human and mouse gene expression patterns was performed in order to evaluate the evolutionary divergence of mammalian gene expression. Tissue-specific expression profiles were analyzed for 9,105 human-mouse orthologous gene pairs across 28 tissues. Expression profiles were resolved into species-specific coexpression networks, and the topological properties of the networks were compared between species. RESULTS At the global level, the topological properties of the human and mouse gene coexpression networks are, essentially, identical. For instance, both networks have topologies with small-world and scale-free properties as well as closely similar average node degrees, clustering coefficients, and path lengths. However, the human and mouse coexpression networks are highly divergent at the local level: only a small fraction (<10%) of coexpressed gene pair relationships are conserved between the two species. A series of controls for experimental and biological variance show that most of this divergence does not result from experimental noise. We further show that, while the expression divergence between species is genuinely rapid, expression does not evolve free from selective (functional) constraint. Indeed, the coexpression networks analyzed here are demonstrably functionally coherent as indicated by the functional similarity of coexpressed gene pairs, and this pattern is most pronounced in the conserved human-mouse intersection network. Numerous dense network clusters show evidence of dedicated functions, such as spermatogenesis and immune response, that are clearly consistent with the coherence of the expression patterns of their constituent gene members. CONCLUSION The dissonance between global versus local network divergence suggests that the interspecies similarity of the global network properties is of limited biological significance, at best, and that the biologically relevant aspects of the architectures of gene coexpression are specific and particular, rather than universal. Nevertheless, there is substantial evolutionary conservation of the local network structure which is compatible with the notion that gene coexpression networks are subject to purifying selection.
Collapse
|
Research Support, N.I.H., Intramural |
19 |
64 |
25
|
Rishishwar L, Conley AB, Wigington CH, Wang L, Valderrama-Aguirre A, Jordan IK. Ancestry, admixture and fitness in Colombian genomes. Sci Rep 2015. [PMID: 26197429 PMCID: PMC4508918 DOI: 10.1038/srep12376] [Citation(s) in RCA: 63] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
The human dimension of the Columbian Exchange entailed substantial genetic admixture between ancestral source populations from Africa, the Americas and Europe, which had evolved separately for many thousands of years. We sought to address the implications of the creation of admixed American genomes, containing novel allelic combinations, for human health and fitness via analysis of an admixed Colombian population from Medellin. Colombian genomes from Medellin show a wide range of three-way admixture contributions from ancestral source populations. The primary ancestry component for the population is European (average = 74.6%, range = 45.0%–96.7%), followed by Native American (average = 18.1%, range = 2.1%–33.3%) and African (average = 7.3%, range = 0.2%–38.6%). Locus-specific patterns of ancestry were evaluated to search for genomic regions that are enriched across the population for particular ancestry contributions. Adaptive and innate immune system related genes and pathways are particularly over-represented among ancestry-enriched segments, including genes (HLA-B and MAPK10) that are involved in defense against endemic pathogens such as malaria. Genes that encode functions related to skin pigmentation (SCL4A5) and cutaneous glands (EDAR) are also found in regions with anomalous ancestry patterns. These results suggest the possibility that ancestry-specific loci were differentially retained in the modern admixed Colombian population based on their utility in the New World environment.
Collapse
|
Journal Article |
10 |
63 |