1
|
Xiao CL, Zhu S, He M, Chen D, Zhang Q, Chen Y, Yu G, Liu J, Xie SQ, Luo F, Liang Z, Wang DP, Bo XC, Gu XF, Wang K, Yan GR. N 6-Methyladenine DNA Modification in the Human Genome. Mol Cell 2018; 71:306-318.e7. [PMID: 30017583 DOI: 10.1016/j.molcel.2018.06.015] [Citation(s) in RCA: 354] [Impact Index Per Article: 50.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2017] [Revised: 03/14/2018] [Accepted: 06/07/2018] [Indexed: 01/06/2023]
Abstract
DNA N6-methyladenine (6mA) modification is the most prevalent DNA modification in prokaryotes, but whether it exists in human cells and whether it plays a role in human diseases remain enigmatic. Here, we showed that 6mA is extensively present in the human genome, and we cataloged 881,240 6mA sites accounting for ∼0.051% of the total adenines. [G/C]AGG[C/T] was the most significantly associated motif with 6mA modification. 6mA sites were enriched in the coding regions and mark actively transcribed genes in human cells. DNA 6mA and N6-demethyladenine modification in the human genome were mediated by methyltransferase N6AMT1 and demethylase ALKBH1, respectively. The abundance of 6mA was significantly lower in cancers, accompanied by decreased N6AMT1 and increased ALKBH1 levels, and downregulation of 6mA modification levels promoted tumorigenesis. Collectively, our results demonstrate that DNA 6mA modification is extensively present in human cells and the decrease of genomic DNA 6mA promotes human tumorigenesis.
Collapse
|
Research Support, Non-U.S. Gov't |
7 |
354 |
2
|
Abstract
Do data from the Encyclopedia Of DNA Elements (ENCODE) project render the notion of junk DNA obsolete? Here, I review older arguments for junk grounded in the C-value paradox and propose a thought experiment to challenge ENCODE's ontology. Specifically, what would we expect for the number of functional elements (as ENCODE defines them) in genomes much larger than our own genome? If the number were to stay more or less constant, it would seem sensible to consider the rest of the DNA of larger genomes to be junk or, at least, assign it a different sort of role (structural rather than informational). If, however, the number of functional elements were to rise significantly with C-value then, (i) organisms with genomes larger than our genome are more complex phenotypically than we are, (ii) ENCODE's definition of functional element identifies many sites that would not be considered functional or phenotype-determining by standard uses in biology, or (iii) the same phenotypic functions are often determined in a more diffuse fashion in larger-genomed organisms. Good cases can be made for propositions ii and iii. A larger theoretical framework, embracing informational and structural roles for DNA, neutral as well as adaptive causes of complexity, and selection as a multilevel phenomenon, is needed.
Collapse
|
research-article |
12 |
250 |
3
|
Abstract
In vivo, the human genome folds into a characteristic ensemble of 3D structures. The mechanism driving the folding process remains unknown. We report a theoretical model for chromatin (Minimal Chromatin Model) that explains the folding of interphase chromosomes and generates chromosome conformations consistent with experimental data. The energy landscape of the model was derived by using the maximum entropy principle and relies on two experimentally derived inputs: a classification of loci into chromatin types and a catalog of the positions of chromatin loops. First, we trained our energy function using the Hi-C contact map of chromosome 10 from human GM12878 lymphoblastoid cells. Then, we used the model to perform molecular dynamics simulations producing an ensemble of 3D structures for all GM12878 autosomes. Finally, we used these 3D structures to generate contact maps. We found that simulated contact maps closely agree with experimental results for all GM12878 autosomes. The ensemble of structures resulting from these simulations exhibited unknotted chromosomes, phase separation of chromatin types, and a tendency for open chromatin to lie at the periphery of chromosome territories.
Collapse
|
Research Support, Non-U.S. Gov't |
9 |
227 |
4
|
Worldwide carrier frequency and genetic prevalence of autosomal recessive inherited retinal diseases. Proc Natl Acad Sci U S A 2020; 117:2710-2716. [PMID: 31964843 DOI: 10.1073/pnas.1913179117] [Citation(s) in RCA: 225] [Impact Index Per Article: 45.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
One of the major questions in human genetics is what percentage of individuals in the general population carry a disease-causing mutation. Based on publicly available information on genotypes from six main world populations, we created a database including data on 276,921 sequence variants, present within 187 genes associated with autosomal recessive (AR) inherited retinal diseases (IRDs). Assessment of these variants revealed that 10,044 were categorized as disease-causing mutations. We developed an algorithm to compute the gene-specific prevalence of disease, as well as the mutational burden in healthy subjects. We found that the genetic prevalence of AR-IRDs corresponds approximately to 1 case in 1,380 individuals, with 5.5 million people expected to be affected worldwide. In addition, we calculated that unaffected carriers of mutations are numerous, ranging from 1 in 2.26 individuals in Europeans to 1 in 3.50 individuals in the Finnish population. Our analysis indicates that about 2.7 billion people worldwide (36% of the population) are healthy carriers of at least one mutation that can cause AR-IRD, a value that is probably the highest across any group of Mendelian conditions in humans.
Collapse
|
Research Support, Non-U.S. Gov't |
5 |
225 |
5
|
Kallberg Y, Oppermann U, Jörnvall H, Persson B. Short-chain dehydrogenase/reductase (SDR) relationships: a large family with eight clusters common to human, animal, and plant genomes. Protein Sci 2002; 11:636-41. [PMID: 11847285 PMCID: PMC2373483 DOI: 10.1110/ps.26902] [Citation(s) in RCA: 177] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
Abstract
The progress in genome characterizations has opened new routes for studying enzyme families. The availability of the human genome enabled us to delineate the large family of short-chain dehydrogenase/reductase (SDR) members. Although the human genome releases are not yet final, we have already found 63 members. We have also compared these SDR forms with those of three model organisms: Caenorhabditis elegans, Drosophila melanogaster, and Arabidopsis thaliana. We detect eight SDR ortholog clusters in a cross-genome comparison. Four of these clusters represent extended SDR forms, a subgroup found in all life forms. The other four are classical SDRs with activities involved in cellular differentiation and signalling. We also find 18 SDR genes that are present only in the human genome of the four genomes studied, reflecting enzyme forms specific to mammals. Close to half of these gene products represent steroid dehydrogenases, emphasizing the regulatory importance of these enzymes.
Collapse
|
research-article |
23 |
177 |
6
|
Kruse CS, Goswamy R, Raval Y, Marawi S. Challenges and Opportunities of Big Data in Health Care: A Systematic Review. JMIR Med Inform 2016; 4:e38. [PMID: 27872036 PMCID: PMC5138448 DOI: 10.2196/medinform.5359] [Citation(s) in RCA: 160] [Impact Index Per Article: 17.8] [Reference Citation Analysis] [Abstract] [Key Words] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2015] [Revised: 07/27/2016] [Accepted: 09/28/2016] [Indexed: 11/18/2022] Open
Abstract
Background Big data analytics offers promise in many business sectors, and health care is looking at big data to provide answers to many age-related issues, particularly dementia and chronic disease management. Objective The purpose of this review was to summarize the challenges faced by big data analytics and the opportunities that big data opens in health care. Methods A total of 3 searches were performed for publications between January 1, 2010 and January 1, 2016 (PubMed/MEDLINE, CINAHL, and Google Scholar), and an assessment was made on content germane to big data in health care. From the results of the searches in research databases and Google Scholar (N=28), the authors summarized content and identified 9 and 14 themes under the categories Challenges and Opportunities, respectively. We rank-ordered and analyzed the themes based on the frequency of occurrence. Results The top challenges were issues of data structure, security, data standardization, storage and transfers, and managerial skills such as data governance. The top opportunities revealed were quality improvement, population management and health, early detection of disease, data quality, structure, and accessibility, improved decision making, and cost reduction. Conclusions Big data analytics has the potential for positive impact and global implications; however, it must overcome some legitimate obstacles.
Collapse
|
Journal Article |
9 |
160 |
7
|
Telonis AG, Loher P, Honda S, Jing Y, Palazzo J, Kirino Y, Rigoutsos I. Dissecting tRNA-derived fragment complexities using personalized transcriptomes reveals novel fragment classes and unexpected dependencies. Oncotarget 2016; 6:24797-822. [PMID: 26325506 PMCID: PMC4694795 DOI: 10.18632/oncotarget.4695] [Citation(s) in RCA: 141] [Impact Index Per Article: 15.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2015] [Accepted: 06/20/2015] [Indexed: 12/21/2022] Open
Abstract
We analyzed transcriptomic data from 452 healthy men and women representing five different human populations and two races, and, 311 breast cancer samples from The Cancer Genome Atlas. Our studies revealed numerous constitutive, distinct fragments with overlapping sequences and quantized lengths that persist across dozens of individuals and arise from the genomic loci of all nuclear and mitochondrial human transfer RNAs (tRNAs). Surprisingly, we discovered that the tRNA fragments' length, starting and ending points, and relative abundance depend on gender, population, race and also on amino acid identity, anticodon, genomic locus, tissue, disease, and disease subtype. Moreover, the length distribution of mitochondrially-encoded tRNAs differs from that of nuclearly-encoded tRNAs, and the specifics of these distributions depend on tissue. Notably, tRNA fragments from the same anticodon do not have correlated abundances. We also report on a novel category of tRNA fragments that significantly contribute to the differences we observe across tissues, genders, populations, and races: these fragments, referred to as i-tRFs, are abundant in human tissues, wholly internal to the respective mature tRNA, and can straddle the anticodon. HITS-CLIP data analysis revealed that tRNA fragments are loaded on Argonaute in a cell-dependent manner, suggesting cell-dependent functional roles through the RNA interference pathway. We validated experimentally two i-tRF molecules: the first was found in 21 of 22 tested breast tumor and adjacent normal samples and was differentially abundant between health and disease whereas the second was found in all eight tested breast cancer cell lines.
Collapse
|
Research Support, Non-U.S. Gov't |
9 |
141 |
8
|
Population-based 3D genome structure analysis reveals driving forces in spatial genome organization. Proc Natl Acad Sci U S A 2016; 113:E1663-72. [PMID: 26951677 DOI: 10.1073/pnas.1512577113] [Citation(s) in RCA: 127] [Impact Index Per Article: 14.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Conformation capture technologies (e.g., Hi-C) chart physical interactions between chromatin regions on a genome-wide scale. However, the structural variability of the genome between cells poses a great challenge to interpreting ensemble-averaged Hi-C data, particularly for long-range and interchromosomal interactions. Here, we present a probabilistic approach for deconvoluting Hi-C data into a model population of distinct diploid 3D genome structures, which facilitates the detection of chromatin interactions likely to co-occur in individual cells. Our approach incorporates the stochastic nature of chromosome conformations and allows a detailed analysis of alternative chromatin structure states. For example, we predict and experimentally confirm the presence of large centromere clusters with distinct chromosome compositions varying between individual cells. The stability of these clusters varies greatly with their chromosome identities. We show that these chromosome-specific clusters can play a key role in the overall chromosome positioning in the nucleus and stabilizing specific chromatin interactions. By explicitly considering genome structural variability, our population-based method provides an important tool for revealing novel insights into the key factors shaping the spatial genome organization.
Collapse
|
Research Support, U.S. Gov't, Non-P.H.S. |
9 |
127 |
9
|
Dynamic maps of UV damage formation and repair for the human genome. Proc Natl Acad Sci U S A 2017; 114:6758-6763. [PMID: 28607063 DOI: 10.1073/pnas.1706522114] [Citation(s) in RCA: 121] [Impact Index Per Article: 15.1] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Formation and repair of UV-induced DNA damage in human cells are affected by cellular context. To study factors influencing damage formation and repair genome-wide, we developed a highly sensitive single-nucleotide resolution damage mapping method [high-sensitivity damage sequencing (HS-Damage-seq)]. Damage maps of both cyclobutane pyrimidine dimers (CPDs) and pyrimidine-pyrimidone (6-4) photoproducts [(6-4)PPs] from UV-irradiated cellular and naked DNA revealed that the effect of transcription factor binding on bulky adducts formation varies, depending on the specific transcription factor, damage type, and strand. We also generated time-resolved UV damage maps of both CPDs and (6-4)PPs by HS-Damage-seq and compared them to the complementary repair maps of the human genome obtained by excision repair sequencing to gain insight into factors that affect UV-induced DNA damage and repair and ultimately UV carcinogenesis. The combination of the two methods revealed that, whereas UV-induced damage is virtually uniform throughout the genome, repair is affected by chromatin states, transcription, and transcription factor binding, in a manner that depends on the type of DNA damage.
Collapse
|
Journal Article |
8 |
121 |
10
|
Transcriptional response to stress in the dynamic chromatin environment of cycling and mitotic cells. Proc Natl Acad Sci U S A 2013; 110:E3388-97. [PMID: 23959860 DOI: 10.1073/pnas.1305275110] [Citation(s) in RCA: 119] [Impact Index Per Article: 9.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
Heat shock factors (HSFs) are the master regulators of transcription under protein-damaging conditions, acting in an environment where the overall transcription is silenced. We determined the genomewide transcriptional program that is rapidly provoked by HSF1 and HSF2 under acute stress in human cells. Our results revealed the molecular mechanisms that maintain cellular homeostasis, including HSF1-driven induction of polyubiquitin genes, as well as HSF1- and HSF2-mediated expression patterns of cochaperones, transcriptional regulators, and signaling molecules. We characterized the genomewide transcriptional response to stress also in mitotic cells where the chromatin is tightly compacted. We found a radically limited binding and transactivating capacity of HSF1, leaving mitotic cells highly susceptible to proteotoxicity. In contrast, HSF2 occupied hundreds of loci in the mitotic cells and localized to the condensed chromatin also in meiosis. These results highlight the importance of the cell cycle phase in transcriptional responses and identify the specific mechanisms for HSF1 and HSF2 in transcriptional orchestration. Moreover, we propose that HSF2 is an epigenetic regulator directing transcription throughout cell cycle progression.
Collapse
|
Research Support, Non-U.S. Gov't |
12 |
119 |
11
|
Ormond KE, Mortlock DP, Scholes DT, Bombard Y, Brody LC, Faucett WA, Garrison NA, Hercher L, Isasi R, Middleton A, Musunuru K, Shriner D, Virani A, Young CE. Human Germline Genome Editing. Am J Hum Genet 2017; 101:167-176. [PMID: 28777929 PMCID: PMC5544380 DOI: 10.1016/j.ajhg.2017.06.012] [Citation(s) in RCA: 108] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
With CRISPR/Cas9 and other genome-editing technologies, successful somatic and germline genome editing are becoming feasible. To respond, an American Society of Human Genetics (ASHG) workgroup developed this position statement, which was approved by the ASHG Board in March 2017. The workgroup included representatives from the UK Association of Genetic Nurses and Counsellors, Canadian Association of Genetic Counsellors, International Genetic Epidemiology Society, and US National Society of Genetic Counselors. These groups, as well as the American Society for Reproductive Medicine, Asia Pacific Society of Human Genetics, British Society for Genetic Medicine, Human Genetics Society of Australasia, Professional Society of Genetic Counselors in Asia, and Southern African Society for Human Genetics, endorsed the final statement. The statement includes the following positions. (1) At this time, given the nature and number of unanswered scientific, ethical, and policy questions, it is inappropriate to perform germline gene editing that culminates in human pregnancy. (2) Currently, there is no reason to prohibit in vitro germline genome editing on human embryos and gametes, with appropriate oversight and consent from donors, to facilitate research on the possible future clinical applications of gene editing. There should be no prohibition on making public funds available to support this research. (3) Future clinical application of human germline genome editing should not proceed unless, at a minimum, there is (a) a compelling medical rationale, (b) an evidence base that supports its clinical use, (c) an ethical justification, and (d) a transparent public process to solicit and incorporate stakeholder input.
Collapse
|
Review |
8 |
108 |
12
|
Genetic variation across the human olfactory receptor repertoire alters odor perception. Proc Natl Acad Sci U S A 2019; 116:9475-9480. [PMID: 31040214 DOI: 10.1073/pnas.1804106115] [Citation(s) in RCA: 102] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Humans use a family of more than 400 olfactory receptors (ORs) to detect odors, but there is currently no model that can predict olfactory perception from receptor activity patterns. Genetic variation in human ORs is abundant and alters receptor function, allowing us to examine the relationship between receptor function and perception. We sequenced the OR repertoire in 332 individuals and examined how genetic variation affected 276 olfactory phenotypes, including the perceived intensity and pleasantness of 68 odorants at two concentrations, detection thresholds of three odorants, and general olfactory acuity. Genetic variation in a single OR was frequently associated with changes in odorant perception, and we validated 10 cases in which in vitro OR function correlated with in vivo odorant perception using a functional assay. In 8 of these 10 cases, reduced receptor function was associated with reduced intensity perception. In addition, we used participant genotypes to quantify genetic ancestry and found that, in combination with single OR genotype, age, and gender, we can explain between 10% and 20% of the perceptual variation in 15 olfactory phenotypes, highlighting the importance of single OR genotype, ancestry, and demographic factors in the variation of olfactory perception.
Collapse
|
Research Support, Non-U.S. Gov't |
6 |
102 |
13
|
Integrated platform for genome-wide screening and construction of high-density genetic interaction maps in mammalian cells. Proc Natl Acad Sci U S A 2013; 110:E2317-26. [PMID: 23739767 DOI: 10.1073/pnas.1307002110] [Citation(s) in RCA: 94] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023] Open
Abstract
A major challenge of the postgenomic era is to understand how human genes function together in normal and disease states. In microorganisms, high-density genetic interaction (GI) maps are a powerful tool to elucidate gene functions and pathways. We have developed an integrated methodology based on pooled shRNA screening in mammalian cells for genome-wide identification of genes with relevant phenotypes and systematic mapping of all GIs among them. We recently demonstrated the potential of this approach in an application to pathways controlling the susceptibility of human cells to the toxin ricin. Here we present the complete quantitative framework underlying our strategy, including experimental design, derivation of quantitative phenotypes from pooled screens, robust identification of hit genes using ultra-complex shRNA libraries, parallel measurement of tens of thousands of GIs from a single double-shRNA experiment, and construction of GI maps. We describe the general applicability of our strategy. Our pooled approach enables rapid screening of the same shRNA library in different cell lines and under different conditions to determine a range of different phenotypes. We illustrate this strategy here for single- and double-shRNA libraries. We compare the roles of genes for susceptibility to ricin and Shiga toxin in different human cell lines and reveal both toxin-specific and cell line-specific pathways. We also present GI maps based on growth and ricin-resistance phenotypes, and we demonstrate how such a comparative GI mapping strategy enables functional dissection of physical complexes and context-dependent pathways.
Collapse
|
Research Support, Non-U.S. Gov't |
12 |
94 |
14
|
Cooper DN, Bacolla A, Férec C, Vasquez KM, Kehrer-Sawatzki H, Chen JM. On the sequence-directed nature of human gene mutation: the role of genomic architecture and the local DNA sequence environment in mediating gene mutations underlying human inherited disease. Hum Mutat 2011; 32:1075-99. [PMID: 21853507 PMCID: PMC3177966 DOI: 10.1002/humu.21557] [Citation(s) in RCA: 90] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2011] [Accepted: 06/17/2011] [Indexed: 12/21/2022]
Abstract
Different types of human gene mutation may vary in size, from structural variants (SVs) to single base-pair substitutions, but what they all have in common is that their nature, size and location are often determined either by specific characteristics of the local DNA sequence environment or by higher order features of the genomic architecture. The human genome is now recognized to contain "pervasive architectural flaws" in that certain DNA sequences are inherently mutation prone by virtue of their base composition, sequence repetitivity and/or epigenetic modification. Here, we explore how the nature, location and frequency of different types of mutation causing inherited disease are shaped in large part, and often in remarkably predictable ways, by the local DNA sequence environment. The mutability of a given gene or genomic region may also be influenced indirectly by a variety of noncanonical (non-B) secondary structures whose formation is facilitated by the underlying DNA sequence. Since these non-B DNA structures can interfere with subsequent DNA replication and repair and may serve to increase mutation frequencies in generalized fashion (i.e., both in the context of subtle mutations and SVs), they have the potential to serve as a unifying concept in studies of mutational mechanisms underlying human inherited disease.
Collapse
|
Research Support, N.I.H., Extramural |
14 |
90 |
15
|
Abstract
Polymorphic inversions are a type of structural variants that are difficult to analyze owing to their balanced nature and the location of breakpoints within complex repeated regions. So far, only a handful of inversions have been studied in detail in humans and current knowledge about their possible functional effects is still limited. However, inversions have been related to phenotypic changes and adaptation in multiple species. In this review, we summarize the evidences of the functional impact of inversions in the human genome. First, given that inversions have been shown to inhibit recombination in heterokaryotes, chromosomes displaying different orientation are expected to evolve independently and this may lead to distinct gene-expression patterns. Second, inversions have a role as disease-causing mutations both by directly affecting gene structure or regulation in different ways, and by predisposing to other secondary arrangements in the offspring of inversion carriers. Finally, several inversions show signals of being selected during human evolution. These findings illustrate the potential of inversions to have phenotypic consequences also in humans and emphasize the importance of their inclusion in genome-wide association studies.
Collapse
|
Review |
10 |
79 |
16
|
Flasch DA, Macia Á, Sánchez L, Ljungman M, Heras SR, García-Pérez JL, Wilson TE, Moran JV. Genome-wide de novo L1 Retrotransposition Connects Endonuclease Activity with Replication. Cell 2019; 177:837-851.e28. [PMID: 30955886 DOI: 10.1016/j.cell.2019.02.050] [Citation(s) in RCA: 77] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2018] [Revised: 01/10/2019] [Accepted: 02/25/2019] [Indexed: 12/18/2022]
Abstract
L1 retrotransposon-derived sequences comprise approximately 17% of the human genome. Darwinian selective pressures alter L1 genomic distributions during evolution, confounding the ability to determine initial L1 integration preferences. Here, we generated high-confidence datasets of greater than 88,000 engineered L1 insertions in human cell lines that act as proxies for cells that accommodate retrotransposition in vivo. Comparing these insertions to a null model, in which L1 endonuclease activity is the sole determinant dictating L1 integration preferences, demonstrated that L1 insertions are not significantly enriched in genes, transcribed regions, or open chromatin. By comparison, we provide compelling evidence that the L1 endonuclease disproportionately cleaves predominant lagging strand DNA replication templates, while lagging strand 3'-hydroxyl groups may prime endonuclease-independent L1 retrotransposition in a Fanconi anemia cell line. Thus, acquisition of an endonuclease domain, in conjunction with the ability to integrate into replicating DNA, allowed L1 to become an autonomous, interspersed retrotransposon.
Collapse
|
Research Support, Non-U.S. Gov't |
6 |
77 |
17
|
Abstract
The University of California Santa Cruz (UCSC) Genome Browser is a popular Web-based tool for quickly displaying a requested portion of a genome at any scale, accompanied by a series of aligned annotation "tracks." The annotations-generated by the UCSC Genome Bioinformatics Group and external collaborators-display gene predictions, mRNA and expressed sequence tag alignments, simple nucleotide polymorphisms, expression and regulatory data, phenotype and variation data, and pairwise and multiple-species comparative genomics data. All information relevant to a region is presented in one window, facilitating biological analysis and interpretation. The database tables underlying the Genome Browser tracks can be viewed, downloaded, and manipulated using another Web-based application, the UCSC Table Browser. Users can upload data as custom annotation tracks in both browsers for research or educational use. This unit describes how to use the Genome Browser and Table Browser for genome analysis, download the underlying database tables, and create and display custom annotation tracks.
Collapse
|
Research Support, N.I.H., Extramural |
16 |
67 |
18
|
Black EM, Giunta S. Repetitive Fragile Sites: Centromere Satellite DNA As a Source of Genome Instability in Human Diseases. Genes (Basel) 2018; 9:E615. [PMID: 30544645 PMCID: PMC6315641 DOI: 10.3390/genes9120615] [Citation(s) in RCA: 66] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2018] [Revised: 12/03/2018] [Accepted: 12/03/2018] [Indexed: 12/31/2022] Open
Abstract
Maintenance of an intact genome is essential for cellular and organismal homeostasis. The centromere is a specialized chromosomal locus required for faithful genome inheritance at each round of cell division. Human centromeres are composed of large tandem arrays of repetitive alpha-satellite DNA, which are often sites of aberrant rearrangements that may lead to chromosome fusions and genetic abnormalities. While the centromere has an essential role in chromosome segregation during mitosis, the long and repetitive nature of the highly identical repeats has greatly hindered in-depth genetic studies, and complete annotation of all human centromeres is still lacking. Here, we review our current understanding of human centromere genetics and epigenetics as well as recent investigations into the role of centromere DNA in disease, with a special focus on cancer, aging, and human immunodeficiency⁻centromeric instability⁻facial anomalies (ICF) syndrome. We also highlight the causes and consequences of genomic instability at these large repetitive arrays and describe the possible sources of centromere fragility. The novel connection between alpha-satellite DNA instability and human pathological conditions emphasizes the importance of obtaining a truly complete human genome assembly and accelerating our understanding of centromere repeats' role in physiology and beyond.
Collapse
|
Review |
7 |
66 |
19
|
Govindarajan R, Duraiyan J, Kaliyappan K, Palanisamy M. Microarray and its applications. J Pharm Bioallied Sci 2012; 4:S310-2. [PMID: 23066278 PMCID: PMC3467903 DOI: 10.4103/0975-7406.100283] [Citation(s) in RCA: 65] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2011] [Revised: 01/02/2012] [Accepted: 01/26/2012] [Indexed: 11/15/2022] Open
Abstract
Microarray is one of the most recent advances being used for cancer research; it provides assistance in pharmacological approach to treat various diseases including oral lesions. Microarray helps in analyzing large amount of samples which have either been recorded previously or new samples; it even helps to test the incidence of a particular marker in tumors. Till recently, microarray's usage in dentistry has been very limited, but in future, as the technology becomes affordable, there may be increase in its usage. Here, we discuss the various techniques and applications of microarray or DNA chip.
Collapse
|
Journal Article |
13 |
65 |
20
|
Shin SI, Ham S, Park J, Seo SH, Lim CH, Jeon H, Huh J, Roh TY. Z-DNA-forming sites identified by ChIP-Seq are associated with actively transcribed regions in the human genome. DNA Res 2016; 23:477-486. [PMID: 27374614 PMCID: PMC5066173 DOI: 10.1093/dnares/dsw031] [Citation(s) in RCA: 64] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2015] [Accepted: 06/03/2016] [Indexed: 01/08/2023] Open
Abstract
Z-DNA, a left-handed double helical DNA is structurally different from the most abundant B-DNA. Z-DNA has been known to play a significant role in transcription and genome stability but the biological meaning and positions of Z-DNA-forming sites (ZFSs) in the human genome has not been fully explored. To obtain genome-wide map of ZFSs, Zaa with two Z-DNA-binding domains was used for ChIP-Seq analysis. A total of 391 ZFSs were found and their functions were examined in vivo. A large portion of ZFSs was enriched in the promoter regions and contain sequences with high potential to form Z-DNA. Genes containing ZFSs were occupied by RNA polymerase II at the promoters and showed high levels of expression. Moreover, ZFSs were significantly related to active histone marks such as H3K4me3 and H3K9ac. The association of Z-DNA with active transcription was confirmed by the reporter assay system. Overall, our results suggest that Z-DNA formation depends on chromatin structure as well as sequence composition, and is associated with active transcription in human cells. The global information about ZFSs positioning will provide a useful resource for further understanding of DNA structure-dependent transcriptional regulation.
Collapse
|
Journal Article |
9 |
64 |
21
|
Lant JT, Berg MD, Heinemann IU, Brandl CJ, O'Donoghue P. Pathways to disease from natural variations in human cytoplasmic tRNAs. J Biol Chem 2019; 294:5294-5308. [PMID: 30643023 DOI: 10.1074/jbc.rev118.002982] [Citation(s) in RCA: 52] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Perfectly accurate translation of mRNA into protein is not a prerequisite for life. Resulting from errors in protein synthesis, mistranslation occurs in all cells, including human cells. The human genome encodes >600 tRNA genes, providing both the raw material for genetic variation and a buffer to ensure that resulting translation errors occur at tolerable levels. On the basis of data from the 1000 Genomes Project, we highlight the unanticipated prevalence of mistranslating tRNA variants in the human population and review studies on synthetic and natural tRNA mutations that cause mistranslation or de-regulate protein synthesis. Although mitochondrial tRNA variants are well known to drive human diseases, including developmental disorders, few studies have revealed a role for human cytoplasmic tRNA mutants in disease. In the context of the unexpectedly large number of tRNA variants in the human population, the emerging literature suggests that human diseases may be affected by natural tRNA variants that cause mistranslation or de-regulate tRNA expression and nucleotide modification. This review highlights examples relevant to genetic disorders, cancer, and neurodegeneration in which cytoplasmic tRNA variants directly cause or exacerbate disease and disease-linked phenotypes in cells, animal models, and humans. In the near future, tRNAs may be recognized as useful genetic markers to predict the onset or severity of human disease.
Collapse
|
Review |
6 |
52 |
22
|
Graur D. An Upper Limit on the Functional Fraction of the Human Genome. Genome Biol Evol 2017; 9:1880-1885. [PMID: 28854598 PMCID: PMC5570035 DOI: 10.1093/gbe/evx121] [Citation(s) in RCA: 51] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/06/2017] [Indexed: 12/13/2022] Open
Abstract
For the human population to maintain a constant size from generation to generation, an increase in fertility must compensate for the reduction in the mean fitness of the population caused, among others, by deleterious mutations. The required increase in fertility due to this mutational load depends on the number of sites in the genome that are functional, the mutation rate, and the fraction of deleterious mutations among all mutations in functional regions. These dependencies and the fact that there exists a maximum tolerable replacement level fertility can be used to put an upper limit on the fraction of the human genome that can be functional. Mutational load considerations lead to the conclusion that the functional fraction within the human genome cannot exceed 15%.
Collapse
|
Journal Article |
8 |
51 |
23
|
Rishishwar L, Mariño-Ramírez L, Jordan IK. Benchmarking computational tools for polymorphic transposable element detection. Brief Bioinform 2018; 18:908-918. [PMID: 27524380 DOI: 10.1093/bib/bbw072] [Citation(s) in RCA: 45] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2016] [Indexed: 12/19/2022] Open
Abstract
Transposable elements (TEs) are an important source of human genetic variation with demonstrable effects on phenotype. Recently, a number of computational methods for the detection of polymorphic TE (polyTE) insertion sites from next-generation sequence data have been developed. The use of such tools will become increasingly important as the pace of human genome sequencing accelerates. For this report, we performed a comparative benchmarking and validation analysis of polyTE detection tools in an effort to inform their selection and use by the TE research community. We analyzed a core set of seven tools with respect to ease of use and accessibility, polyTE detection performance and runtime parameters. An experimentally validated set of 893 human polyTE insertions was used for this purpose, along with a series of simulated data sets that allowed us to assess the impact of sequence coverage on tool performance. The recently developed tool MELT showed the best overall performance followed by Mobster and then RetroSeq. PolyTE detection tools can best detect Alu insertion events in the human genome with reduced reliability for L1 insertions and substantially lowered performance for SVA insertions. We also show evidence that different polyTE detection tools are complementary with respect to their ability to detect a complete set of insertion events. Accordingly, a combined approach, coupled with manual inspection of individual results, may yield the best overall performance. In addition to the benchmarking results, we also provide notes on tool installation and usage as well as suggestions for future polyTE detection algorithm development.
Collapse
|
Journal Article |
7 |
45 |
24
|
Dou Y, Fox-Walsh KL, Baldi PF, Hertel KJ. Genomic splice-site analysis reveals frequent alternative splicing close to the dominant splice site. RNA (NEW YORK, N.Y.) 2006; 12:2047-56. [PMID: 17053087 PMCID: PMC1664720 DOI: 10.1261/rna.151106] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/07/2023]
Abstract
Alternative pre-mRNA splicing may be the most efficient and widespread mechanism to generate multiple protein isoforms from single genes. Here, we describe the genomic analysis of one of the most frequent types of alternative pre-mRNA splicing, alternative 5'- and 3'-splice-site selection. Using an EST-based alternative splicing database recording >47,000 alternative splicing events, we determined the frequency and location of alternative 5'- and 3'-splice sites within the human genome. The most common alternative splice sites used in the human genome are located within 6 nucleotides (nt) of the dominant splice site. We show that the EST database overrepresents alternative splicing events that maintain the reading frame, thus supporting the concept that RNA quality-control steps ensure that mRNAs that encode for potentially harmful protein products are destroyed and do not serve as templates for translation. The most frequent location for alternative 5'-splice sites is 4 nt upstream or downstream from the dominant splice site. Sequence analysis suggests that this preference is a consequence of the U1 snRNP binding sequence at the 5'-splice site, which frequently contains a GU dinucleotide 4 nt downstream from the dominant splice site. Surprisingly, approximately 50% of duplicated 3'-YAG splice junctions are subject to alternative splicing. This high probability of alternative 3'-splice-site activation in close proximity of the dominant 3'-splice site suggests that the second step of the splicing may be prone to violate splicing fidelity.
Collapse
|
Research Support, N.I.H., Extramural |
19 |
44 |
25
|
Analysis of missense variants in the human genome reveals widespread gene-specific clustering and improves prediction of pathogenicity. Am J Hum Genet 2022; 109:457-470. [PMID: 35120630 PMCID: PMC8948164 DOI: 10.1016/j.ajhg.2022.01.006] [Citation(s) in RCA: 38] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2021] [Accepted: 01/11/2022] [Indexed: 12/11/2022] Open
Abstract
We used a machine learning approach to analyze the within-gene distribution of missense variants observed in hereditary conditions and cancer. When applied to 840 genes from the ClinVar database, this approach detected a significant non-random distribution of pathogenic and benign variants in 387 (46%) and 172 (20%) genes, respectively, revealing that variant clustering is widespread across the human exome. This clustering likely occurs as a consequence of mechanisms shaping pathogenicity at the protein level, as illustrated by the overlap of some clusters with known functional domains. We then took advantage of these findings to develop a pathogenicity predictor, MutScore, that integrates qualitative features of DNA substitutions with the new additional information derived from this positional clustering. Using a random forest approach, MutScore was able to identify pathogenic missense mutations with very high accuracy, outperforming existing predictive tools, especially for variants associated with autosomal-dominant disease and cancer. Thus, the within-gene clustering of pathogenic and benign DNA changes is an important and previously underappreciated feature of the human exome, which can be harnessed to improve the prediction of pathogenicity and disambiguation of DNA variants of uncertain significance.
Collapse
|
|
3 |
38 |