51
|
Corless S, Höcker S, Erhardt S. Centromeric RNA and Its Function at and Beyond Centromeric Chromatin. J Mol Biol 2020; 432:4257-4269. [DOI: 10.1016/j.jmb.2020.03.027] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2019] [Revised: 03/26/2020] [Accepted: 03/27/2020] [Indexed: 12/21/2022]
|
52
|
Miga KH. Centromere studies in the era of 'telomere-to-telomere' genomics. Exp Cell Res 2020; 394:112127. [PMID: 32504677 DOI: 10.1016/j.yexcr.2020.112127] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2020] [Revised: 05/23/2020] [Accepted: 05/30/2020] [Indexed: 12/17/2022]
Abstract
We are entering into an exciting era of genomics where truly complete, high-quality assemblies of human chromosomes are available end-to-end, or from 'telomere-to-telomere' (T2T). This technological advance offers a new opportunity to include endogenous human centromeric regions in high-resolution, sequence-based studies. These emerging reference maps are expected to reveal a new functional landscape in the human genome, where centromere proteins, transcriptional regulation, and spatial organization can be examined with base-level resolution across different stages of development and disease. Such studies will depend on innovative assembly methods of extremely long tandem repeats (ETRs), or satellite DNAs, paired with the development of new, orthogonal validation methods to ensure accuracy and completeness. This review reflects the progress in centromere genomics, credited by recent advancements in long-read sequencing and assembly methods. In doing so, I will discuss the challenges that remain and the promise for a new period of scientific discovery for satellite DNA biology and centromere function.
Collapse
Affiliation(s)
- Karen H Miga
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA, CA, 95064, USA.
| |
Collapse
|
53
|
Sullivan LL, Sullivan BA. Genomic and functional variation of human centromeres. Exp Cell Res 2020; 389:111896. [PMID: 32035947 PMCID: PMC7140587 DOI: 10.1016/j.yexcr.2020.111896] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2019] [Revised: 01/29/2020] [Accepted: 02/05/2020] [Indexed: 10/25/2022]
Abstract
Centromeres are central to chromosome segregation and genome stability, and thus their molecular foundations are important for understanding their function and the ways in which they go awry. Human centromeres typically form at large megabase-sized arrays of alpha satellite DNA for which there is little genomic understanding due to its repetitive nature. Consequently, it has been difficult to achieve genome assemblies at centromeres using traditional next generation sequencing approaches, so that centromeres represent gaps in the current human genome assembly. The role of alpha satellite DNA has been debated since centromeres can form, albeit rarely, on non-alpha satellite DNA. Conversely, the simple presence of alpha satellite DNA is not sufficient for centromere function since chromosomes with multiple alpha satellite arrays only exhibit a single location of centromere assembly. Here, we discuss the organization of human centromeres as well as genomic and functional variation in human centromere location, and current understanding of the genomic and epigenetic mechanisms that underlie centromere flexibility in humans.
Collapse
Affiliation(s)
| | - Beth A Sullivan
- Department of Molecular Genetics and Microbiology, USA; Division of Human Genetics, Duke University School of Medicine, Durham, NC, 27710, USA.
| |
Collapse
|
54
|
Abstract
Since the early days of the genome era, the scientific community has relied on a single 'reference' genome for each species, which is used as the basis for a wide range of genetic analyses, including studies of variation within and across species. As sequencing costs have dropped, thousands of new genomes have been sequenced, and scientists have come to realize that a single reference genome is inadequate for many purposes. By sampling a diverse set of individuals, one can begin to assemble a pan-genome: a collection of all the DNA sequences that occur in a species. Here we review efforts to create pan-genomes for a range of species, from bacteria to humans, and we further consider the computational methods that have been proposed in order to capture, interpret and compare pan-genome data. As scientists continue to survey and catalogue the genomic variation across human populations and begin to assemble a human pan-genome, these efforts will increase our power to connect variation to human diversity, disease and beyond.
Collapse
Affiliation(s)
- Rachel M Sherman
- Department of Computer Science, Whiting School of Engineering, Johns Hopkins University, Baltimore, MD, USA.
- Center for Computational Biology, Whiting School of Engineering, Johns Hopkins University, Baltimore, MD, USA.
| | - Steven L Salzberg
- Department of Computer Science, Whiting School of Engineering, Johns Hopkins University, Baltimore, MD, USA
- Center for Computational Biology, Whiting School of Engineering, Johns Hopkins University, Baltimore, MD, USA
- Department of Biomedical Engineering, Johns Hopkins School of Medicine, Baltimore, MD, USA
- Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD, USA
| |
Collapse
|
55
|
Chromosome-Level Assembly of Drosophila bifasciata Reveals Important Karyotypic Transition of the X Chromosome. G3-GENES GENOMES GENETICS 2020; 10:891-897. [PMID: 31969429 PMCID: PMC7056972 DOI: 10.1534/g3.119.400922] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
The Drosophila obscura species group is one of the most studied clades of Drosophila and harbors multiple distinct karyotypes. Here we present a de novo genome assembly and annotation of D. bifasciata, a species which represents an important subgroup for which no high-quality chromosome-level genome assembly currently exists. We combined long-read sequencing (Nanopore) and Hi-C scaffolding to achieve a highly contiguous genome assembly approximately 193 Mb in size, with repetitive elements constituting 30.1% of the total length. Drosophila bifasciata harbors four large metacentric chromosomes and the small dot, and our assembly contains each chromosome in a single scaffold, including the highly repetitive pericentromeres, which were largely composed of Jockey and Gypsy transposable elements. We annotated a total of 12,821 protein-coding genes and comparisons of synteny with D. athabasca orthologs show that the large metacentric pericentromeric regions of multiple chromosomes are conserved between these species. Importantly, Muller A (X chromosome) was found to be metacentric in D. bifasciata and the pericentromeric region appears homologous to the pericentromeric region of the fused Muller A-AD (XL and XR) of pseudoobscura/affinis subgroup species. Our finding suggests a metacentric ancestral X fused to a telocentric Muller D and created the large neo-X (Muller A-AD) chromosome ∼15 MYA. We also confirm the fusion of Muller C and D in D. bifasciata and show that it likely involved a centromere-centromere fusion.
Collapse
|
56
|
Hori T, Fukagawa T. Artificial generation of centromeres and kinetochores to understand their structure and function. Exp Cell Res 2020; 389:111898. [PMID: 32035949 DOI: 10.1016/j.yexcr.2020.111898] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2019] [Revised: 01/18/2020] [Accepted: 02/05/2020] [Indexed: 01/19/2023]
Abstract
The centromere is an essential genomic region that provides the surface to form the kinetochore, which binds to the spindle microtubes to mediate chromosome segregation during mitosis and meiosis. Centromeres of most organisms possess highly repetitive sequences, making it difficult to study these loci. However, an unusual centromere called a "neocentromere," which does not contain repetitive sequences, was discovered in a patient and can be generated experimentally. Recent advances in genome biology techniques allow us to analyze centromeric chromatin using neocentromeres. In addition to neocentromeres, artificial kinetochores have been generated on non-centromeric loci, using protein tethering systems. These are powerful tools to understand the mechanism of the centromere specification and kinetochore assembly. In this review, we introduce recent studies utilizing the neocentromeres and artificial kinetochores and discuss current problems in centromere biology.
Collapse
Affiliation(s)
- Tetsuya Hori
- Graduate School of Frontier Biosciences, Osaka University, Suita, Osaka, 565-0871, Japan.
| | - Tatsuo Fukagawa
- Graduate School of Frontier Biosciences, Osaka University, Suita, Osaka, 565-0871, Japan.
| |
Collapse
|
57
|
Louzada S, Lopes M, Ferreira D, Adega F, Escudeiro A, Gama-Carvalho M, Chaves R. Decoding the Role of Satellite DNA in Genome Architecture and Plasticity-An Evolutionary and Clinical Affair. Genes (Basel) 2020; 11:E72. [PMID: 31936645 PMCID: PMC7017282 DOI: 10.3390/genes11010072] [Citation(s) in RCA: 40] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2019] [Revised: 12/29/2019] [Accepted: 01/08/2020] [Indexed: 12/11/2022] Open
Abstract
Repetitive DNA is a major organizational component of eukaryotic genomes, being intrinsically related with their architecture and evolution. Tandemly repeated satellite DNAs (satDNAs) can be found clustered in specific heterochromatin-rich chromosomal regions, building vital structures like functional centromeres and also dispersed within euchromatin. Interestingly, despite their association to critical chromosomal structures, satDNAs are widely variable among species due to their high turnover rates. This dynamic behavior has been associated with genome plasticity and chromosome rearrangements, leading to the reshaping of genomes. Here we present the current knowledge regarding satDNAs in the light of new genomic technologies, and the challenges in the study of these sequences. Furthermore, we discuss how these sequences, together with other repeats, influence genome architecture, impacting its evolution and association with disease.
Collapse
Affiliation(s)
- Sandra Louzada
- Laboratory of Cytogenomics and Animal Genomics (CAG), Department of Genetics and Biotechnology (DGB), University of Trás-os-Montes and Alto Douro (UTAD), 5000-801 Vila Real, Portugal; (S.L.); (M.L.); (D.F.); (F.A.); (A.E.)
- Biosystems and Integrative Sciences Institute (BioISI), Faculty of Sciences, University of Lisboa, 1749-016 Lisbon, Portugal;
| | - Mariana Lopes
- Laboratory of Cytogenomics and Animal Genomics (CAG), Department of Genetics and Biotechnology (DGB), University of Trás-os-Montes and Alto Douro (UTAD), 5000-801 Vila Real, Portugal; (S.L.); (M.L.); (D.F.); (F.A.); (A.E.)
- Biosystems and Integrative Sciences Institute (BioISI), Faculty of Sciences, University of Lisboa, 1749-016 Lisbon, Portugal;
| | - Daniela Ferreira
- Laboratory of Cytogenomics and Animal Genomics (CAG), Department of Genetics and Biotechnology (DGB), University of Trás-os-Montes and Alto Douro (UTAD), 5000-801 Vila Real, Portugal; (S.L.); (M.L.); (D.F.); (F.A.); (A.E.)
- Biosystems and Integrative Sciences Institute (BioISI), Faculty of Sciences, University of Lisboa, 1749-016 Lisbon, Portugal;
| | - Filomena Adega
- Laboratory of Cytogenomics and Animal Genomics (CAG), Department of Genetics and Biotechnology (DGB), University of Trás-os-Montes and Alto Douro (UTAD), 5000-801 Vila Real, Portugal; (S.L.); (M.L.); (D.F.); (F.A.); (A.E.)
- Biosystems and Integrative Sciences Institute (BioISI), Faculty of Sciences, University of Lisboa, 1749-016 Lisbon, Portugal;
| | - Ana Escudeiro
- Laboratory of Cytogenomics and Animal Genomics (CAG), Department of Genetics and Biotechnology (DGB), University of Trás-os-Montes and Alto Douro (UTAD), 5000-801 Vila Real, Portugal; (S.L.); (M.L.); (D.F.); (F.A.); (A.E.)
- Biosystems and Integrative Sciences Institute (BioISI), Faculty of Sciences, University of Lisboa, 1749-016 Lisbon, Portugal;
| | - Margarida Gama-Carvalho
- Biosystems and Integrative Sciences Institute (BioISI), Faculty of Sciences, University of Lisboa, 1749-016 Lisbon, Portugal;
| | - Raquel Chaves
- Laboratory of Cytogenomics and Animal Genomics (CAG), Department of Genetics and Biotechnology (DGB), University of Trás-os-Montes and Alto Douro (UTAD), 5000-801 Vila Real, Portugal; (S.L.); (M.L.); (D.F.); (F.A.); (A.E.)
- Biosystems and Integrative Sciences Institute (BioISI), Faculty of Sciences, University of Lisboa, 1749-016 Lisbon, Portugal;
| |
Collapse
|
58
|
Applications and Trends of Machine Learning in Genomics and Phenomics for Next-Generation Breeding. PLANTS 2019; 9:plants9010034. [PMID: 31881663 PMCID: PMC7020215 DOI: 10.3390/plants9010034] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/05/2019] [Revised: 12/17/2019] [Accepted: 12/23/2019] [Indexed: 12/27/2022]
Abstract
Crops are the major source of food supply and raw materials for the processing industry. A balance between crop production and food consumption is continually threatened by plant diseases and adverse environmental conditions. This leads to serious losses every year and results in food shortages, particularly in developing countries. Presently, cutting-edge technologies for genome sequencing and phenotyping of crops combined with progress in computational sciences are leading a revolution in plant breeding, boosting the identification of the genetic basis of traits at a precision never reached before. In this frame, machine learning (ML) plays a pivotal role in data-mining and analysis, providing relevant information for decision-making towards achieving breeding targets. To this end, we summarize the recent progress in next-generation sequencing and the role of phenotyping technologies in genomics-assisted breeding toward the exploitation of the natural variation and the identification of target genes. We also explore the application of ML in managing big data and predictive models, reporting a case study using microRNAs (miRNAs) to identify genes related to stress conditions.
Collapse
|
59
|
Golicz AA, Bayer PE, Bhalla PL, Batley J, Edwards D. Pangenomics Comes of Age: From Bacteria to Plant and Animal Applications. Trends Genet 2019; 36:132-145. [PMID: 31882191 DOI: 10.1016/j.tig.2019.11.006] [Citation(s) in RCA: 111] [Impact Index Per Article: 22.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2019] [Revised: 11/09/2019] [Accepted: 11/12/2019] [Indexed: 02/01/2023]
Abstract
The pangenome refers to a collection of genomic sequence found in the entire species or population rather than in a single individual; the sequence can be core, present in all individuals, or accessory (variable or dispensable), found in a subset of individuals only. While pangenomic studies were first undertaken in bacterial species, developments in genome sequencing and assembly approaches have allowed construction of pangenomes for eukaryotic organisms, fungi, plants, and animals, including two large-scale human pangenome projects. Analysis of the these pangenomes revealed key differences, most likely stemming from divergent evolutionary histories, but also surprising similarities.
Collapse
Affiliation(s)
- Agnieszka A Golicz
- Plant Molecular Biology and Biotechnology Laboratory, Faculty of Veterinary and Agricultural Sciences, University of Melbourne, Melbourne, VIC, Australia.
| | - Philipp E Bayer
- School of Biological Sciences and Institute of Agriculture, The University of Western Australia, Crawley, WA, Australia
| | - Prem L Bhalla
- Plant Molecular Biology and Biotechnology Laboratory, Faculty of Veterinary and Agricultural Sciences, University of Melbourne, Melbourne, VIC, Australia
| | - Jacqueline Batley
- School of Biological Sciences and Institute of Agriculture, The University of Western Australia, Crawley, WA, Australia
| | - David Edwards
- School of Biological Sciences and Institute of Agriculture, The University of Western Australia, Crawley, WA, Australia.
| |
Collapse
|
60
|
Abstract
Repetitive DNAs are ubiquitous in eukaryotic genomes and, in many species, comprise the bulk of the genome. Repeats include transposable elements that can self-mobilize and disperse around the genome and tandemly-repeated satellite DNAs that increase in copy number due to replication slippage and unequal crossing over. Despite their abundance, repetitive DNAs are often ignored in genomic studies due to technical challenges in identifying, assembling, and quantifying them. New technologies and methods are now allowing unprecedented power to analyze repetitive DNAs across diverse taxa. Repetitive DNAs are of particular interest because they can represent distinct modes of genome evolution. Some repetitive DNAs form essential genome structures, such as telomeres and centromeres, that are required for proper chromosome maintenance and segregation, while others form piRNA clusters that regulate transposable elements; thus, these elements are expected to evolve under purifying selection. In contrast, other repeats evolve selfishly and cause genetic conflicts with their host species that drive adaptive evolution of host defense systems. However, the majority of repeats likely accumulate in eukaryotes in the absence of selection due to mechanisms of transposition and unequal crossing over. However, even these “neutral” repeats may indirectly influence genome evolution as they reach high abundance. In this Special Issue, the contributing authors explore these questions from a range of perspectives.
Collapse
|