1
|
Kwon D, Park N, Wy S, Lee D, Park W, Chai HH, Cho IC, Lee J, Kwon K, Kim H, Moon Y, Kim J, Kim J. Identification and characterization of structural variants related to meat quality in pigs using chromosome-level genome assemblies. BMC Genomics 2024; 25:299. [PMID: 38515031 PMCID: PMC10956321 DOI: 10.1186/s12864-024-10225-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2024] [Accepted: 03/14/2024] [Indexed: 03/23/2024] Open
Abstract
BACKGROUND Many studies have been performed to identify various genomic loci and genes associated with the meat quality in pigs. However, the full genetic architecture of the trait still remains unclear in part because of the lack of accurate identification of related structural variations (SVs) which resulted from the shortage of target breeds, the limitations of sequencing data, and the incompleteness of genome assemblies. The recent generation of a new pig breed with superior meat quality, called Nanchukmacdon, and its chromosome-level genome assembly (the NCMD assembly) has provided new opportunities. RESULTS By applying assembly-based SV calling approaches to various genome assemblies of pigs including Nanchukmacdon, the impact of SVs on meat quality was investigated. Especially, by checking the commonality of SVs with other pig breeds, a total of 13,819 Nanchukmacdon-specific SVs (NSVs) were identified, which have a potential effect on the unique meat quality of Nanchukmacdon. The regulatory potentials of NSVs for the expression of nearby genes were further examined using transcriptome- and epigenome-based analyses in different tissues. CONCLUSIONS Whole-genome comparisons based on chromosome-level genome assemblies have led to the discovery of SVs affecting meat quality in pigs, and their regulatory potentials were analyzed. The identified NSVs will provide new insights regarding genetic architectures underlying the meat quality in pigs. Finally, this study confirms the utility of chromosome-level genome assemblies and multi-omics analysis to enhance the understanding of unique phenotypes.
Collapse
Affiliation(s)
- Daehong Kwon
- Department of Biomedical Science and Engineering, Konkuk University, Seoul, 05029, Republic of Korea
| | - Nayoung Park
- Department of Biomedical Science and Engineering, Konkuk University, Seoul, 05029, Republic of Korea
| | - Suyeon Wy
- Department of Biomedical Science and Engineering, Konkuk University, Seoul, 05029, Republic of Korea
| | - Daehwan Lee
- Department of Biomedical Science and Engineering, Konkuk University, Seoul, 05029, Republic of Korea
| | - Woncheoul Park
- Animal Genomics and Bioinformatics Division, National Institute of Animal Science, RDA, Wanju, 55365, Republic of Korea
| | - Han-Ha Chai
- Animal Genomics and Bioinformatics Division, National Institute of Animal Science, RDA, Wanju, 55365, Republic of Korea
| | - In-Cheol Cho
- Subtropical Livestock Research Institute, National Institute of Animal Science, RDA, Jeju, 63242, Republic of Korea
| | - Jongin Lee
- Department of Biomedical Science and Engineering, Konkuk University, Seoul, 05029, Republic of Korea
| | - Kisang Kwon
- Department of Biomedical Science and Engineering, Konkuk University, Seoul, 05029, Republic of Korea
| | - Heesun Kim
- Department of Biomedical Science and Engineering, Konkuk University, Seoul, 05029, Republic of Korea
| | - Youngbeen Moon
- Department of Biomedical Science and Engineering, Konkuk University, Seoul, 05029, Republic of Korea
| | - Juyeon Kim
- Department of Biomedical Science and Engineering, Konkuk University, Seoul, 05029, Republic of Korea
| | - Jaebum Kim
- Department of Biomedical Science and Engineering, Konkuk University, Seoul, 05029, Republic of Korea.
| |
Collapse
|
2
|
Zhou ZW, Yu ZG, Huang XM, Liu JS, Guo YX, Chen LL, Song JM. GenomeSyn: a bioinformatics tool for visualizing genome synteny and structural variations. J Genet Genomics 2022; 49:1174-1176. [PMID: 35436609 DOI: 10.1016/j.jgg.2022.03.013] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2022] [Revised: 03/24/2022] [Accepted: 03/29/2022] [Indexed: 01/18/2023]
Affiliation(s)
- Zu-Wen Zhou
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-bioresources, College of Life Science and Technology, Guangxi University, Nanning, Guangxi 530004, China
| | - Zhi-Guang Yu
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-bioresources, College of Life Science and Technology, Guangxi University, Nanning, Guangxi 530004, China
| | - Xiao-Ming Huang
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-bioresources, College of Life Science and Technology, Guangxi University, Nanning, Guangxi 530004, China
| | - Jin-Shen Liu
- College of Stomatology, Guangxi Medical University, Nanning, Guangxi 530021, China
| | - Yi-Xiong Guo
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, Hubei 430070, China
| | - Ling-Ling Chen
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-bioresources, College of Life Science and Technology, Guangxi University, Nanning, Guangxi 530004, China.
| | - Jia-Ming Song
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-bioresources, College of Life Science and Technology, Guangxi University, Nanning, Guangxi 530004, China.
| |
Collapse
|
3
|
Martinson JW, Bencic DC, Toth GP, Kostich MS, Flick RW, See MJ, Lattier D, Biales AD, Huang W. De Novo Assembly of the Nearly Complete Fathead Minnow Reference Genome Reveals a Repetitive but Compact Genome. ENVIRONMENTAL TOXICOLOGY AND CHEMISTRY 2022; 41:448-461. [PMID: 34888930 PMCID: PMC9560796 DOI: 10.1002/etc.5266] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/27/2021] [Revised: 06/28/2021] [Accepted: 12/04/2021] [Indexed: 06/13/2023]
Abstract
The fathead minnow is a widely used model organism in environmental toxicology. The lack of a high-quality fathead minnow reference genome, however, has severely hampered its uses in toxicogenomics. We present the de novo assembly and annotation of the fathead minnow genome using long PacBio reads, Bionano and Hi-C scaffolding data, and large RNA-sequencing data sets from different tissues and life stages. The new annotated fathead minnow reference genome has a scaffold N50 of 12.0 Mbp and a complete benchmarking universal single-copy orthologs score of 95.1%. The completeness of annotation for the new reference genome is comparable to that of the zebrafish GRCz11 reference genome. The fathead minnow genome, revealed to be highly repetitive and sharing extensive syntenic regions with the zebrafish genome, has a much more compact gene structure than the zebrafish genome. Particularly, comparative genomic analysis with zebrafish, mouse, and human showed that fathead minnow homologous genes are relatively conserved in exon regions but had strikingly shorter intron regions. The new fathead minnow reference genome and annotation data, publicly available from the National Center for Biotechnology Information and the University of California Santa Cruz genome browser, provides an essential resource for aquatic toxicogenomic studies in ecotoxicology and public health. Environ Toxicol Chem 2022;41:448-461. Published 2021. This article is a U.S. Government work and is in the public domain in the USA.
Collapse
Affiliation(s)
- John W. Martinson
- Center for Computational Toxicology and Exposure, Molecular Indicators Branch, US Environmental Protection Agency, Cincinnati, Ohio, USA
| | - David C. Bencic
- Center for Computational Toxicology and Exposure, Molecular Indicators Branch, US Environmental Protection Agency, Cincinnati, Ohio, USA
| | - Gregory P. Toth
- Center for Computational Toxicology and Exposure, Molecular Indicators Branch, US Environmental Protection Agency, Cincinnati, Ohio, USA
| | - Mitchell S. Kostich
- Center for Computational Toxicology and Exposure, Molecular Indicators Branch, US Environmental Protection Agency, Cincinnati, Ohio, USA
- The Jackson Laboratory for Genomic Medicine, Farmington, Connecticut, USA
| | - Robert W. Flick
- Center for Computational Toxicology and Exposure, Molecular Indicators Branch, US Environmental Protection Agency, Cincinnati, Ohio, USA
| | - Mary J. See
- Center for Computational Toxicology and Exposure, Molecular Indicators Branch, US Environmental Protection Agency, Cincinnati, Ohio, USA
| | - David Lattier
- Center for Computational Toxicology and Exposure, Molecular Indicators Branch, US Environmental Protection Agency, Cincinnati, Ohio, USA
| | - Adam D. Biales
- Center for Computational Toxicology and Exposure, Molecular Indicators Branch, US Environmental Protection Agency, Cincinnati, Ohio, USA
| | - Weichun Huang
- Center for Computational Toxicology and Exposure, Molecular Indicators Branch, US Environmental Protection Agency, Research Triangle Park, North Carolina, USA
| |
Collapse
|
4
|
de la Haba RR, López-Hermoso C, Sánchez-Porro C, Konstantinidis KT, Ventosa A. Comparative Genomics and Phylogenomic Analysis of the Genus Salinivibrio. Front Microbiol 2019; 10:2104. [PMID: 31572321 PMCID: PMC6749099 DOI: 10.3389/fmicb.2019.02104] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2019] [Accepted: 08/27/2019] [Indexed: 12/02/2022] Open
Abstract
In the genomic era phylogenetic relationship among prokaryotes can be inferred from the core orthologous genes (OGs) or proteins in order to elucidate their evolutionary history and current taxonomy should benefits of that. The genus Salinivibrio belongs to the family Vibrionaceae and currently includes only five halophilic species, in spite the fact that new strains are very frequently isolated from hypersaline environments. Species belonging to this genus have undergone several reclassifications and, moreover, there are many strains of Salinivibrio with available genomes which have not been affiliated to the existing species or have been wrongly designated. Therefore, a phylogenetic study using the available genomic information is necessary to clarify the relationships of existing strains within this genus and to review their taxonomic affiliation. For that purpose, we have also sequenced the first complete genome of a Salinivibrio species, Salinivibrio kushneri AL184T, which was employed as a reference to order the contigs of the draft genomes of the type strains of the current species of this genus, as well as to perform a comparative analysis with all the other available Salinivibrio sp. genomes. The genome of S. kushneri AL184T was assembled in two circular chromosomes (with sizes of 2.84 Mb and 0.60 Mb, respectively), as typically occurs in members of the family Vibrionaceae, with nine complete ribosomal operons, which might explain the fast growing rate of salinivibrios cultured under laboratory conditions. Synteny analysis among the type strains of the genus revealed a high level of genomic conservation in both chromosomes, which allow us to hypothesize a slow speciation process or homogenization events taking place in this group of microorganisms to be tested experimentally in the future. Phylogenomic and orthologous average nucleotide identity (OrthoANI)/average amino acid identity (AAI) analyses also evidenced the elevated level of genetic relatedness within members of this genus and allowed to group all the Salinivibrio strains with available genomes in seven separated species. Genome-scale attribute study of the salinivibrios identified traits related to polar flagellum, facultatively anaerobic growth and osmotic response, in accordance to the phenotypic features described for species of this genus.
Collapse
Affiliation(s)
- Rafael R. de la Haba
- Department of Microbiology and Parasitology, Faculty of Pharmacy, University of Seville, Seville, Spain
| | - Clara López-Hermoso
- Department of Microbiology and Parasitology, Faculty of Pharmacy, University of Seville, Seville, Spain
| | - Cristina Sánchez-Porro
- Department of Microbiology and Parasitology, Faculty of Pharmacy, University of Seville, Seville, Spain
| | | | - Antonio Ventosa
- Department of Microbiology and Parasitology, Faculty of Pharmacy, University of Seville, Seville, Spain
| |
Collapse
|
5
|
Song G, Lee J, Kim J, Kang S, Lee H, Kwon D, Lee D, Lang GI, Cherry JM, Kim J. Integrative Meta-Assembly Pipeline (IMAP): Chromosome-level genome assembler combining multiple de novo assemblies. PLoS One 2019; 14:e0221858. [PMID: 31454399 PMCID: PMC6711525 DOI: 10.1371/journal.pone.0221858] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2018] [Accepted: 08/18/2019] [Indexed: 11/29/2022] Open
Abstract
BACKGROUND Genomic data have become major resources to understand complex mechanisms at fine-scale temporal and spatial resolution in functional and evolutionary genetic studies, including human diseases, such as cancers. Recently, a large number of whole genomes of evolving populations of yeast (Saccharomyces cerevisiae W303 strain) were sequenced in a time-dependent manner to identify temporal evolutionary patterns. For this type of study, a chromosome-level sequence assembly of the strain or population at time zero is required to compare with the genomes derived later. However, there is no fully automated computational approach in experimental evolution studies to establish the chromosome-level genome assembly using unique features of sequencing data. METHODS AND RESULTS In this study, we developed a new software pipeline, the integrative meta-assembly pipeline (IMAP), to build chromosome-level genome sequence assemblies by generating and combining multiple initial assemblies using three de novo assemblers from short-read sequencing data. We significantly improved the continuity and accuracy of the genome assembly using a large collection of sequencing data and hybrid assembly approaches. We validated our pipeline by generating chromosome-level assemblies of yeast strains W303 and SK1, and compared our results with assemblies built using long-read sequencing and various assembly evaluation metrics. We also constructed chromosome-level sequence assemblies of S. cerevisiae strain Sigma1278b, and three commonly used fungal strains: Aspergillus nidulans A713, Neurospora crassa 73, and Thielavia terrestris CBS 492.74, for which long-read sequencing data are not yet available. Finally, we examined the effect of IMAP parameters, such as reference and resolution, on the quality of the final assembly of the yeast strains W303 and SK1. CONCLUSIONS We developed a cost-effective pipeline to generate chromosome-level sequence assemblies using only short-read sequencing data. Our pipeline combines the strengths of reference-guided and meta-assembly approaches. Our pipeline is available online at http://github.com/jkimlab/IMAP including a Docker image, as well as a Perl script, to help users install the IMAP package, including several prerequisite programs. Users can use IMAP to easily build the chromosome-level assembly for the genome of their interest.
Collapse
Affiliation(s)
- Giltae Song
- School of Computer Science and Engineering, Pusan National University, Busan, South Korea
| | - Jongin Lee
- Department of Biomedical Science and Engineering, Konkuk University, Seoul, South Korea
| | - Juyeon Kim
- Department of Biomedical Science and Engineering, Konkuk University, Seoul, South Korea
| | - Seokwoo Kang
- School of Computer Science and Engineering, Pusan National University, Busan, South Korea
| | - Hoyong Lee
- School of Computer Science and Engineering, Pusan National University, Busan, South Korea
| | - Daehong Kwon
- Department of Biomedical Science and Engineering, Konkuk University, Seoul, South Korea
| | - Daehwan Lee
- Department of Biomedical Science and Engineering, Konkuk University, Seoul, South Korea
| | - Gregory I. Lang
- Department of Biological Sciences, Lehigh University, Bethlehem, PA, United States of America
| | - J. Michael Cherry
- Department of Genetics, Stanford University School of Medicine, Stanford, California, United States of America
| | - Jaebum Kim
- Department of Biomedical Science and Engineering, Konkuk University, Seoul, South Korea
| |
Collapse
|
6
|
Kwon D, Lee J, Kim J. GMASS: a novel measure for genome assembly structural similarity. BMC Bioinformatics 2019; 20:147. [PMID: 30885117 PMCID: PMC6423833 DOI: 10.1186/s12859-019-2710-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2018] [Accepted: 03/03/2019] [Indexed: 01/10/2023] Open
Abstract
BACKGROUND Thanks to the recent advancements in next-generation sequencing (NGS) technologies, large amount of genomic data, which are short DNA sequences known as reads, has been accumulating. Diverse assemblers have been developed to generate high quality de novo assemblies using the NGS reads, but their output is very different because of algorithmic differences. However, there are not properly structured measures to show the similarity or difference in assemblies. RESULTS We developed a new measure, called the GMASS score, for comparing two genome assemblies in terms of their structure. The GMASS score was developed based on the distribution pattern of the number and coverage of similar regions between a pair of assemblies. The new measure was able to show structural similarity between assemblies when evaluated by simulated assembly datasets. The application of the GMASS score to compare assemblies in recently published benchmark datasets showed the divergent performance of current assemblers as well as its ability to compare assemblies. CONCLUSION The GMASS score is a novel measure for representing structural similarity between two assemblies. It will contribute to the understanding of assembly output and developing de novo assemblers.
Collapse
Affiliation(s)
- Daehong Kwon
- Department of Biomedical Science and Engineering, Konkuk University, Seoul, 05029, South Korea
| | - Jongin Lee
- Department of Biomedical Science and Engineering, Konkuk University, Seoul, 05029, South Korea
| | - Jaebum Kim
- Department of Biomedical Science and Engineering, Konkuk University, Seoul, 05029, South Korea.
| |
Collapse
|