1
|
Moeckel C, Mareboina M, Konnaris MA, Chan CS, Mouratidis I, Montgomery A, Chantzi N, Pavlopoulos GA, Georgakopoulos-Soares I. A survey of k-mer methods and applications in bioinformatics. Comput Struct Biotechnol J 2024; 23:2289-2303. [PMID: 38840832 PMCID: PMC11152613 DOI: 10.1016/j.csbj.2024.05.025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2024] [Revised: 05/14/2024] [Accepted: 05/15/2024] [Indexed: 06/07/2024] Open
Abstract
The rapid progression of genomics and proteomics has been driven by the advent of advanced sequencing technologies, large, diverse, and readily available omics datasets, and the evolution of computational data processing capabilities. The vast amount of data generated by these advancements necessitates efficient algorithms to extract meaningful information. K-mers serve as a valuable tool when working with large sequencing datasets, offering several advantages in computational speed and memory efficiency and carrying the potential for intrinsic biological functionality. This review provides an overview of the methods, applications, and significance of k-mers in genomic and proteomic data analyses, as well as the utility of absent sequences, including nullomers and nullpeptides, in disease detection, vaccine development, therapeutics, and forensic science. Therefore, the review highlights the pivotal role of k-mers in addressing current genomic and proteomic problems and underscores their potential for future breakthroughs in research.
Collapse
Affiliation(s)
- Camille Moeckel
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
| | - Manvita Mareboina
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
| | - Maxwell A. Konnaris
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
| | - Candace S.Y. Chan
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA
| | - Ioannis Mouratidis
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
- Huck Institute of the Life Sciences, Penn State University, University Park, Pennsylvania, USA
| | - Austin Montgomery
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
| | - Nikol Chantzi
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
| | | | - Ilias Georgakopoulos-Soares
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
- Huck Institute of the Life Sciences, Penn State University, University Park, Pennsylvania, USA
| |
Collapse
|
2
|
Nestor BJ, Bird T, Severn-Ellis AA, Bayer PE, Ranathunge K, Prodhan MA, Dassanayake M, Batley J, Edwards D, Lambers H, Finnegan PM. Identification and expression analysis of Phosphate Transporter 1 (PHT1) genes in the highly phosphorus-use-efficient Hakea prostrata (Proteaceae). PLANT, CELL & ENVIRONMENT 2024; 47:5021-5038. [PMID: 39136390 DOI: 10.1111/pce.15088] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/01/2024] [Revised: 07/10/2024] [Accepted: 08/02/2024] [Indexed: 11/06/2024]
Abstract
Heavy and costly use of phosphorus (P) fertiliser is often needed to achieve high crop yields, but only a small amount of applied P fertiliser is available to most crop plants. Hakea prostrata (Proteaceae) is endemic to the P-impoverished landscape of southwest Australia and has several P-saving traits. We identified 16 members of the Phosphate Transporter 1 (PHT1) gene family (HpPHT1;1-HpPHT1;12d) in a long-read genome assembly of H. prostrata. Based on phylogenetics, sequence structure and expression patterns, we classified HpPHT1;1 as potentially involved in Pi uptake from soil and HpPHT1;8 and HpPHT1;9 as potentially involved in Pi uptake and root-to-shoot translocation. Three genes, HpPHT1;4, HpPHT1;6 and HpPHT1;8, lacked regulatory PHR1-binding sites (P1BS) in the promoter regions. Available expression data for HpPHT1;6 and HpPHT1;8 indicated they are not responsive to changes in P supply, potentially contributing to the high P sensitivity of H. prostrata. We also discovered a Proteaceae-specific clade of closely-spaced PHT1 genes that lacked conserved genetic architecture among genera, indicating an evolutionary hot spot within the genome. Overall, the genome assembly of H. prostrata provides a much-needed foundation for understanding the genetic mechanisms of novel adaptations to low P soils in southwest Australian plants.
Collapse
Affiliation(s)
- Benjamin J Nestor
- School of Biological Sciences, University of Western Australia, Perth, Western Australia, Australia
- Centre for Applied Bioinformatics, School of Biological Sciences, The University of Western Australia, Perth, Western Australia, Australia
| | - Toby Bird
- School of Biological Sciences, University of Western Australia, Perth, Western Australia, Australia
| | - Anita A Severn-Ellis
- School of Biological Sciences, University of Western Australia, Perth, Western Australia, Australia
| | - Philipp E Bayer
- School of Biological Sciences, University of Western Australia, Perth, Western Australia, Australia
- Centre for Applied Bioinformatics, School of Biological Sciences, The University of Western Australia, Perth, Western Australia, Australia
| | - Kosala Ranathunge
- School of Biological Sciences, University of Western Australia, Perth, Western Australia, Australia
| | - M Asaduzzaman Prodhan
- School of Biological Sciences, University of Western Australia, Perth, Western Australia, Australia
- Centre for Applied Bioinformatics, School of Biological Sciences, The University of Western Australia, Perth, Western Australia, Australia
| | - Maheshi Dassanayake
- Department of Biological Sciences, Louisiana State University, Baton Rouge, Louisiana, USA
| | - Jacqueline Batley
- School of Biological Sciences, University of Western Australia, Perth, Western Australia, Australia
| | - David Edwards
- School of Biological Sciences, University of Western Australia, Perth, Western Australia, Australia
- Centre for Applied Bioinformatics, School of Biological Sciences, The University of Western Australia, Perth, Western Australia, Australia
| | - Hans Lambers
- School of Biological Sciences, University of Western Australia, Perth, Western Australia, Australia
| | - Patrick M Finnegan
- School of Biological Sciences, University of Western Australia, Perth, Western Australia, Australia
- Centre for Applied Bioinformatics, School of Biological Sciences, The University of Western Australia, Perth, Western Australia, Australia
| |
Collapse
|
3
|
Botkin JR, Farmer AD, Young ND, Curtin SJ. Genome assembly of Medicago truncatula accession SA27063 provides insight into spring black stem and leaf spot disease resistance. BMC Genomics 2024; 25:204. [PMID: 38395768 PMCID: PMC10885650 DOI: 10.1186/s12864-024-10112-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2024] [Accepted: 02/10/2024] [Indexed: 02/25/2024] Open
Abstract
Medicago truncatula, model legume and alfalfa relative, has served as an essential resource for advancing our understanding of legume physiology, functional genetics, and crop improvement traits. Necrotrophic fungus, Ascochyta medicaginicola, the causal agent of spring black stem (SBS) and leaf spot is a devasting foliar disease of alfalfa affecting stand survival, yield, and forage quality. Host resistance to SBS disease is poorly understood, and control methods rely on cultural practices. Resistance has been observed in M. truncatula accession SA27063 (HM078) with two recessively inherited quantitative-trait loci (QTL), rnpm1 and rnpm2, previously reported. To shed light on host resistance, we carried out a de novo genome assembly of HM078. The genome, referred to as MtHM078 v1.0, is comprised of 23 contigs totaling 481.19 Mbp. Notably, this assembly contains a substantial amount of novel centromere-related repeat sequences due to deep long-read sequencing. Genome annotation resulted in 98.4% of BUSCO fabales proteins being complete. The assembly enabled sequence-level analysis of rnpm1 and rnpm2 for gene content, synteny, and structural variation between SBS-resistant accession SA27063 (HM078) and SBS-susceptible accession A17 (HM101). Fourteen candidate genes were identified, and some have been implicated in resistance to necrotrophic fungi. Especially interesting candidates include loss-of-function events in HM078 because they fit the inverse gene-for-gene model, where resistance is recessively inherited. In rnpm1, these include a loss-of-function in a disease resistance gene due to a premature stop codon, and a 10.85 kbp retrotransposon-like insertion disrupting a ubiquitin conjugating E2. In rnpm2, we identified a frameshift mutation causing a loss-of-function in a glycosidase, as well as a missense and frameshift mutation altering an F-box family protein. This study generated a high-quality genome of HM078 and has identified promising candidates, that once validated, could be further studied in alfalfa to enhance disease resistance.
Collapse
Affiliation(s)
- Jacob R Botkin
- Department of Plant Pathology, University of Minnesota, St. Paul, MN, 55108, USA
| | - Andrew D Farmer
- National Center for Genome Resources, Santa Fe, NM, 87505, USA
| | - Nevin D Young
- Department of Plant Pathology, University of Minnesota, St. Paul, MN, 55108, USA
| | - Shaun J Curtin
- United States Department of Agriculture, Plant Science Research Unit, St Paul, MN, 55108, USA.
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, MN, 55108, USA.
- Center for Plant Precision Genomics, University of Minnesota, St. Paul, MN, 55108, USA.
- Center for Genome Engineering, University of Minnesota, St. Paul, MN, 55108, USA.
| |
Collapse
|
4
|
Pootakham W, Naktang C, Sonthirod C, Kongkachana W, Narong N, Sangsrakru D, Maknual C, Jiumjamrassil D, Chumriang P, Tangphatsornruang S. Chromosome-level genome assembly of Indian mangrove (Ceriops tagal) revealed a genome-wide duplication event predating the divergence of Rhizophoraceae mangrove species. THE PLANT GENOME 2022; 15:e20217. [PMID: 35608212 DOI: 10.1002/tpg2.20217] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/20/2022] [Accepted: 03/30/2022] [Indexed: 06/15/2023]
Abstract
Mangrove ecosystems are unique, highly diverse, provide benefits to humans, and aid in coastal protection. The Indian mangrove, or spurred mangrove, [Ceriops tagal (Perr.) C. B. Rob.] is a member of the Rhizophoraceae family and is commonly found along the intertidal zones in tropical regions in Southeast Asia, southern Asia, and Africa. Here, we present the first high-quality reference genome assembly of the Ceriops species. A preliminary draft assembly, generated from the 10× Genomics linked-read library, was scaffolded using the proximity ligation chromatin contact mapping technique (Hi-C) to obtain a chromosome-scale assembly of 231,919,005 bases with an N50 length of 11,408,429 bases. The benchmarking universal single-copy orthologs (BUSCO) analysis revealed that C. tagal gene predictions recovered 95.8% of the highly conserved orthologs. Phylogenetic analyses suggested that C. tagal diverged from the last common ancestor of flat-leaf spurred mangrove [C. decandra (Griff.) Ding Hou] and C. zippeliana Blume ∼10.4 million yr ago (MYA), and the last common ancestor of genera Ceriops, Kandelia, and Rhizophora diverged from that of genus Bruguiera ∼49.4 MYA. In addition, our analysis of the transversion rate at fourfold-degenerate sites from orthologous gene pairs provided evidence supporting a recent whole-genome duplication in C. tagal. The STRUCTURE and principal component analyses illustrated that C. tagal individuals investigated in this study were the admixture of two subpopulations, the genetic background of which was influenced primarily by location. The availability of genomic and transcriptomic resources and biodiversity data reported in this work will be useful for future studies that may shed light on adaptive evolutions of mangrove species.
Collapse
Affiliation(s)
- Wirulda Pootakham
- National Omics Center, National Science and Technology Development Agency (NSTDA), Pathum Thani, Thailand
| | - Chaiwat Naktang
- National Omics Center, National Science and Technology Development Agency (NSTDA), Pathum Thani, Thailand
| | - Chutima Sonthirod
- National Omics Center, National Science and Technology Development Agency (NSTDA), Pathum Thani, Thailand
| | - Wasitthee Kongkachana
- National Omics Center, National Science and Technology Development Agency (NSTDA), Pathum Thani, Thailand
| | - Nattapol Narong
- National Omics Center, National Science and Technology Development Agency (NSTDA), Pathum Thani, Thailand
| | - Duangjai Sangsrakru
- National Omics Center, National Science and Technology Development Agency (NSTDA), Pathum Thani, Thailand
| | - Chatree Maknual
- Dep. of Marine and Coastal Resources, 120 The Government Complex, Chaengwatthana Rd., Thung Song Hong, Bangkok, 10210, Thailand
| | - Darunee Jiumjamrassil
- Dep. of Marine and Coastal Resources, 120 The Government Complex, Chaengwatthana Rd., Thung Song Hong, Bangkok, 10210, Thailand
| | - Pranom Chumriang
- Dep. of Marine and Coastal Resources, 120 The Government Complex, Chaengwatthana Rd., Thung Song Hong, Bangkok, 10210, Thailand
| | | |
Collapse
|
5
|
Ruang-Areerate P, Yoocha T, Kongkachana W, Phetchawang P, Maknual C, Meepol W, Jiumjamrassil D, Pootakham W, Tangphatsornruang S. Comparative Analysis and Phylogenetic Relationships of Ceriops Species (Rhizophoraceae) and Avicennia lanata (Acanthaceae): Insight into the Chloroplast Genome Evolution between Middle and Seaward Zones of Mangrove Forests. BIOLOGY 2022; 11:383. [PMID: 35336757 PMCID: PMC8945693 DOI: 10.3390/biology11030383] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/16/2022] [Revised: 02/19/2022] [Accepted: 02/24/2022] [Indexed: 02/04/2023]
Abstract
Ceriops and Avicennia are true mangroves in the middle and seaward zones of mangrove forests, respectively. The chloroplast genomes of Ceriops decandra, Ceriops zippeliana, and Ceriops tagal were assembled into lengths of 166,650, 166,083 and 164,432 bp, respectively, whereas Avicennia lanata was 148,264 bp in length. The gene content and gene order are highly conserved among these species. The chloroplast genome contains 125 genes in A. lanata and 129 genes in Ceriops species. Three duplicate genes (rpl2, rpl23, and trnM-CAU) were found in the IR regions of the three Ceriops species, resulting in expansion of the IR regions. The rpl32 gene was lost in C. zippeliana, whereas the infA gene was present in A. lanata. Short repeats (<40 bp) and a lower number of SSRs were found in A. lanata but not in Ceriops species. The phylogenetic analysis supports that all Ceriops species are clustered in Rhizophoraceae and A. lanata is in Acanthaceae. In a search for genes under selective pressures of coastal environments, the rps7 gene was under positive selection compared with non-mangrove species. Finally, two specific primer sets were developed for species identification of the three Ceriops species. Thus, this finding provides insightful genetic information for evolutionary relationships and molecular markers in Ceriops and Avicennia species.
Collapse
Affiliation(s)
- Panthita Ruang-Areerate
- National Omics Center, National Science and Technology Development Agency (NSTDA), Pathum Thani 12120, Thailand
| | - Thippawan Yoocha
- National Omics Center, National Science and Technology Development Agency (NSTDA), Pathum Thani 12120, Thailand
| | - Wasitthee Kongkachana
- National Omics Center, National Science and Technology Development Agency (NSTDA), Pathum Thani 12120, Thailand
| | - Phakamas Phetchawang
- National Omics Center, National Science and Technology Development Agency (NSTDA), Pathum Thani 12120, Thailand
| | - Chatree Maknual
- Department of Marine and Coastal Resources, 120 The Government Complex, Chaengwatthana Rd., Thung Song Hong, Bangkok 10210, Thailand
| | - Wijarn Meepol
- Department of Marine and Coastal Resources, Ranong Mangrove Forest Research Center, Tambon Ngao, Muang District, Ranong 85000, Thailand
| | - Darunee Jiumjamrassil
- Marine and Coastal Resources Office 5, 199/6 Khanom, Khanom, Nakhon Si Thammarat 80210, Thailand
| | - Wirulda Pootakham
- National Omics Center, National Science and Technology Development Agency (NSTDA), Pathum Thani 12120, Thailand
| | | |
Collapse
|
6
|
Pootakham W, Naktang C, Sonthirod C, Kongkachana W, Yoocha T, Jomchai N, Maknual C, Chumriang P, Pravinvongvuthi T, Tangphatsornruang S. De novo reference assembly of the upriver orange mangrove (Bruguiera sexangula) genome. Genome Biol Evol 2022; 14:6527208. [PMID: 35148390 PMCID: PMC8872974 DOI: 10.1093/gbe/evac025] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/01/2022] [Indexed: 11/21/2022] Open
Abstract
Upriver orange mangrove (Bruguiera sexangula) is a member of the most mangrove-rich taxon (Rhizophoraceae family) and is commonly distributed in the intertidal zones in tropical and subtropical latitudes. In this study, we employed the 10× Genomics linked-read technology to obtain a preliminary de novo assembly of the B. sexangula genome, which was further scaffolded to a pseudomolecule level using the Bruguiera parviflora genome as a reference. The final assembly of the B. sexangula genome contained 260 Mb with an N50 scaffold length of 11,020,310 bases. The assembly comprised 18 pseudomolecules (corresponding to the haploid chromosome number in B. sexangula), covering 204,645,832 bases or 78.6% of the 260-Mb assembly. We predicted a total of 23,978 protein-coding sequences, 17,598 of which were associated with gene ontology terms. Our gene prediction recovered 96.6% of the highly conserved orthologs based on the Benchmarking Universal Single-Copy Orthologs (BUSCO) analysis. The chromosome-level assembly presented in this work provides a valuable genetic resource to help strengthen our understanding of mangroves’ physiological and morphological adaptations to the intertidal zones.
Collapse
Affiliation(s)
- Wirulda Pootakham
- National Omics Center, National Science and Technology Development Agency (NSTDA), Pathum Thani, Thailand
| | - Chaiwat Naktang
- National Omics Center, National Science and Technology Development Agency (NSTDA), Pathum Thani, Thailand
| | - Chutima Sonthirod
- National Omics Center, National Science and Technology Development Agency (NSTDA), Pathum Thani, Thailand
| | - Wasitthee Kongkachana
- National Omics Center, National Science and Technology Development Agency (NSTDA), Pathum Thani, Thailand
| | - Thippawan Yoocha
- National Omics Center, National Science and Technology Development Agency (NSTDA), Pathum Thani, Thailand
| | - Nukoon Jomchai
- National Omics Center, National Science and Technology Development Agency (NSTDA), Pathum Thani, Thailand
| | - Chatree Maknual
- Department of Marine and Coastal Resources, 120 The Government Complex, Chaengwatthana Rd., Thung Song Hong, Bangkok, 10210, Thailand
| | - Pranom Chumriang
- Department of Marine and Coastal Resources, 120 The Government Complex, Chaengwatthana Rd., Thung Song Hong, Bangkok, 10210, Thailand
| | - Tamanai Pravinvongvuthi
- Department of Marine and Coastal Resources, 120 The Government Complex, Chaengwatthana Rd., Thung Song Hong, Bangkok, 10210, Thailand
| | | |
Collapse
|