1
|
Alakuş TB. A Novel Repetition Frequency-Based DNA Encoding Scheme to Predict Human and Mouse DNA Enhancers with Deep Learning. Biomimetics (Basel) 2023; 8:218. [PMID: 37366813 DOI: 10.3390/biomimetics8020218] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2023] [Revised: 05/18/2023] [Accepted: 05/22/2023] [Indexed: 06/28/2023] Open
Abstract
Recent studies have shown that DNA enhancers have an important role in the regulation of gene expression. They are responsible for different important biological elements and processes such as development, homeostasis, and embryogenesis. However, experimental prediction of these DNA enhancers is time-consuming and costly as it requires laboratory work. Therefore, researchers started to look for alternative ways and started to apply computation-based deep learning algorithms to this field. Yet, the inconsistency and unsuccessful prediction performance of computational-based approaches among various cell lines led to the investigation of these approaches as well. Therefore, in this study, a novel DNA encoding scheme was proposed, and solutions were sought to the problems mentioned and DNA enhancers were predicted with BiLSTM. The study consisted of four different stages for two scenarios. In the first stage, DNA enhancer data were obtained. In the second stage, DNA sequences were converted to numerical representations by both the proposed encoding scheme and various DNA encoding schemes including EIIP, integer number, and atomic number. In the third stage, the BiLSTM model was designed, and the data were classified. In the final stage, the performance of DNA encoding schemes was determined by accuracy, precision, recall, F1-score, CSI, MCC, G-mean, Kappa coefficient, and AUC scores. In the first scenario, it was determined whether the DNA enhancers belonged to humans or mice. As a result of the prediction process, the highest performance was achieved with the proposed DNA encoding scheme, and an accuracy of 92.16% and an AUC score of 0.85 were calculated, respectively. The closest accuracy score to the proposed scheme was obtained with the EIIP DNA encoding scheme and the result was observed as 89.14%. The AUC score of this scheme was measured as 0.87. Among the remaining DNA encoding schemes, the atomic number showed an accuracy score of 86.61%, while this rate decreased to 76.96% with the integer scheme. The AUC values of these schemes were 0.84 and 0.82, respectively. In the second scenario, it was determined whether there was a DNA enhancer and, if so, it was decided to which species this enhancer belonged. In this scenario, the highest accuracy score was obtained with the proposed DNA encoding scheme and the result was 84.59%. Moreover, the AUC score of the proposed scheme was determined as 0.92. EIIP and integer DNA encoding schemes showed accuracy scores of 77.80% and 73.68%, respectively, while their AUC scores were close to 0.90. The most ineffective prediction was performed with the atomic number and the accuracy score of this scheme was calculated as 68.27%. Finally, the AUC score of this scheme was 0.81. At the end of the study, it was observed that the proposed DNA encoding scheme was successful and effective in predicting DNA enhancers.
Collapse
Affiliation(s)
- Talha Burak Alakuş
- Department of Software Engineering, Faculty of Engineering, Kırklareli University, 39100 Kırklareli, Turkey
| |
Collapse
|
2
|
Hou BH, Tsai YH, Chiang MH, Tsao SM, Huang SH, Chao CP, Chen HM. Cultivar-specific markers, mutations, and chimerisim of Cavendish banana somaclonal variants resistant to Fusarium oxysporum f. sp. cubense tropical race 4. BMC Genomics 2022; 23:470. [PMID: 35752751 PMCID: PMC9233791 DOI: 10.1186/s12864-022-08692-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2021] [Accepted: 06/13/2022] [Indexed: 11/13/2022] Open
Abstract
Background The selection of tissue culture–derived somaclonal variants of Giant Cavendish banana (Musa spp., Cavendish sub-group AAA) by the Taiwan Banana Research Institute (TBRI) has resulted in several cultivars resistant to Fusarium oxysporum f. sp. cubense tropical race 4 (Foc TR4), a destructive fungus threatening global banana production. However, the mutations in these somaclonal variants have not yet been determined. We performed an RNA-sequencing (RNA-seq) analysis of three TBRI Foc TR4–resistant cultivars: ‘Tai-Chiao No. 5’ (TC5), ‘Tai-Chiao No. 7’ (TC7), and ‘Formosana’ (FM), as well as their susceptible progenitor ‘Pei-Chiao’ (PC), to investigate the sequence variations among them and develop cultivar-specific markers. Results A group of single-nucleotide variants (SNVs) specific to one cultivar were identified from the analysis of RNA-seq data and validated using Sanger sequencing from genomic DNA. Several SNVs were further converted into cleaved amplified polymorphic sequence (CAPS) markers or derived CAPS markers that could identify the three Foc TR4–resistant cultivars among 6 local and 5 international Cavendish cultivars. Compared with PC, the three resistant cultivars showed a loss or alteration of heterozygosity in some chromosomal regions, which appears to be a consequence of single-copy chromosomal deletions. Notably, TC7 and FM shared a common deletion region on chromosome 5; however, different TC7 tissues displayed varying degrees of allele ratios in this region, suggesting the presence of chimerism in TC7. Conclusions This work demonstrates that reliable SNV markers of tissue culture–derived and propagated banana cultivars with a triploid genome can be developed through RNA-seq data analysis. Moreover, the analysis of sequence heterozygosity can uncover chromosomal deletions and chimerism in banana somaclonal variants. The markers obtained from this study will assist with the identification of TBRI Cavendish somaclonal variants for the quality control of tissue culture propagation, and the protection of breeders’ rights. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-022-08692-5.
Collapse
Affiliation(s)
- Bo-Han Hou
- Agricultural Biotechnology Research Center, Academia Sinica, 11529, Taipei, Taiwan
| | - Yi-Heng Tsai
- Agricultural Biotechnology Research Center, Academia Sinica, 11529, Taipei, Taiwan
| | - Ming-Hau Chiang
- Agricultural Biotechnology Research Center, Academia Sinica, 11529, Taipei, Taiwan
| | - Shu-Ming Tsao
- Agricultural Biotechnology Research Center, Academia Sinica, 11529, Taipei, Taiwan
| | | | - Chih-Ping Chao
- Taiwan Banana Research Institute, 90442, Pingtung, Taiwan
| | - Ho-Ming Chen
- Agricultural Biotechnology Research Center, Academia Sinica, 11529, Taipei, Taiwan.
| |
Collapse
|
3
|
Wyant SR, Rodriguez MF, Carter CK, Parrott WA, Jackson SA, Stupar RM, Morrell PL. Fast neutron mutagenesis in soybean enriches for small indels and creates frameshift mutations. G3 (BETHESDA, MD.) 2022; 12:jkab431. [PMID: 35100358 PMCID: PMC9335934 DOI: 10.1093/g3journal/jkab431] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/05/2021] [Accepted: 11/14/2021] [Indexed: 11/13/2022]
Abstract
The mutagenic effects of ionizing radiation have been used for decades to create novel variants in experimental populations. Fast neutron (FN) bombardment as a mutagen has been especially widespread in plants, with extensive reports describing the induction of large structural variants, i.e., deletions, insertions, inversions, and translocations. However, the full spectrum of FN-induced mutations is poorly understood. We contrast small insertions and deletions (indels) observed in 27 soybean lines subject to FN irradiation with the standing indels identified in 107 diverse soybean lines. We use the same populations to contrast the nature and context (bases flanking a nucleotide change) of single-nucleotide variants. The accumulation of new single-nucleotide changes in FN lines is marginally higher than expected based on spontaneous mutation. In FN-treated lines and in standing variation, C→T transitions and the corresponding reverse complement G→A transitions are the most abundant and occur most frequently in a CpG local context. These data indicate that most SNPs identified in FN lines are likely derived from spontaneous de novo processes in generations following mutagenesis rather than from the FN irradiation mutagen. However, small indels in FN lines differ from standing variants. Short insertions, from 1 to 6 bp, are less abundant than in standing variation. Short deletions are more abundant and prone to induce frameshift mutations that should disrupt the structure and function of encoded proteins. These findings indicate that FN irradiation generates numerous small indels, increasing the abundance of loss-of-function mutations that impact single genes.
Collapse
Affiliation(s)
- Skylar R Wyant
- Department of Ecology and Evolutionary Biology, University of California, Irvine, CA 92697, USA
| | - M Fernanda Rodriguez
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, MN 55108, USA
| | - Corey K Carter
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, MN 55108, USA
| | - Wayne A Parrott
- Department of Crop and Soil Sciences, University of Georgia, Athens, GA 30602, USA
| | - Scott A Jackson
- Department of Crop and Soil Sciences, University of Georgia, Athens, GA 30602, USA
| | - Robert M Stupar
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, MN 55108, USA
| | - Peter L Morrell
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, MN 55108, USA
| |
Collapse
|
4
|
Tang S, Liu DX, Lu S, Yu L, Li Y, Lin S, Li L, Du Z, Liu X, Li X, Ma W, Yang QY, Guo L. Development and screening of EMS mutants with altered seed oil content or fatty acid composition in Brassica napus. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2020; 104:1410-1422. [PMID: 33048384 DOI: 10.1111/tpj.15003] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/14/2020] [Accepted: 09/10/2020] [Indexed: 06/11/2023]
Abstract
Brassica napus is an important oilseed crop in the world, and the mechanism of seed oil biosynthesis in B. napus remains unclear. In order to study the mechanism of oil biosynthesis and generate germplasms for breeding, an ethyl methanesulfonate (EMS) mutant population with ~100 000 M2 lines was generated using Zhongshuang 11 as the parent line. The EMS-induced genome-wide mutations in M2-M4 plants were assessed. The average number of mutations including single nucleotide polymorphisms and insertion/deletion in M2-M4 was 21 177, 28 675 and 17 915, respectively. The effects of the mutations on gene function were predicted in M2-M4 mutants, respectively. We screened the seeds from 98 113 M2 lines, and 9415 seed oil content and fatty acid mutants were identified. We further confirmed 686 mutants with altered seed oil content and fatty acid in advanced generation (M4 seeds). Five representative M4 mutants with increased oleic acid were re-sequenced, and the potential causal variations in FAD2 and ROD1 genes were identified. This study generated and screened a large scale of B. napus EMS mutant population, and the identified mutants could provide useful genetic resources for the study of oil biosynthesis and genetic improvement of seed oil content and fatty acid composition of B. napus in the future.
Collapse
Affiliation(s)
- Shan Tang
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, China
| | - Dong-Xu Liu
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China
| | - Shaoping Lu
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, China
| | - Liangqian Yu
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, China
| | - Yuqing Li
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, China
| | - Shengli Lin
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, China
| | - Long Li
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, China
| | - Zhuolin Du
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, China
| | - Xiao Liu
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, China
| | - Xiao Li
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, China
| | - Wei Ma
- School of Biological Sciences, Nanyang Technological University, Singapore, 637551, Singapore
| | - Qing-Yong Yang
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China
| | - Liang Guo
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, China
| |
Collapse
|
5
|
Graham N, Patil GB, Bubeck DM, Dobert RC, Glenn KC, Gutsche AT, Kumar S, Lindbo JA, Maas L, May GD, Vega-Sanchez ME, Stupar RM, Morrell PL. Plant Genome Editing and the Relevance of Off-Target Changes. PLANT PHYSIOLOGY 2020; 183:1453-1471. [PMID: 32457089 PMCID: PMC7401131 DOI: 10.1104/pp.19.01194] [Citation(s) in RCA: 38] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/01/2019] [Accepted: 05/07/2020] [Indexed: 05/12/2023]
Abstract
Site-directed nucleases (SDNs) used for targeted genome editing are powerful new tools to introduce precise genetic changes into plants. Like traditional approaches, such as conventional crossing and induced mutagenesis, genome editing aims to improve crop yield and nutrition. Next-generation sequencing studies demonstrate that across their genomes, populations of crop species typically carry millions of single nucleotide polymorphisms and many copy number and structural variants. Spontaneous mutations occur at rates of ∼10-8 to 10-9 per site per generation, while variation induced by chemical treatment or ionizing radiation results in higher mutation rates. In the context of SDNs, an off-target change or edit is an unintended, nonspecific mutation occurring at a site with sequence similarity to the targeted edit region. SDN-mediated off-target changes can contribute to a small number of additional genetic variants compared to those that occur naturally in breeding populations or are introduced by induced-mutagenesis methods. Recent studies show that using computational algorithms to design genome editing reagents can mitigate off-target edits in plants. Finally, crops are subject to strong selection to eliminate off-type plants through well-established multigenerational breeding, selection, and commercial variety development practices. Within this context, off-target edits in crops present no new safety concerns compared to other breeding practices. The current generation of genome editing technologies is already proving useful to develop new plant varieties with consumer and farmer benefits. Genome editing will likely undergo improved editing specificity along with new developments in SDN delivery and increasing genomic characterization, further improving reagent design and application.
Collapse
Affiliation(s)
- Nathaniel Graham
- Department of Genetics, Cell Biology and Development, University of Minnesota, St. Paul, Minnesota 55108
- Pairwise, Durham, North Carolina 27709
| | - Gunvant B Patil
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, Minnesota 55108
| | | | | | | | | | | | | | - Luis Maas
- Enza Zaden Research USA, San Juan Bautista, California 95045
| | | | | | - Robert M Stupar
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, Minnesota 55108
| | - Peter L Morrell
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, Minnesota 55108
| |
Collapse
|
6
|
Shirdeli M, Orlov YL, Eslami G, Hajimohammadi B, Tabikhanova LE, Ehrampoush MH, Rezvani ME, Fallahzadeh H, Zandi H, Hosseini S, Ahmadian S, Mortazavi S, Fallahi R, Asadi-Yousefabad S. Testing Safety of Genetically Modified Products of Rice: Case Study on Sprague Dawley Rats. RUSS J GENET+ 2019. [DOI: 10.1134/s1022795419080131] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
7
|
Proposed U.S. regulation of gene-edited food animals is not fit for purpose. NPJ Sci Food 2019; 3:3. [PMID: 31304275 PMCID: PMC6550240 DOI: 10.1038/s41538-019-0035-y] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2019] [Accepted: 02/21/2019] [Indexed: 12/12/2022] Open
Abstract
Dietary DNA is generally regarded as safe to consume, and is a routine ingredient of food obtained from any living organism. Millions of naturally-occurring DNA variations are observed when comparing the genomic sequence of any two healthy individuals of a given species. Breeders routinely select desired traits resulting from this DNA variation to develop new cultivars and varieties of food plants and animals. Regulatory agencies do not evaluate these new varieties prior to commercial release. Gene editing tools now allow plant and animal breeders to precisely introduce useful genetic variation into agricultural breeding programs. The U.S. Department of Agriculture (USDA) announced that it has no plans to place additional regulations on gene-edited plants that could otherwise have been developed through traditional breeding prior to commercialization. However, the U.S. Food and Drug Administration (FDA) has proposed mandatory premarket new animal drug regulatory evaluation for all food animals whose genomes have been intentionally altered using modern molecular technologies including gene editing technologies. This runs counter to U.S. biotechnology policy that regulatory oversight should be triggered by unreasonable risk, and not by the fact that an organism has been modified by a particular process or technique. Breeder intention is not associated with product risk. Harmonizing the regulations associated with gene editing in food species is imperative to allow both plant and animal breeders access to gene editing tools to introduce useful sustainability traits like disease resistance, climate adaptability, and food quality attributes into U.S. agricultural breeding programs.
Collapse
|