1
|
Das L, Das JK, Mohapatra S, Nanda S. DNA numerical encoding schemes for exon prediction: a recent history. NUCLEOSIDES NUCLEOTIDES & NUCLEIC ACIDS 2021; 40:985-1017. [PMID: 34455915 DOI: 10.1080/15257770.2021.1966797] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Abstract
Bioinformatics in the present day has been firmly established as a regulator in genomics. In recent times, applications of Signal processing in exon prediction have gained a lot of attention. The exons carry protein information. Proteins are composed of connected constituents known as amino acids that characterize the specific function. Conversion of the nucleotide character string into a numerical sequence is the gateway before analyzing it through signal processing methods. This numeric encoding is the mathematical descriptor of nucleotides and is based on some statistical properties of the structure of nucleic acids. Since the type of encoding extremely affects the exon detection accuracy, this paper is devised for the review of existing encoding (mapping) schemes. The comparative analysis is formulated to emphasize the importance of the genetic code setting of amino acids considered for application related to computational elucidation for exon detection. This work covers much helpful information for future applications.
Collapse
Affiliation(s)
- Lopamudra Das
- School of Electronics Engineering, KIIT, Bhubaneswar, India
| | - J K Das
- School of Electronics Engineering, KIIT, Bhubaneswar, India
| | - S Mohapatra
- School of Electronics Engineering, KIIT, Bhubaneswar, India
| | - Sarita Nanda
- School of Electronics Engineering, KIIT, Bhubaneswar, India
| |
Collapse
|
2
|
Das L, Das JK, Nanda S. Detection of exon location in eukaryotic DNA using a fuzzy adaptive Gabor wavelet transform. Genomics 2020; 112:4406-4416. [PMID: 32717319 DOI: 10.1016/j.ygeno.2020.07.020] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2020] [Revised: 06/25/2020] [Accepted: 07/08/2020] [Indexed: 11/17/2022]
Abstract
The existing model-independent methods for the detection of exons in DNA could not prove to be ideal as commonly employed fixed window length strategy produces spectral leakage causing signal noise The Modified-Gabor-wavelet-transform exploits a multiscale strategy to deal with the issue to some extent. Yet, no rule regarding the occurrence of small and large exons has been specified. To overcome this randomness, scaling-factor of GWT has been adapted based on a fuzzy rule. Due to the nucleotides' genetic code and fuzzy behaviors in DNA configuration, this work could adopt the fuzzy approach. Two fuzzy membership functions (large and small) take care of the variation in the coding regions. The fuzzy-based learning parameter adaptively tunes the scale factor for fast and precise prediction of exons. The proposed approach has an immense plus point of being capable of isolating detailed sub-regions in each exon efficiently proving its efficacy comparing with existing techniques.
Collapse
Affiliation(s)
- Lopamudra Das
- School of Electronics Engineering, KIIT University, Bhubaneswar, India.
| | - J K Das
- School of Electronics Engineering, KIIT University, Bhubaneswar, India.
| | - Sarita Nanda
- School of Electronics Engineering, KIIT University, Bhubaneswar, India.
| |
Collapse
|
3
|
Touati R, Haddad-Boubaker S, Ferchichi I, Messaoudi I, Ouesleti AE, Triki H, Lachiri Z, Kharrat M. Comparative genomic signature representations of the emerging COVID-19 coronavirus and other coronaviruses: High identity and possible recombination between Bat and Pangolin coronaviruses. Genomics 2020; 112:4189-4202. [PMID: 32645523 PMCID: PMC7336935 DOI: 10.1016/j.ygeno.2020.07.003] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2020] [Revised: 06/22/2020] [Accepted: 07/02/2020] [Indexed: 12/24/2022]
Abstract
Coronaviruses are responsible on respiratory diseases in animal and human. The combination of numerical encoding techniques and digital signal processing methods are becoming increasingly important in handling large genomic data. In this paper, we propose to analyze the SARS-CoV-2 genomic signature using the combination of different nucleotide representations and signal processing tools in the aim to identify its genetic origin. The sequence of SARS-CoV-2 was compared with 21 relevant sequences including Bat, Yak and Pangolin coronavirus sequences. In addition, we developed a new algorithm to locate the nucleotide modifications. The results show that the Bat and Pangolin coronaviruses were the most related to SARS-CoV-2 with 96% and 86% of identity all along the genome. Within the S gene sequence, the Pangolin sequence presents local highest nucleotide identity. Those findings suggest genesis of SARS-Cov-2 through evolution from Bat and Pangolin strains. This study offers new ways to automatically characterize viruses.
Collapse
Affiliation(s)
- Rabeb Touati
- University of Tunis El Manar, LR99ES10 Human Genetics Laboratory, Faculty of Medicine of Tunis, Tunisia; University of Tunis El Manar, SITI Laboratory, National School of Engineers of Tunis, BP 37, le Belvédère, 1002 Tunis, Tunisie.
| | - Sondes Haddad-Boubaker
- University of Tunis El Manar, Laboratory of Clinical Virology, WHO Regional Reference Laboratory for Poliomyelitis and Measles for EMRO region, Institut Pasteur de Tunis, 13 place Pasteur, BP74 1002 le Belvédère, Tunis, Tunisie
| | - Imen Ferchichi
- University of Tunis El Manar, LR99ES10 Human Genetics Laboratory, Faculty of Medicine of Tunis, Tunisia
| | - Imen Messaoudi
- University of Carthage, Higher Institute of Information Technologies and Communications, Industrial Computing Department, Tunisia; University of Tunis El Manar, SITI Laboratory, National School of Engineers of Tunis, BP 37, le Belvédère, 1002 Tunis, Tunisie
| | - Afef Elloumi Ouesleti
- University of Carthage, National School of Engineers of Carthage, Electrical Engineering Department, Tunisia; University of Tunis El Manar, SITI Laboratory, National School of Engineers of Tunis, BP 37, le Belvédère, 1002 Tunis, Tunisie
| | - Henda Triki
- University of Tunis El Manar, Laboratory of Clinical Virology, WHO Regional Reference Laboratory for Poliomyelitis and Measles for EMRO region, Institut Pasteur de Tunis, 13 place Pasteur, BP74 1002 le Belvédère, Tunis, Tunisie
| | - Zied Lachiri
- University of Tunis El Manar, SITI Laboratory, National School of Engineers of Tunis, BP 37, le Belvédère, 1002 Tunis, Tunisie
| | - Maher Kharrat
- University of Tunis El Manar, LR99ES10 Human Genetics Laboratory, Faculty of Medicine of Tunis, Tunisia
| |
Collapse
|