Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Goel N, Singh S, Aseri TC. A comparative analysis of soft computing techniques for gene prediction. Anal Biochem 2013;438:14-21. [PMID: 23529114 DOI: 10.1016/j.ab.2013.03.015] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2012] [Revised: 03/05/2013] [Accepted: 03/14/2013] [Indexed: 11/17/2022]

For:	Goel N, Singh S, Aseri TC. A comparative analysis of soft computing techniques for gene prediction. Anal Biochem 2013;438:14-21. [PMID: 23529114 DOI: 10.1016/j.ab.2013.03.015] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2012] [Revised: 03/05/2013] [Accepted: 03/14/2013] [Indexed: 11/17/2022]

Number

Cited by Other Article(s)

Fernandez-Castillo E, Barbosa-Santillán LI, Falcon-Morales L, Sánchez-Escobar JJ. Deep Splicer: A CNN Model for Splice Site Prediction in Genetic Sequences. Genes (Basel) 2022;13:907. [PMID: 35627292 PMCID: PMC9141016 DOI: 10.3390/genes13050907] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2022] [Revised: 05/12/2022] [Accepted: 05/13/2022] [Indexed: 02/05/2023] Open

Abstract

Many living organisms have DNA in their cells that is responsible for their biological features. DNA is an organic molecule of two complementary strands of four different nucleotides wound up in a double helix. These nucleotides are adenine (A), thymine (T), guanine (G), and cytosine (C). Genes are DNA sequences containing the information to synthesize proteins. The genes of higher eukaryotic organisms contain coding sequences, known as exons and non-coding sequences, known as introns, which are removed on splice sites after the DNA is transcribed into RNA. Genome annotation is the process of identifying the location of coding regions and determining their function. This process is fundamental for understanding gene structure; however, it is time-consuming and expensive when done by biochemical methods. With technological advances, splice site detection can be done computationally. Although various software tools have been developed to predict splice sites, they need to improve accuracy and reduce false-positive rates. The main goal of this research was to generate Deep Splicer, a deep learning model to identify splice sites in the genomes of humans and other species. This model has good performance metrics and a lower false-positive rate than the currently existing tools. Deep Splicer achieved an accuracy between 93.55% and 99.66% on the genetic sequences of different organisms, while Splice2Deep, another splice site detection tool, had an accuracy between 90.52% and 98.08%. Splice2Deep surpassed Deep Splicer on the accuracy obtained after evaluating C. elegans genomic sequences (97.88% vs. 93.62%) and A. thaliana (95.40% vs. 94.93%); however, Deep Splicer's accuracy was better for H. sapiens (98.94% vs. 97.15%) and D. melanogaster (97.14% vs. 92.30%). The rate of false positives was 0.11% for human genetic sequences and 0.25% for other species' genetic sequences. Another splice prediction tool, Splice Finder, had between 1% and 3% of false positives for human sequences, while other species' sequences had around 4% and 10%.

Collapse

Varliero G, Wray J, Malandain C, Barker G. PhyloPrimer: a taxon-specific oligonucleotide design platform. PeerJ 2021;9:e11120. [PMID: 33986979 PMCID: PMC8098674 DOI: 10.7717/peerj.11120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2020] [Accepted: 02/25/2021] [Indexed: 11/26/2022] Open

Abstract

Many environmental and biomedical biomonitoring and detection studies aim to explore the presence of specific organisms or gene functionalities in microbiome samples. In such cases, when the study hypotheses can be answered with the exploration of a small number of genes, a targeted PCR-approach is appropriate. However, due to the complexity of environmental microbial communities, the design of specific primers is challenging and can lead to non-specific results. We designed PhyloPrimer, the first user-friendly platform to semi-automate the design of taxon-specific oligos (i.e., PCR primers) for a gene of interest. The main strength of PhyloPrimer is the ability to retrieve and align GenBank gene sequences matching the user’s input, and to explore their relationships through an online dynamic tree. PhyloPrimer then designs oligos specific to the gene sequences selected from the tree and uses the tree non-selected sequences to look for and maximize oligo differences between targeted and non-targeted sequences, therefore increasing oligo taxon-specificity (positive/negative consensus approach). Designed oligos are then checked for the presence of secondary structure with the nearest-neighbor (NN) calculation and the presence of off-target matches with in silico PCR tests, also processing oligos with degenerate bases. Whilst the main function of PhyloPrimer is the design of taxon-specific oligos (down to the species level), the software can also be used for designing oligos to target a gene without any taxonomic specificity, for designing oligos from preselected sequences and for checking predesigned oligos. We validated the pipeline on four commercially available microbial mock communities using PhyloPrimer to design genus- and species-specific primers for the detection of Streptococcus species in the mock communities. The software performed well on these mock microbial communities and can be found at https://www.cerealsdb.uk.net/cerealgenomics/phyloprimer.

Collapse

Abdi Y, Feizi-Derakhshi MR. Hybrid multi-objective evolutionary algorithm based on Search Manager framework for big data optimization problems. Appl Soft Comput 2020. [DOI: 10.1016/j.asoc.2019.105991] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]

Splice sites detection using chaos game representation and neural network. Genomics 2019;112:1847-1852. [PMID: 31704313 DOI: 10.1016/j.ygeno.2019.10.018] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2018] [Revised: 03/18/2019] [Accepted: 10/29/2019] [Indexed: 11/23/2022]

Mihi A, Boucenna N, Benmahammmed K. Prediction of DNA sequences using adaptative neuro-fuzzy inference system. INT J BIOMATH 2018. [DOI: 10.1142/s179352451850047x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]

Pucker B, Holtgräwe D, Weisshaar B. Consideration of non-canonical splice sites improves gene prediction on the Arabidopsis thaliana Niederzenz-1 genome sequence. BMC Res Notes 2017;10:667. [PMID: 29202864 PMCID: PMC5716242 DOI: 10.1186/s13104-017-2985-y] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2017] [Accepted: 11/23/2017] [Indexed: 12/26/2022] Open

Chowdhury B, Garai A, Garai G. An optimized approach for annotation of large eukaryotic genomic sequences using genetic algorithm. BMC Bioinformatics 2017;18:460. [PMID: 29065853 PMCID: PMC5655831 DOI: 10.1186/s12859-017-1874-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2017] [Accepted: 10/17/2017] [Indexed: 01/06/2023] Open

Singh A, Mishra A, Khosravi A, Khandelwal G, Jayaram B. Physico-chemical fingerprinting of RNA genes. Nucleic Acids Res 2017;45:e47. [PMID: 27932456 PMCID: PMC5397174 DOI: 10.1093/nar/gkw1236] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2016] [Accepted: 11/29/2016] [Indexed: 12/13/2022] Open

Jabbar SF, Hamed RI, Alwan AH. The potential of nonparametric model in foundation bearing capacity prediction. Neural Comput Appl 2017. [DOI: 10.1007/s00521-017-2916-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]

Chan KL, Rosli R, Tatarinova TV, Hogan M, Firdaus-Raih M, Low ETL. Seqping: gene prediction pipeline for plant genomes using self-training gene models and transcriptomic data. BMC Bioinformatics 2017;18:1426. [PMID: 28466793 PMCID: PMC5333190 DOI: 10.1186/s12859-016-1426-6] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Gene prediction is one of the most important steps in the genome annotation process. A large number of software tools and pipelines developed by various computing techniques are available for gene prediction. However, these systems have yet to accurately predict all or even most of the protein-coding regions. Furthermore, none of the currently available gene-finders has a universal Hidden Markov Model (HMM) that can perform gene prediction for all organisms equally well in an automatic fashion.

RESULTS

We present an automated gene prediction pipeline, Seqping that uses self-training HMM models and transcriptomic data. The pipeline processes the genome and transcriptome sequences of the target species using GlimmerHMM, SNAP, and AUGUSTUS pipelines, followed by MAKER2 program to combine predictions from the three tools in association with the transcriptomic evidence. Seqping generates species-specific HMMs that are able to offer unbiased gene predictions. The pipeline was evaluated using the Oryza sativa and Arabidopsis thaliana genomes. Benchmarking Universal Single-Copy Orthologs (BUSCO) analysis showed that the pipeline was able to identify at least 95% of BUSCO's plantae dataset. Our evaluation shows that Seqping was able to generate better gene predictions compared to three HMM-based programs (MAKER2, GlimmerHMM and AUGUSTUS) using their respective available HMMs. Seqping had the highest accuracy in rice (0.5648 for CDS, 0.4468 for exon, and 0.6695 nucleotide structure) and A. thaliana (0.5808 for CDS, 0.5955 for exon, and 0.8839 nucleotide structure).

CONCLUSIONS

Seqping provides researchers a seamless pipeline to train species-specific HMMs and predict genes in newly sequenced or less-studied genomes. We conclude that the Seqping pipeline predictions are more accurate than gene predictions using the other three approaches with the default or available HMMs.

Collapse

Huang Y, Chen SY, Deng F. Well-characterized sequence features of eukaryote genomes and implications for ab initio gene prediction. Comput Struct Biotechnol J 2016;14:298-303. [PMID: 27536341 PMCID: PMC4975701 DOI: 10.1016/j.csbj.2016.07.002] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2016] [Revised: 07/06/2016] [Accepted: 07/12/2016] [Indexed: 12/31/2022] Open

Singh S, Kaur S, Goel N. A Review of Computational Intelligence Methods for Eukaryotic Promoter Prediction. NUCLEOSIDES NUCLEOTIDES & NUCLEIC ACIDS 2016;34:449-62. [PMID: 26158565 DOI: 10.1080/15257770.2015.1013126] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]

Cao R, Cheng J. Deciphering the association between gene function and spatial gene-gene interactions in 3D human genome conformation. BMC Genomics 2015;16:880. [PMID: 26511362 PMCID: PMC4625479 DOI: 10.1186/s12864-015-2093-0] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2015] [Accepted: 10/15/2015] [Indexed: 01/17/2023] Open

Abstract

BACKGROUND

A number of factors have been investigated in the context of gene function prediction and analysis, such as sequence identity, gene expressions, and gene co-evolution. However, three-dimensional (3D) conformation of the genome has not been tapped to analyse gene function, probably largely due to lack of genome conformation data until recently.

METHODS

We construct the genome-wide spatial gene-gene interaction networks for three different human B-cells or cell lines from their chromosomal contact data generated by the Hi-C chromosome conformation capturing technique. The G-SESAME and Fast-SemSim are used to calculate function similarity between interacted / non-interacted genes. The Gene Ontology statistics computed from the gene-gene interaction networks is used for gene function prediction.

RESULTS

We compare the function similarity of gene pairs that do not spatially interact and that have interactions. We find that genes that have strong spatial interactions tend to have highly similar function in terms of biological process, molecular function and cellular component of the Gene Ontology. And even though the level of gene-gene interactions generally have no or weak correlation with either sequential genomic distance or sequence identity between genes, the interacted genes with high function similarity tend to have stronger interactions, somewhat shorter genomic distance and significantly higher sequence identity. And combining genomic distance or sequence identity with spatial gene-gene interaction information informs gene-gene function similarity much better than using either one of them alone, suggesting gene-gene interaction information is largely complementary with genomic distance and sequence identity in the context of gene function analysis. We develop and evaluate a new gene function prediction method based on gene-gene interacting networks, which can predict gene function well for a large number of human genes.

CONCLUSIONS

In this work, we demonstrate that the spatial conformation of the human genome is relevant to gene function similarity and is useful for gene function prediction.

Collapse

Nasiri J, Naghavi M, Rad SN, Yolmeh T, Shirazi M, Naderi R, Nasiri M, Ahmadi S. Gene identification programs in bread wheat: a comparison study. NUCLEOSIDES NUCLEOTIDES & NUCLEIC ACIDS 2014;32:529-54. [PMID: 24124688 DOI: 10.1080/15257770.2013.832773] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]