Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Kaur A, Chauhan APS, Aggarwal AK. Prediction of Enhancers in DNA Sequence Data using a Hybrid CNN-DLSTM Model. IEEE/ACM Trans Comput Biol Bioinform 2023;20:1327-1336. [PMID: 35417351 DOI: 10.1109/tcbb.2022.3167090] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]

For:	Kaur A, Chauhan APS, Aggarwal AK. Prediction of Enhancers in DNA Sequence Data using a Hybrid CNN-DLSTM Model. IEEE/ACM Trans Comput Biol Bioinform 2023;20:1327-1336. [PMID: 35417351 DOI: 10.1109/tcbb.2022.3167090] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]

Number

Cited by Other Article(s)

Moeckel C, Mouratidis I, Chantzi N, Uzun Y, Georgakopoulos-Soares I. Advances in computational and experimental approaches for deciphering transcriptional regulatory networks: Understanding the roles of cis-regulatory elements is essential, and recent research utilizing MPRAs, STARR-seq, CRISPR-Cas9, and machine learning has yielded valuable insights. Bioessays 2024;46:e2300210. [PMID: 38715516 DOI: 10.1002/bies.202300210] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 04/22/2024] [Accepted: 04/23/2024] [Indexed: 05/16/2024]

Hu W, Li Y, Wu Y, Guan L, Li M. A deep learning model for DNA enhancer prediction based on nucleotide position aware feature encoding. iScience 2024;27:110030. [PMID: 38868182 PMCID: PMC11167433 DOI: 10.1016/j.isci.2024.110030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2024] [Revised: 04/23/2024] [Accepted: 05/16/2024] [Indexed: 06/14/2024] Open

Sinha R, Pal RK, De RK. A novel method addressing NGS-based mappability bias for sensitive detection of DNA alterations. J Bioinform Comput Biol 2024;22:2450009. [PMID: 39030667 DOI: 10.1142/s0219720024500094] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/21/2024]

Ren Y, Li C, Nanayakkara Sapugahawatte D, Zhu C, Spänig S, Jamrozy D, Rothen J, Daubenberger CA, Bentley SD, Ip M, Heider D. Predicting hosts and cross-species transmission of Streptococcus agalactiae by interpretable machine learning. Comput Biol Med 2024;171:108185. [PMID: 38401454 DOI: 10.1016/j.compbiomed.2024.108185] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Revised: 02/13/2024] [Accepted: 02/18/2024] [Indexed: 02/26/2024]

Abstract

BACKGROUND

Streptococcus agalactiae, commonly known as Group B Streptococcus (GBS), exhibits a broad host range, manifesting as both a beneficial commensal and an opportunistic pathogen across various species. In humans, it poses significant risks, causing neonatal sepsis and meningitis, along with severe infections in adults. Additionally, it impacts livestock by inducing mastitis in bovines and contributing to epidemic mortality in fish populations. Despite its wide host spectrum, the mechanisms enabling GBS to adapt to specific hosts remain inadequately elucidated. Therefore, the development of a rapid and accurate method differentiates GBS strains associated with particular animal hosts based on genome-wide information holds immense potential. Such a tool would not only bolster the identification and containment efforts during GBS outbreaks but also deepen our comprehension of the bacteria's host adaptations spanning humans, livestock, and other natural animal reservoirs.

METHODS AND RESULTS

Here, we developed three machine learning models-random forest (RF), logistic regression (LR), and support vector machine (SVM) based on genome-wide mutation data. These models enabled precise prediction of the host origin of GBS, accurately distinguishing between human, bovine, fish, and pig hosts. Moreover, we conducted an interpretable machine learning using SHapley Additive exPlanations (SHAP) and variant annotation to uncover the most influential genomic features and associated genes for each host. Additionally, by meticulously examining misclassified samples, we gained valuable insights into the dynamics of host transmission and the potential for zoonotic infections.

CONCLUSIONS

Our study underscores the effectiveness of random forest (RF) and logistic regression (LR) models based on mutation data for accurately predicting GBS host origins. Additionally, we identify the key features associated with each GBS host, thereby enhancing our understanding of the bacteria's host-specific adaptations.

Collapse

Jiang J, Pei H, Li J, Li M, Zou Q, Lv Z. FEOpti-ACVP: identification of novel anti-coronavirus peptide sequences based on feature engineering and optimization. Brief Bioinform 2024;25:bbae037. [PMID: 38366802 PMCID: PMC10939380 DOI: 10.1093/bib/bbae037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Revised: 12/27/2023] [Accepted: 01/17/2024] [Indexed: 02/18/2024] Open

Ramakrishnan A, Wangensteen G, Kim S, Nestler EJ, Shen L. DeepRegFinder: deep learning-based regulatory elements finder. BIOINFORMATICS ADVANCES 2024;4:vbae007. [PMID: 38343388 PMCID: PMC10858349 DOI: 10.1093/bioadv/vbae007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Revised: 12/06/2023] [Accepted: 01/12/2024] [Indexed: 06/15/2024]

Raes A, Athanasiou G, Azari-Dolatabad N, Sadeghi H, Gonzalez Andueza S, Arcos JL, Cerquides J, Chaitanya Pavani K, Opsomer G, Bogado Pascottini O, Smits K, Angel-Velez D, Van Soom A. Manual versus deep learning measurements to evaluate cumulus expansion of bovine oocytes and its relationship with embryo development in vitro. Comput Biol Med 2024;168:107785. [PMID: 38056209 DOI: 10.1016/j.compbiomed.2023.107785] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2023] [Revised: 11/20/2023] [Accepted: 11/28/2023] [Indexed: 12/08/2023]

Abstract

Cumulus expansion is an important indicator of oocyte maturation and has been suggested to be indicative of greater oocyte developmental capacity. Although multiple methods have been described to assess cumulus expansion, none of them is considered a gold standard. Additionally, these methods are subjective and time-consuming. In this manuscript, the reliability of three cumulus expansion measurement methods was assessed, and a deep learning model was created to automatically perform the measurement. Cumulus expansion of 232 cumulus-oocyte complexes was evaluated by three independent observers using three methods: (1) measurement of the cumulus area, (2) measurement of three distances between the zona pellucida and outer cumulus, and (3) scoring cumulus expansion on a 5-point Likert scale. The reliability of the methods was calculated in terms of intraclass-correlation coefficients (ICC) for both inter- and intra-observer agreements. The area method resulted in the best overall inter-observer agreement with an ICC of 0.89 versus 0.54 and 0.30 for the 3-distance and scoring methods, respectively. Therefore, the area method served as the base to create a deep learning model, AI-xpansion, which reaches a human-level performance in terms of average rank, bias and variance. To evaluate the accuracy of the methods, the results of cumulus expansion calculations were linked to embryonic development. Cumulus expansion had increased significantly in oocytes that achieved successful embryo development when measured by AI-xpansion, the area- or 3-distance method, while this was not the case for the scoring method. Measuring the area is the most reliable method to manually evaluate cumulus expansion, whilst deep learning automatically performs the calculation with human-level precision and high accuracy and could therefore be a valuable prospective tool for embryologists.

Collapse

Affiliation(s)

Annelies Raes Department of Internal Medicine, Reproduction and Population Medicine, Faculty of Veterinary Medicine, Ghent University, Merelbeke, Belgium.
Georgios Athanasiou Artificial Intelligence Research Institute (IIIA-CSIC), 08193, Bellaterra, Spain; Department of Computer Science, Universitat Autonoma de Barcelona, Spain.
Nima Azari-Dolatabad Department of Internal Medicine, Reproduction and Population Medicine, Faculty of Veterinary Medicine, Ghent University, Merelbeke, Belgium
Hafez Sadeghi Department of Internal Medicine, Reproduction and Population Medicine, Faculty of Veterinary Medicine, Ghent University, Merelbeke, Belgium
Sebastian Gonzalez Andueza Department of Internal Medicine, Reproduction and Population Medicine, Faculty of Veterinary Medicine, Ghent University, Merelbeke, Belgium
Josep Lluis Arcos Artificial Intelligence Research Institute (IIIA-CSIC), 08193, Bellaterra, Spain
Jesus Cerquides Artificial Intelligence Research Institute (IIIA-CSIC), 08193, Bellaterra, Spain.
Krishna Chaitanya Pavani Department of Internal Medicine, Reproduction and Population Medicine, Faculty of Veterinary Medicine, Ghent University, Merelbeke, Belgium
Geert Opsomer Department of Internal Medicine, Reproduction and Population Medicine, Faculty of Veterinary Medicine, Ghent University, Merelbeke, Belgium
Osvaldo Bogado Pascottini Department of Internal Medicine, Reproduction and Population Medicine, Faculty of Veterinary Medicine, Ghent University, Merelbeke, Belgium
Katrien Smits Department of Internal Medicine, Reproduction and Population Medicine, Faculty of Veterinary Medicine, Ghent University, Merelbeke, Belgium
Daniel Angel-Velez Department of Internal Medicine, Reproduction and Population Medicine, Faculty of Veterinary Medicine, Ghent University, Merelbeke, Belgium; Research Group in Animal Sciences-INCA-CES, Universidad CES, Medellin, 050021, Colombia
Ann Van Soom Department of Internal Medicine, Reproduction and Population Medicine, Faculty of Veterinary Medicine, Ghent University, Merelbeke, Belgium

Collapse

Ma J, Kong D, Wu F, Bao L, Yuan J, Liu Y. Densely connected convolutional networks for ultrasound image based lesion segmentation. Comput Biol Med 2024;168:107725. [PMID: 38006827 DOI: 10.1016/j.compbiomed.2023.107725] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Revised: 11/03/2023] [Accepted: 11/15/2023] [Indexed: 11/27/2023]

Zhu J, Yang Y. Imputation for Single-cell RNA-seq Data with Non-negative Matrix Factorization and Transfer Learning. J Bioinform Comput Biol 2023;21:2350029. [PMID: 38248911 DOI: 10.1142/s0219720023500294] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2024]

Lazaros K, Vlamos P, Vrahatis AG. Methods for cell-type annotation on scRNA-seq data: A recent overview. J Bioinform Comput Biol 2023;21:2340002. [PMID: 37743364 DOI: 10.1142/s0219720023400024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/26/2023]

Liu Y, Wang Z, Yuan H, Zhu G, Zhang Y. HEAP: a task adaptive-based explainable deep learning framework for enhancer activity prediction. Brief Bioinform 2023;24:bbad286. [PMID: 37539835 DOI: 10.1093/bib/bbad286] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Revised: 07/05/2023] [Accepted: 07/21/2023] [Indexed: 08/05/2023] Open

Phan LT, Oh C, He T, Manavalan B. A comprehensive revisit of the machine-learning tools developed for the identification of enhancers in the human genome. Proteomics 2023;23:e2200409. [PMID: 37021401 DOI: 10.1002/pmic.202200409] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2023] [Revised: 03/18/2023] [Accepted: 03/27/2023] [Indexed: 04/07/2023]

Alakuş TB. A Novel Repetition Frequency-Based DNA Encoding Scheme to Predict Human and Mouse DNA Enhancers with Deep Learning. Biomimetics (Basel) 2023;8:218. [PMID: 37366813 DOI: 10.3390/biomimetics8020218] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2023] [Revised: 05/18/2023] [Accepted: 05/22/2023] [Indexed: 06/28/2023] Open

Abstract

Recent studies have shown that DNA enhancers have an important role in the regulation of gene expression. They are responsible for different important biological elements and processes such as development, homeostasis, and embryogenesis. However, experimental prediction of these DNA enhancers is time-consuming and costly as it requires laboratory work. Therefore, researchers started to look for alternative ways and started to apply computation-based deep learning algorithms to this field. Yet, the inconsistency and unsuccessful prediction performance of computational-based approaches among various cell lines led to the investigation of these approaches as well. Therefore, in this study, a novel DNA encoding scheme was proposed, and solutions were sought to the problems mentioned and DNA enhancers were predicted with BiLSTM. The study consisted of four different stages for two scenarios. In the first stage, DNA enhancer data were obtained. In the second stage, DNA sequences were converted to numerical representations by both the proposed encoding scheme and various DNA encoding schemes including EIIP, integer number, and atomic number. In the third stage, the BiLSTM model was designed, and the data were classified. In the final stage, the performance of DNA encoding schemes was determined by accuracy, precision, recall, F1-score, CSI, MCC, G-mean, Kappa coefficient, and AUC scores. In the first scenario, it was determined whether the DNA enhancers belonged to humans or mice. As a result of the prediction process, the highest performance was achieved with the proposed DNA encoding scheme, and an accuracy of 92.16% and an AUC score of 0.85 were calculated, respectively. The closest accuracy score to the proposed scheme was obtained with the EIIP DNA encoding scheme and the result was observed as 89.14%. The AUC score of this scheme was measured as 0.87. Among the remaining DNA encoding schemes, the atomic number showed an accuracy score of 86.61%, while this rate decreased to 76.96% with the integer scheme. The AUC values of these schemes were 0.84 and 0.82, respectively. In the second scenario, it was determined whether there was a DNA enhancer and, if so, it was decided to which species this enhancer belonged. In this scenario, the highest accuracy score was obtained with the proposed DNA encoding scheme and the result was 84.59%. Moreover, the AUC score of the proposed scheme was determined as 0.92. EIIP and integer DNA encoding schemes showed accuracy scores of 77.80% and 73.68%, respectively, while their AUC scores were close to 0.90. The most ineffective prediction was performed with the atomic number and the accuracy score of this scheme was calculated as 68.27%. Finally, the AUC score of this scheme was 0.81. At the end of the study, it was observed that the proposed DNA encoding scheme was successful and effective in predicting DNA enhancers.

Collapse

Wu P, Nie Z, Huang Z, Zhang X. CircPCBL: Identification of Plant CircRNAs with a CNN-BiGRU-GLT Model. PLANTS (BASEL, SWITZERLAND) 2023;12:1652. [PMID: 37111874 PMCID: PMC10143888 DOI: 10.3390/plants12081652] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/20/2023] [Revised: 04/10/2023] [Accepted: 04/13/2023] [Indexed: 06/19/2023]

Abstract

Circular RNAs (circRNAs), which are produced post-splicing of pre-mRNAs, are strongly linked to the emergence of several tumor types. The initial stage in conducting follow-up studies involves identifying circRNAs. Currently, animals are the primary target of most established circRNA recognition technologies. However, the sequence features of plant circRNAs differ from those of animal circRNAs, making it impossible to detect plant circRNAs. For example, there are non-GT/AG splicing signals at circRNA junction sites and few reverse complementary sequences and repetitive elements in the flanking intron sequences of plant circRNAs. In addition, there have been few studies on circRNAs in plants, and thus it is urgent to create a plant-specific method for identifying circRNAs. In this study, we propose CircPCBL, a deep-learning approach that only uses raw sequences to distinguish between circRNAs found in plants and other lncRNAs. CircPCBL comprises two separate detectors: a CNN-BiGRU detector and a GLT detector. The CNN-BiGRU detector takes in the one-hot encoding of the RNA sequence as the input, while the GLT detector uses k-mer (k = 1 - 4) features. The output matrices of the two submodels are then concatenated and ultimately pass through a fully connected layer to produce the final output. To verify the generalization performance of the model, we evaluated CircPCBL using several datasets, and the results revealed that it had an F1 of 85.40% on the validation dataset composed of six different plants species and 85.88%, 75.87%, and 86.83% on the three cross-species independent test sets composed of Cucumis sativus, Populus trichocarpa, and Gossypium raimondii, respectively. With an accuracy of 90.9% and 90%, respectively, CircPCBL successfully predicted ten of the eleven circRNAs of experimentally reported Poncirus trifoliata and nine of the ten lncRNAs of rice on the real set. CircPCBL could potentially contribute to the identification of circRNAs in plants. In addition, it is remarkable that CircPCBL also achieved an average accuracy of 94.08% on the human datasets, which is also an excellent result, implying its potential application in animal datasets. Ultimately, CircPCBL is available as a web server, from which the data and source code can also be downloaded free of charge.

Collapse

Sokhansanj BA, Rosen GL. Predicting COVID-19 disease severity from SARS-CoV-2 spike protein sequence by mixed effects machine learning. Comput Biol Med 2022;149:105969. [PMID: 36041271 PMCID: PMC9384346 DOI: 10.1016/j.compbiomed.2022.105969] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2022] [Revised: 07/11/2022] [Accepted: 08/13/2022] [Indexed: 11/17/2022]