1
|
Teragawa S, Wang L, Liu Y. DeepPGD: A Deep Learning Model for DNA Methylation Prediction Using Temporal Convolution, BiLSTM, and Attention Mechanism. Int J Mol Sci 2024; 25:8146. [PMID: 39125714 PMCID: PMC11311892 DOI: 10.3390/ijms25158146] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2024] [Revised: 06/07/2024] [Accepted: 06/25/2024] [Indexed: 08/12/2024] Open
Abstract
As part of the field of DNA methylation identification, this study tackles the challenge of enhancing recognition performance by introducing a specialized deep learning framework called DeepPGD. DNA methylation, a crucial biological modification, plays a vital role in gene expression analyses, cellular differentiation, and the study of disease progression. However, accurately and efficiently identifying DNA methylation sites remains a pivotal concern in the field of bioinformatics. The issue addressed in this paper is the presence of methylation in DNA, which is a binary classification problem. To address this, our research aimed to develop a deep learning algorithm capable of more precisely identifying these sites. The DeepPGD framework combined a dual residual structure involving Temporal convolutional networks (TCNs) and bidirectional long short-term memory (BiLSTM) networks to effectively extract intricate DNA structural and sequence features. Additionally, to meet the practical requirements of DNA methylation identification, extensive experiments were conducted across a variety of biological species. The experimental results highlighted DeepPGD's exceptional performance across multiple evaluation metrics, including accuracy, Matthews' correlation coefficient (MCC), and the area under the curve (AUC). In comparison to other algorithms in the same domain, DeepPGD demonstrated superior classification and predictive capabilities across various biological species datasets. This significant advancement in algorithmic prowess not only offers substantial technical support, but also holds potential for research and practical implementation within the DNA methylation identification domain. Moreover, the DeepPGD framework shows potential for application in genomics research, biomedicine, and disease diagnostics, among other fields.
Collapse
Affiliation(s)
- Shoryu Teragawa
- School of Software, Dalian University of Technology, Dalian 116024, China;
| | - Lei Wang
- School of Software, Dalian University of Technology, Dalian 116024, China;
| | - Yi Liu
- School of Engineering, University of Southern Queensland, 487-535 West Street, Toowoomba, QLD 4350, Australia;
| |
Collapse
|
2
|
Wei L, Zou Q, Zeng X. Editorial: Artificial intelligence in drug discovery and development. Methods 2024; 226:133-137. [PMID: 38582311 DOI: 10.1016/j.ymeth.2024.04.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/08/2024] Open
Affiliation(s)
- Leyi Wei
- Faculty of Applied Sciences, Macao Polytechnic University, Macao 999078, China; School of Software, Shandong University, Jinan 250101, China.
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
| | - Xiangxiang Zeng
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| |
Collapse
|
3
|
Akay A, Reddy HN, Galloway R, Kozyra J, Jackson AW. Predicting DNA toehold-mediated strand displacement rate constants using a DNA-BERT transformer deep learning model. Heliyon 2024; 10:e28443. [PMID: 38560216 PMCID: PMC10981123 DOI: 10.1016/j.heliyon.2024.e28443] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Revised: 03/15/2024] [Accepted: 03/19/2024] [Indexed: 04/04/2024] Open
Abstract
Dynamic DNA nanotechnology is driving exciting developments in molecular computing, cargo delivery, sensing and detection. Combining this innovative area of research with the progress made in machine learning will aid in the design of sophisticated DNA machinery. Herein, we present a novel framework based on a transformer architecture and a deep learning model which can predict the rate constant of toehold-mediated strand displacement, the underlying process in dynamic DNA nanotechnology. Initially, a dataset of 4450 DNA sequences and corresponding rate constants were generated in-silico using KinDA. Subsequently, a 1D convolution neural network was trained using specific local features and DNA-BERT sequence embedding to produce predicted rate constants. As a result, the newly trained deep learning model predicted toehold-mediated strand displacement rate constants with a root mean square error of 0.76, during testing. These findings demonstrate that DNA-BERT can improve prediction accuracy, negating the need for extensive computational simulations or experimentation. Finally, the impact of various local features during model training is discussed, and a detailed comparison between the One-hot encoder and DNA-BERT sequences representation methods is presented.
Collapse
Affiliation(s)
- Ali Akay
- Nanovery Limited, United Kingdom
- Universita Degli Studi di Trento, Italy
| | | | | | | | | |
Collapse
|
4
|
Yin Z, Lyu J, Zhang G, Huang X, Ma Q, Jiang J. SoftVoting6mA: An improved ensemble-based method for predicting DNA N6-methyladenine sites in cross-species genomes. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2024; 21:3798-3815. [PMID: 38549308 DOI: 10.3934/mbe.2024169] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/02/2024]
Abstract
The DNA N6-methyladenine (6mA) is an epigenetic modification, which plays a pivotal role in biological processes encompassing gene expression, DNA replication, repair, and recombination. Therefore, the precise identification of 6mA sites is fundamental for better understanding its function, but challenging. We proposed an improved ensemble-based method for predicting DNA N6-methyladenine sites in cross-species genomes called SoftVoting6mA. The SoftVoting6mA selected four (electron-ion-interaction pseudo potential, One-hot encoding, Kmer, and pseudo dinucleotide composition) codes from 15 types of encoding to represent DNA sequences by comparing their performances. Similarly, the SoftVoting6mA combined four learning algorithms using the soft voting strategy. The 5-fold cross-validation and the independent tests showed that SoftVoting6mA reached the state-of-the-art performance. To enhance accessibility, a user-friendly web server is provided at http://www.biolscience.cn/SoftVoting6mA/.
Collapse
Affiliation(s)
- Zhaoting Yin
- College of Information Science and Engineering, Shaoyang University, Shaoyang 422000, China
| | - Jianyi Lyu
- College of Information Science and Engineering, Shaoyang University, Shaoyang 422000, China
| | - Guiyang Zhang
- College of Information Science and Engineering, Shaoyang University, Shaoyang 422000, China
| | - Xiaohong Huang
- College of Information Science and Engineering, Shaoyang University, Shaoyang 422000, China
| | - Qinghua Ma
- College of Information Science and Engineering, Hohai University, Nanjing 210000, China
- Faculty of Information Technology, University of Jyvaskyla, Jyvaskyla, Finland
| | - Jinyun Jiang
- College of Information Science and Engineering, Shaoyang University, Shaoyang 422000, China
| |
Collapse
|
5
|
Huang G, Huang X, Luo W. 6mA-StackingCV: an improved stacking ensemble model for predicting DNA N6-methyladenine site. BioData Min 2023; 16:34. [PMID: 38012796 PMCID: PMC10680251 DOI: 10.1186/s13040-023-00348-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2023] [Accepted: 11/04/2023] [Indexed: 11/29/2023] Open
Abstract
DNA N6-adenine methylation (N6-methyladenine, 6mA) plays a key regulating role in the cellular processes. Precisely recognizing 6mA sites is of importance to further explore its biological functions. Although there are many developed computational methods for 6mA site prediction over the past decades, there is a large root left to improve. We presented a cross validation-based stacking ensemble model for 6mA site prediction, called 6mA-StackingCV. The 6mA-StackingCV is a type of meta-learning algorithm, which uses output of cross validation as input to the final classifier. The 6mA-StackingCV reached the state of the art performances in the Rosaceae independent test. Extensive tests demonstrated the stability and the flexibility of the 6mA-StackingCV. We implemented the 6mA-StackingCV as a user-friendly web application, which allows one to restrictively choose representations or learning algorithms. This application is freely available at http://www.biolscience.cn/6mA-stackingCV/ . The source code and experimental data is available at https://github.com/Xiaohong-source/6mA-stackingCV .
Collapse
Affiliation(s)
- Guohua Huang
- School of Information Technology and Administration, Hunan University of Finance and Economics, Changsha, China.
- College of Information Science and Engineering, Shaoyang University, Shaoyang, Hunan, 422000, China.
| | - Xiaohong Huang
- College of Information Science and Engineering, Shaoyang University, Shaoyang, Hunan, 422000, China
| | - Wei Luo
- College of Information Science and Engineering, Shaoyang University, Shaoyang, Hunan, 422000, China
| |
Collapse
|
6
|
Le NQK, Xu L. Optimizing Hyperparameter Tuning in Machine Learning to Improve the Predictive Performance of Cross-Species N6-Methyladenosine Sites. ACS OMEGA 2023; 8:39420-39426. [PMID: 37901522 PMCID: PMC10600906 DOI: 10.1021/acsomega.3c05074] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/14/2023] [Accepted: 09/28/2023] [Indexed: 10/31/2023]
Abstract
DNA N6-methyladenosine (6 mA) modification carries significant epigenetic information and plays a pivotal role in biological functions, thereby profoundly impacting human development. Precise and reliable detection of 6 mA sites is integral to understanding the mechanisms underpinning DNA modification. The present methods, primarily experimental, used to identify specific molecular sites are often time-intensive and costly. Consequently, the rise of computer-based methods aimed at identifying 6 mA sites provides a welcome alternative. Our research introduces a novel model to discern DNA 6 mA sites in cross-species genomes. This model, developed through machine learning, utilizes extracted sequence information. Hyperparameter tuning was employed to ascertain the most effective feature combination and model implementation, thereby garnering vital information from sequences. Our model demonstrated superior accuracy compared to the existing models when tested using five-fold cross-validation. Thus, our study substantiates the reliability and efficiency of our model as a valuable tool for supplementing experimental research.
Collapse
Affiliation(s)
- Nguyen Quoc Khanh Le
- Professional
Master Program in Artificial Intelligence in Medicine, College of
Medicine, Taipei Medical University, Taipei 110, Taiwan
- Research
Center for Artificial Intelligence in Medicine, Taipei Medical University, Taipei 110, Taiwan
- AIBioMed
Research Group, Taipei Medical University, Taipei 110, Taiwan
- Translational
Imaging Research Center, Taipei Medical
University Hospital, Taipei 110, Taiwan
| | - Ling Xu
- NUS-ISS,
National University of Singapore, Singapore, 119615, Singapore
| |
Collapse
|
7
|
Xiang S, Zhang T, Wu M. M6ATMR: identifying N6-methyladenosine sites through RNA sequence similarity matrix reconstruction guided by Transformer. PeerJ 2023; 11:e15899. [PMID: 37719113 PMCID: PMC10501384 DOI: 10.7717/peerj.15899] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2023] [Accepted: 07/24/2023] [Indexed: 09/19/2023] Open
Abstract
Numerous studies have focused on the classification of N6-methyladenosine (m6A) modification sites in RNA sequences, treating it as a multi-feature extraction task. In these studies, the incorporation of physicochemical properties of nucleotides has been applied to enhance recognition efficacy. However, the introduction of excessive supplementary information may introduce noise to the RNA sequence features, and the utilization of sequence similarity information remains underexplored. In this research, we present a novel method for RNA m6A modification site recognition called M6ATMR. Our approach relies solely on sequence information, leveraging Transformer to guide the reconstruction of the sequence similarity matrix, thereby enhancing feature representation. Initially, M6ATMR encodes RNA sequences using 3-mers to generate the sequence similarity matrix. Meanwhile, Transformer is applied to extract sequence structure graphs for each RNA sequence. Subsequently, to capture low-dimensional representations of similarity matrices and structure graphs, we introduce a graph self-correlation convolution block. These representations are then fused and reconstructed through the local-global fusion block. Notably, we adopt iteratively updated sequence structure graphs to continuously optimize the similarity matrix, thereby constraining the end-to-end feature extraction process. Finally, we employ the random forest (RF) algorithm for identifying m6A modification sites based on the reconstructed features. Experimental results demonstrate that M6ATMR achieves promising performance by solely utilizing RNA sequences for m6A modification site identification. Our proposed method can be considered an effective complement to existing RNA m6A modification site recognition approaches.
Collapse
Affiliation(s)
- Shuang Xiang
- Changjiang Water Resources and Hydropower Development Group, Wuhan, Hubei, China
| | - Te Zhang
- Changjiang Water Resources and Hydropower Development Group, Wuhan, Hubei, China
| | - Minghao Wu
- Changjiang Water Resources and Hydropower Development Group, Wuhan, Hubei, China
| |
Collapse
|
8
|
Berson E, Sreenivas A, Phongpreecha T, Perna A, Grandi FC, Xue L, Ravindra NG, Payrovnaziri N, Mataraso S, Kim Y, Espinosa C, Chang AL, Becker M, Montine KS, Fox EJ, Chang HY, Corces MR, Aghaeepour N, Montine TJ. Whole genome deconvolution unveils Alzheimer's resilient epigenetic signature. Nat Commun 2023; 14:4947. [PMID: 37587197 PMCID: PMC10432546 DOI: 10.1038/s41467-023-40611-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Accepted: 08/03/2023] [Indexed: 08/18/2023] Open
Abstract
Assay for Transposase Accessible Chromatin by sequencing (ATAC-seq) accurately depicts the chromatin regulatory state and altered mechanisms guiding gene expression in disease. However, bulk sequencing entangles information from different cell types and obscures cellular heterogeneity. To address this, we developed Cellformer, a deep learning method that deconvolutes bulk ATAC-seq into cell type-specific expression across the whole genome. Cellformer enables cost-effective cell type-specific open chromatin profiling in large cohorts. Applied to 191 bulk samples from 3 brain regions, Cellformer identifies cell type-specific gene regulatory mechanisms involved in resilience to Alzheimer's disease, an uncommon group of cognitively healthy individuals that harbor a high pathological load of Alzheimer's disease. Cell type-resolved chromatin profiling unveils cell type-specific pathways and nominates potential epigenetic mediators underlying resilience that may illuminate therapeutic opportunities to limit the cognitive impact of the disease. Cellformer is freely available to facilitate future investigations using high-throughput bulk ATAC-seq data.
Collapse
Affiliation(s)
- Eloise Berson
- Department of Pathology, Stanford University, Stanford, CA, USA.
- Department of Anesthesiology, Perioperative, and Pain Medicine, Stanford University, Stanford, CA, USA.
- Department of Biomedical Data Science, Stanford University, Stanford, CA, USA.
| | - Anjali Sreenivas
- Department of Pathology, Stanford University, Stanford, CA, USA
- Department of Anesthesiology, Perioperative, and Pain Medicine, Stanford University, Stanford, CA, USA
| | - Thanaphong Phongpreecha
- Department of Pathology, Stanford University, Stanford, CA, USA
- Department of Anesthesiology, Perioperative, and Pain Medicine, Stanford University, Stanford, CA, USA
- Department of Biomedical Data Science, Stanford University, Stanford, CA, USA
| | - Amalia Perna
- Department of Pathology, Stanford University, Stanford, CA, USA
| | - Fiorella C Grandi
- Gladstone Institute of Neurological Disease, San Francisco, CA, USA
- Gladstone Institute of Data Science and Biotechnology, San Francisco, CA, USA
- Department of Neurology, University of California San Francisco, San Francisco, CA, USA
| | - Lei Xue
- Department of Anesthesiology, Perioperative, and Pain Medicine, Stanford University, Stanford, CA, USA
- Department of Biomedical Data Science, Stanford University, Stanford, CA, USA
- Department of Pediatrics, Stanford University, Stanford, CA, USA
| | - Neal G Ravindra
- Department of Pathology, Stanford University, Stanford, CA, USA
- Department of Anesthesiology, Perioperative, and Pain Medicine, Stanford University, Stanford, CA, USA
- Department of Biomedical Data Science, Stanford University, Stanford, CA, USA
| | - Neelufar Payrovnaziri
- Department of Anesthesiology, Perioperative, and Pain Medicine, Stanford University, Stanford, CA, USA
- Department of Biomedical Data Science, Stanford University, Stanford, CA, USA
- Department of Pediatrics, Stanford University, Stanford, CA, USA
| | - Samson Mataraso
- Department of Anesthesiology, Perioperative, and Pain Medicine, Stanford University, Stanford, CA, USA
- Department of Biomedical Data Science, Stanford University, Stanford, CA, USA
- Department of Pediatrics, Stanford University, Stanford, CA, USA
| | - Yeasul Kim
- Department of Anesthesiology, Perioperative, and Pain Medicine, Stanford University, Stanford, CA, USA
- Department of Biomedical Data Science, Stanford University, Stanford, CA, USA
- Department of Pediatrics, Stanford University, Stanford, CA, USA
| | - Camilo Espinosa
- Department of Anesthesiology, Perioperative, and Pain Medicine, Stanford University, Stanford, CA, USA
- Department of Biomedical Data Science, Stanford University, Stanford, CA, USA
- Department of Pediatrics, Stanford University, Stanford, CA, USA
| | - Alan L Chang
- Department of Anesthesiology, Perioperative, and Pain Medicine, Stanford University, Stanford, CA, USA
- Department of Biomedical Data Science, Stanford University, Stanford, CA, USA
- Department of Pediatrics, Stanford University, Stanford, CA, USA
| | - Martin Becker
- Department of Anesthesiology, Perioperative, and Pain Medicine, Stanford University, Stanford, CA, USA
- Department of Biomedical Data Science, Stanford University, Stanford, CA, USA
- Department of Pediatrics, Stanford University, Stanford, CA, USA
| | | | - Edward J Fox
- Department of Pathology, Stanford University, Stanford, CA, USA
| | - Howard Y Chang
- Center for Personal Dynamic Regulomes, Stanford University School of Medicine, Stanford, CA, USA
- Howard Hughes Medical Institute, Stanford University School of Medicine, Stanford, CA, USA
| | - M Ryan Corces
- Gladstone Institute of Neurological Disease, San Francisco, CA, USA
- Gladstone Institute of Data Science and Biotechnology, San Francisco, CA, USA
- Department of Neurology, University of California San Francisco, San Francisco, CA, USA
| | - Nima Aghaeepour
- Department of Anesthesiology, Perioperative, and Pain Medicine, Stanford University, Stanford, CA, USA
- Department of Biomedical Data Science, Stanford University, Stanford, CA, USA
- Department of Pediatrics, Stanford University, Stanford, CA, USA
| | | |
Collapse
|
9
|
Hu W, Guan L, Li M. Prediction of DNA Methylation based on Multi-dimensional feature encoding and double convolutional fully connected convolutional neural network. PLoS Comput Biol 2023; 19:e1011370. [PMID: 37639434 PMCID: PMC10461834 DOI: 10.1371/journal.pcbi.1011370] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2023] [Accepted: 07/18/2023] [Indexed: 08/31/2023] Open
Abstract
DNA methylation takes on critical significance to the regulation of gene expression by affecting the stability of DNA and changing the structure of chromosomes. DNA methylation modification sites should be identified, which lays a solid basis for gaining more insights into their biological functions. Existing machine learning-based methods of predicting DNA methylation have not fully exploited the hidden multidimensional information in DNA gene sequences, such that the prediction accuracy of models is significantly limited. Besides, most models have been built in terms of a single methylation type. To address the above-mentioned issues, a deep learning-based method was proposed in this study for DNA methylation site prediction, termed the MEDCNN model. The MEDCNN model is capable of extracting feature information from gene sequences in three dimensions (i.e., positional information, biological information, and chemical information). Moreover, the proposed method employs a convolutional neural network model with double convolutional layers and double fully connected layers while iteratively updating the gradient descent algorithm using the cross-entropy loss function to increase the prediction accuracy of the model. Besides, the MEDCNN model can predict different types of DNA methylation sites. As indicated by the experimental results,the deep learning method based on coding from multiple dimensions outperformed single coding methods, and the MEDCNN model was highly applicable and outperformed existing models in predicting DNA methylation between different species. As revealed by the above-described findings, the MEDCNN model can be effective in predicting DNA methylation sites.
Collapse
Affiliation(s)
- Wenxing Hu
- College of Physics and Electronic Information, Gannan Normal University, Ganzhou, Jiangxi, China
| | - Lixin Guan
- College of Physics and Electronic Information, Gannan Normal University, Ganzhou, Jiangxi, China
| | - Mengshan Li
- College of Physics and Electronic Information, Gannan Normal University, Ganzhou, Jiangxi, China
| |
Collapse
|
10
|
Zeng J, Li W, Zheng B, Xiao L, Zhang X, Zhong Q, Liang S, Wang J, Huang Y, Qin C. ACMTR: Attention-guided, combined multi-scale, transformer reasoning-based network for 3D CT pelvic functional bone marrow segmentation. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2022.104522] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
|
11
|
Brain tumor segmentation of the FLAIR MRI images using novel ResUnet. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2023.104586] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
|
12
|
Machine Learning: Using Xception, a Deep Convolutional Neural Network Architecture, to Implement Pectus Excavatum Diagnostic Tool from Frontal-View Chest X-rays. Biomedicines 2023; 11:biomedicines11030760. [PMID: 36979738 PMCID: PMC10045358 DOI: 10.3390/biomedicines11030760] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2022] [Revised: 02/17/2023] [Accepted: 02/19/2023] [Indexed: 03/06/2023] Open
Abstract
Pectus excavatum (PE), a chest-wall deformity that can compromise cardiopulmonary function, cannot be detected by a radiologist through frontal chest radiography without a lateral view or chest computed tomography. This study aims to train a convolutional neural network (CNN), a deep learning architecture with powerful image processing ability, for PE screening through frontal chest radiography, which is the most common imaging test in current hospital practice. Posteroanterior-view chest images of PE and normal patients were collected from our hospital to build the database. Among them, 80% were used as the training set used to train the established CNN algorithm, Xception, whereas the remaining 20% were a test set for model performance evaluation. The performance of our diagnostic artificial intelligence model ranged between 0.976–1 under the receiver operating characteristic curve. The test accuracy of the model reached 0.989, and the sensitivity and specificity were 96.66 and 96.64, respectively. Our study is the first to prove that a CNN can be trained as a diagnostic tool for PE using frontal chest X-rays, which is not possible by the human eye. It offers a convenient way to screen potential candidates for the surgical repair of PE, primarily using available image examinations.
Collapse
|
13
|
Improving brain tumor classification performance with an effective approach based on new deep learning model named 3ACL from 3D MRI data. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2022.104424] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
|
14
|
PISDGAN: Perceive image structure and details for laryngeal image enhancement. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2022.104307] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
15
|
Lieberman B, Kong JD, Gusinow R, Asgary A, Bragazzi NL, Choma J, Dahbi SE, Hayashi K, Kar D, Kawonga M, Mbada M, Monnakgotla K, Orbinski J, Ruan X, Stevenson F, Wu J, Mellado B. Big data- and artificial intelligence-based hot-spot analysis of COVID-19: Gauteng, South Africa, as a case study. BMC Med Inform Decis Mak 2023; 23:19. [PMID: 36703133 PMCID: PMC9879257 DOI: 10.1186/s12911-023-02098-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2022] [Accepted: 01/02/2023] [Indexed: 01/27/2023] Open
Abstract
The coronavirus disease 2019 (COVID-19) has developed into a pandemic. Data-driven techniques can be used to inform and guide public health decision- and policy-makers. In generalizing the spread of a virus over a large area, such as a province, it must be assumed that the transmission occurs as a stochastic process. It is therefore very difficult for policy and decision makers to understand and visualize the location specific dynamics of the virus on a more granular level. A primary concern is exposing local virus hot-spots, in order to inform and implement non-pharmaceutical interventions. A hot-spot is defined as an area experiencing exponential growth relative to the generalised growth of the pandemic. This paper uses the first and second waves of the COVID-19 epidemic in Gauteng Province, South Africa, as a case study. The study aims provide a data-driven methodology and comprehensive case study to expose location specific virus dynamics within a given area. The methodology uses an unsupervised Gaussian Mixture model to cluster cases at a desired granularity. This is combined with an epidemiological analysis to quantify each cluster's severity, progression and whether it can be defined as a hot-spot.
Collapse
Affiliation(s)
- Benjamin Lieberman
- grid.11951.3d0000 0004 1937 1135School of Physics and Institute for Collider Particle Physics, University of the Witwatersrand, Johannesburg, South Africa ,Africa-Canada Artificial Intelligence and Data Innovation Consortium (ACADIC), Toronto, Canada
| | - Jude Dzevela Kong
- grid.21100.320000 0004 1936 9430Department of Mathematics and Statistics, York University, Toronto, Canada ,Africa-Canada Artificial Intelligence and Data Innovation Consortium (ACADIC), Toronto, Canada
| | - Roy Gusinow
- grid.11951.3d0000 0004 1937 1135School of Physics and Institute for Collider Particle Physics, University of the Witwatersrand, Johannesburg, South Africa ,Africa-Canada Artificial Intelligence and Data Innovation Consortium (ACADIC), Toronto, Canada
| | - Ali Asgary
- grid.21100.320000 0004 1936 9430Disaster and Emergency Management, School of Administrative Studies and Advanced Disaster, Emergency and Rapid-response Simulation, York University, Toronto, Canada ,Africa-Canada Artificial Intelligence and Data Innovation Consortium (ACADIC), Toronto, Canada
| | - Nicola Luigi Bragazzi
- grid.21100.320000 0004 1936 9430Department of Mathematics and Statistics, York University, Toronto, Canada ,grid.21100.320000 0004 1936 9430Laboratory for Industrial and Applied Mathematics (LIAM), York University, Toronto, Canada ,Africa-Canada Artificial Intelligence and Data Innovation Consortium (ACADIC), Toronto, Canada
| | - Joshua Choma
- grid.11951.3d0000 0004 1937 1135School of Physics and Institute for Collider Particle Physics, University of the Witwatersrand, Johannesburg, South Africa ,Africa-Canada Artificial Intelligence and Data Innovation Consortium (ACADIC), Toronto, Canada
| | - Salah-Eddine Dahbi
- grid.11951.3d0000 0004 1937 1135School of Physics and Institute for Collider Particle Physics, University of the Witwatersrand, Johannesburg, South Africa ,Africa-Canada Artificial Intelligence and Data Innovation Consortium (ACADIC), Toronto, Canada
| | - Kentaro Hayashi
- grid.11951.3d0000 0004 1937 1135School of Computer Science and Applied Mathematics, University of the Witwatersrand, Johannesburg, South Africa ,Africa-Canada Artificial Intelligence and Data Innovation Consortium (ACADIC), Toronto, Canada
| | - Deepak Kar
- grid.11951.3d0000 0004 1937 1135School of Physics and Institute for Collider Particle Physics, University of the Witwatersrand, Johannesburg, South Africa ,Africa-Canada Artificial Intelligence and Data Innovation Consortium (ACADIC), Toronto, Canada
| | - Mary Kawonga
- grid.11951.3d0000 0004 1937 1135School of Public Health, University of the Witwatersrand, Johannesburg, South Africa ,Gauteng Provincial Department of Health, Johannesburg, South Africa ,Africa-Canada Artificial Intelligence and Data Innovation Consortium (ACADIC), Toronto, Canada
| | - Mduduzi Mbada
- Africa-Canada Artificial Intelligence and Data Innovation Consortium (ACADIC), Toronto, Canada ,Gauteng Office of the Premier, Johannesburg, South Africa
| | - Kgomotso Monnakgotla
- grid.11951.3d0000 0004 1937 1135School of Physics and Institute for Collider Particle Physics, University of the Witwatersrand, Johannesburg, South Africa ,Africa-Canada Artificial Intelligence and Data Innovation Consortium (ACADIC), Toronto, Canada
| | - James Orbinski
- Africa-Canada Artificial Intelligence and Data Innovation Consortium (ACADIC), Toronto, Canada ,grid.21100.320000 0004 1936 9430Dahdaleh Institute for Global Health Research, York University, Toronto, Canada
| | - Xifeng Ruan
- grid.11951.3d0000 0004 1937 1135School of Physics and Institute for Collider Particle Physics, University of the Witwatersrand, Johannesburg, South Africa ,Africa-Canada Artificial Intelligence and Data Innovation Consortium (ACADIC), Toronto, Canada
| | - Finn Stevenson
- grid.11951.3d0000 0004 1937 1135School of Physics and Institute for Collider Particle Physics, University of the Witwatersrand, Johannesburg, South Africa ,Africa-Canada Artificial Intelligence and Data Innovation Consortium (ACADIC), Toronto, Canada
| | - Jianhong Wu
- grid.21100.320000 0004 1936 9430Department of Mathematics and Statistics, York University, Toronto, Canada ,grid.21100.320000 0004 1936 9430Laboratory for Industrial and Applied Mathematics (LIAM), York University, Toronto, Canada ,Africa-Canada Artificial Intelligence and Data Innovation Consortium (ACADIC), Toronto, Canada
| | - Bruce Mellado
- grid.11951.3d0000 0004 1937 1135School of Physics and Institute for Collider Particle Physics, University of the Witwatersrand, Johannesburg, South Africa ,Africa-Canada Artificial Intelligence and Data Innovation Consortium (ACADIC), Toronto, Canada ,grid.462638.d0000 0001 0696 719XiThemba LABS, National Research Foundation, Somerset West, South Africa
| |
Collapse
|
16
|
Han K, Wang J, Wang Y, Zhang L, Yu M, Xie F, Zheng D, Xu Y, Ding Y, Wan J. A review of methods for predicting DNA N6-methyladenine sites. Brief Bioinform 2023; 24:6887111. [PMID: 36502371 DOI: 10.1093/bib/bbac514] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2022] [Revised: 10/07/2022] [Accepted: 10/27/2022] [Indexed: 12/14/2022] Open
Abstract
Deoxyribonucleic acid(DNA) N6-methyladenine plays a vital role in various biological processes, and the accurate identification of its site can provide a more comprehensive understanding of its biological effects. There are several methods for 6mA site prediction. With the continuous development of technology, traditional techniques with the high costs and low efficiencies are gradually being replaced by computer methods. Computer methods that are widely used can be divided into two categories: traditional machine learning and deep learning methods. We first list some existing experimental methods for predicting the 6mA site, then analyze the general process from sequence input to results in computer methods and review existing model architectures. Finally, the results were summarized and compared to facilitate subsequent researchers in choosing the most suitable method for their work.
Collapse
Affiliation(s)
- Ke Han
- School of Computer and Information Engineering, Heilongjiang Provincial Key Laboratory of Electronic Commerce and Information Processing, Harbin University of Commerce, Harbin, 150028, China.,College of Pharmacy, Harbin University of Commerce, Harbin, 150076, China
| | - Jianchun Wang
- School of Computer and Information Engineering, Heilongjiang Provincial Key Laboratory of Electronic Commerce and Information Processing, Harbin University of Commerce, Harbin, 150028, China
| | - Yu Wang
- School of Computer and Information Engineering, Heilongjiang Provincial Key Laboratory of Electronic Commerce and Information Processing, Harbin University of Commerce, Harbin, 150028, China
| | - Lei Zhang
- School of Computer and Information Engineering, Heilongjiang Provincial Key Laboratory of Electronic Commerce and Information Processing, Harbin University of Commerce, Harbin, 150028, China
| | - Mengyao Yu
- School of Computer and Information Engineering, Heilongjiang Provincial Key Laboratory of Electronic Commerce and Information Processing, Harbin University of Commerce, Harbin, 150028, China
| | - Fang Xie
- School of Computer and Information Engineering, Heilongjiang Provincial Key Laboratory of Electronic Commerce and Information Processing, Harbin University of Commerce, Harbin, 150028, China
| | - Dequan Zheng
- School of Computer and Information Engineering, Heilongjiang Provincial Key Laboratory of Electronic Commerce and Information Processing, Harbin University of Commerce, Harbin, 150028, China
| | - Yaoqun Xu
- School of Computer and Information Engineering, Heilongjiang Provincial Key Laboratory of Electronic Commerce and Information Processing, Harbin University of Commerce, Harbin, 150028, China
| | - Yijie Ding
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, 324000, China
| | - Jie Wan
- Laboratory for Space Environment and Physical Sciences, Harbin Institute of Technology, Harbin, 150001, China
| |
Collapse
|
17
|
Yuan Q, Chen K, Yu Y, Le NQK, Chua MCH. Prediction of anticancer peptides based on an ensemble model of deep learning and machine learning using ordinal positional encoding. Brief Bioinform 2023; 24:6987656. [PMID: 36642410 DOI: 10.1093/bib/bbac630] [Citation(s) in RCA: 32] [Impact Index Per Article: 32.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2022] [Revised: 12/01/2022] [Accepted: 12/28/2022] [Indexed: 01/17/2023] Open
Abstract
Anticancer peptides (ACPs) are the types of peptides that have been demonstrated to have anticancer activities. Using ACPs to prevent cancer could be a viable alternative to conventional cancer treatments because they are safer and display higher selectivity. Due to ACP identification being highly lab-limited, expensive and lengthy, a computational method is proposed to predict ACPs from sequence information in this study. The process includes the input of the peptide sequences, feature extraction in terms of ordinal encoding with positional information and handcrafted features, and finally feature selection. The whole model comprises of two modules, including deep learning and machine learning algorithms. The deep learning module contained two channels: bidirectional long short-term memory (BiLSTM) and convolutional neural network (CNN). Light Gradient Boosting Machine (LightGBM) was used in the machine learning module. Finally, this study voted the three models' classification results for the three paths resulting in the model ensemble layer. This study provides insights into ACP prediction utilizing a novel method and presented a promising performance. It used a benchmark dataset for further exploration and improvement compared with previous studies. Our final model has an accuracy of 0.7895, sensitivity of 0.8153 and specificity of 0.7676, and it was increased by at least 2% compared with the state-of-the-art studies in all metrics. Hence, this paper presents a novel method that can potentially predict ACPs more effectively and efficiently. The work and source codes are made available to the community of researchers and developers at https://github.com/khanhlee/acp-ope/.
Collapse
Affiliation(s)
- Qitong Yuan
- Institute of Systems Science, National University of Singapore, 25 Heng Mui Keng Terrace, 119615, Singapore, Singapore
| | - Keyi Chen
- Institute of Systems Science, National University of Singapore, 25 Heng Mui Keng Terrace, 119615, Singapore, Singapore
| | - Yimin Yu
- Institute of Systems Science, National University of Singapore, 25 Heng Mui Keng Terrace, 119615, Singapore, Singapore
| | - Nguyen Quoc Khanh Le
- Professional Master Program in Artificial Intelligence in Medicine, College of Medicine, Taipei Medical University, 250 Wuxing St, 106, Taipei, Taiwan.,Research Center for Artificial Intelligence in Medicine, Taipei Medical University, 250 Wuxing St, 106, Taipei, Taiwan.,Translational Imaging Research Center, Taipei Medical University Hospital, 252 Wuxing St, 110, Taipei, Taiwan
| | - Matthew Chin Heng Chua
- Institute of Systems Science, National University of Singapore, 25 Heng Mui Keng Terrace, 119615, Singapore, Singapore
| |
Collapse
|
18
|
Prasad B, Bjourson AJ, Shukla P. muSignAl: An algorithm to search for multiple omic signatures with similar predictive performance. Proteomics 2023; 23:e2200252. [PMID: 36076312 DOI: 10.1002/pmic.202200252] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2022] [Revised: 08/29/2022] [Accepted: 09/05/2022] [Indexed: 01/19/2023]
Abstract
Multidimensional omic datasets often have correlated features leading to the possibility of discovering multiple biological signatures with similar predictive performance for a phenotype. However, their exploration is limited by low sample size and the exponential nature of the combinatorial search leading to high computational cost. To address these issues, we have developed an algorithm muSignAl (multiple signature algorithm) which selects multiple signatures with similar predictive performance while systematically bypassing the requirement of exploring all the combinations of features. We demonstrated the workflow of this algorithm with an example of proteomics dataset. muSignAl is applicable in various bioinformatics-driven explorations, such as understanding the relationship between multiple biological feature sets and phenotypes, and discovery and development of biomarker panels while providing the opportunity of optimising their development cost with the help of equally good multiple signatures. Source code of muSignAl is freely available at https://github.com/ShuklaLab/muSignAl.
Collapse
Affiliation(s)
- Bodhayan Prasad
- Personalised Medicine Centre, School of Medicine, Ulster University, C-TRIC Building, Altnagelvin Area Hospital, Glenshane Road, Londonderry, BT47 6SB, UK
| | - Anthony J Bjourson
- Personalised Medicine Centre, School of Medicine, Ulster University, C-TRIC Building, Altnagelvin Area Hospital, Glenshane Road, Londonderry, BT47 6SB, UK
| | - Priyank Shukla
- Personalised Medicine Centre, School of Medicine, Ulster University, C-TRIC Building, Altnagelvin Area Hospital, Glenshane Road, Londonderry, BT47 6SB, UK
| |
Collapse
|
19
|
Zeng W, Gautam A, Huson DH. MuLan-Methyl-multiple transformer-based language models for accurate DNA methylation prediction. Gigascience 2022; 12:giad054. [PMID: 37489753 PMCID: PMC10367125 DOI: 10.1093/gigascience/giad054] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2023] [Revised: 05/09/2023] [Accepted: 07/18/2023] [Indexed: 07/26/2023] Open
Abstract
Transformer-based language models are successfully used to address massive text-related tasks. DNA methylation is an important epigenetic mechanism, and its analysis provides valuable insights into gene regulation and biomarker identification. Several deep learning-based methods have been proposed to identify DNA methylation, and each seeks to strike a balance between computational effort and accuracy. Here, we introduce MuLan-Methyl, a deep learning framework for predicting DNA methylation sites, which is based on 5 popular transformer-based language models. The framework identifies methylation sites for 3 different types of DNA methylation: N6-adenine, N4-cytosine, and 5-hydroxymethylcytosine. Each of the employed language models is adapted to the task using the "pretrain and fine-tune" paradigm. Pretraining is performed on a custom corpus of DNA fragments and taxonomy lineages using self-supervised learning. Fine-tuning aims at predicting the DNA methylation status of each type. The 5 models are used to collectively predict the DNA methylation status. We report excellent performance of MuLan-Methyl on a benchmark dataset. Moreover, we argue that the model captures characteristic differences between different species that are relevant for methylation. This work demonstrates that language models can be successfully adapted to applications in biological sequence analysis and that joint utilization of different language models improves model performance. Mulan-Methyl is open source, and we provide a web server that implements the approach.
Collapse
Affiliation(s)
- Wenhuan Zeng
- Algorithms in Bioinformatics, Institute for Bioinformatics and Medical Informatics, University of Tübingen, 72076 Tübingen, Germany
| | - Anupam Gautam
- Algorithms in Bioinformatics, Institute for Bioinformatics and Medical Informatics, University of Tübingen, 72076 Tübingen, Germany
- International Max Planck Research School “From Molecules to Organisms”, Max Planck Institute for Biology Tübingen, 72076 Tübingen, Germany
- Cluster of Excellence: EXC 2124: Controlling Microbes to Fight Infection, University of Tübingen, 72076 Tübingen, Germany
| | - Daniel H Huson
- Algorithms in Bioinformatics, Institute for Bioinformatics and Medical Informatics, University of Tübingen, 72076 Tübingen, Germany
- International Max Planck Research School “From Molecules to Organisms”, Max Planck Institute for Biology Tübingen, 72076 Tübingen, Germany
- Cluster of Excellence: EXC 2124: Controlling Microbes to Fight Infection, University of Tübingen, 72076 Tübingen, Germany
| |
Collapse
|
20
|
Construction of a machine learning-based artificial neural network for discriminating PANoptosis related subgroups to predict prognosis in low-grade gliomas. Sci Rep 2022; 12:22119. [PMID: 36543888 PMCID: PMC9770564 DOI: 10.1038/s41598-022-26389-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2022] [Accepted: 12/14/2022] [Indexed: 12/24/2022] Open
Abstract
The poor prognosis of gliomas necessitates the search for biomarkers for predicting clinical outcomes. Recent studies have shown that PANoptosis play an important role in tumor progression. However, the role of PANoptosis in in gliomas has not been fully clarified.Low-grade gliomas (LGGs) from TCGA and CGGA database were classified into two PANoptosis patterns based on the expression of PANoptosis related genes (PRGs) using consensus clustering method, followed which the differentially expressed genes (DEGs) between two PANoptosis patterns were defined as PANoptosis related gene signature. Subsequently, LGGs were separated into two PANoptosis related gene clusters with distinct prognosis based on PANoptosis related gene signature. Univariate and multivariate cox regression analysis confirmed the prognostic values of PANoptosis related gene cluster, based on which a nomogram model was constructed to predict the prognosis in LGGs. ESTIMATE algorithm, MCP counter and CIBERSORT algorithm were utilized to explore the distinct characteristics of tumor microenvironment (TME) between two PANoptosis related gene clusters. Furthermore, an artificial neural network (ANN) model based on machine learning methods was developed to discriminate distinct PANoptosis related gene clusters. Two external datasets were used to verify the performance of the ANN model. The Human Protein Atlas website and western blotting were utilized to confirm the expression of the featured genes involved the ANN model. We developed a machine learning based ANN model for discriminating PANoptosis related subgroups with drawing implications in predicting prognosis in gliomas.
Collapse
|
21
|
Chen B, Xi Y, Zhao J, Hong Y, Tian S, Zhai X, Chen Q, Ren X, Fan L, Xie X, Jiang C. m5C regulator-mediated modification patterns and tumor microenvironment infiltration characterization in colorectal cancer: One step closer to precision medicine. Front Immunol 2022; 13:1049435. [PMID: 36532062 PMCID: PMC9751490 DOI: 10.3389/fimmu.2022.1049435] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2022] [Accepted: 11/09/2022] [Indexed: 12/04/2022] Open
Abstract
Background The RNA modification 5-methylcytosine (m5C) is one of the most prevalent post-transcriptional modifications, with increasing evidence demonstrating its extensive involvement in the tumorigenesis and progression of various cancers. Colorectal cancer (CRC) is the third most common cancer and second leading cause of cancer-related deaths worldwide. However, the role of m5C modulators in shaping tumor microenvironment (TME) heterogeneity and regulating immune cell infiltration in CRC requires further clarification. Results The transcriptomic sequencing data of 18 m5C regulators and clinical data of patients with CRC were obtained from The Cancer Genome Atlas (TCGA) and systematically evaluated. We found that 16 m5C regulators were differentially expressed between CRC and normal tissues. Unsupervised cluster analysis was then performed and revealed two distinct m5C modification patterns that yielded different clinical prognoses and biological functions in CRC. We demonstrated that the m5C score constructed from eight m5C-related genes showed excellent prognostic performance, with a subsequent independent analysis confirming its predictive ability in the CRC cohort. Then we developed a nomogram containing five clinical risk factors and the m5C risk score and found that the m5C score exhibited high prognostic prediction accuracy and favorable clinical applicability. Moreover, the CRC patients with low m5C score were characterized by "hot" TME exhibiting increased immune cell infiltration and higher immune checkpoint expression. These characteristics were highlighted as potential identifiers of suitable candidates for anticancer immunotherapy. Although the high m5C score represented the non-inflammatory phenotype, the CRC patients in this group exhibited high level of sensitivity to molecular-targeted therapy. Conclusion Our comprehensive analysis indicated that the novel m5C clusters and scoring system accurately reflected the distinct prognostic signature, clinicopathological characteristics, immunological phenotypes, and stratifying therapeutic opportunities of CRC. Our findings, therefore, offer valuable insights into factors that may be targeted in the development of precision medicine-based therapeutic strategies for CRC.
Collapse
Affiliation(s)
- Baoxiang Chen
- Department of Colorectal and Anal Surgery, Zhongnan Hospital of Wuhan University, Wuhan, China,Clinical Center of Intestinal and Colorectal Diseases of Hubei Province (Zhongnan Hospital of Wuhan University), Wuhan, China,Hubei Key Laboratory of Intestinal and Colorectal Diseases (Zhongnan Hospital of Wuhan University), Wuhan, China
| | - Yiqing Xi
- Department of Breast and Thyroid Surgery, Zhongnan Hospital of Wuhan University, Wuhan, China
| | - Jianhong Zhao
- Hubei Key Laboratory of Cell Homeostasis, College of Life Sciences, Wuhan University, Wuhan, China,Frontier Science Center for Immunology and Metabolism, Medical Research Institute, Wuhan University, Wuhan, China
| | - Yuntian Hong
- Department of Colorectal and Anal Surgery, Zhongnan Hospital of Wuhan University, Wuhan, China,Clinical Center of Intestinal and Colorectal Diseases of Hubei Province (Zhongnan Hospital of Wuhan University), Wuhan, China,Hubei Key Laboratory of Intestinal and Colorectal Diseases (Zhongnan Hospital of Wuhan University), Wuhan, China
| | - Shunhua Tian
- Department of Colorectal and Anal Surgery, Zhongnan Hospital of Wuhan University, Wuhan, China,Clinical Center of Intestinal and Colorectal Diseases of Hubei Province (Zhongnan Hospital of Wuhan University), Wuhan, China,Hubei Key Laboratory of Intestinal and Colorectal Diseases (Zhongnan Hospital of Wuhan University), Wuhan, China
| | - Xiang Zhai
- Department of Colorectal and Anal Surgery, Zhongnan Hospital of Wuhan University, Wuhan, China,Clinical Center of Intestinal and Colorectal Diseases of Hubei Province (Zhongnan Hospital of Wuhan University), Wuhan, China,Hubei Key Laboratory of Intestinal and Colorectal Diseases (Zhongnan Hospital of Wuhan University), Wuhan, China
| | - Quanjiao Chen
- CAS Key Laboratory of Special Pathogens and Biosafety, CAS Center for Influenza Research and Early Warning, Wuhan Institute of Virology, Chinese Academy of Sciences, Wuhan, China
| | - Xianghai Ren
- Department of Colorectal and Anal Surgery, Zhongnan Hospital of Wuhan University, Wuhan, China,Clinical Center of Intestinal and Colorectal Diseases of Hubei Province (Zhongnan Hospital of Wuhan University), Wuhan, China,Hubei Key Laboratory of Intestinal and Colorectal Diseases (Zhongnan Hospital of Wuhan University), Wuhan, China,*Correspondence: Congqing Jiang, ; Xiaoyu Xie, ; Lifang Fan, ; Xianghai Ren,
| | - Lifang Fan
- Department of Pathology, Hubei Cancer Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China,*Correspondence: Congqing Jiang, ; Xiaoyu Xie, ; Lifang Fan, ; Xianghai Ren,
| | - Xiaoyu Xie
- Department of Colorectal and Anal Surgery, Zhongnan Hospital of Wuhan University, Wuhan, China,Clinical Center of Intestinal and Colorectal Diseases of Hubei Province (Zhongnan Hospital of Wuhan University), Wuhan, China,Hubei Key Laboratory of Intestinal and Colorectal Diseases (Zhongnan Hospital of Wuhan University), Wuhan, China,*Correspondence: Congqing Jiang, ; Xiaoyu Xie, ; Lifang Fan, ; Xianghai Ren,
| | - Congqing Jiang
- Department of Colorectal and Anal Surgery, Zhongnan Hospital of Wuhan University, Wuhan, China,Clinical Center of Intestinal and Colorectal Diseases of Hubei Province (Zhongnan Hospital of Wuhan University), Wuhan, China,Hubei Key Laboratory of Intestinal and Colorectal Diseases (Zhongnan Hospital of Wuhan University), Wuhan, China,*Correspondence: Congqing Jiang, ; Xiaoyu Xie, ; Lifang Fan, ; Xianghai Ren,
| |
Collapse
|
22
|
Zeng Z, Luo M, Li Y, Li J, Huang Z, Zeng Y, Yuan Y, Wang M, Liu Y, Gong Y, Xie C. Prediction of radiosensitivity and radiocurability using a novel supervised artificial neural network. BMC Cancer 2022; 22:1243. [PMID: 36451111 PMCID: PMC9713966 DOI: 10.1186/s12885-022-10339-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2022] [Accepted: 11/21/2022] [Indexed: 12/03/2022] Open
Abstract
BACKGROUND Radiotherapy has been widely used to treat various cancers, but its efficacy depends on the individual involved. Traditional gene-based machine-learning models have been widely used to predict radiosensitivity. However, there is still a lack of emerging powerful models, artificial neural networks (ANN), in the practice of gene-based radiosensitivity prediction. In addition, ANN may overfit and learn biologically irrelevant features. METHODS We developed a novel ANN with Selective Connection based on Gene Patterns (namely ANN-SCGP) to predict radiosensitivity and radiocurability. We creatively used gene patterns (gene similarity or gene interaction information) to control the "on-off" of the first layer of weights, enabling the low-dimensional features to learn the gene pattern information. ANN-SCGP was trained and tested in 82 cell lines and 1,101 patients from the 11 pan-cancer cohorts. RESULTS For survival fraction at 2 Gy, the root mean squared errors (RMSE) of prediction in ANN-SCGP was the smallest among all algorithms (mean RMSE: 0.1587-0.1654). For radiocurability, ANN-SCGP achieved the first and second largest C-index in the 12/20 and 4/20 tests, respectively. The low dimensional output of ANN-SCGP reproduced the patterns of gene similarity. Moreover, the pan-cancer analysis indicated that immune signals and DNA damage responses were associated with radiocurability. CONCLUSIONS As a model including gene pattern information, ANN-SCGP had superior prediction abilities than traditional models. Our work provided novel insights into radiosensitivity and radiocurability.
Collapse
Affiliation(s)
- Zihang Zeng
- grid.413247.70000 0004 1808 0969Department of Radiation and Medical Oncology, Zhongnan Hospital of Wuhan University, 169 Donghu Road, Wuhan, 430071 Hubei China
| | - Maoling Luo
- grid.413247.70000 0004 1808 0969Department of Radiation and Medical Oncology, Zhongnan Hospital of Wuhan University, 169 Donghu Road, Wuhan, 430071 Hubei China
| | - Yangyi Li
- grid.413247.70000 0004 1808 0969Department of Radiation and Medical Oncology, Zhongnan Hospital of Wuhan University, 169 Donghu Road, Wuhan, 430071 Hubei China
| | - Jiali Li
- grid.413247.70000 0004 1808 0969Department of Radiation and Medical Oncology, Zhongnan Hospital of Wuhan University, 169 Donghu Road, Wuhan, 430071 Hubei China
| | - Zhengrong Huang
- grid.413247.70000 0004 1808 0969Department of Radiation and Medical Oncology, Zhongnan Hospital of Wuhan University, 169 Donghu Road, Wuhan, 430071 Hubei China ,grid.413247.70000 0004 1808 0969Department of Biological Repositories, Zhongnan Hospital of Wuhan University, 169 Donghu Road, Wuhan, 430071 Hubei China
| | - Yuxin Zeng
- grid.413247.70000 0004 1808 0969Department of Radiation and Medical Oncology, Zhongnan Hospital of Wuhan University, 169 Donghu Road, Wuhan, 430071 Hubei China
| | - Yu Yuan
- grid.413247.70000 0004 1808 0969Department of Radiation and Medical Oncology, Zhongnan Hospital of Wuhan University, 169 Donghu Road, Wuhan, 430071 Hubei China
| | - Mengqin Wang
- grid.413247.70000 0004 1808 0969Department of Radiation and Medical Oncology, Zhongnan Hospital of Wuhan University, 169 Donghu Road, Wuhan, 430071 Hubei China
| | - Yuying Liu
- grid.413247.70000 0004 1808 0969Department of Radiation and Medical Oncology, Zhongnan Hospital of Wuhan University, 169 Donghu Road, Wuhan, 430071 Hubei China
| | - Yan Gong
- grid.413247.70000 0004 1808 0969Department of Biological Repositories, Zhongnan Hospital of Wuhan University, 169 Donghu Road, Wuhan, 430071 Hubei China ,grid.413247.70000 0004 1808 0969Tumor Precision Diagnosis and Treatment Technology and Translational Medicine, Hubei Engineering Research Center, Zhongnan Hospital of Wuhan University, Wuhan, 430071 Hubei China
| | - Conghua Xie
- grid.413247.70000 0004 1808 0969Department of Radiation and Medical Oncology, Zhongnan Hospital of Wuhan University, 169 Donghu Road, Wuhan, 430071 Hubei China ,grid.413247.70000 0004 1808 0969Hubei Key Laboratory of Tumor Biological Behaviors, Zhongnan Hospital of Wuhan University, Wuhan, 430071 Hubei China ,grid.413247.70000 0004 1808 0969Hubei Cancer Clinical Study Center, Zhongnan Hospital of Wuhan University, Wuhan, 430071 Hubei China
| |
Collapse
|
23
|
Cuproptosis-Related LncRNA Signature for Predicting Prognosis of Hepatocellular Carcinoma: A Comprehensive Analysis. DISEASE MARKERS 2022; 2022:3265212. [PMID: 36452343 PMCID: PMC9705118 DOI: 10.1155/2022/3265212] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/16/2022] [Revised: 10/21/2022] [Accepted: 10/25/2022] [Indexed: 11/23/2022]
Abstract
Hepatocellular carcinoma (HCC) is one of the most common malignant tumors worldwide and has a poor prognosis. Cuproptosis is a novel mode of cell death that has only recently been discovered. Considering the critical role of lncRNAs in liver cancer development, the aim of this study was to construct a prognostic signature based on cuproptosis-related lncRNAs (CRlncRNAs). We downloaded RNA-sequencing data and corresponding clinical information of patients with HCC from The Cancer Genome Atlas (TCGA) database. To verify the robustness of the model, we added an external validation set obtained from the Gene Expression Omnibus (GEO): GSE40144. In addition, we identified the cuproptosis-related genes (CRGs) based on previous reports. Pearson correlation analysis, univariate Cox regression, and least absolute shrinkage and selection operator (LASSO) Cox regression analysis were utilized to screen for genes associated with prognosis. On this basis, multivariate Cox regression and stepAIC were used to further construct and optimize the prognostic model. The simplified signature with the lowest Akaike information criterion (AIC) value was considered the prognostic signature. Seven different algorithms were used to perform immune infiltration analysis. The single-sample Gene Set Enrichment Analysis (ssGSEA) algorithm was utilized to find the difference in immune function between the high- and low-risk groups. Finally, in vitro experiments were performed by quantitative real-time PCR (qRT-PCR) analysis using HCC cell lines to validate the expression of prognostic genes. We identified 3 lncRNAs (CYTOR, LINC00205, and LINC01184) as independent risk factors for HCC. The receiver operating characteristic (ROC) curves calculated that the AUC at 1, 3, and 5 years reached 0.717, 0.633, and 0.607, respectively. The expression levels of 41 immune checkpoints differed significantly between the high- and low-risk groups, and there were significant differences in sensitivity to immunotherapy between the high- and low-risk groups. The risk model could also serve as a promising predictor of immunotherapeutic response, which has been verified by the TIDE algorithm (p < 0.001). Overall, we propose a signature related to CRlncRNAs that can be used to predict the prognosis of HCC patients, which was validated in external cohort and in vitro experiments.
Collapse
|
24
|
Development and Validation of Novel Deep-Learning Models Using Multiple Data Types for Lung Cancer Survival. Cancers (Basel) 2022; 14:cancers14225562. [PMID: 36428655 PMCID: PMC9688689 DOI: 10.3390/cancers14225562] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2022] [Revised: 11/03/2022] [Accepted: 11/10/2022] [Indexed: 11/16/2022] Open
Abstract
A well-established lung-cancer-survival-prediction model that relies on multiple data types, multiple novel machine-learning algorithms, and external testing is absent in the literature. This study aims to address this gap and determine the critical factors of lung cancer survival. We selected non-small-cell lung cancer patients from a retrospective dataset of the Taipei Medical University Clinical Research Database and Taiwan Cancer Registry between January 2008 and December 2018. All patients were monitored from the index date of cancer diagnosis until the event of death. Variables, including demographics, comorbidities, medications, laboratories, and patient gene tests, were used. Nine machine-learning algorithms with various modes were used. The performance of the algorithms was measured by the area under the receiver operating characteristic curve (AUC). In total, 3714 patients were included. The best performance of the artificial neural network (ANN) model was achieved when integrating all variables with the AUC, accuracy, precision, recall, and F1-score of 0.89, 0.82, 0.91, 0.75, and 0.65, respectively. The most important features were cancer stage, cancer size, age of diagnosis, smoking, drinking status, EGFR gene, and body mass index. Overall, the ANN model improved predictive performance when integrating different data types.
Collapse
|
25
|
A Novel Deep Transfer Learning Approach Based on Depth-Wise Separable CNN for Human Posture Detection. INFORMATION 2022. [DOI: 10.3390/info13110520] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Human posture classification (HPC) is the process of identifying a human pose from a still image or moving image that was recorded by a digicam. This makes it easier to keep a record of people’s postures, which is helpful for many things. The intricate surroundings that are depicted in the image, such as occlusion and the camera view angle, make HPC a difficult process. Consequently, the development of a reliable HPC system is essential. This study proposes the “DeneSVM”, an innovative deep transfer learning-based classification model that pulls characteristics from image datasets to detect and classify human postures. The paradigm is intended to classify the four primary postures of lying, bending, sitting, and standing. These positions are classes of sitting, bending, lying, and standing. The Silhouettes for Human Posture Recognition dataset has been used to train, validate, test, and analyze the suggested model. The DeneSVM model attained the highest test precision (94.72%), validation accuracy (93.79%) and training accuracy (97.06%). When the efficiency of the suggested model was validated using the testing dataset, it too had a good accuracy of 95%.
Collapse
|
26
|
Chen G, Liu ZP. Inferring causal gene regulatory network via GreyNet: From dynamic grey association to causation. Front Bioeng Biotechnol 2022; 10:954610. [PMID: 36237217 PMCID: PMC9551017 DOI: 10.3389/fbioe.2022.954610] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2022] [Accepted: 08/15/2022] [Indexed: 11/23/2022] Open
Abstract
Gene regulatory network (GRN) provides abundant information on gene interactions, which contributes to demonstrating pathology, predicting clinical outcomes, and identifying drug targets. Existing high-throughput experiments provide rich time-series gene expression data to reconstruct the GRN to further gain insights into the mechanism of organisms responding to external stimuli. Numerous machine-learning methods have been proposed to infer gene regulatory networks. Nevertheless, machine learning, especially deep learning, is generally a “black box,” which lacks interpretability. The causality has not been well recognized in GRN inference procedures. In this article, we introduce grey theory integrated with the adaptive sliding window technique to flexibly capture instant gene–gene interactions in the uncertain regulatory system. Then, we incorporate generalized multivariate Granger causality regression methods to transform the dynamic grey association into causation to generate directional regulatory links. We evaluate our model on the DREAM4 in silico benchmark dataset and real-world hepatocellular carcinoma (HCC) time-series data. We achieved competitive results on the DREAM4 compared with other state-of-the-art algorithms and gained meaningful GRN structure on HCC data respectively.
Collapse
Affiliation(s)
- Guangyi Chen
- Department of Biomedical Engineering, School of Control Science and Engineering, Shandong University, Jinan, Shandong, China
| | - Zhi-Ping Liu
- Department of Biomedical Engineering, School of Control Science and Engineering, Shandong University, Jinan, Shandong, China
- Center for Intelligent Medicine, Shandong University, Jinan, Shandong, China
- *Correspondence: Zhi-Ping Liu,
| |
Collapse
|
27
|
Kha QH, Tran TO, Nguyen TTD, Nguyen VN, Than K, Le NQK. An interpretable deep learning model for classifying adaptor protein complexes from sequence information. Methods 2022; 207:90-96. [PMID: 36174933 DOI: 10.1016/j.ymeth.2022.09.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2022] [Revised: 08/19/2022] [Accepted: 09/22/2022] [Indexed: 11/15/2022] Open
Abstract
Adaptor proteins (APs) are a family of proteins that aids in intracellular membrane trafficking, and their impairments or defects are closely related to various disorders. Traditional methods to identify and classify APs require time and complex techniques, which were then advanced by machine learning and computational approaches to facilitate the APs recognition task. However, most studies focused on recognizing separate ones in the APs family or the APs in general with non-APs, lacking one comprehensive strategy to distinguish the complexes of AP subtypes. Herein, we proposed a novel method to implement one novel task as discriminating the AP complexes in the APs family, utilizing an interpretable deep neural network architecture on sequence-based encoding features. This work also introduced a benchmark data set of AP complexes originating from the UniProt and GeneOntology databases. To assess the robustness of our proposed method, we compared our performance to various machine learning algorithms and feature extraction strategies. Furthermore, the interpretation of the model's prediction performance was implemented using t-distributed stochastic neighbor embedding (t-SNE), uniform manifold approximation and projection (UMAP), and SHapley Additive exPlanations (SHAP) analysis to show the distribution of AP complexes on optimal features. The promising performance of our architecture can assist scientists not only in AP complexes distinction but also in general protein sequences. Moreover, we have also made our work publicly on GitHub https://github.com/khanhlee/adaptor-dnn.
Collapse
Affiliation(s)
- Quang-Hien Kha
- International Master/Ph.D. Program in Medicine, College of Medicine, Taipei Medical University, Taipei, Taiwan
| | - Thi-Oanh Tran
- International Ph.D. Program for Cell Therapy and Regeneration Medicine, College of Medicine, Taipei Medical University, Taipei, Taiwan
| | - Trinh-Trung-Duong Nguyen
- Personalised Medicine Cluster, Department of Drug Design and Pharmacology, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Van-Nui Nguyen
- University of Information and Communication Technology, Thai Nguyen University, Thai Nguyen, Viet Nam
| | - Khoat Than
- School of Information and Communication Technology, Hanoi University of Science and Technology, Viet Nam
| | - Nguyen Quoc Khanh Le
- Professional Master Program in Artificial Intelligence in Medicine, College of Medicine, Taipei Medical University, Taipei 106, Taiwan; Research Center for Artificial Intelligence in Medicine, Taipei Medical University, Taipei 106, Taiwan; Translational Imaging Research Center, Taipei Medical University Hospital, Taipei 110, Taiwan.
| |
Collapse
|
28
|
Zhu X, Cheang I, Xu F, Gao R, Liao S, Yao W, Zhou Y, Zhang H, Li X. Long-term prognostic value of inflammatory biomarkers for patients with acute heart failure: Construction of an inflammatory prognostic scoring system. Front Immunol 2022; 13:1005697. [PMID: 36189198 PMCID: PMC9520349 DOI: 10.3389/fimmu.2022.1005697] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Accepted: 09/01/2022] [Indexed: 02/05/2023] Open
Abstract
Objective Systemic inflammation is associated with a poor prognosis in acute heart failure (AHF). This study was to assess the long-term prognostic value of combining the accessible inflammatory markers in relation to all-cause mortality in patients with AHF. Methods Consecutive patients with AHF who were hospitalized between March 2012 and April 2016 at the Department of Cardiology of the First Affiliated Hospital of Nanjing Medical University were enrolled in this prospective study. The LASSO regression model was used to select the most valuable inflammatory biomarkers to develop an inflammatory prognostic scoring (IPS) system. Kaplan-Meier method, multivariate COX regression and time-dependent ROC analysis were used to assess the relationship between inflammatory markers and AHF prognosis. A randomized survival forest model was used to estimate the relative importance of each inflammatory marker in the prognostic risks of AHF. Results A total of 538 patients with AHF were included in the analysis (mean age, 61.1 ± 16.0 years; 357 [66.4%] men). During a median follow-up of 34 months, there were 227 all-cause deaths (42.2%). C-reactive protein (CRP), red blood cell distribution width (RDW) and neutrophil-to-lymphocyte ratio (NLR) were incorporated into the IPS system (IPS = 0.301×CRP + 0.263×RDW + 0.091×NLR). A higher IPS meant a significantly worse long-term prognosis in Kaplan-Meier analysis, with 0.301 points as the optimal cut-off value (P log-rank <0.001). IPS remained an independent prognostic factor associated with an increased risk of all-cause mortality among patients with AHF in multivariate Cox regression models with a full adjustment of the other significant covariables. Random forest variable importance and minimal depth analysis further validated that the IPS system was the most predictive for all-cause mortality in patients with AHF. Conclusions Inflammatory biomarkers were associated with the risk of all-cause mortality in patients with AHF, while IPS significantly improved the predictive power of the model and could be used as a practical tool for individualized risk stratification of patients with AHF.
Collapse
Affiliation(s)
- Xu Zhu
- Department of Cardiology, The First Affiliated Hospital of Nanjing Medical University, Jiangsu Province Hospital, Nanjing, China
| | - Iokfai Cheang
- Department of Cardiology, The First Affiliated Hospital of Nanjing Medical University, Jiangsu Province Hospital, Nanjing, China
| | - Fang Xu
- Department of Cardiology, The First Affiliated Hospital of Nanjing Medical University, Jiangsu Province Hospital, Nanjing, China
| | - Rongrong Gao
- Department of Cardiology, The First Affiliated Hospital of Nanjing Medical University, Jiangsu Province Hospital, Nanjing, China
| | - Shengen Liao
- Department of Cardiology, The First Affiliated Hospital of Nanjing Medical University, Jiangsu Province Hospital, Nanjing, China
| | - Wenming Yao
- Department of Cardiology, The First Affiliated Hospital of Nanjing Medical University, Jiangsu Province Hospital, Nanjing, China
| | - Yanli Zhou
- Department of Cardiology, The First Affiliated Hospital of Nanjing Medical University, Jiangsu Province Hospital, Nanjing, China
| | - Haifeng Zhang
- Department of Cardiology, The First Affiliated Hospital of Nanjing Medical University, Jiangsu Province Hospital, Nanjing, China
- Department of Cardiology, The Affiliated Suzhou Hospital of Nanjing Medical University, Suzhou Municipal Hospital, Suzhou, China
| | - Xinli Li
- Department of Cardiology, The First Affiliated Hospital of Nanjing Medical University, Jiangsu Province Hospital, Nanjing, China
| |
Collapse
|
29
|
Prediction of Histological Grades and Ki-67 Expression of Hepatocellular Carcinoma Based on Sonazoid Contrast Enhanced Ultrasound Radiomics Signatures. Diagnostics (Basel) 2022; 12:diagnostics12092175. [PMID: 36140576 PMCID: PMC9497787 DOI: 10.3390/diagnostics12092175] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Revised: 09/04/2022] [Accepted: 09/05/2022] [Indexed: 01/27/2023] Open
Abstract
Objectives: Histopathological tumor grade and Ki-67 expression level are key aspects concerning the prognosis of patients with hepatocellular carcinoma (HCC) lesions. The aim of this study was to investigate whether the radiomics model derived from Sonazoid contrast-enhanced (S-CEUS) images could predict histological grades and Ki-67 expression of HCC lesions. Methods: This prospective study included 101 (training cohort: n = 71; validation cohort: n = 30) patients with surgical resection and histopathologically confirmed HCC lesions. Radiomics features were extracted from the B mode and Kupffer phase of S-CEUS images. Maximum relevance minimum redundancy (MRMR) and least absolute shrinkage and selection operator (LASSO) were used for feature selection, and a stepwise multivariate logit regression model was trained for prediction. Model accuracy, sensitivity, and specificity in both training and testing datasets were used to evaluate performance. Results: The prediction model derived from Kupffer phase images (CE-model) displayed a significantly better performance in the prediction of stage III HCC patients, with an area under the receiver operating characteristic curve (AUROC) of 0.908 in the training dataset and 0.792 in the testing set. The CE-model demonstrated generalizability in identifying HCC patients with elevated Ki-67 expression (>10%) with a training AUROC of 0.873 and testing AUROC of 0.768, with noticeably higher specificity of 92.3% and 80.0% in training and testing datasets, respectively. Conclusions: The radiomics model constructed from the Kupffer phase of S-CEUS images has the potential for predicting Ki-67 expression and histological stages in patients with HCC.
Collapse
|
30
|
Lee Y, Son J, Song M. BertSRC: transformer-based semantic relation classification. BMC Med Inform Decis Mak 2022; 22:234. [PMID: 36068535 PMCID: PMC9446816 DOI: 10.1186/s12911-022-01977-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2022] [Accepted: 08/11/2022] [Indexed: 11/13/2022] Open
Abstract
The relationship between biomedical entities is complex, and many of them have not yet been identified. For many biomedical research areas including drug discovery, it is of paramount importance to identify the relationships that have already been established through a comprehensive literature survey. However, manually searching through literature is difficult as the amount of biomedical publications continues to increase. Therefore, the relation classification task, which automatically mines meaningful relations from the literature, is spotlighted in the field of biomedical text mining. By applying relation classification techniques to the accumulated biomedical literature, existing semantic relations between biomedical entities that can help to infer previously unknown relationships are efficiently grasped. To develop semantic relation classification models, which is a type of supervised machine learning, it is essential to construct a training dataset that is manually annotated by biomedical experts with semantic relations among biomedical entities. Any advanced model must be trained on a dataset with reliable quality and meaningful scale to be deployed in the real world and can assist biologists in their research. In addition, as the number of such public datasets increases, the performance of machine learning algorithms can be accurately revealed and compared by using those datasets as a benchmark for model development and improvement. In this paper, we aim to build such a dataset. Along with that, to validate the usability of the dataset as training data for relation classification models and to improve the performance of the relation extraction task, we built a relation classification model based on Bidirectional Encoder Representations from Transformers (BERT) trained on our dataset, applying our newly proposed fine-tuning methodology. In experiments comparing performance among several models based on different deep learning algorithms, our model with the proposed fine-tuning methodology showed the best performance. The experimental results show that the constructed training dataset is an important information resource for the development and evaluation of semantic relation extraction models. Furthermore, relation extraction performance can be improved by integrating our proposed fine-tuning methodology. Therefore, this can lead to the promotion of future text mining research in the biomedical field.
Collapse
Affiliation(s)
- Yeawon Lee
- Department of Library and Information Science, Yonsei University, Seoul, South Korea
| | - Jinseok Son
- Department of Digital Analytics, Yonsei University, Seoul, South Korea
| | - Min Song
- Department of Library and Information Science, Yonsei University, Seoul, South Korea.
| |
Collapse
|
31
|
Gromada K. Unsupervised SAR Imagery Feature Learning with Median Filter-Based Loss Value. SENSORS (BASEL, SWITZERLAND) 2022; 22:6519. [PMID: 36080978 PMCID: PMC9460378 DOI: 10.3390/s22176519] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Revised: 08/22/2022] [Accepted: 08/24/2022] [Indexed: 06/15/2023]
Abstract
The scarcity of open SAR (Synthetic Aperture Radars) imagery databases (especially the labeled ones) and sparsity of pre-trained neural networks lead to the need for heavy data generation, augmentation, or transfer learning usage. This paper described the characteristics of SAR imagery, the limitations related to it, and a small set of available databases. Comprehensive data augmentation methods for training Neural Networks were presented, and a novel filter-based method was proposed. The new method limits the effect of the speckle noise, which is very high-level in SAR imagery. The improvement in the dataset could be clearly registered in the loss value functions. The main advantage comes from more developed feature detectors for filter-based training, which is shown in the layer-wise feature analysis. The author attached the trained neural networks for open use. This provides quicker CNN-based solutions implementation.
Collapse
Affiliation(s)
- Krzysztof Gromada
- Institute of Automatic Control and Robotics, Warsaw University of Technology, A. Boboli 8 St., 02-525 Warsaw, Poland
| |
Collapse
|
32
|
Salehi A, Wang L, Coates PJ, Norberg Spaak L, Gu X, Sgaramella N, Nylander K. Reiterative modeling of combined transcriptomic and proteomic features refines and improves the prediction of early recurrence in squamous cell carcinoma of head and neck. Comput Biol Med 2022; 149:105991. [PMID: 36007290 DOI: 10.1016/j.compbiomed.2022.105991] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2022] [Revised: 07/11/2022] [Accepted: 08/14/2022] [Indexed: 11/24/2022]
Abstract
BACKGROUND Patients with squamous cell carcinoma of the head and neck (SCCHN) have a high-risk of recurrence. We aimed to develop machine learning methods to identify transcriptomic and proteomic features that provide accurate classification models for predicting risk of early recurrence in SCCHN patients. METHODS Clinical, genomic, transcriptomic and proteomic features distinguishing recurrence risk were examined in SCCHN patients from The Cancer Genome Atlas (TCGA). Recurrence within one year after treatment was classified as high-risk and no recurrence as low-risk. RESULTS No significant differences in individual clinicopathological characteristics, mutation profiles or mRNA expression patterns were seen between the groups using conventional statistical analysis. Using the machine learning algorithm, extreme gradient boosting (XGBoost), ten proteins (RAD50, 4E-BP1, MYH11, MAP2K1, BECN1, NF2, RAB25, ERRFI1, KDR, SERPINE1) and five mRNAs (PLAUR, DKK1, AXIN2, ANG and VEGFA) made the greatest contribution to classification. These features were used to build improved models in XGBoost, achieving the best discrimination performance when combining transcriptomic and proteomic data, providing an accuracy of 0.939 and an Area Under the ROC Curve (AUC) of 0.951. CONCLUSIONS This study highlights machine learning to identify transcriptomic and proteomic factors that play important roles in predicting risk of recurrence in patients with SCCHN and to develop such models by iterative cycles to enhance their accuracy, thereby aiding the introduction of personalized treatment regimens.
Collapse
Affiliation(s)
- Amir Salehi
- Department of Medical Biosciences/Pathology, Umeå University, Umeå, Sweden
| | - Lixiao Wang
- Department of Medical Biosciences/Pathology, Umeå University, Umeå, Sweden
| | - Philip J Coates
- Research Centre for Applied Molecular Oncology, Masaryk Memorial Cancer Institute, Brno, 656 53, Czech Republic
| | - Lena Norberg Spaak
- Department of Medical Biosciences/Pathology, Umeå University, Umeå, Sweden
| | - Xiaolian Gu
- Department of Medical Biosciences/Pathology, Umeå University, Umeå, Sweden
| | - Nicola Sgaramella
- Department of Medical Biosciences/Pathology, Umeå University, Umeå, Sweden
| | - Karin Nylander
- Department of Medical Biosciences/Pathology, Umeå University, Umeå, Sweden.
| |
Collapse
|
33
|
Predictive Modelling in Clinical Bioinformatics: Key Concepts for Startups. BIOTECH 2022; 11:biotech11030035. [PMID: 35997343 PMCID: PMC9397027 DOI: 10.3390/biotech11030035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Revised: 07/30/2022] [Accepted: 08/03/2022] [Indexed: 11/17/2022] Open
Abstract
Clinical bioinformatics is a newly emerging field that applies bioinformatics techniques for facilitating the identification of diseases, discovery of biomarkers, and therapy decision. Mathematical modelling is part of bioinformatics analysis pipelines and a fundamental step to extract clinical insights from genomes, transcriptomes and proteomes of patients. Often, the chosen modelling techniques relies on either statistical, machine learning or deterministic approaches. Research that combines bioinformatics with modelling techniques have been generating innovative biomedical technology, algorithms and models with biotech applications, attracting private investment to develop new business; however, startups that emerge from these technologies have been facing difficulties to implement clinical bioinformatics pipelines, protect their technology and generate profit. In this commentary, we discuss the main concepts that startups should know for enabling a successful application of predictive modelling in clinical bioinformatics. Here we will focus on key modelling concepts, provide some successful examples and briefly discuss the modelling framework choice. We also highlight some aspects to be taken into account for a successful implementation of cost-effective bioinformatics from a business perspective.
Collapse
|
34
|
Kazemi N, Gholizadeh N, Musilek P. Selective Microwave Zeroth-Order Resonator Sensor Aided by Machine Learning. SENSORS (BASEL, SWITZERLAND) 2022; 22:s22145362. [PMID: 35891042 PMCID: PMC9323907 DOI: 10.3390/s22145362] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/29/2022] [Revised: 07/03/2022] [Accepted: 07/15/2022] [Indexed: 06/13/2023]
Abstract
Microwave sensors are principally sensitive to effective permittivity, and hence not selective to a specific material under test (MUT). In this work, a highly compact microwave planar sensor based on zeroth-order resonance is designed to operate at three distant frequencies of 3.5, 4.3, and 5 GHz, with the size of only λg-min/8 per resonator. This resonator is deployed to characterize liquid mixtures with one desired MUT (here water) combined with an interfering material (e.g., methanol, ethanol, or acetone) with various concentrations (0%:10%:100%). To achieve a sensor with selectivity to water, a convolutional neural network (CNN) is used to recognize different concentrations of water regardless of the host medium. To obtain a high accuracy of this classification, Style-GAN is utilized to generate a reliable sensor response for concentrations between water and the host medium (methanol, ethanol, and acetone). A high accuracy of 90.7% is achieved using CNN for selectively discriminating water concentrations.
Collapse
Affiliation(s)
- Nazli Kazemi
- Electrical and Computer Engineering, University of Alberta, Edmonton, AB T6G 1H9, Canada; (N.K.); (N.G.)
| | - Nastaran Gholizadeh
- Electrical and Computer Engineering, University of Alberta, Edmonton, AB T6G 1H9, Canada; (N.K.); (N.G.)
| | - Petr Musilek
- Electrical and Computer Engineering, University of Alberta, Edmonton, AB T6G 1H9, Canada; (N.K.); (N.G.)
- Applied Cybernetics, University of Hradec Králové, 500 03 Hradec Králové, Czech Republic
| |
Collapse
|
35
|
Price Forecast for Mexican Red Spiny Lobster (Panulirus spp.) Using Artificial Neural Networks (ANNs). APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12126044] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
The selling price is one of the essential variables in decision making for fishers regarding the catching of a fishing resource. In the case of the Pacific Mexican lobster fishery, the price uncertainty at the beginning of the season translates into the suboptimal utilization of this resource. This work aims to predict the export price of Mexican red lobster (Panulirus) in a fishing season using demand-related market variables including price, main competitors, main buyers, and product quantities exported/imported in the market. We used the monthly export price from 2006 to 2018 for the main importer, China. As a method for price forecasting, artificial neural networks (ANNs), with and without exogenous variables (NARX, NAR), were used as an autoregressive model, while the same information was analyzed with an ARIMAX model for comparative purposes. It was found that ANNs are a useful tool that yielded better predictive power when forecasting Mexican lobster export prices compared to ARIMAX models. The predictive power was evaluated by comparing the mean square errors (MSE) of 15 models. The MSE of ANNs (73.07) was lower than that of the four ARIMAX models (88.1). It is concluded that neural networks are a valuable tool for accurately predicting prices relative to real values, an aspect of great interest for application in fishery resource management.
Collapse
|
36
|
A Hierarchical Approach toward Prediction of Human Biological Age from Masked Facial Image Leveraging Deep Learning Techniques. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12115306] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
The lifestyle of humans has changed noticeably since the contagious COVID-19 disease struck globally. People should wear a face mask as a protective measure to curb the spread of the contagious disease. Consequently, real-world applications (i.e., electronic customer relationship management) dealing with human ages extracted from face images must migrate to a robust system proficient to estimate the age of a person wearing a face mask. In this paper, we proposed a hierarchical age estimation model from masked facial images in a group-to-specific manner rather than a single regression model because age progression across different age groups is quite dissimilar. Our intention was to squeeze the feature space among limited age classes so that the model could fairly discern age. We generated a synthetic masked face image dataset over the IMDB-WIKI face image dataset to train and validate our proposed model due to the absence of a benchmark masked face image dataset with real age annotations. We somewhat mitigated the data sparsity problem of the large public IMDB-WIKI dataset using off-the-shelf down-sampling and up-sampling techniques as required. The age estimation task was fully modeled like a deep classification problem, and expected ages were formulated from SoftMax probabilities. We performed a classification task by deploying multiple low-memory and higher-accuracy-based convolutional neural networks (CNNs). Our proposed hierarchical framework demonstrated marginal improvement in terms of mean absolute error (MAE) compared to the one-off model approach for masked face real age estimation. Moreover, this research is perhaps the maiden attempt to estimate the real age of a person from his/her masked face image.
Collapse
|
37
|
Sokhansanj BA, Rosen GL. Mapping Data to Deep Understanding: Making the Most of the Deluge of SARS-CoV-2 Genome Sequences. mSystems 2022; 7:e0003522. [PMID: 35311562 PMCID: PMC9040592 DOI: 10.1128/msystems.00035-22] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/27/2022] [Indexed: 12/22/2022] Open
Abstract
Next-generation sequencing has been essential to the global response to the COVID-19 pandemic. As of January 2022, nearly 7 million severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) sequences are available to researchers in public databases. Sequence databases are an abundant resource from which to extract biologically relevant and clinically actionable information. As the pandemic has gone on, SARS-CoV-2 has rapidly evolved, involving complex genomic changes that challenge current approaches to classifying SARS-CoV-2 variants. Deep sequence learning could be a potentially powerful way to build complex sequence-to-phenotype models. Unfortunately, while they can be predictive, deep learning typically produces "black box" models that cannot directly provide biological and clinical insight. Researchers should therefore consider implementing emerging methods for visualizing and interpreting deep sequence models. Finally, researchers should address important data limitations, including (i) global sequencing disparities, (ii) insufficient sequence metadata, and (iii) screening artifacts due to poor sequence quality control.
Collapse
Affiliation(s)
- Bahrad A. Sokhansanj
- Drexel University, Ecological and Evolutionary Signal-Processing and Informatics Laboratory, Department of Electrical & Computer Engineering, College of Engineering, Philadelphia, Pennsylvania, USA
| | - Gail L. Rosen
- Drexel University, Ecological and Evolutionary Signal-Processing and Informatics Laboratory, Department of Electrical & Computer Engineering, College of Engineering, Philadelphia, Pennsylvania, USA
| |
Collapse
|
38
|
Tang X, Zheng P, Li X, Wu H, Wei DQ, Liu Y, Huang G. Deep6mAPred: A CNN and Bi-LSTM-based deep learning method for predicting DNA N6-methyladenosine sites across plant species. Methods 2022; 204:142-150. [PMID: 35477057 DOI: 10.1016/j.ymeth.2022.04.011] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2022] [Revised: 04/16/2022] [Accepted: 04/20/2022] [Indexed: 12/11/2022] Open
Abstract
DNA N6-methyladenine (6mA) is a key DNA modification, which plays versatile roles in the cellular processes, including regulation of gene expression, DNA repair, and DNA replication. DNA 6mA is closely associated with many diseases in the mammals and with growth as well as development of plants. Precisely detecting DNA 6mA sites is of great importance to exploration of 6mA functions. Although many computational methods have been presented for DNA 6mA prediction, there is still a wide gap in the practical application. We presented a convolution neural network (CNN) and bi-directional long-short term memory (Bi-LSTM)-based deep learning method (Deep6mAPred) for predicting DNA 6mA sites across plant species. The Deep6mAPred stacked the CNNs and the Bi-LSTMs in a paralleling manner instead of a series-connection manner. The Deep6mAPred also employed the attention mechanism for improving the representations of sequences. The Deep6mAPred reached an accuracy of 0.9556 over the independent rice dataset, far outperforming the state-of-the-art methods. The tests across plant species showed that the Deep6mAPred is of a remarkable advantage over the state of the art methods. We developed a user-friendly web application for DNA 6mA prediction, which is freely available at http://106.13.196.152:7001/ for all the scientific researchers. The Deep6mAPred would enrich tools to predict DNA 6mA sites and speed up the exploration of DNA modification.
Collapse
Affiliation(s)
- Xingyu Tang
- School of Electrical Engineering, Shaoyang University, Shaoyang, Hunan 422000, China
| | - Peijie Zheng
- School of Electrical Engineering, Shaoyang University, Shaoyang, Hunan 422000, China
| | - Xueyong Li
- School of Electrical Engineering, Shaoyang University, Shaoyang, Hunan 422000, China
| | - Hongyan Wu
- The Joint Engineering Research Center for Health Big Data Intelligent Analysis Technology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Dong-Qing Wei
- The Joint Engineering Research Center for Health Big Data Intelligent Analysis Technology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China; State Key Laboratory of Microbial Metabolism, and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China.
| | - Yuewu Liu
- College of Information and Intelligence, Hunan Agricultural University, Changsha, Hunan 410081, China
| | - Guohua Huang
- School of Electrical Engineering, Shaoyang University, Shaoyang, Hunan 422000, China.
| |
Collapse
|
39
|
Convolutional Neural Networks for Mechanistic Driver Detection in Atrial Fibrillation. Int J Mol Sci 2022; 23:ijms23084216. [PMID: 35457044 PMCID: PMC9032062 DOI: 10.3390/ijms23084216] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2022] [Revised: 04/04/2022] [Accepted: 04/04/2022] [Indexed: 02/04/2023] Open
Abstract
The maintaining and initiating mechanisms of atrial fibrillation (AF) remain controversial. Deep learning is emerging as a powerful tool to better understand AF and improve its treatment, which remains suboptimal. This paper aims to provide a solution to automatically identify rotational activity drivers in endocardial electrograms (EGMs) with convolutional recurrent neural networks (CRNNs). The CRNN model was compared with two other state-of-the-art methods (SimpleCNN and attention-based time-incremental convolutional neural network (ATI-CNN)) for different input signals (unipolar EGMs, bipolar EGMs, and unipolar local activation times), sampling frequencies, and signal lengths. The proposed CRNN obtained a detection score based on the Matthews correlation coefficient of 0.680, an ATI-CNN score of 0.401, and a SimpleCNN score of 0.118, with bipolar EGMs as input signals exhibiting better overall performance. In terms of signal length and sampling frequency, no significant differences were found. The proposed architecture opens the way for new ablation strategies and driver detection methods to better understand the AF problem and its treatment.
Collapse
|
40
|
Xiao X, Shao YT, Luo ZT, Qiu WR. m5C-HPromoter: An Ensemble Deep Learning Predictor for Identifying 5-methylcytosine Sites in Human Promoters. Curr Bioinform 2022. [DOI: 10.2174/1574893617666220330150259] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Aims:
This paper is intended to identify 5-methylcytosine Sites in Human Promoters.
Background:
Aberrant DNA methylation patterns are often associated with tumor development, hypermethylation inhibits expression of tumor suppressor genes, and hypomethylation stimulates expression of certain oncogenes. Most DNA methylation occurs on the CpG island of gene promoter region.
Objective:
Therefore, a comprehensive display of the methylation status of the promoter region of human gene is extremely important for understanding cancer pathogenesis and function of post-transcriptional modification.
Method:
This paper constructed three human promoter methylation datasets, a total of 3 million sample sequences, of small cell lung cancer, non-small cell lung cancer, and hepatocellular carcinoma from Cancer Cell Line Encyclopedia (CCLE) database. Frequency-based One-Hot Encoding was used to encode the sample sequence, and an innovative stacking-based ensemble deep learning classifier was applied to establish the m5C-HPromoter predictor.
Result:
Taking the average of 10 times of 5-fold cross-validation, m5C-HPromoter obtained a good result of Accuracy (Acc) = 0.9270, Matthew's correlation coefficient (MCC) = 0.7234, Sensitivity (Sn) = 0.9123, and Specificity (Sp) = 0.9290.
Collapse
Affiliation(s)
- Xuan Xiao
- Department of Computer, Jing-De-Zhen Ceramic Institute, 333046, Jing-De-Zhen, China
| | - Yu-Tao Shao
- Department of Computer, Jing-De-Zhen Ceramic Institute, 333046, Jing-De-Zhen, China
| | - Zhen-Tao Luo
- Department of Computer, Jing-De-Zhen Ceramic Institute, 333046, Jing-De-Zhen, China
| | - Wang-Ren Qiu
- Department of Computer, Jing-De-Zhen Ceramic Institute, 333046, Jing-De-Zhen, China
| |
Collapse
|
41
|
Deep 3D Neural Network for Brain Structures Segmentation Using Self-Attention Modules in MRI Images. SENSORS 2022; 22:s22072559. [PMID: 35408173 PMCID: PMC9002763 DOI: 10.3390/s22072559] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/07/2022] [Revised: 03/15/2022] [Accepted: 03/21/2022] [Indexed: 01/03/2023]
Abstract
In recent years, the use of deep learning-based models for developing advanced healthcare systems has been growing due to the results they can achieve. However, the majority of the proposed deep learning-models largely use convolutional and pooling operations, causing a loss in valuable data and focusing on local information. In this paper, we propose a deep learning-based approach that uses global and local features which are of importance in the medical image segmentation process. In order to train the architecture, we used extracted three-dimensional (3D) blocks from the full magnetic resonance image resolution, which were sent through a set of successive convolutional neural network (CNN) layers free of pooling operations to extract local information. Later, we sent the resulting feature maps to successive layers of self-attention modules to obtain the global context, whose output was later dispatched to the decoder pipeline composed mostly of upsampling layers. The model was trained using the Mindboggle-101 dataset. The experimental results showed that the self-attention modules allow segmentation with a higher Mean Dice Score of 0.90 ± 0.036 compared with other UNet-based approaches. The average segmentation time was approximately 0.038 s per brain structure. The proposed model allows tackling the brain structure segmentation task properly. Exploiting the global context that the self-attention modules incorporate allows for more precise and faster segmentation. We segmented 37 brain structures and, to the best of our knowledge, it is the largest number of structures under a 3D approach using attention mechanisms.
Collapse
|
42
|
Inferring Time-Lagged Causality Using the Derivative of Single-Cell Expression. Int J Mol Sci 2022; 23:ijms23063348. [PMID: 35328768 PMCID: PMC8948830 DOI: 10.3390/ijms23063348] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2021] [Revised: 01/07/2022] [Accepted: 01/11/2022] [Indexed: 12/10/2022] Open
Abstract
Many computational methods have been developed to infer causality among genes using cross-sectional gene expression data, such as single-cell RNA sequencing (scRNA-seq) data. However, due to the limitations of scRNA-seq technologies, time-lagged causal relationships may be missed by existing methods. In this work, we propose a method, called causal inference with time-lagged information (CITL), to infer time-lagged causal relationships from scRNA-seq data by assessing the conditional independence between the changing and current expression levels of genes. CITL estimates the changing expression levels of genes by “RNA velocity”. We demonstrate the accuracy and stability of CITL for inferring time-lagged causality on simulation data against other leading approaches. We have applied CITL to real scRNA data and inferred 878 pairs of time-lagged causal relationships. Furthermore, we showed that the number of regulatory relationships identified by CITL was significantly more than that expected by chance. We provide an R package and a command-line tool of CITL for different usage scenarios.
Collapse
|
43
|
Peng Z, Maciel-Guerra A, Baker M, Zhang X, Hu Y, Wang W, Rong J, Zhang J, Xue N, Barrow P, Renney D, Stekel D, Williams P, Liu L, Chen J, Li F, Dottorini T. Whole-genome sequencing and gene sharing network analysis powered by machine learning identifies antibiotic resistance sharing between animals, humans and environment in livestock farming. PLoS Comput Biol 2022; 18:e1010018. [PMID: 35333870 PMCID: PMC8986120 DOI: 10.1371/journal.pcbi.1010018] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2021] [Revised: 04/06/2022] [Accepted: 03/14/2022] [Indexed: 01/26/2023] Open
Abstract
Anthropogenic environments such as those created by intensive farming of livestock, have been proposed to provide ideal selection pressure for the emergence of antimicrobial-resistant Escherichia coli bacteria and antimicrobial resistance genes (ARGs) and spread to humans. Here, we performed a longitudinal study in a large-scale commercial poultry farm in China, collecting E. coli isolates from both farm and slaughterhouse; targeting animals, carcasses, workers and their households and environment. By using whole-genome phylogenetic analysis and network analysis based on single nucleotide polymorphisms (SNPs), we found highly interrelated non-pathogenic and pathogenic E. coli strains with phylogenetic intermixing, and a high prevalence of shared multidrug resistance profiles amongst livestock, human and environment. Through an original data processing pipeline which combines omics, machine learning, gene sharing network and mobile genetic elements analysis, we investigated the resistance to 26 different antimicrobials and identified 361 genes associated to antimicrobial resistance (AMR) phenotypes; 58 of these were known AMR-associated genes and 35 were associated to multidrug resistance. We uncovered an extensive network of genes, correlated to AMR phenotypes, shared among livestock, humans, farm and slaughterhouse environments. We also found several human, livestock and environmental isolates sharing closely related mobile genetic elements carrying ARGs across host species and environments. In a scenario where no consensus exists on how antibiotic use in the livestock may affect antibiotic resistance in the human population, our findings provide novel insights into the broader epidemiology of antimicrobial resistance in livestock farming. Moreover, our original data analysis method has the potential to uncover AMR transmission pathways when applied to the study of other pathogens active in other anthropogenic environments characterised by complex interconnections between host species. Livestock have been suggested as an important source of antimicrobial-resistant (AMR) Escherichia coli, capable of infecting humans and carrying resistance to drugs used in human medicine. China has a large intensive livestock farming industry, poultry being the second most important source of meat in the country, and is the largest user of antibiotics for food production in the world. Here we studied antimicrobial resistance gene overlap between E. coli isolates collected from humans, livestock and their shared environments in a large-scale Chinese poultry farm and associated slaughterhouse. By using a computational approach that integrates machine learning, whole-genome sequencing, gene sharing network and mobile genetic elements analysis we characterized the E. coli community structure, antimicrobial resistance phenotypes and the genetic relatedness of non-pathogenic and pathogenic E. coli strains. We uncovered the network of genes, associated with AMR, shared across host species (animals and workers) and environments (farm and slaughterhouse). Our approach opens up new avenues for the development of a fast, affordable and effective computational solutions that provide novel insights into the broader epidemiology of antimicrobial resistance in livestock farming.
Collapse
Affiliation(s)
- Zixin Peng
- NHC Key Laboratory of Food Safety Risk Assessment, Chinese Academy of Medical Science Research Unit (2019RU014), China National Center for Food Safety Risk Assessment, Beijing, People’s Republic of China
| | - Alexandre Maciel-Guerra
- School of Veterinary Medicine and Science, University of Nottingham, Sutton Bonington, United Kingdom
| | - Michelle Baker
- School of Veterinary Medicine and Science, University of Nottingham, Sutton Bonington, United Kingdom
| | - Xibin Zhang
- Qingdao Tian run Food Co., Ltd, New Hope, Beijing, People’s Republic of China
| | - Yue Hu
- School of Veterinary Medicine and Science, University of Nottingham, Sutton Bonington, United Kingdom
| | - Wei Wang
- NHC Key Laboratory of Food Safety Risk Assessment, Chinese Academy of Medical Science Research Unit (2019RU014), China National Center for Food Safety Risk Assessment, Beijing, People’s Republic of China
| | - Jia Rong
- Qingdao Tian run Food Co., Ltd, New Hope, Beijing, People’s Republic of China
| | - Jing Zhang
- NHC Key Laboratory of Food Safety Risk Assessment, Chinese Academy of Medical Science Research Unit (2019RU014), China National Center for Food Safety Risk Assessment, Beijing, People’s Republic of China
| | - Ning Xue
- School of Veterinary Medicine and Science, University of Nottingham, Sutton Bonington, United Kingdom
| | - Paul Barrow
- School of Veterinary Medicine and Science, University of Nottingham, Sutton Bonington, United Kingdom
- School of Veterinary Medicine, University of Surrey, Guildford, Surrey, United Kingdom
| | - David Renney
- Nimrod Veterinary Products Limited, Moreton-in-Marsh, United Kingdom
| | - Dov Stekel
- School of Biosciences, University of Nottingham, Sutton Bonington, United Kingdom
| | - Paul Williams
- Biodiscovery Institute and School of Life Sciences, University of Nottingham, Nottingham, United Kingdom
| | - Longhai Liu
- Qingdao Tian run Food Co., Ltd, New Hope, Beijing, People’s Republic of China
| | - Junshi Chen
- NHC Key Laboratory of Food Safety Risk Assessment, Chinese Academy of Medical Science Research Unit (2019RU014), China National Center for Food Safety Risk Assessment, Beijing, People’s Republic of China
| | - Fengqin Li
- NHC Key Laboratory of Food Safety Risk Assessment, Chinese Academy of Medical Science Research Unit (2019RU014), China National Center for Food Safety Risk Assessment, Beijing, People’s Republic of China
- * E-mail: (FL); (TD)
| | - Tania Dottorini
- School of Veterinary Medicine and Science, University of Nottingham, Sutton Bonington, United Kingdom
- * E-mail: (FL); (TD)
| |
Collapse
|