1
|
Zhang Y, Li R, Zou G, Guo Y, Wu R, Zhou Y, Chen H, Zhou R, Lavigne R, Bergen PJ, Li J, Li J. Discovery of Antimicrobial Lysins from the "Dark Matter" of Uncharacterized Phages Using Artificial Intelligence. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2024:e2404049. [PMID: 38899839 DOI: 10.1002/advs.202404049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/17/2024] [Revised: 05/29/2024] [Indexed: 06/21/2024]
Abstract
The rapid rise of antibiotic resistance and slow discovery of new antibiotics have threatened global health. While novel phage lysins have emerged as potential antibacterial agents, experimental screening methods for novel lysins pose significant challenges due to the enormous workload. Here, the first unified software package, namely DeepLysin, is developed to employ artificial intelligence for mining the vast genome reservoirs ("dark matter") for novel antibacterial phage lysins. Putative lysins are computationally screened from uncharacterized Staphylococcus aureus phages and 17 novel lysins are randomly selected for experimental validation. Seven candidates exhibit excellent in vitro antibacterial activity, with LLysSA9 exceeding that of the best-in-class alternative. The efficacy of LLysSA9 is further demonstrated in mouse bloodstream and wound infection models. Therefore, this study demonstrates the potential of integrating computational and experimental approaches to expedite the discovery of new antibacterial proteins for combating increasing antimicrobial resistance.
Collapse
Affiliation(s)
- Yue Zhang
- National Key Laboratory of Agricultural Microbiology, Key Laboratory of Environment Correlative Dietology, College of Biomedicine and Health, Shenzhen Institute of Nutrition and Health, Huazhong Agricultural University, Wuhan, 430070, China
- Hubei Hongshan Laboratory, College of Food Science and Technology, Huazhong Agricultural University, Wuhan, 430070, China
| | - Runze Li
- National Key Laboratory of Agricultural Microbiology, Key Laboratory of Environment Correlative Dietology, College of Biomedicine and Health, Shenzhen Institute of Nutrition and Health, Huazhong Agricultural University, Wuhan, 430070, China
- Hubei Hongshan Laboratory, College of Food Science and Technology, Huazhong Agricultural University, Wuhan, 430070, China
| | - Geng Zou
- National Key Laboratory of Agricultural Microbiology, Key Laboratory of Environment Correlative Dietology, College of Biomedicine and Health, Shenzhen Institute of Nutrition and Health, Huazhong Agricultural University, Wuhan, 430070, China
- Hubei Hongshan Laboratory, College of Food Science and Technology, Huazhong Agricultural University, Wuhan, 430070, China
| | - Yating Guo
- National Key Laboratory of Agricultural Microbiology, Key Laboratory of Environment Correlative Dietology, College of Biomedicine and Health, Shenzhen Institute of Nutrition and Health, Huazhong Agricultural University, Wuhan, 430070, China
- College of Veterinary Medicine, Huazhong Agricultural University, Wuhan, 430070, China
| | - Renwei Wu
- National Key Laboratory of Agricultural Microbiology, Key Laboratory of Environment Correlative Dietology, College of Biomedicine and Health, Shenzhen Institute of Nutrition and Health, Huazhong Agricultural University, Wuhan, 430070, China
- College of Veterinary Medicine, Huazhong Agricultural University, Wuhan, 430070, China
| | - Yang Zhou
- National Key Laboratory of Agricultural Microbiology, Key Laboratory of Environment Correlative Dietology, College of Biomedicine and Health, Shenzhen Institute of Nutrition and Health, Huazhong Agricultural University, Wuhan, 430070, China
| | - Huanchun Chen
- National Key Laboratory of Agricultural Microbiology, Key Laboratory of Environment Correlative Dietology, College of Biomedicine and Health, Shenzhen Institute of Nutrition and Health, Huazhong Agricultural University, Wuhan, 430070, China
- College of Veterinary Medicine, Huazhong Agricultural University, Wuhan, 430070, China
| | - Rui Zhou
- National Key Laboratory of Agricultural Microbiology, Key Laboratory of Environment Correlative Dietology, College of Biomedicine and Health, Shenzhen Institute of Nutrition and Health, Huazhong Agricultural University, Wuhan, 430070, China
- College of Veterinary Medicine, Huazhong Agricultural University, Wuhan, 430070, China
| | - Rob Lavigne
- Department of Biosystems, Laboratory of Gene Technology, KU Leuven, Leuven, 3001, Belgium
| | - Phillip J Bergen
- Monash Biomedicine Discovery Institute, Department of Microbiology, Faculty of Medicine, Nursing and Health Sciences, Monash University, Melbourne, 3800, Australia
| | - Jian Li
- Monash Biomedicine Discovery Institute, Department of Microbiology, Faculty of Medicine, Nursing and Health Sciences, Monash University, Melbourne, 3800, Australia
| | - Jinquan Li
- National Key Laboratory of Agricultural Microbiology, Key Laboratory of Environment Correlative Dietology, College of Biomedicine and Health, Shenzhen Institute of Nutrition and Health, Huazhong Agricultural University, Wuhan, 430070, China
- Hubei Hongshan Laboratory, College of Food Science and Technology, Huazhong Agricultural University, Wuhan, 430070, China
- College of Veterinary Medicine, Huazhong Agricultural University, Wuhan, 430070, China
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518000, China
| |
Collapse
|
2
|
Chen J, Sun C, Dong Y, Jin M, Lai S, Jia L, Zhao X, Wang H, Gao NL, Bork P, Liu Z, Chen W, Zhao X. Efficient Recovery of Complete Gut Viral Genomes by Combined Short- and Long-Read Sequencing. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2024; 11:e2305818. [PMID: 38240578 PMCID: PMC10987132 DOI: 10.1002/advs.202305818] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/18/2023] [Revised: 12/01/2023] [Indexed: 04/04/2024]
Abstract
Current metagenome assembled human gut phage catalogs contained mostly fragmented genomes. Here, comprehensive gut virome detection procedure is developed involving virus-like particle (VLP) enrichment from ≈500 g feces and combined sequencing of short- and long-read. Applied to 135 samples, a Chinese Gut Virome Catalog (CHGV) is assembled consisting of 21,499 non-redundant viral operational taxonomic units (vOTUs) that are significantly longer than those obtained by short-read sequencing and contained ≈35% (7675) complete genomes, which is ≈nine times more than those in the Gut Virome Database (GVD, ≈4%, 1,443). Interestingly, the majority (≈60%, 13,356) of the CHGV vOTUs are obtained by either long-read or hybrid assemblies, with little overlap with those assembled from only the short-read data. With this dataset, vast diversity of the gut virome is elucidated, including the identification of 32% (6,962) novel vOTUs compare to public gut virome databases, dozens of phages that are more prevalent than the crAssphages and/or Gubaphages, and several viral clades that are more diverse than the two. Finally, the functional capacities are also characterized of the CHGV encoded proteins and constructed a viral-host interaction network to facilitate future research and applications.
Collapse
Affiliation(s)
- Jingchao Chen
- Key Laboratory of Molecular Biophysics of the Ministry of EducationHubei Key Laboratory of Bioinformatics and Molecular ImagingCenter for Artificial Intelligence BiologyDepartment of Bioinformatics and Systems BiologyCollege of Life Science and TechnologyHuazhong University of Science and TechnologyWuhanHubei430074China
| | - Chuqing Sun
- Key Laboratory of Molecular Biophysics of the Ministry of EducationHubei Key Laboratory of Bioinformatics and Molecular ImagingCenter for Artificial Intelligence BiologyDepartment of Bioinformatics and Systems BiologyCollege of Life Science and TechnologyHuazhong University of Science and TechnologyWuhanHubei430074China
| | - Yanqi Dong
- Department of NeurologyZhongshan Hospital and Institute of Science and Technology for Brain‐Inspired IntelligenceFudan UniversityShanghai200433China
| | - Menglu Jin
- Key Laboratory of Molecular Biophysics of the Ministry of EducationHubei Key Laboratory of Bioinformatics and Molecular ImagingCenter for Artificial Intelligence BiologyDepartment of Bioinformatics and Systems BiologyCollege of Life Science and TechnologyHuazhong University of Science and TechnologyWuhanHubei430074China
- College of Life ScienceHenan Normal UniversityXinxiangHenan453007China
| | - Senying Lai
- Department of NeurologyZhongshan Hospital and Institute of Science and Technology for Brain‐Inspired IntelligenceFudan UniversityShanghai200433China
| | - Longhao Jia
- Department of NeurologyZhongshan Hospital and Institute of Science and Technology for Brain‐Inspired IntelligenceFudan UniversityShanghai200433China
| | - Xueyang Zhao
- College of Life ScienceHenan Normal UniversityXinxiangHenan453007China
| | - Huarui Wang
- Key Laboratory of Molecular Biophysics of the Ministry of EducationHubei Key Laboratory of Bioinformatics and Molecular ImagingCenter for Artificial Intelligence BiologyDepartment of Bioinformatics and Systems BiologyCollege of Life Science and TechnologyHuazhong University of Science and TechnologyWuhanHubei430074China
| | - Na L. Gao
- Key Laboratory of Molecular Biophysics of the Ministry of EducationHubei Key Laboratory of Bioinformatics and Molecular ImagingCenter for Artificial Intelligence BiologyDepartment of Bioinformatics and Systems BiologyCollege of Life Science and TechnologyHuazhong University of Science and TechnologyWuhanHubei430074China
- Department of Laboratory MedicineZhongnan Hospital of Wuhan UniversityWuhan UniversityWuhan430071China
| | - Peer Bork
- European Molecular Biology LaboratoryStructural and Computational Biology Unit69117HeidelbergGermany
- Max Delbrück Centre for Molecular Medicine13125BerlinGermany
- Yonsei Frontier Lab (YFL)Yonsei University03722SeoulSouth Korea
- Department of BioinformaticsBiocenterUniversity of Würzburg97070WürzburgGermany
| | - Zhi Liu
- Department of BiotechnologyCollege of Life Science and TechnologyHuazhong University of Science and Technology430074WuhanChina
| | - Wei‐Hua Chen
- Key Laboratory of Molecular Biophysics of the Ministry of EducationHubei Key Laboratory of Bioinformatics and Molecular ImagingCenter for Artificial Intelligence BiologyDepartment of Bioinformatics and Systems BiologyCollege of Life Science and TechnologyHuazhong University of Science and TechnologyWuhanHubei430074China
- College of Life ScienceHenan Normal UniversityXinxiangHenan453007China
- Institution of Medical Artificial IntelligenceBinzhou Medical UniversityYantai264003China
| | - Xing‐Ming Zhao
- Department of NeurologyZhongshan Hospital and Institute of Science and Technology for Brain‐Inspired IntelligenceFudan UniversityShanghai200433China
- MOE Key Laboratory of Computational Neuroscience and Brain‐Inspired Intelligenceand MOE Frontiers Center for Brain ScienceFudan UniversityShanghai200433China
- State Key Laboratory of Medical NeurobiologyInstitute of Brain ScienceFudan UniversityShanghai200433China
- International Human Phenome Institutes (Shanghai)Shanghai200433China
| |
Collapse
|
3
|
Minot M, Reddy ST. Meta learning addresses noisy and under-labeled data in machine learning-guided antibody engineering. Cell Syst 2024; 15:4-18.e4. [PMID: 38194961 DOI: 10.1016/j.cels.2023.12.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2023] [Revised: 07/21/2023] [Accepted: 12/07/2023] [Indexed: 01/11/2024]
Abstract
Machine learning-guided protein engineering is rapidly progressing; however, collecting high-quality, large datasets remains a bottleneck. Directed evolution and protein engineering studies often require extensive experimental processes to eliminate noise and label protein sequence-function data. Meta learning has proven effective in other fields in learning from noisy data via bi-level optimization given the availability of a small dataset with trusted labels. Here, we leverage meta learning approaches to overcome noisy and under-labeled data and expedite workflows in antibody engineering. We generate yeast display antibody mutagenesis libraries and screen them for target antigen binding followed by deep sequencing. We then create representative learning tasks, including learning from noisy training data, positive and unlabeled learning, and learning out of distribution properties. We demonstrate that meta learning has the potential to reduce experimental screening time and improve the robustness of machine learning models by training with noisy and under-labeled training data.
Collapse
Affiliation(s)
- Mason Minot
- ETH Zurich, Department of Biosystems Science and Engineering, Basel 4056, Switzerland
| | - Sai T Reddy
- ETH Zurich, Department of Biosystems Science and Engineering, Basel 4056, Switzerland.
| |
Collapse
|
4
|
Wang D, Shang J, Lin H, Liang J, Wang C, Sun Y, Bai Y, Qu J. Identifying ARG-carrying bacteriophages in a lake replenished by reclaimed water using deep learning techniques. WATER RESEARCH 2024; 248:120859. [PMID: 37976954 DOI: 10.1016/j.watres.2023.120859] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/16/2023] [Revised: 09/16/2023] [Accepted: 11/10/2023] [Indexed: 11/19/2023]
Abstract
As important mobile genetic elements, phages support the spread of antibiotic resistance genes (ARGs). Previous analyses of metaviromes or metagenome-assembled genomes (MAGs) failed to assess the extent of ARGs transferred by phages, particularly in the generation of antibiotic pathogens. Therefore, we have developed a bioinformatic pipeline that utilizes deep learning techniques to identify ARG-carrying phages and predict their hosts, with a special focus on pathogens. Using this method, we discovered that the predominant types of ARGs carried by temperate phages in a typical landscape lake, which is fully replenished by reclaimed water, were related to multidrug resistance and β-lactam antibiotics. MAGs containing virulent factors (VFs) were predicted to serve as hosts for these ARG-carrying phages, which suggests that the phages may have the potential to transfer ARGs. In silico analysis showed a significant positive correlation between temperate phages and host pathogens (R = 0.503, p < 0.001), which was later confirmed by qPCR. Interestingly, these MAGs were found to be more abundant than those containing both ARGs and VFs, especially in December and March. Seasonal variations were observed in the abundance of phages harboring ARGs (from 5.62 % to 21.02 %) and chromosomes harboring ARGs (from 18.01 % to 30.94 %). In contrast, the abundance of plasmids harboring ARGs remained unchanged. In summary, this study leverages deep learning to analyze phage-transferred ARGs and demonstrates an alternative method to track the production of potential antibiotic-resistant pathogens by metagenomics that can be extended to microbiological risk assessment.
Collapse
Affiliation(s)
- Donglin Wang
- Key Laboratory of Drinking Water Science and Technology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, China
| | - Jiayu Shang
- Department of Electrical Engineering, City University of Hong Kong, Kowloon, Hong Kong SAR, China
| | - Hui Lin
- Key Laboratory of Drinking Water Science and Technology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, China
| | - Jinsong Liang
- School of Civil and Environmental Engineering, Harbin Institute of Technology, Shenzhen 518055, China
| | - Chenchen Wang
- School of Environmental and Municipal Engineering, Tianjin Chengjian University, Tianjin 300384, China; Tianjin Key Laboratory of Aquatic Science and Technology, Tianjin Chengjian University, Tianjin 300384, China
| | - Yanni Sun
- Department of Electrical Engineering, City University of Hong Kong, Kowloon, Hong Kong SAR, China.
| | - Yaohui Bai
- Key Laboratory of Drinking Water Science and Technology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, China.
| | - Jiuhui Qu
- Key Laboratory of Drinking Water Science and Technology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, China
| |
Collapse
|
5
|
Chen XG, Yang X, Li C, Lin X, Zhang W. Non-coding RNA identification with pseudo RNA sequences and feature representation learning. Comput Biol Med 2023; 165:107355. [PMID: 37639767 DOI: 10.1016/j.compbiomed.2023.107355] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2023] [Revised: 07/16/2023] [Accepted: 08/12/2023] [Indexed: 08/31/2023]
Abstract
Distinguishing non-coding RNAs (ncRNAs) from coding RNAs is very important in bioinformatics. Although many methods have been proposed for solving this task, it remains highly challenging to further improve the accuracy of ncRNA identification. In this paper, we propose a coding potential predictor using feature representation learning based on pseudo RNA sequences named CPPFLPS. In this method, we use the pseudo RNA sequences generated by simulating RNA sequence mutations as new samples for data augmentation, and six string operations simulating RNA sequence mutations are considered: base replacement, base insertion, base deletion, subsequence reversion, subsequence repetition and subsequence deletion. In the feature representation learning framework, different types of pseudo RNA sequences are added to the training set to form new training sets that can be used to train baseline classifiers, thus obtaining baseline models. The resulting labels of these baseline models are used as feature vectors to represent RNA sequences, and the resulting feature vectors acquired after feature selection are used to train a predictive model for distinguishing ncRNAs from coding RNAs. Our method achieves better performance compared with that of existing state-of-the-art methods. The implementation of the proposed method is available at https://github.com/chenxgscuec/CPPFLPS.
Collapse
Affiliation(s)
- Xian-Gan Chen
- School of Biomedical Engineering, South-Central Minzu University, Wuhan, 430074, China; Hubei Key Laboratory of Medical Information Analysis and Tumor Diagnosis & Treatment, South-Central Minzu University, Wuhan, 430074, China; Key Laboratory of Cognitive Science(South-Central Minzu University), State Ethnic Affairs Commission, Wuhan, 430074, China.
| | - Xiaofei Yang
- School of Biomedical Engineering, South-Central Minzu University, Wuhan, 430074, China; Hubei Key Laboratory of Medical Information Analysis and Tumor Diagnosis & Treatment, South-Central Minzu University, Wuhan, 430074, China; Key Laboratory of Cognitive Science(South-Central Minzu University), State Ethnic Affairs Commission, Wuhan, 430074, China.
| | - Chenhong Li
- School of Biomedical Engineering, South-Central Minzu University, Wuhan, 430074, China; Hubei Key Laboratory of Medical Information Analysis and Tumor Diagnosis & Treatment, South-Central Minzu University, Wuhan, 430074, China; Key Laboratory of Cognitive Science(South-Central Minzu University), State Ethnic Affairs Commission, Wuhan, 430074, China.
| | - Xianguang Lin
- School of Biomedical Engineering, South-Central Minzu University, Wuhan, 430074, China; Hubei Key Laboratory of Medical Information Analysis and Tumor Diagnosis & Treatment, South-Central Minzu University, Wuhan, 430074, China; Key Laboratory of Cognitive Science(South-Central Minzu University), State Ethnic Affairs Commission, Wuhan, 430074, China.
| | - Wen Zhang
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China.
| |
Collapse
|
6
|
Pan J, You Z, You W, Zhao T, Feng C, Zhang X, Ren F, Ma S, Wu F, Wang S, Sun Y. PTBGRP: predicting phage-bacteria interactions with graph representation learning on microbial heterogeneous information network. Brief Bioinform 2023; 24:bbad328. [PMID: 37742053 DOI: 10.1093/bib/bbad328] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2023] [Revised: 08/14/2023] [Accepted: 08/30/2023] [Indexed: 09/25/2023] Open
Abstract
Identifying the potential bacteriophages (phage) candidate to treat bacterial infections plays an essential role in the research of human pathogens. Computational approaches are recognized as a valid way to predict bacteria and target phages. However, most of the current methods only utilize lower-order biological information without considering the higher-order connectivity patterns, which helps to improve the predictive accuracy. Therefore, we developed a novel microbial heterogeneous interaction network (MHIN)-based model called PTBGRP to predict new phages for bacterial hosts. Specifically, PTBGRP first constructs an MHIN by integrating phage-bacteria interaction (PBI) and six bacteria-bacteria interaction networks with their biological attributes. Then, different representation learning methods are deployed to extract higher-level biological features and lower-level topological features from MHIN. Finally, PTBGRP employs a deep neural network as the classifier to predict unknown PBI pairs based on the fused biological information. Experiment results demonstrated that PTBGRP achieves the best performance on the corresponding ESKAPE pathogens and PBI dataset when compared with state-of-art methods. In addition, case studies of Klebsiella pneumoniae and Staphylococcus aureus further indicate that the consideration of rich heterogeneous information enables PTBGRP to accurately predict PBI from a more comprehensive perspective. The webserver of the PTBGRP predictor is freely available at http://120.77.11.78/PTBGRP/.
Collapse
Affiliation(s)
- Jie Pan
- Key Laboratory of Resources Biology and Biotechnology in Western China, Ministry of Education, Provincial Key Laboratory of Biotechnology of Shaanxi Province, the College of Life Sciences, Northwest University, Xi'an 710069, China
| | - Zhuhong You
- School of Computer Science, Northwestern Polytechnical University, Xi'an 710129, China
| | - Wencai You
- Key Laboratory of Resources Biology and Biotechnology in Western China, Ministry of Education, Provincial Key Laboratory of Biotechnology of Shaanxi Province, the College of Life Sciences, Northwest University, Xi'an 710069, China
| | - Tian Zhao
- Key Laboratory of Resources Biology and Biotechnology in Western China, Ministry of Education, Provincial Key Laboratory of Biotechnology of Shaanxi Province, the College of Life Sciences, Northwest University, Xi'an 710069, China
| | - Chenlu Feng
- Key Laboratory of Resources Biology and Biotechnology in Western China, Ministry of Education, Provincial Key Laboratory of Biotechnology of Shaanxi Province, the College of Life Sciences, Northwest University, Xi'an 710069, China
| | - Xuexia Zhang
- North China Pharmaceutical Group, Shijiazhuang 050015, Hebei, China
- National Microbial Medicine Engineering & Research Center, Shijiazhuang 050015, Hebei, China
| | - Fengzhi Ren
- North China Pharmaceutical Group, Shijiazhuang 050015, Hebei, China
- National Microbial Medicine Engineering & Research Center, Shijiazhuang 050015, Hebei, China
| | - Sanxing Ma
- Key Laboratory of Resources Biology and Biotechnology in Western China, Ministry of Education, Provincial Key Laboratory of Biotechnology of Shaanxi Province, the College of Life Sciences, Northwest University, Xi'an 710069, China
| | - Fan Wu
- Key Laboratory of Resources Biology and Biotechnology in Western China, Ministry of Education, Provincial Key Laboratory of Biotechnology of Shaanxi Province, the College of Life Sciences, Northwest University, Xi'an 710069, China
| | - Shiwei Wang
- Key Laboratory of Resources Biology and Biotechnology in Western China, Ministry of Education, Provincial Key Laboratory of Biotechnology of Shaanxi Province, the College of Life Sciences, Northwest University, Xi'an 710069, China
| | - Yanmei Sun
- Key Laboratory of Resources Biology and Biotechnology in Western China, Ministry of Education, Provincial Key Laboratory of Biotechnology of Shaanxi Province, the College of Life Sciences, Northwest University, Xi'an 710069, China
| |
Collapse
|
7
|
Qiu S, Liu R, Liang Y. GR-m6A: Prediction of N6-methyladenosine sites in mammals with molecular graph and residual network. Comput Biol Med 2023; 163:107202. [PMID: 37450964 DOI: 10.1016/j.compbiomed.2023.107202] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2023] [Revised: 06/14/2023] [Accepted: 06/25/2023] [Indexed: 07/18/2023]
Abstract
RNA N6-methyladenine (m6A), which is produced by the methylation of the N6 position of eukaryotic adenine, is a relatively common post-transcriptional modification on the surface of the molecule, which frequently plays a crucial role in biological processes. Biological experimental methods to identify m6A have been studied and implemented in recent years, but they cannot be promoted widely due to drawbacks such as the time and cost of reagents and equipment. Therefore, researchers have proposed computational strategies for identifying m6A sites, but these strategies do not account for the mechanism of methylation occurrence or the structure of RNA molecules. This study, therefore, proposed a novel deep learning model for predicting m6A sites, GR-m6A, which predicts m6A sites by extracting features from the physicochemical properties and spatial structure of molecules via residual networks. In GR-m6A, each RNA base string is represented by SMILES as two matrices comprising topology structural information and node attributes with molecular physicochemical characteristics. The feature encoding matrix was then obtained by fusing the topology matrix and the node matrix in accordance with the graphical convolutional network principle. Correspondingly, the more discriminative features were extracted from the encoding matrix using the residual neural network and predicted using a multilayer perceptron. As evident from the 5-fold cross-validation and independent validation, the GR-m6A model outperformed other existing methods. Thus, we hope that GR-m6A can aid researchers in predicting mammalian m6A loci. The source code and database are available at https://github.com/YingLiangjxau/GR-m6A.
Collapse
Affiliation(s)
- Shi Qiu
- College of Engineering, Jiangxi Agricultural University, Nanchang 310045, Jiangxi, China.
| | - Renxin Liu
- College of Engineering, Jiangxi Agricultural University, Nanchang 310045, Jiangxi, China.
| | - Ying Liang
- College of Computer and Information Engineering, Jiangxi Agricultural University, Nanchang 310045, Jiangxi, China.
| |
Collapse
|
8
|
Jia J, Lei R, Qin L, Wu G, Wei X. iEnhancer-DCSV: Predicting enhancers and their strength based on DenseNet and improved convolutional block attention module. Front Genet 2023; 14:1132018. [PMID: 36936423 PMCID: PMC10014624 DOI: 10.3389/fgene.2023.1132018] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2022] [Accepted: 02/13/2023] [Indexed: 03/06/2023] Open
Abstract
Enhancers play a crucial role in controlling gene transcription and expression. Therefore, bioinformatics puts many emphases on predicting enhancers and their strength. It is vital to create quick and accurate calculating techniques because conventional biomedical tests take too long time and are too expensive. This paper proposed a new predictor called iEnhancer-DCSV built on a modified densely connected convolutional network (DenseNet) and an improved convolutional block attention module (CBAM). Coding was performed using one-hot and nucleotide chemical property (NCP). DenseNet was used to extract advanced features from raw coding. The channel attention and spatial attention modules were used to evaluate the significance of the advanced features and then input into a fully connected neural network to yield the prediction probabilities. Finally, ensemble learning was employed on the final categorization findings via voting. According to the experimental results on the test set, the first layer of enhancer recognition achieved an accuracy of 78.95%, and the Matthews correlation coefficient value was 0.5809. The second layer of enhancer strength prediction achieved an accuracy of 80.70%, and the Matthews correlation coefficient value was 0.6609. The iEnhancer-DCSV method can be found at https://github.com/leirufeng/iEnhancer-DCSV. It is easy to obtain the desired results without using the complex mathematical formulas involved.
Collapse
Affiliation(s)
- Jianhua Jia
- School of Information Engineering, Jingdezhen Ceramic University, Jingdezhen, China
- *Correspondence: Jianhua Jia, ; Rufeng Lei,
| | - Rufeng Lei
- School of Information Engineering, Jingdezhen Ceramic University, Jingdezhen, China
- *Correspondence: Jianhua Jia, ; Rufeng Lei,
| | - Lulu Qin
- School of Information Engineering, Jingdezhen Ceramic University, Jingdezhen, China
| | - Genqiang Wu
- School of Information Engineering, Jingdezhen Ceramic University, Jingdezhen, China
| | - Xin Wei
- Business School, Jiangxi Institute of Fashion Technology, Nanchang, China
| |
Collapse
|