1
|
Ao C, Ye X, Sakurai T, Zou Q, Yu L. m5U-SVM: identification of RNA 5-methyluridine modification sites based on multi-view features of physicochemical features and distributed representation. BMC Biol 2023; 21:93. [PMID: 37095510 PMCID: PMC10127088 DOI: 10.1186/s12915-023-01596-0] [Citation(s) in RCA: 25] [Impact Index Per Article: 25.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2022] [Accepted: 04/12/2023] [Indexed: 04/26/2023] Open
Abstract
BACKGROUND RNA 5-methyluridine (m5U) modifications are obtained by methylation at the C5 position of uridine catalyzed by pyrimidine methylation transferase, which is related to the development of human diseases. Accurate identification of m5U modification sites from RNA sequences can contribute to the understanding of their biological functions and the pathogenesis of related diseases. Compared to traditional experimental methods, computational methods developed based on machine learning with ease of use can identify modification sites from RNA sequences in an efficient and time-saving manner. Despite the good performance of these computational methods, there are some drawbacks and limitations. RESULTS In this study, we have developed a novel predictor, m5U-SVM, based on multi-view features and machine learning algorithms to construct predictive models for identifying m5U modification sites from RNA sequences. In this method, we used four traditional physicochemical features and distributed representation features. The optimized multi-view features were obtained from the four fused traditional physicochemical features by using the two-step LightGBM and IFS methods, and then the distributed representation features were fused with the optimized physicochemical features to obtain the new multi-view features. The best performing classifier, support vector machine, was identified by screening different machine learning algorithms. Compared with the results, the performance of the proposed model is better than that of the existing state-of-the-art tool. CONCLUSIONS m5U-SVM provides an effective tool that successfully captures sequence-related attributes of modifications and can accurately predict m5U modification sites from RNA sequences. The identification of m5U modification sites helps to understand and delve into the related biological processes and functions.
Collapse
Affiliation(s)
- Chunyan Ao
- School of Computer Science and Technology, Xidian University, Xi'an, China
- Department of Computer Science, University of Tsukuba, Tsukuba, Japan
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
| | - Xiucai Ye
- Department of Computer Science, University of Tsukuba, Tsukuba, Japan
| | - Tetsuya Sakurai
- Department of Computer Science, University of Tsukuba, Tsukuba, Japan
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China.
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China.
| | - Liang Yu
- School of Computer Science and Technology, Xidian University, Xi'an, China.
| |
Collapse
|
2
|
Fan Y, Wu Q, Cui H, Lu W, Ren W. Stochastic simulation of seawater intrusion in the Longkou area of China based on the Monte Carlo method. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2023; 30:22063-22077. [PMID: 36280633 DOI: 10.1007/s11356-022-23767-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/25/2022] [Accepted: 10/18/2022] [Indexed: 06/16/2023]
Abstract
Seawater intrusion is a common groundwater pollution problem, which has a great impact on ecological environment and economic development. In this paper, a numerical simulation model of variable density groundwater was constructed to simulate and predict the future seawater intrusion in Longkou city, Shandong Province of China. The influence of the sensitive parameter uncertainty of the model on the simulation results was evaluated by using the Monte Carlo method. In order to reduce the computational load from repeatedly calling the simulation model, the surrogate model was established by using the support vector regression (SVR) method. After training, the correlation coefficient R2 of the input-output relationship between the SVR surrogate model and the seawater intrusion simulation model reached 0.9957, with an average relative error of 0.2%, indicating that the surrogate model has a high fitting accuracy. Stochastic simulations of seawater intrusion showed that the seawater intrusion in the Longkou area will gradually aggravate at a slow rate, and the increase of seawater intrusion in the study area after 30 years was expected to range from - 6.03% to 7.37% at the 80% confidence level.
Collapse
Affiliation(s)
- Yue Fan
- Key Laboratory of Geotechnical Mechanics and Engineering of Ministry of Water Resources, Changjiang River Scientific Research Institute, Wuhan, 430010, Hubei, China
| | - Qinghua Wu
- Key Laboratory of Geotechnical Mechanics and Engineering of Ministry of Water Resources, Changjiang River Scientific Research Institute, Wuhan, 430010, Hubei, China
| | - Haodong Cui
- Key Laboratory of Geotechnical Mechanics and Engineering of Ministry of Water Resources, Changjiang River Scientific Research Institute, Wuhan, 430010, Hubei, China
| | - Wenxi Lu
- College of New Energy and Environment, Jilin University, Changchun, 130021, China
| | - Wanli Ren
- State Key Laboratory of Biogeology and Environmental Geology, China University of Geosciences, Wuhan, 430078, China.
| |
Collapse
|
3
|
Research on lung nodule recognition algorithm based on deep feature fusion and MKL-SVM-IPSO. Sci Rep 2022; 12:17403. [PMID: 36257988 PMCID: PMC9579155 DOI: 10.1038/s41598-022-22442-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2022] [Accepted: 10/14/2022] [Indexed: 01/10/2023] Open
Abstract
Lung CAD system can provide auxiliary third-party opinions for doctors, improve the accuracy of lung nodule recognition. The selection and fusion of nodule features and the advancement of recognition algorithms are crucial improving lung CAD systems. Based on the HDL model, this paper mainly focuses on the three key algorithms of feature extraction, feature fusion and nodule recognition of lung CAD system. First, CBAM is embedded into VGG16 and VGG19, and feature extraction models AE-VGG16 and AE-VGG19 are constructed, so that the network can pay more attention to the key feature information in nodule description. Then, feature dimensionality reduction based on PCA and feature fusion based on CCA are sequentially performed on the extracted depth features to obtain low-dimensional fusion features. Finally, the fusion features are input into the proposed MKL-SVM-IPSO model based on the improved Particle Swarm Optimization algorithm to speed up the training speed, get the global optimal parameter group. The public dataset LUNA16 was selected for the experiment. The results show that the accuracy of lung nodule recognition of the proposed lung CAD system can reach 99.56%, and the sensitivity and F1-score can reach 99.3% and 0.9965, respectively, which can reduce the possibility of false detection and missed detection of nodules.
Collapse
|
4
|
Wani IM, Arora S. Computer-aided diagnosis systems for osteoporosis detection: a comprehensive survey. Med Biol Eng Comput 2020; 58:1873-1917. [PMID: 32583141 DOI: 10.1007/s11517-020-02171-3] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2019] [Accepted: 03/26/2020] [Indexed: 12/18/2022]
Abstract
Computer-aided diagnosis (CAD) has revolutionized the field of medical diagnosis. They assist in improving the treatment potentials and intensify the survival frequency by early diagnosing the diseases in an efficient, timely, and cost-effective way. The automatic segmentation has led the radiologist to successfully segment the region of interest to improve the diagnosis of diseases from medical images which is not so efficiently possible by manual segmentation. The aim of this paper is to survey the vision-based CAD systems especially focusing on the segmentation techniques for the pathological bone disease known as osteoporosis. Osteoporosis is the state of the bones where the mineral density of bones decreases and they become porous, making the bones easily susceptible to fractures by small injury or a fall. The article covers the image acquisition techniques for acquiring the medical images for osteoporosis diagnosis. The article also discusses the advanced machine learning paradigms employed in segmentation for osteoporosis disease. Other image processing steps in osteoporosis like feature extraction and classification are also briefly described. Finally, the paper gives the future directions to improve the osteoporosis diagnosis and presents the proposed architecture. Graphical abstract.
Collapse
Affiliation(s)
- Insha Majeed Wani
- School of Computer Science and Engineering, SMVDU, Katra, J&K, India
| | - Sakshi Arora
- School of Computer Science and Engineering, SMVDU, Katra, J&K, India.
| |
Collapse
|
5
|
Bio-inspired weighed quantum particle swarm optimization and smooth support vector machine ensembles for identification of abnormalities in medical data. SN APPLIED SCIENCES 2019. [DOI: 10.1007/s42452-019-1179-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022] Open
|
6
|
Methane Detection Based on Improved Chicken Algorithm Optimization Support Vector Machine. APPLIED SCIENCES-BASEL 2019. [DOI: 10.3390/app9091761] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Methane, known as a flammable and explosion hazard gas, is the main component of marsh gas, firedamp, and rock gas. Therefore, it is important to be able to detect methane concentration safely and effectively. At present, many models have been proposed to enhance the performance of methane predictions. However, the traditional models displayed inevitable shortcomings in parameter optimization in our experiment, which resulted in their having poor prediction performance. Accordingly, the improved chicken swarm algorithm optimized support vector machine (ICSO-SVM) was proposed to predict the concentration of methane precisely. The traditional chicken swarm optimization algorithm (CSO) easily falls into a local optimum due to its characteristics, so the ICSO algorithm was developed. The formula for position updating of the chicks of the ICSO is not only about the rooster of the same subgroup, but also about the roosters of other subgroups. Therefore, the ICSO algorithm more easily avoids falling into the local extremum. In this paper, the following work has been done. The sample data were obtained by using the methane detection system designed by us; In order to verify the validity of the ICSO algorithm, the ICSO, CSO, genetic algorithm (GA), and particle swarm optimization algorithm (PSO) algorithms were tested, and the four models were applied for methane concentration prediction. The results showed that he ICSO algorithm had the best convergence effect, relative error percentage, and average mean squared error, when the four models were applied to predict methane concentration. The results showed that the average mean squared error values of ICSO-SVM model were smaller than other three models, and that the ICSO-SVM model has better stability, and the average recovery rate of the ICSO-SVM is much closer to 100%. Therefore, the ICSO-SVM model can efficiently predict methane concentration.
Collapse
|
7
|
Zhao D, Liu H, Zheng Y, He Y, Lu D, Lyu C. Whale optimized mixed kernel function of support vector machine for colorectal cancer diagnosis. J Biomed Inform 2019; 92:103124. [PMID: 30796977 DOI: 10.1016/j.jbi.2019.103124] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2018] [Revised: 01/15/2019] [Accepted: 02/04/2019] [Indexed: 12/17/2022]
Abstract
Microarray technique is a prevalent method for the classification and prediction of colorectal cancer (CRC). Nevertheless, microarray data suffers from the curse of dimensionality when selecting feature genes of the disease based on imbalance samples, thus causing low prediction accuracy. Hence, it is of vital significance to build proper models that can avoid the above problems and predict the CRC more accurately. In this paper, we use an ensemble model to classify samples into healthy and CRC groups and improve prediction performance. The proposed model is composed of three functional modules. The first module mainly performs the function of removing redundant genes. The main feature genes are selected using minimum redundancy maximum relevance (mRMR) method to reduce the dimensionality of features thereby increasing the prediction results. The second module aims to solve the problem caused by imbalanced data using hybrid sampling algorithm RUSBoost. The third module focuses on the classification algorithm optimization. We use mixed kernel function (MKF) based support vector machine (SVM) model to classify an unknown sample into healthy individuals and CRC patients, and then, the Whale Optimization Algorithm (WOA) is applied to find most optimal parameters of the proposed MKF-SVM. The final results show that the proposed model achieves higher G-means than other comparable models. The conclusion comes to show that RUSBoost wrapping WOA + MKF-SVM model can be applied to improve the predictive performance of colorectal cancer based on the imbalanced data.
Collapse
Affiliation(s)
- Dandan Zhao
- School of Information Science and Engineering, Shandong Normal University, Jinan City, China; Shandong Provincial Key Laboratory for Novel Distributed Computer Software Technology, Jinan City, China
| | - Hong Liu
- School of Information Science and Engineering, Shandong Normal University, Jinan City, China; Shandong Provincial Key Laboratory for Novel Distributed Computer Software Technology, Jinan City, China.
| | - Yuanjie Zheng
- School of Information Science and Engineering, Shandong Normal University, Jinan City, China; Shandong Provincial Key Laboratory for Novel Distributed Computer Software Technology, Jinan City, China
| | - Yanlin He
- School of Information Science and Engineering, Shandong Normal University, Jinan City, China; Shandong Provincial Key Laboratory for Novel Distributed Computer Software Technology, Jinan City, China
| | - Dianjie Lu
- School of Information Science and Engineering, Shandong Normal University, Jinan City, China; Shandong Provincial Key Laboratory for Novel Distributed Computer Software Technology, Jinan City, China
| | - Chen Lyu
- School of Information Science and Engineering, Shandong Normal University, Jinan City, China; Shandong Provincial Key Laboratory for Novel Distributed Computer Software Technology, Jinan City, China
| |
Collapse
|