1
|
Xin K, Wei X, Shao J, Chen F, Liu Q, Liu B. Establishment of a novel tumor neoantigen prediction tool for personalized vaccine design. Hum Vaccin Immunother 2024; 20:2300881. [PMID: 38214336 DOI: 10.1080/21645515.2023.2300881] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2023] [Accepted: 12/28/2023] [Indexed: 01/13/2024] Open
Abstract
The personalized neoantigen nanovaccine (PNVAC) platform for patients with gastric cancer we established previously exhibited promising anti-tumor immunoreaction. However, limited by the ability of traditional neoantigen prediction tools, a portion of epitopes failed to induce specific immune response. In order to filter out more neoantigens to optimize our PNVAC platform, we develop a novel neoantigen prediction model, NUCC. This prediction tool trained through a deep learning approach exhibits better neoantigen prediction performance than other prediction tools, not only in two independent epitope datasets, but also in a totally new epitope dataset we construct from scratch, including 25 patients with advance gastric cancer and 150 candidate mutant peptides, 13 of which prove to be neoantigen by immunogenicity test in vitro. Our work lay the foundation for the improvement of our PNVAC platform for gastric cancer in the future.
Collapse
Affiliation(s)
- Kai Xin
- Department of Oncology, Nanjing Drum Tower Hospital Clinical College of Nanjing University of Chinese Medicine, Nanjing, Jiangsu Province, China
| | - Xiao Wei
- Department of Pathology, Nanjing Drum Tower Hospital, Affiliated Hospital of Medical School, Nanjing University, Nanjing, Jiangsu Province, China
| | - Jie Shao
- Department of Oncology, Nanjing Drum Tower Hospital, Affiliated Hospital of Medical School, Nanjing University, Nanjing, Jiangsu Province, China
| | - Fangjun Chen
- Department of Oncology, Nanjing Drum Tower Hospital, Affiliated Hospital of Medical School, Nanjing University, Nanjing, Jiangsu Province, China
| | - Qin Liu
- Department of Oncology, Nanjing Drum Tower Hospital Clinical College of Nanjing University of Chinese Medicine, Nanjing, Jiangsu Province, China
- Department of Oncology, Nanjing Drum Tower Hospital, Affiliated Hospital of Medical School, Nanjing University, Nanjing, Jiangsu Province, China
| | - Baorui Liu
- Department of Oncology, Nanjing Drum Tower Hospital Clinical College of Nanjing University of Chinese Medicine, Nanjing, Jiangsu Province, China
- Department of Oncology, Nanjing Drum Tower Hospital, Affiliated Hospital of Medical School, Nanjing University, Nanjing, Jiangsu Province, China
| |
Collapse
|
2
|
Nopour R. Establishment of prediction model for mortality risk of pancreatic cancer: a retrospective study. BMC Med Inform Decis Mak 2024; 24:181. [PMID: 38937795 PMCID: PMC11210158 DOI: 10.1186/s12911-024-02590-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2024] [Accepted: 06/25/2024] [Indexed: 06/29/2024] Open
Abstract
BACKGROUND AND AIM Pancreatic cancer possesses a high prevalence and mortality rate among other cancers. Despite the low survival rate of this cancer type, the early prediction of this disease has a crucial role in decreasing the mortality rate and improving the prognosis. So, this study. MATERIALS AND METHODS In this retrospective study, we used 654 alive and dead PC cases to establish the prediction model for PC. The six chosen machine learning algorithms and prognostic factors were utilized to build the prediction models. The importance of the predictive factors was assessed using the relative importance of a high-performing algorithm. RESULTS The XG-Boost with AU-ROC of 0.933 (95% CI= [0.906-0.958]) and AU-ROC of 0.836 (95% CI= [0.789-0.865] in internal and external validation modes were considered as the best-performing model for predicting the mortality risk of PC. The factors, including tumor size, smoking, and chemotherapy, were considered the most influential for prediction. CONCLUSION The XG-Boost gained more performance efficiency in predicting the mortality risk of PC patients, so this model can promote the clinical solutions that doctors can achieve in healthcare environments to decrease the mortality risk of these patients.
Collapse
Affiliation(s)
- Raoof Nopour
- Department of Health Information Management, Student Research Committee, School of Health Management and Information Sciences Branch, Iran University of Medical Sciences, Tehran, Iran.
| |
Collapse
|
3
|
Tran HN, Nguyen PXQ, Guo F, Wang J. Prediction of Protein-Protein Interactions Based on Integrating Deep Learning and Feature Fusion. Int J Mol Sci 2024; 25:5820. [PMID: 38892007 PMCID: PMC11172432 DOI: 10.3390/ijms25115820] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2024] [Revised: 04/27/2024] [Accepted: 04/29/2024] [Indexed: 06/21/2024] Open
Abstract
Understanding protein-protein interactions (PPIs) helps to identify protein functions and develop other important applications such as drug preparation and protein-disease relationship identification. Deep-learning-based approaches are being intensely researched for PPI determination to reduce the cost and time of previous testing methods. In this work, we integrate deep learning with feature fusion, harnessing the strengths of both approaches, handcrafted features, and protein sequence embedding. The accuracies of the proposed model using five-fold cross-validation on Yeast core and Human datasets are 96.34% and 99.30%, respectively. In the task of predicting interactions in important PPI networks, our model correctly predicted all interactions in one-core, Wnt-related, and cancer-specific networks. The experimental results on cross-species datasets, including Caenorhabditis elegans, Helicobacter pylori, Homo sapiens, Mus musculus, and Escherichia coli, also show that our feature fusion method helps increase the generalization capability of the PPI prediction model.
Collapse
Affiliation(s)
| | | | | | - Jianxin Wang
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 410083, China (F.G.)
| |
Collapse
|
4
|
Yang S, Wang X, Huan R, Deng M, Kong Z, Xiong Y, Luo T, Jin Z, Liu J, Chu L, Han G, Zhang J, Tan Y. Machine learning unveils immune-related signature in multicenter glioma studies. iScience 2024; 27:109317. [PMID: 38500821 PMCID: PMC10946333 DOI: 10.1016/j.isci.2024.109317] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2023] [Revised: 01/11/2024] [Accepted: 02/17/2024] [Indexed: 03/20/2024] Open
Abstract
In glioma molecular subtyping, existing biomarkers are limited, prompting the development of new ones. We present a multicenter study-derived consensus immune-related and prognostic gene signature (CIPS) using an optimal risk score model and 101 algorithms. CIPS, an independent risk factor, showed stable and powerful predictive performance for overall and progression-free survival, surpassing traditional clinical variables. The risk score correlated significantly with the immune microenvironment, indicating potential sensitivity to immunotherapy. High-risk groups exhibited distinct chemotherapy drug sensitivity. Seven signature genes, including IGFBP2 and TNFRSF12A, were validated by qRT-PCR, with higher expression in tumors and prognostic relevance. TNFRSF12A, upregulated in GBM, demonstrated inhibitory effects on glioma cell proliferation, migration, and invasion. CIPS emerges as a robust tool for enhancing individual glioma patient outcomes, while IGFBP2 and TNFRSF12A pose as promising tumor markers and therapeutic targets.
Collapse
Affiliation(s)
- Sha Yang
- Guizhou University Medical College, Guiyang 550025, Guizhou Province, China
| | - Xiang Wang
- Department of Neurosurgery, the Affiliated Hospital of Guizhou Medical University, Guiyang 550004, China
| | - Renzheng Huan
- Department of Neurosurgery, The Second Affiliated Hospital of Chongqing Medical University, Chongqing 400010, China
| | - Mei Deng
- Department of Neurosurgery, Guizhou Provincial People’s Hospital, Guiyang, China
| | - Zhuo Kong
- Department of Neurosurgery, Guizhou Provincial People’s Hospital, Guiyang, China
| | - Yunbiao Xiong
- Department of Neurosurgery, Guizhou Provincial People’s Hospital, Guiyang, China
| | - Tao Luo
- Department of Neurosurgery, Guizhou Provincial People’s Hospital, Guiyang, China
| | - Zheng Jin
- Department of Neurosurgery, Guizhou Provincial People’s Hospital, Guiyang, China
| | - Jian Liu
- Guizhou University Medical College, Guiyang 550025, Guizhou Province, China
- Department of Neurosurgery, Guizhou Provincial People’s Hospital, Guiyang, China
| | - Liangzhao Chu
- Department of Neurosurgery, the Affiliated Hospital of Guizhou Medical University, Guiyang 550004, China
| | - Guoqiang Han
- Department of Neurosurgery, Guizhou Provincial People’s Hospital, Guiyang, China
| | - Jiqin Zhang
- Department of Anesthesiology, Guizhou Provincial People’s Hospital, Guiyang, China
| | - Ying Tan
- Department of Neurosurgery, Guizhou Provincial People’s Hospital, Guiyang, China
| |
Collapse
|
5
|
Okita J, Nakata T, Uchida H, Kudo A, Fukuda A, Ueno T, Tanigawa M, Sato N, Shibata H. Development and validation of a machine learning model to predict time to renal replacement therapy in patients with chronic kidney disease. BMC Nephrol 2024; 25:101. [PMID: 38493099 PMCID: PMC10943785 DOI: 10.1186/s12882-024-03527-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2023] [Accepted: 02/28/2024] [Indexed: 03/18/2024] Open
Abstract
BACKGROUND Predicting time to renal replacement therapy (RRT) is important in patients at high risk for end-stage kidney disease. We developed and validated machine learning models for predicting the time to RRT and compared its accuracy with conventional prediction methods that uses the rate of estimated glomerular filtration rate (eGFR) decline. METHODS Data of adult chronic kidney disease (CKD) patients who underwent hemodialysis at Oita University Hospital from April 2016 to March 2021 were extracted from electronic medical records (N = 135). A new machine learning predictor was compared with the established prediction method that uses the eGFR decline rate and the accuracy of the prediction models was determined using the coefficient of determination (R2). The data were preprocessed and split into training and validation datasets. We created multiple machine learning models using the training data and evaluated their accuracy using validation data. Furthermore, we predicted the time to RRT using a conventional prediction method that uses the eGFR decline rate for patients who had measured eGFR three or more times in two years and evaluated its accuracy. RESULTS The least absolute shrinkage and selection operator regression model exhibited moderate accuracy with an R2 of 0.60. By contrast, the conventional prediction method was found to be extremely low with an R2 of -17.1. CONCLUSIONS The significance of this study is that it shows that machine learning can predict time to RRT moderately well with continuous values from data at a single time point. This approach outperforms the conventional prediction method that uses eGFR time series data and presents new avenues for CKD treatment.
Collapse
Affiliation(s)
- Jun Okita
- Department of Endocrinology, Metabolism, Rheumatology and Nephrology, Faculty of Medicine, Oita University, 8795593, 1-1 idaigaoka Hasama-cho, Yufu-shi, Oita-ken, Japan
| | - Takeshi Nakata
- Department of Endocrinology, Metabolism, Rheumatology and Nephrology, Faculty of Medicine, Oita University, 8795593, 1-1 idaigaoka Hasama-cho, Yufu-shi, Oita-ken, Japan.
| | - Hiroki Uchida
- Department of Endocrinology, Metabolism, Rheumatology and Nephrology, Faculty of Medicine, Oita University, 8795593, 1-1 idaigaoka Hasama-cho, Yufu-shi, Oita-ken, Japan
| | - Akiko Kudo
- Department of Endocrinology, Metabolism, Rheumatology and Nephrology, Faculty of Medicine, Oita University, 8795593, 1-1 idaigaoka Hasama-cho, Yufu-shi, Oita-ken, Japan
| | - Akihiro Fukuda
- Department of Endocrinology, Metabolism, Rheumatology and Nephrology, Faculty of Medicine, Oita University, 8795593, 1-1 idaigaoka Hasama-cho, Yufu-shi, Oita-ken, Japan
| | - Tamio Ueno
- Department of Medical Technology and Sciences, School of Health Sciences at Fukuoka, International University of Health and Welfare, Okawa, Japan
| | - Masato Tanigawa
- Department of Biophysics, Faculty of Medicine, Oita University, Oita, Japan
| | - Noboru Sato
- Department of Healthcare AI Data Science, Faculty of Medicine, Oita University, Oita, Japan
| | - Hirotaka Shibata
- Department of Endocrinology, Metabolism, Rheumatology and Nephrology, Faculty of Medicine, Oita University, 8795593, 1-1 idaigaoka Hasama-cho, Yufu-shi, Oita-ken, Japan
| |
Collapse
|
6
|
Banerjee S, Sengupta A, Ghosh SK, Banerjee R. CDH1 gene as biomarker towards breast cancer prediction. J Biomol Struct Dyn 2024:1-14. [PMID: 38373072 DOI: 10.1080/07391102.2024.2316770] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Accepted: 02/03/2024] [Indexed: 02/21/2024]
Abstract
Breast cancer is considered to be happened due to genetic aberration. Out of several genes expressed, it is found that cadherin 1, type 1 (CDH1) is responsible in several ways to control the metabolic order in human. Deregulation of the function of protein E-cadherin, expressed from CDH1 plays an important role in lobular breast cancer. In order to understand the root cause of this recent claim, we focus on CDH1 gene: whether the genetic information translated due to any deviation/alteration/modification in its sequence is related to the occurrence of the different types breast cancer. Towards this end, quantitative analysis of different biophysical and bio-chemical properties of CDH1 gene in genomic and proteomic levels from the available genomic (cDNA) sequences of CDH1 gene (obtained from the COSMIC Database for 78 patients, suffering from various types of breast cancer) clearly emphasizes that alternation/modification in the sequence of the CDH1 gene can be detrimental. Furthermore, Random forest, K-nearest neighbour and stochastic gradient descent (SGD) algorithms are applied on the derived dataset to classify the types of breast cancer, and to validate our hypothesis regarding the acute role of CDH1 as potential bio marker for breast cancer. Analysis of the mutated CDH1 gene sequences, and their related parameters using aforesaid machine learning techniques clearly establish that CDH1 gene can take the deterministic role in predicting the chances of occurrences of different types of breast cancer with an accuracy of > 90 % . Such an observation opens a new paradigm in diagnostic approach of breast cancer.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Srijan Banerjee
- Department of Biotechnology, Maulana Abul Kalam Azad University of Technology, Nadia, West Bengal, India
| | - Antara Sengupta
- Department of Computer Science and Engineering, University of Calcutta, Kolkata, West Bengal, India
| | - Shankar Kumar Ghosh
- Department of Computer Science and Engineering, Shiv Nadar Institution of Eminence, Delhi, India
| | - Raja Banerjee
- Department of Biotechnology, Maulana Abul Kalam Azad University of Technology, Nadia, West Bengal, India
| |
Collapse
|
7
|
Xu W, Yang X, Guan Y, Cheng X, Wang Y. Integrative approach for predicting drug-target interactions via matrix factorization and broad learning systems. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2024; 21:2608-2625. [PMID: 38454698 DOI: 10.3934/mbe.2024115] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/09/2024]
Abstract
In the drug discovery process, time and costs are the most typical problems resulting from the experimental screening of drug-target interactions (DTIs). To address these limitations, many computational methods have been developed to achieve more accurate predictions. However, identifying DTIs mostly rely on separate learning tasks with drug and target features that neglect interaction representation between drugs and target. In addition, the lack of these relationships may lead to a greatly impaired performance on the prediction of DTIs. Aiming at capturing comprehensive drug-target representations and simplifying the network structure, we propose an integrative approach with a convolution broad learning system for the DTI prediction (ConvBLS-DTI) to reduce the impact of the data sparsity and incompleteness. First, given the lack of known interactions for the drug and target, the weighted K-nearest known neighbors (WKNKN) method was used as a preprocessing strategy for unknown drug-target pairs. Second, a neighborhood regularized logistic matrix factorization (NRLMF) was applied to extract features of updated drug-target interaction information, which focused more on the known interaction pair parties. Then, a broad learning network incorporating a convolutional neural network was established to predict DTIs, which can make classification more effective using a different perspective. Finally, based on the four benchmark datasets in three scenarios, the ConvBLS-DTI's overall performance out-performed some mainstream methods. The test results demonstrate that our model achieves improved prediction effect on the area under the receiver operating characteristic curve and the precision-recall curve.
Collapse
Affiliation(s)
- Wanying Xu
- College of Computer Science & Technology, Qingdao University, Qingdao 266071, China
| | - Xixin Yang
- College of Computer Science & Technology, Qingdao University, Qingdao 266071, China
- School of Automation, Qingdao University, Qingdao 266071, China
| | - Yuanlin Guan
- Key Lab of Industrial Fluid Energy Conservation and Pollution Control, Ministry of Education, Qingdao University of Technology, Qingdao 266520, China
- School of Mechanical & Automotive Engineering, Qingdao University of Technology, Qingdao 266520, China
| | - Xiaoqing Cheng
- College of Computer Science & Technology, Qingdao University, Qingdao 266071, China
| | - Yu Wang
- College of Computer Science & Technology, Qingdao University, Qingdao 266071, China
| |
Collapse
|
8
|
Morís DI, de Moura J, Aslani S, Jacob J, Novo J, Ortega M. Multi-task localization of the hemidiaphragms and lung segmentation in portable chest X-ray images of COVID-19 patients. Digit Health 2024; 10:20552076231225853. [PMID: 38313365 PMCID: PMC10836150 DOI: 10.1177/20552076231225853] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Accepted: 12/05/2023] [Indexed: 02/06/2024] Open
Abstract
Background The COVID-19 can cause long-term symptoms in the patients after they overcome the disease. Given that this disease mainly damages the respiratory system, these symptoms are often related with breathing problems that can be caused by an affected diaphragm. The diaphragmatic function can be assessed with imaging modalities like computerized tomography or chest X-ray. However, this process must be performed by expert clinicians with manual visual inspection. Moreover, during the pandemic, the clinicians were asked to prioritize the use of portable devices, preventing the risk of cross-contamination. Nevertheless, the captures of these devices are of a lower quality. Objectives The automatic quantification of the diaphragmatic function can determine the damage of COVID-19 on each patient and assess their evolution during the recovery period, a task that could also be complemented with the lung segmentation. Methods We propose a novel multi-task fully automatic methodology to simultaneously localize the position of the hemidiaphragms and to segment the lung boundaries with a convolutional architecture using portable chest X-ray images of COVID-19 patients. For that aim, the hemidiaphragms' landmarks are located adapting the paradigm of heatmap regression. Results The methodology is exhaustively validated with four analyses, achieving an 82.31% ± 2.78% of accuracy when localizing the hemidiaphragms' landmarks and a Dice score of 0.9688 ± 0.0012 in lung segmentation. Conclusions The results demonstrate that the model is able to perform both tasks simultaneously, being a helpful tool for clinicians despite the lower quality of the portable chest X-ray images.
Collapse
Affiliation(s)
- Daniel I Morís
- Centro de Investigación CITIC, Universidade da Coruña, A Coruña, Spain
- Grupo VARPA, Instituto de Investigación Biomédica de A Coruña (INIBIC), Universidade da Coruña, A Coruña, Spain
| | - Joaquim de Moura
- Centro de Investigación CITIC, Universidade da Coruña, A Coruña, Spain
- Grupo VARPA, Instituto de Investigación Biomédica de A Coruña (INIBIC), Universidade da Coruña, A Coruña, Spain
| | - Shahab Aslani
- Department of Computer Science, Centre for Medical Image Computing, University College London, UK
| | - Joseph Jacob
- Department of Computer Science, Centre for Medical Image Computing, University College London, UK
- Satsuma Lab, Centre for Medical Image Computing, University College London, UK
| | - Jorge Novo
- Centro de Investigación CITIC, Universidade da Coruña, A Coruña, Spain
- Grupo VARPA, Instituto de Investigación Biomédica de A Coruña (INIBIC), Universidade da Coruña, A Coruña, Spain
| | - Marcos Ortega
- Centro de Investigación CITIC, Universidade da Coruña, A Coruña, Spain
- Grupo VARPA, Instituto de Investigación Biomédica de A Coruña (INIBIC), Universidade da Coruña, A Coruña, Spain
| |
Collapse
|
9
|
Lee SJ, Kim JH. Applying Sequential Pattern Mining to Investigate the Temporal Relationships between Commonly Occurring Internal Medicine Diseases and Intervals for the Risk of Concurrent Disease in Canine Patients. Animals (Basel) 2023; 13:3359. [PMID: 37958114 PMCID: PMC10647901 DOI: 10.3390/ani13213359] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2023] [Revised: 09/29/2023] [Accepted: 10/27/2023] [Indexed: 11/15/2023] Open
Abstract
Sequential pattern mining (SPM) is a data mining technique used for identifying common association rules in multiple sequential datasets and patterns in ordered events. In this study, we aimed to identify the relationships between commonly occurring internal medicine diseases in canine patients. We obtained medical records of dogs referred to the Konkuk University Veterinary Medicine Teaching Hospital. The data used for SPM included comorbidities and intervals between the diagnoses of internal medicine diseases. Additionally, we estimated the 3-year risk of developing an additional disease after the initial diagnosis of a commonly occurring veterinary internal medicine disease using logistic regression. We identified 547 canine patients diagnosed with ≥ 1 internal medicine disease. The SPM-based analysis assessed comorbidities and intervals for each of the five most common internal medical diseases, including hyperadrenocorticism, myxomatous mitral valve disease, canine atopic dermatitis, chronic kidney disease, and chronic pancreatitis. The highest values of the association rule were 3.01%, 6.02%, 3.9%, 4.1%, and 4.84%, and the shortest intervals were 1.64, 13.14, 5.37, 17.02, and 1.7 days, respectively. This study proposes that SPM is an effective technique for identifying common associations and temporal relationships between internal medicine diseases, and can be used to assess the probability of additional admission due to the development of the subsequent disease that may be diagnosed in canine patients. The results of this study will help veterinarians suggest appropriate preventive measures or other medical treatments for canine patients with medical conditions that have not yet been diagnosed, but are likely to develop in the short term.
Collapse
Affiliation(s)
- Suk-Jun Lee
- Department of Business Management, Kwangwoon University, 536 Nuri-Hall, 20 Kwangwoon-ro, Nowon-gu, Seoul 01897, Republic of Korea;
| | - Jung-Hyun Kim
- Department of Veterinary Internal Medicine, College of Veterinary Medicine, Konkuk University, #120 Neungdong-ro, Gwangjin-gu, Seoul 05029, Republic of Korea
| |
Collapse
|
10
|
Ramakrishnaiah Y, Morris AP, Dhaliwal J, Philip M, Kuhlmann L, Tyagi S. Linc2function: A Comprehensive Pipeline and Webserver for Long Non-Coding RNA (lncRNA) Identification and Functional Predictions Using Deep Learning Approaches. EPIGENOMES 2023; 7:22. [PMID: 37754274 PMCID: PMC10528440 DOI: 10.3390/epigenomes7030022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 09/02/2023] [Accepted: 09/11/2023] [Indexed: 09/28/2023] Open
Abstract
Long non-coding RNAs (lncRNAs), comprising a significant portion of the human transcriptome, serve as vital regulators of cellular processes and potential disease biomarkers. However, the function of most lncRNAs remains unknown, and furthermore, existing approaches have focused on gene-level investigation. Our work emphasizes the importance of transcript-level annotation to uncover the roles of specific transcript isoforms. We propose that understanding the mechanisms of lncRNA in pathological processes requires solving their structural motifs and interactomes. A complete lncRNA annotation first involves discriminating them from their coding counterparts and then predicting their functional motifs and target bio-molecules. Current in silico methods mainly perform primary-sequence-based discrimination using a reference model, limiting their comprehensiveness and generalizability. We demonstrate that integrating secondary structure and interactome information, in addition to using transcript sequence, enables a comprehensive functional annotation. Annotating lncRNA for newly sequenced species is challenging due to inconsistencies in functional annotations, specialized computational techniques, limited accessibility to source code, and the shortcomings of reference-based methods for cross-species predictions. To address these challenges, we developed a pipeline for identifying and annotating transcript sequences at the isoform level. We demonstrate the effectiveness of the pipeline by comprehensively annotating the lncRNA associated with two specific disease groups. The source code of our pipeline is available under the MIT licensefor local use by researchers to make new predictions using the pre-trained models or to re-train models on new sequence datasets. Non-technical users can access the pipeline through a web server setup.
Collapse
Affiliation(s)
- Yashpal Ramakrishnaiah
- Central Clinical School, Monash University, Melbourne, VIC 3000, Australia
- School of Computing Technologies, Royal Melbourne Institute of Technology University, Melbourne, VIC 3000, Australia
| | - Adam P. Morris
- Monash Data Futures Institute, Monash University, Clayton, VIC 3800, Australia
| | - Jasbir Dhaliwal
- School of Computing Technologies, Royal Melbourne Institute of Technology University, Melbourne, VIC 3000, Australia
| | - Melcy Philip
- Central Clinical School, Monash University, Melbourne, VIC 3000, Australia
| | - Levin Kuhlmann
- Faculty of Information Technology, Monash University, Clayton, VIC 3800, Australia
| | - Sonika Tyagi
- Central Clinical School, Monash University, Melbourne, VIC 3000, Australia
- School of Computing Technologies, Royal Melbourne Institute of Technology University, Melbourne, VIC 3000, Australia
| |
Collapse
|
11
|
Chen D, Wang R, Jiang Y, Xing Z, Sheng Q, Liu X, Wang R, Xie H, Zhao L. Application of artificial neural network in daily prediction of bleeding in ICU patients treated with anti-thrombotic therapy. BMC Med Inform Decis Mak 2023; 23:171. [PMID: 37653495 PMCID: PMC10470146 DOI: 10.1186/s12911-023-02274-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2023] [Accepted: 08/23/2023] [Indexed: 09/02/2023] Open
Abstract
OBJECTIVES Anti-thrombotic therapy is the basis of thrombosis prevention and treatment. Bleeding is the main adverse event of anti-thrombosis. Existing laboratory indicators cannot accurately reflect the real-time coagulation function. It is necessary to develop tools to dynamically evaluate the risk and benefits of anti-thrombosis to prescribe accurate anti-thrombotic therapy. METHODS The prediction model,daily prediction of bleeding risk in ICU patients treated with anti-thrombotic therapy, was built using deep learning algorithm recurrent neural networks, and the model results and performance were compared with clinicians. RESULTS There was no significant statistical discrepancy in the baseline. ROC curves of the four models in the validation and test set were drawn, respectively. One-layer GRU of the validation set had a larger AUC (0.9462; 95%CI, 0.9147-0.9778). Analysis was conducted in the test set, and the ROC curve showed the superiority of two layers LSTM over one-layer GRU, while the former AUC was 0.8391(95%CI, 0.7786-0.8997). One-layer GRU in the test set possessed a better specificity (sensitivity 0.5942; specificity 0.9300). The Fleiss' k of junior clinicians, senior clinicians, and machine learning classifiers is 0.0984, 0.4562, and 0.8012, respectively. CONCLUSIONS Recurrent neural networks were first applied for daily prediction of bleeding risk in ICU patients treated with anti-thrombotic therapy. Deep learning classifiers are more reliable and consistent than human classifiers. The machine learning classifier suggested strong reliability. The deep learning algorithm significantly outperformed human classifiers in prediction time.
Collapse
Affiliation(s)
- Daonan Chen
- Department of Critical Care Medicine, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, No. 650 New Songjiang Road, Songjiang, Shanghai, 201600, China
| | - Rui Wang
- Department of Critical Care Medicine, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, No. 650 New Songjiang Road, Songjiang, Shanghai, 201600, China
| | - Yihan Jiang
- Department of Critical Care Medicine, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, No. 650 New Songjiang Road, Songjiang, Shanghai, 201600, China
| | - Zijian Xing
- Deepwise Artificial Intelligence Laboratory, Beijing, China
| | - Qiuyang Sheng
- Deepwise Artificial Intelligence Laboratory, Beijing, China
| | - Xiaoqing Liu
- Deepwise Artificial Intelligence Laboratory, Beijing, China
| | - Ruilan Wang
- Department of Critical Care Medicine, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, No. 650 New Songjiang Road, Songjiang, Shanghai, 201600, China
| | - Hui Xie
- Department of Critical Care Medicine, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, No. 650 New Songjiang Road, Songjiang, Shanghai, 201600, China.
| | - Lina Zhao
- Department of Critical Care Medicine, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, No. 650 New Songjiang Road, Songjiang, Shanghai, 201600, China.
| |
Collapse
|
12
|
Champa-Bujaico E, Díez-Pascual AM, García-Díaz P. Synthesis and Characterization of Polyhydroxyalkanoate/Graphene Oxide/Nanoclay Bionanocomposites: Experimental Results and Theoretical Predictions via Machine Learning Models. Biomolecules 2023; 13:1192. [PMID: 37627257 PMCID: PMC10452513 DOI: 10.3390/biom13081192] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Revised: 07/20/2023] [Accepted: 07/27/2023] [Indexed: 08/27/2023] Open
Abstract
Predicting the mechanical properties of multiscale nanocomposites requires simulations that are costly from a practical viewpoint and time consuming. The use of algorithms for property prediction can reduce the extensive experimental work, saving time and costs. To assess this, ternary poly(hydroxybutyrate-co-hydroxyvalerate) (PHBV)-based bionanocomposites reinforced with graphene oxide (GO) and montmorillonite nanoclay were prepared herein via an environmentally friendly electrochemical process followed by solution casting. The aim was to evaluate the effectiveness of different Machine Learning (ML) models, namely Artificial Neural Network (ANN), Decision Tree (DT), and Support Vector Machine (SVM), in predicting their mechanical properties. The algorithms' input data were the Young's modulus, tensile strength, and elongation at break for various concentrations of the nanofillers (GO and nanoclay). The correlation coefficient (R2), mean absolute error (MAE), and mean square error (MSE) were used as statistical indicators to assess the performance of the models. The results demonstrated that ANN and SVM are useful for estimating the Young's modulus and elongation at break, with MSE values in the range of 0.64-1.0% and 0.14-0.28%, respectively. On the other hand, DT was more suitable for predicting the tensile strength, with the indicated error in the range of 0.02-9.11%. This study paves the way for the application of ML models as confident tools for predicting the mechanical properties of polymeric nanocomposites reinforced with different types of nanofiller, with a view to using them in practical applications such as biomedicine.
Collapse
Affiliation(s)
- Elizabeth Champa-Bujaico
- Universidad de Alcalá, Departamento de Teoría de la Señal y Comunicaciones, Ctra. Madrid-Barcelona Km. 33.6, 28805 Alcalá de Henares, Madrid, Spain; (E.C.-B.); (P.G.-D.)
| | - Ana M. Díez-Pascual
- Universidad de Alcalá, Facultad de Ciencias, Departamento de Química Analítica, Química Física e Ingeniería Química, Ctra. Madrid-Barcelona Km. 33.6, 28805 Alcalá de Henares, Madrid, Spain
| | - Pilar García-Díaz
- Universidad de Alcalá, Departamento de Teoría de la Señal y Comunicaciones, Ctra. Madrid-Barcelona Km. 33.6, 28805 Alcalá de Henares, Madrid, Spain; (E.C.-B.); (P.G.-D.)
| |
Collapse
|