1
|
Song R, Wang X, Zhang J, Chen S, Zhou J. GATDE: A graph attention network with diffusion-enhanced protein-protein interaction for cancer classification. Methods 2024; 231:70-77. [PMID: 39303774 DOI: 10.1016/j.ymeth.2024.09.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2024] [Revised: 08/11/2024] [Accepted: 09/04/2024] [Indexed: 09/22/2024] Open
Abstract
Cancer classification is crucial for effective patient treatment, and recent years have seen various methods emerge based on protein expression levels. However, existing methods oversimplify by assuming uniform interaction strengths and neglecting intermediate influences among proteins. Addressing these limitations, GATDE employs a graph attention network enhanced with diffusion on protein-protein interactions. By constructing a weighted protein-protein interaction network, GATDE captures the diversity of these interactions and uses a diffusion process to assess multi-hop influences between proteins. This information is subsequently incorporated into the graph attention network, resulting in precise cancer classification. Experimental results on breast cancer and pan-cancer datasets demonstrate that GATDE surpasses current leading methods. Additionally, in-depth case studies further validate the effectiveness of the diffusion process and the attention mechanism, highlighting GATDE's robustness and potential for real-world applications.
Collapse
Affiliation(s)
- Ruike Song
- College of Software, Nankai University, Tianjin, China.
| | - Xiaofeng Wang
- College of Software, Nankai University, Tianjin, China.
| | - Jiahao Zhang
- College of Software, Nankai University, Tianjin, China.
| | - Shengquan Chen
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin, China.
| | - Jianyu Zhou
- College of Software, Nankai University, Tianjin, China.
| |
Collapse
|
2
|
Ojha A, Zhao SJ, Akpunonu B, Zhang JT, Simo KA, Liu JY. Gap-App: A sex-distinct AI-based predictor for pancreatic ductal adenocarcinoma survival as a web application open to patients and physicians. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.04.597246. [PMID: 38895246 PMCID: PMC11185613 DOI: 10.1101/2024.06.04.597246] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/21/2024]
Abstract
In this study, using RNA-Seq gene expression data and advanced machine learning techniques, we identified distinct gene expression profiles between male and female pancreatic ductal adenocarcinoma (PDAC) patients. Building upon this insight, we developed sex-specific 3-year survival predictive models along with a single comprehensive model. These sex-specific models outperformed the single general model despite the smaller sample sizes. We further refined our models by using the most important features extracted from these initial models. The refined sex-specific predictive models achieved improved accuracies of 92.62% for males and 91.96% for females, respectively, versus an accuracy of 87.84% from the refined comprehensive model, further highlighting the value of sex-specific analysis. Based on these findings, we created Gap-App, a web application that enables the use of individual gene expression profiles combined with sex information for personalized survival predictions. Gap-App, the first online tool aiming to bridge the gap between complex genomic data and clinical application and facilitating more precise and individualized cancer care, marks a significant advancement in personalized prognosis. The study not only underscores the importance of acknowledging sex differences in personalized prognosis, but also sets the stage for the shift from traditional one-size-fits-all to more personalized and targeted medicine. The GAP-App service is freely available at www.gap-app.org.
Collapse
Affiliation(s)
- Anuj Ojha
- Department of Medicine, College of Medicine, University of Toledo, Toledo, OH, USA
- Department of Bioengineering, College of Engineering, University of Toledo, Toledo, OH, USA
| | - Shu-Jun Zhao
- Department of Medicine, College of Medicine, University of Toledo, Toledo, OH, USA
- Department of Bioengineering, College of Engineering, University of Toledo, Toledo, OH, USA
| | - Basil Akpunonu
- Department of Medicine, College of Medicine, University of Toledo, Toledo, OH, USA
| | - Jian-Ting Zhang
- Department of Cell and Cancer Biology, College of Medicine, University of Toledo, Toledo, OH, USA
| | - Kerri A. Simo
- Department of Surgery, College of Medicine, University of Toledo, Toledo, OH, USA
- ProMedica Health System, ProMedica Cancer Institute, Toledo, OH, USA
| | - Jing-Yuan Liu
- Department of Medicine, College of Medicine, University of Toledo, Toledo, OH, USA
- Department of Cell and Cancer Biology, College of Medicine, University of Toledo, Toledo, OH, USA
- Department of Bioengineering, College of Engineering, University of Toledo, Toledo, OH, USA
| |
Collapse
|
3
|
Shams A. Leveraging State-of-the-Art AI Algorithms in Personalized Oncology: From Transcriptomics to Treatment. Diagnostics (Basel) 2024; 14:2174. [PMID: 39410578 PMCID: PMC11476216 DOI: 10.3390/diagnostics14192174] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2024] [Revised: 09/17/2024] [Accepted: 09/23/2024] [Indexed: 10/20/2024] Open
Abstract
BACKGROUND Continuous breakthroughs in computational algorithms have positioned AI-based models as some of the most sophisticated technologies in the healthcare system. AI shows dynamic contributions in advancing various medical fields involving data interpretation and monitoring, imaging screening and diagnosis, and treatment response and survival prediction. Despite advances in clinical oncology, more effort must be employed to tailor therapeutic plans based on each patient's unique transcriptomic profile within the precision/personalized oncology frame. Furthermore, the standard analysis method is not compatible with the comprehensive deciphering of significant data streams, thus precluding the prediction of accurate treatment options. METHODOLOGY We proposed a novel approach that includes obtaining different tumour tissues and preparing RNA samples for comprehensive transcriptomic interpretation using specifically trained, programmed, and optimized AI-based models for extracting large data volumes, refining, and analyzing them. Next, the transcriptomic results will be scanned against an expansive drug library to predict the response of each target to the tested drugs. The obtained target-drug combination/s will be then validated using in vitro and in vivo experimental models. Finally, the best treatment combination option/s will be introduced to the patient. We also provided a comprehensive review discussing AI models' recent innovations and implementations to aid in molecular diagnosis and treatment planning. RESULTS The expected transcriptomic analysis generated by the AI-based algorithms will provide an inclusive genomic profile for each patient, containing statistical and bioinformatics analyses, identification of the dysregulated pathways, detection of the targeted genes, and recognition of molecular biomarkers. Subjecting these results to the prediction and pairing AI-based processes will result in statistical graphs presenting each target's likely response rate to various treatment options. Different in vitro and in vivo investigations will further validate the selection of the target drug/s pairs. CONCLUSIONS Leveraging AI models will provide more rigorous manipulation of large-scale datasets on specific cancer care paths. Such a strategy would shape treatment according to each patient's demand, thus fortifying the avenue of personalized/precision medicine. Undoubtedly, this will assist in improving the oncology domain and alleviate the burden of clinicians in the coming decade.
Collapse
Affiliation(s)
- Anwar Shams
- Department of Pharmacology, College of Medicine, Taif University, P.O. Box 11099, Taif 21944, Saudi Arabia; or ; Tel.: +00966-548638099
- Research Center for Health Sciences, Deanship of Graduate Studies and Scientific Research, Taif University, Taif 26432, Saudi Arabia
- High Altitude Research Center, Taif University, P.O. Box 11099, Taif 21944, Saudi Arabia
| |
Collapse
|
4
|
Abidalkareem A, Ibrahim AK, Abd M, Rehman O, Zhuang H. Identification of Gene Expression in Different Stages of Breast Cancer with Machine Learning. Cancers (Basel) 2024; 16:1864. [PMID: 38791943 PMCID: PMC11120052 DOI: 10.3390/cancers16101864] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2024] [Revised: 05/01/2024] [Accepted: 05/09/2024] [Indexed: 05/26/2024] Open
Abstract
Determining the tumor origin in humans is vital in clinical applications of molecular diagnostics. Metastatic cancer is usually a very aggressive disease with limited diagnostic procedures, despite the fact that many protocols have been evaluated for their effectiveness in prognostication. Research has shown that dysregulation in miRNAs (a class of non-coding, regulatory RNAs) is remarkably involved in oncogenic conditions. This research paper aims to develop a machine learning model that processes an array of miRNAs in 1097 metastatic tissue samples from patients who suffered from various stages of breast cancer. The suggested machine learning model is fed with miRNA quantitative read count data taken from The Cancer Genome Atlas Data Repository. Two main feature-selection techniques have been used, mainly Neighborhood Component Analysis and Minimum Redundancy Maximum Relevance, to identify the most discriminant and relevant miRNAs for their up-regulated and down-regulated states. These miRNAs are then validated as biological identifiers for each of the four cancer stages in breast tumors. Both machine learning algorithms yield performance scores that are significantly higher than the traditional fold-change approach, particularly in earlier stages of cancer, with Neighborhood Component Analysis and Minimum Redundancy Maximum Relevance achieving accuracy scores of up to 0.983 and 0.931, respectively, compared to 0.920 for the FC method. This study underscores the potential of advanced feature-selection methods in enhancing the accuracy of cancer stage identification, paving the way for improved diagnostic and therapeutic strategies in oncology.
Collapse
Affiliation(s)
- Ali Abidalkareem
- EECS Department, Florida Atlantic University, Boca Raton, FL 33431, USA; (A.A.); (O.R.); (H.Z.)
| | - Ali K. Ibrahim
- EECS Department, Florida Atlantic University, Boca Raton, FL 33431, USA; (A.A.); (O.R.); (H.Z.)
- Harbor Branch Oceanographic Institute, Florida Atlantic University, Fort Pierce, FL 34946, USA
| | - Moaed Abd
- Ocean and Mechanical Engineering Department, Florida Atlantic University, Boca Raton, FL 33431, USA;
| | - Oneeb Rehman
- EECS Department, Florida Atlantic University, Boca Raton, FL 33431, USA; (A.A.); (O.R.); (H.Z.)
| | - Hanqi Zhuang
- EECS Department, Florida Atlantic University, Boca Raton, FL 33431, USA; (A.A.); (O.R.); (H.Z.)
| |
Collapse
|
5
|
Khanna NN, Singh M, Maindarkar M, Kumar A, Johri AM, Mentella L, Laird JR, Paraskevas KI, Ruzsa Z, Singh N, Kalra MK, Fernandes JFE, Chaturvedi S, Nicolaides A, Rathore V, Singh I, Teji JS, Al-Maini M, Isenovic ER, Viswanathan V, Khanna P, Fouda MM, Saba L, Suri JS. Polygenic Risk Score for Cardiovascular Diseases in Artificial Intelligence Paradigm: A Review. J Korean Med Sci 2023; 38:e395. [PMID: 38013648 PMCID: PMC10681845 DOI: 10.3346/jkms.2023.38.e395] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Accepted: 10/15/2023] [Indexed: 11/29/2023] Open
Abstract
Cardiovascular disease (CVD) related mortality and morbidity heavily strain society. The relationship between external risk factors and our genetics have not been well established. It is widely acknowledged that environmental influence and individual behaviours play a significant role in CVD vulnerability, leading to the development of polygenic risk scores (PRS). We employed the PRISMA search method to locate pertinent research and literature to extensively review artificial intelligence (AI)-based PRS models for CVD risk prediction. Furthermore, we analyzed and compared conventional vs. AI-based solutions for PRS. We summarized the recent advances in our understanding of the use of AI-based PRS for risk prediction of CVD. Our study proposes three hypotheses: i) Multiple genetic variations and risk factors can be incorporated into AI-based PRS to improve the accuracy of CVD risk predicting. ii) AI-based PRS for CVD circumvents the drawbacks of conventional PRS calculators by incorporating a larger variety of genetic and non-genetic components, allowing for more precise and individualised risk estimations. iii) Using AI approaches, it is possible to significantly reduce the dimensionality of huge genomic datasets, resulting in more accurate and effective disease risk prediction models. Our study highlighted that the AI-PRS model outperformed traditional PRS calculators in predicting CVD risk. Furthermore, using AI-based methods to calculate PRS may increase the precision of risk predictions for CVD and have significant ramifications for individualized prevention and treatment plans.
Collapse
Affiliation(s)
- Narendra N Khanna
- Department of Cardiology, Indraprastha APOLLO Hospitals, New Delhi, India
- Asia Pacific Vascular Society, New Delhi, India
| | - Manasvi Singh
- Stroke Monitoring and Diagnostic Division, AtheroPoint™, Roseville, CA, USA
- Bennett University, Greater Noida, India
| | - Mahesh Maindarkar
- Asia Pacific Vascular Society, New Delhi, India
- Stroke Monitoring and Diagnostic Division, AtheroPoint™, Roseville, CA, USA
- School of Bioengineering Sciences and Research, Maharashtra Institute of Technology's Art, Design and Technology University, Pune, India
| | | | - Amer M Johri
- Department of Medicine, Division of Cardiology, Queen's University, Kingston, Canada
| | - Laura Mentella
- Department of Medicine, Division of Cardiology, University of Toronto, Toronto, Canada
| | - John R Laird
- Heart and Vascular Institute, Adventist Health St. Helena, St. Helena, CA, USA
| | | | - Zoltan Ruzsa
- Invasive Cardiology Division, University of Szeged, Szeged, Hungary
| | - Narpinder Singh
- Department of Food Science and Technology, Graphic Era Deemed to be University, Dehradun, Uttarakhand, India
| | | | | | - Seemant Chaturvedi
- Department of Neurology & Stroke Program, University of Maryland, Baltimore, MD, USA
| | - Andrew Nicolaides
- Vascular Screening and Diagnostic Centre and University of Nicosia Medical School, Cyprus
| | - Vijay Rathore
- Nephrology Department, Kaiser Permanente, Sacramento, CA, USA
| | - Inder Singh
- Stroke Monitoring and Diagnostic Division, AtheroPoint™, Roseville, CA, USA
| | - Jagjit S Teji
- Ann and Robert H. Lurie Children's Hospital of Chicago, Chicago, IL, USA
| | - Mostafa Al-Maini
- Allergy, Clinical Immunology and Rheumatology Institute, Toronto, ON, Canada
| | - Esma R Isenovic
- Department of Radiobiology and Molecular Genetics, National Institute of The Republic of Serbia, University of Belgrade, Beograd, Serbia
| | | | - Puneet Khanna
- Department of Anaesthesiology, AIIMS, New Delhi, India
| | - Mostafa M Fouda
- Department of Electrical and Computer Engineering, Idaho State University, Pocatello, ID, USA
| | - Luca Saba
- Department of Radiology, Azienda Ospedaliero Universitaria, Cagliari, Italy
| | - Jasjit S Suri
- Asia Pacific Vascular Society, New Delhi, India
- Stroke Monitoring and Diagnostic Division, AtheroPoint™, Roseville, CA, USA
- Department of Computer Engineering, Graphic Era Deemed to be University, Dehradun, India.
| |
Collapse
|
6
|
Cava C, D'Antona S, Maselli F, Castiglioni I, Porro D. From genetic correlations of Alzheimer's disease to classification with artificial neural network models. Funct Integr Genomics 2023; 23:293. [PMID: 37682415 PMCID: PMC10491691 DOI: 10.1007/s10142-023-01228-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Revised: 08/30/2023] [Accepted: 09/03/2023] [Indexed: 09/09/2023]
Abstract
Sporadic Alzheimer's disease (AD) is a complex neurological disorder characterized by many risk loci with potential associations with different traits and diseases. AD, characterized by a progressive loss of neuronal functions, manifests with different symptoms such as decline in memory, movement, coordination, and speech. The mechanisms underlying the onset of AD are not always fully understood, but involve a multiplicity of factors. Early diagnosis of AD plays a central role as it can offer the possibility of early treatment, which can slow disease progression. Currently, the methods of diagnosis are cognitive testing, neuroimaging, or cerebrospinal fluid analysis that can be time-consuming, expensive, invasive, and not always accurate. In the present study, we performed a genetic correlation analysis using genome-wide association statistics from a large study of AD and UK Biobank, to examine the association of AD with other human traits and disorders. In addition, since hippocampus, a part of cerebral cortex could play a central role in several traits that are associated with AD; we analyzed the gene expression profiles of hippocampus of AD patients applying 4 different artificial neural network models. We found 65 traits correlated with AD grouped into 9 clusters: medical conditions, fluid intelligence, education, anthropometric measures, employment status, activity, diet, lifestyle, and sexuality. The comparison of different 4 neural network models along with feature selection methods on 5 Alzheimer's gene expression datasets showed that the simple basic neural network model obtains a better performance (66% of accuracy) than other more complex methods with dropout and weight regularization of the network.
Collapse
Affiliation(s)
- Claudia Cava
- Institute of Molecular Bioimaging and Physiology, National Research Council (IBFM-CNR), Via F. Cervi 93, Segrate-Milan, 20090, Milan, Italy.
- Department of Science, Technology and Society, University School for Advanced Studies IUSS Pavia, Palazzo del Broletto, Piazza Della Vittoria 15, 27100, Pavia, Italy.
| | - Salvatore D'Antona
- Institute of Molecular Bioimaging and Physiology, National Research Council (IBFM-CNR), Via F. Cervi 93, Segrate-Milan, 20090, Milan, Italy
| | - Francesca Maselli
- Institute of Molecular Bioimaging and Physiology, National Research Council (IBFM-CNR), Via F. Cervi 93, Segrate-Milan, 20090, Milan, Italy
| | - Isabella Castiglioni
- Department of Physics "Giuseppe Occhialini", University of Milan-Bicocca Piazza Dell'Ateneo Nuovo, 20126, Milan, Italy
| | - Danilo Porro
- Institute of Molecular Bioimaging and Physiology, National Research Council (IBFM-CNR), Via F. Cervi 93, Segrate-Milan, 20090, Milan, Italy
- NBFC, National Biodiversity Future Center, 90133, Palermo, Italy
| |
Collapse
|
7
|
Beaude A, Rafiee Vahid M, Augé F, Zehraoui F, Hanczar B. AttOmics: attention-based architecture for diagnosis and prognosis from omics data. Bioinformatics 2023; 39:i94-i102. [PMID: 37387182 DOI: 10.1093/bioinformatics/btad232] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/01/2023] Open
Abstract
MOTIVATION The increasing availability of high-throughput omics data allows for considering a new medicine centered on individual patients. Precision medicine relies on exploiting these high-throughput data with machine-learning models, especially the ones based on deep-learning approaches, to improve diagnosis. Due to the high-dimensional small-sample nature of omics data, current deep-learning models end up with many parameters and have to be fitted with a limited training set. Furthermore, interactions between molecular entities inside an omics profile are not patient specific but are the same for all patients. RESULTS In this article, we propose AttOmics, a new deep-learning architecture based on the self-attention mechanism. First, we decompose each omics profile into a set of groups, where each group contains related features. Then, by applying the self-attention mechanism to the set of groups, we can capture the different interactions specific to a patient. The results of different experiments carried out in this article show that our model can accurately predict the phenotype of a patient with fewer parameters than deep neural networks. Visualizing the attention maps can provide new insights into the essential groups for a particular phenotype. AVAILABILITY AND IMPLEMENTATION The code and data are available at https://forge.ibisc.univ-evry.fr/abeaude/AttOmics. TCGA data can be downloaded from the Genomic Data Commons Data Portal.
Collapse
Affiliation(s)
- Aurélien Beaude
- IBISC, Université Paris-Saclay, Univ Evry, 23 Boulevard de France, Evry-Courcouronnes 91020, France
- Artificial Intelligence & Deep Analytics, Omics Data Science, Sanofi R&D Data and Data Science, 1 Av. Pierre Brossolette, Chilly-Mazarin 91385, France
| | - Milad Rafiee Vahid
- Sanofi R&D Data and Data Science, Artificial Intelligence & Deep Analytics, Omics Data Science, 450 Water Street, Cambridge, MA 02142, United States
| | - Franck Augé
- Artificial Intelligence & Deep Analytics, Omics Data Science, Sanofi R&D Data and Data Science, 1 Av. Pierre Brossolette, Chilly-Mazarin 91385, France
| | - Farida Zehraoui
- IBISC, Université Paris-Saclay, Univ Evry, 23 Boulevard de France, Evry-Courcouronnes 91020, France
| | - Blaise Hanczar
- IBISC, Université Paris-Saclay, Univ Evry, 23 Boulevard de France, Evry-Courcouronnes 91020, France
| |
Collapse
|
8
|
Tripathy G, Sharaff A. AEGA: enhanced feature selection based on ANOVA and extended genetic algorithm for online customer review analysis. THE JOURNAL OF SUPERCOMPUTING 2023; 79:1-30. [PMID: 37359344 PMCID: PMC10031171 DOI: 10.1007/s11227-023-05179-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Accepted: 03/07/2023] [Indexed: 06/28/2023]
Abstract
Sentiment analysis involves extricating and interpreting people's views, feelings, beliefs, etc., about diverse actualities such as services, goods, and topics. People intend to investigate the users' opinions on the online platform to achieve better performance. Regardless, the high-dimensional feature set in an online review study affects the interpretation of classification. Several studies have implemented different feature selection techniques; however, getting a high accuracy with a very minimal number of features is yet to be accomplished. This paper develops an effective hybrid approach based on an enhanced genetic algorithm (GA) and analysis of variance (ANOVA) to achieve this purpose. To beat the local minima convergence problem, this paper uses a unique two-phase crossover and impressive selection approach, gaining high exploration and fast convergence of the model. The use of ANOVA drastically reduces the feature size to minimize the computational burden of the model. Experiments are performed to estimate the algorithm performance using different conventional classifiers and algorithms like GA, Particle Swarm Optimization (PSO), Recursive Feature Elimination (RFE), Random Forest, ExtraTree, AdaBoost, GradientBoost, and XGBoost. The proposed novel approach gives impressive results using the Amazon Review dataset with an accuracy of 78.60 %, F1 score of 79.38 %, and an average precision of 0.87, and the Restaurant Customer Review dataset with an accuracy of 77.70 %, F1 score of 78.24 %, and average precision of 0.89 as compared to other existing algorithms. The result shows that the proposed model outperforms other algorithms with nearly 45 and 42% fewer features for the Amazon Review and Restaurant Customer Review datasets.
Collapse
Affiliation(s)
- Gyananjaya Tripathy
- Department of Computer Science and Engineering, National Institute of Technology, Raipur, Chhattisgarh 492010 India
| | - Aakanksha Sharaff
- Department of Computer Science and Engineering, National Institute of Technology, Raipur, Chhattisgarh 492010 India
| |
Collapse
|
9
|
Georgieva O. An Iterative Unsupervised Method for Gene Expression Differentiation. Genes (Basel) 2023; 14:412. [PMID: 36833339 PMCID: PMC9956932 DOI: 10.3390/genes14020412] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2022] [Revised: 01/24/2023] [Accepted: 02/01/2023] [Indexed: 02/09/2023] Open
Abstract
For several decades, intensive research for understanding gene activity and its role in organism's lives is the research focus of scientists in different areas. A part of these investigations is the analysis of gene expression data for selecting differentially expressed genes. Methods that identify the interested genes have been proposed on statistical data analysis. The problem is that there is no good agreement among them, as different results are produced by distinct methods. By taking the advantage of the unsupervised data analysis, an iterative clustering procedure that finds differentially expressed genes shows promising results. In the present paper, a comparative study of the clustering methods applied for gene expression analysis is presented to explicate the choice of the clustering algorithm implemented in the method. An investigation of different distance measures is provided to reveal those that increase the efficiency of the method in finding the real data structure. Further, the method is improved by incorporating an additional aggregation measure based on the standard deviation of the expression levels. Its usage increases the gene distinction as a new amount of differentially expressed genes is found. The method is summarized in a detailed procedure. The significance of the method is proved by an analysis of two mice strain data sets. The differentially expressed genes defined by the proposed method are compared with those selected by the well-known statistical methods applied to the same data set.
Collapse
Affiliation(s)
- Olga Georgieva
- Faculty of Mathematics and Informatics, Sofia University "St. Kliment Ohridski", 125 Tsarigradsko Shosse Blvd., bl. 2, 1113 Sofia, Bulgaria
| |
Collapse
|
10
|
Alakus TB, Baykara M. Comparison of Monkeypox and Wart DNA Sequences with Deep Learning Model. APPLIED SCIENCES 2022; 12:10216. [DOI: 10.3390/app122010216] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/01/2023]
Abstract
After the COVID-19 disease, monkeypox disease has emerged today and has started to be seen almost everywhere in the world in a short time. Monkeypox causes symptoms such as fever, chills, and headache in people. In addition, rashes are seen on the skin and lumps are formed. Early diagnosis and treatment of monkeypox, which is a contagious disease, are of great importance. An expert interpretation and clinical examination are usually needed to detect monkeypox. This may cause the treatment process to be slow. Furthermore, monkeypox is sometimes confused with warts. This leads to incorrect diagnosis and treatment. Because of these disadvantages, in this study, the DNA sequences of HPV causing warts and MPV causing monkeypox were analyzed and the classification of these sequences was performed with a deep learning algorithm. The study consisted of four stages. In the first stage, DNA sequences of viruses that cause warts and monkeypox were obtained. In the second stage, these sequences were mapped using various DNA-mapping methods. In the third stage, the mapped sequences were classified using a deep learning algorithm. At the last stage, the performances of DNA-mapping methods were compared by calculating accuracy and F1-score. At the end of the study, an average accuracy of 96.08% and an F1-score of 99.83% were obtained. These results showed that these two diseases can be effectively classified according to their DNA sequences.
Collapse
|
11
|
Automated Recognition of Cancer Tissues through Deep Learning Framework from the Photoacoustic Specimen. CONTRAST MEDIA & MOLECULAR IMAGING 2022; 2022:4356744. [PMID: 36017020 PMCID: PMC9385293 DOI: 10.1155/2022/4356744] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/08/2022] [Revised: 06/26/2022] [Accepted: 07/15/2022] [Indexed: 11/30/2022]
Abstract
The fast advancement of biomedical research technology has expanded and enhanced the spectrum of diagnostic instruments. Various research groups have found optical imaging, ultrasonic imaging, and magnetic resonance imaging to create multifunctional devices that are critical for biomedical activities. Multispectral photoacoustic imaging that integrates the ideas of optical and ultrasonic technologies is one of the most essential instruments. At the same time, early cancer identification is becoming increasingly important in order to minimize fatality. Deep learning (DL) techniques have recently advanced to the point where they can be used to diagnose and classify cancer using biological images. This paper describes a hybrid optimization method that combines in-depth transfer learning-based cancer detection with multispectral photoacoustic imaging. The goal of the PS-ACO-RNN approach is to use ultrasound images to detect and classify the presence of cancer. Bilateral filtration (BF) is often used as a noise removal approach in image processing. In addition, lightweight LEDNet models are used to separate the biological images. A feature extractor with particle swarm with ant colony optimization (PS-ACO) paradigm can also be used. Finally, biological images assign appropriate class labels using a recurrent neural network (RNN) model. The effectiveness of the PS-ACO-RNN technique is verified using a benchmark database, and test results show that the PS-ACO-RNN approach works better than current approaches.
Collapse
|
12
|
A Survival Status Classification Model for Osteosarcoma Patients Based on E-CNN-SVM and Multisource Data Fusion. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022; 2022:9464182. [PMID: 35855803 PMCID: PMC9288314 DOI: 10.1155/2022/9464182] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/05/2022] [Revised: 06/27/2022] [Accepted: 06/28/2022] [Indexed: 12/01/2022]
Abstract
Traditional algorithms have the following drawbacks: (1) they only focus on a certain aspect of genetic data or local feature data of osteosarcoma patients, and the extracted feature information is not considered as a whole; (2) they do not equalize the sample data between categories; (3) the generalization ability of the model is weak, and it is difficult to perform the task of classifying the survival status of osteosarcoma patients better. In this context, this paper designs a survival status prediction model for osteosarcoma patients based on E-CNN-SVM and multisource data fusion, taking into full consideration the characteristics of the small number of samples, high dimensionality, and interclass imbalance of osteosarcoma patients' genetic data. The model fuses four gene sequencing data highly correlated with bone tumors using the random forest algorithm in a dimensionality reduction and then equalizes the data using a hybrid sampling method combining the SMOTE algorithm and the TomekLink algorithm; secondly, the CNN model with the incentive module is used to further extract features from the data for more accurate extraction of characteristic information; finally, the data are passed to the SVM model to further improve the stability and classification performance of the model. The model has been demonstrated to be more effective in improving the accuracy of the classification of patients with osteosarcoma.
Collapse
|
13
|
Hilal AM, Malibari AA, Obayya M, Alzahrani JS, Alamgeer M, Mohamed A, Motwakel A, Yaseen I, Hamza MA, Zamani AS. Feature Subset Selection with Optimal Adaptive Neuro-Fuzzy Systems for Bioinformatics Gene Expression Classification. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022; 2022:1698137. [PMID: 35607459 PMCID: PMC9124108 DOI: 10.1155/2022/1698137] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/09/2022] [Revised: 04/20/2022] [Accepted: 04/27/2022] [Indexed: 01/28/2023]
Abstract
Recently, bioinformatics and computational biology-enabled applications such as gene expression analysis, cellular restoration, medical image processing, protein structure examination, and medical data classification utilize fuzzy systems in offering effective solutions and decisions. The latest developments of fuzzy systems with artificial intelligence techniques enable to design the effective microarray gene expression classification models. In this aspect, this study introduces a novel feature subset selection with optimal adaptive neuro-fuzzy inference system (FSS-OANFIS) for gene expression classification. The major aim of the FSS-OANFIS model is to detect and classify the gene expression data. To accomplish this, the FSS-OANFIS model designs an improved grey wolf optimizer-based feature selection (IGWO-FS) model to derive an optimal subset of features. Besides, the OANFIS model is employed for gene classification and the parameter tuning of the ANFIS model is adjusted by the use of coyote optimization algorithm (COA). The application of IGWO-FS and COA techniques helps in accomplishing enhanced microarray gene expression classification outcomes. The experimental validation of the FSS-OANFIS model has been performed using Leukemia, Prostate, DLBCL Stanford, and Colon Cancer datasets. The proposed FSS-OANFIS model has resulted in a maximum classification accuracy of 89.47%.
Collapse
Affiliation(s)
- Anwer Mustafa Hilal
- Department of Computer and Self Development, Preparatory Year Deanship, Prince Sattam Bin Abdulaziz University, AlKharj, Saudi Arabia
| | - Areej A. Malibari
- Department of Industrial and Systems Engineering, College of Engineering, Princess Nourah Bint Abdulrahman University, P.O. Box 84428, Riyadh 11671, Saudi Arabia
| | - Marwa Obayya
- Department of Biomedical Engineering, College of Engineering, Princess Nourah Bint Abdulrahman University, P.O.Box 84428, Riyadh 11671, Saudi Arabia
| | - Jaber S. Alzahrani
- Department of Industrial Engineering, College of Engineering Alqunfudah, Umm Al-Qura University, Mecca, Saudi Arabia
| | - Mohammad Alamgeer
- Department of Information Systems, College of Science & Art Mahayil, King Khalid University, Abha, Saudi Arabia
| | - Abdullah Mohamed
- Research Centre, Future University, Egypt, New Cairo 11845, Egypt
| | - Abdelwahed Motwakel
- Department of Computer and Self Development, Preparatory Year Deanship, Prince Sattam Bin Abdulaziz University, AlKharj, Saudi Arabia
| | - Ishfaq Yaseen
- Department of Computer and Self Development, Preparatory Year Deanship, Prince Sattam Bin Abdulaziz University, AlKharj, Saudi Arabia
| | - Manar Ahmed Hamza
- Department of Computer and Self Development, Preparatory Year Deanship, Prince Sattam Bin Abdulaziz University, AlKharj, Saudi Arabia
| | - Abu Sarwar Zamani
- Department of Computer and Self Development, Preparatory Year Deanship, Prince Sattam Bin Abdulaziz University, AlKharj, Saudi Arabia
| |
Collapse
|
14
|
Saxena A, Rubens M, Ramamoorthy V, Zhang Z, Ahmed MA, McGranaghan P, Das S, Veledar E. A Brief Overview of Adaptive Designs for Phase I Cancer Trials. Cancers (Basel) 2022; 14:cancers14061566. [PMID: 35326715 PMCID: PMC8946506 DOI: 10.3390/cancers14061566] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2022] [Revised: 03/16/2022] [Accepted: 03/17/2022] [Indexed: 12/18/2022] Open
Abstract
Simple Summary Phase I cancer trials are important for new drug developments to test the safety and optimal dosage of cancer drugs which are usually toxic. Understanding biostatistical methodologies of these designs is important for developing phase I studies that are both safe for the participants and which use optimal dosages for better outcomes. Currently there are several phase I designs that are being refined and modified for better outcomes and newer designs are being continuously developed. In this review article, we described several important phase I study designs to provide a brief overview of existing methods. Our review could be helpful to the research community who intent to have a better and yet a concise summary of existing methods. Abstract Phase I studies are used to estimate the dose-toxicity profile of the drugs and to select appropriate doses for successive studies. However, literature on statistical methods used for phase I studies are extensive. The objective of this review is to provide a concise summary of existing and emerging techniques for selecting dosages that are appropriate for phase I cancer trials. Many advanced statistical studies have proposed novel and robust methods for adaptive designs that have shown significant advantages over conventional dose finding methods. An increasing number of phase I cancer trials use adaptive designs, particularly during the early phases of the study. In this review, we described nonparametric and algorithm-based designs such as traditional 3 + 3, accelerated titration, Bayesian algorithm-based design, up-and-down design, and isotonic design. In addition, we also described parametric model-based designs such as continual reassessment method, escalation with overdose control, and Bayesian decision theoretic and optimal design. Ongoing studies have been continuously focusing on improving and refining the existing models as well as developing newer methods. This study would help readers to assimilate core concepts and compare different phase I statistical methods under one banner. Nevertheless, other evolving methods require future reviews.
Collapse
Affiliation(s)
- Anshul Saxena
- Center for Advanced Analytics, Baptist Health South Florida, Miami, FL 33176, USA; (V.R.); (Z.Z.); (M.A.A.); (E.V.)
- Robert Stempel College of Public Health & Social Work, Florida International University, Miami, FL 33199, USA
- Correspondence: (A.S.); (P.M.)
| | - Muni Rubens
- Miami Cancer Institute, Baptist Health South Florida, Miami, FL 33176, USA;
| | - Venkataraghavan Ramamoorthy
- Center for Advanced Analytics, Baptist Health South Florida, Miami, FL 33176, USA; (V.R.); (Z.Z.); (M.A.A.); (E.V.)
| | - Zhenwei Zhang
- Center for Advanced Analytics, Baptist Health South Florida, Miami, FL 33176, USA; (V.R.); (Z.Z.); (M.A.A.); (E.V.)
| | - Md Ashfaq Ahmed
- Center for Advanced Analytics, Baptist Health South Florida, Miami, FL 33176, USA; (V.R.); (Z.Z.); (M.A.A.); (E.V.)
| | - Peter McGranaghan
- Miami Cancer Institute, Baptist Health South Florida, Miami, FL 33176, USA;
- Department of Internal Medicine and Cardiology, Charité—Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt Universität zu Berlin, 10117 Berlin, Germany
- Correspondence: (A.S.); (P.M.)
| | - Sankalp Das
- Wellness and Employee Health, Baptist Health South Florida, Miami, FL 33176, USA;
| | - Emir Veledar
- Center for Advanced Analytics, Baptist Health South Florida, Miami, FL 33176, USA; (V.R.); (Z.Z.); (M.A.A.); (E.V.)
- Robert Stempel College of Public Health & Social Work, Florida International University, Miami, FL 33199, USA
| |
Collapse
|