1
|
Bai Z, Bartelo N, Aslam M, Murphy EA, Hale CR, Blachere NE, Parveen S, Spolaore E, DiCarlo E, Gravallese EM, Smith MH, Frank MO, Jiang CS, Zhang H, Pyrgaki C, Lewis MJ, Sikandar S, Pitzalis C, Lesnak JB, Mazhar K, Price TJ, Malfait AM, Miller RE, Zhang F, Goodman S, Darnell RB, Wang F, Orange DE. Synovial fibroblast gene expression is associated with sensory nerve growth and pain in rheumatoid arthritis. Sci Transl Med 2024; 16:eadk3506. [PMID: 38598614 DOI: 10.1126/scitranslmed.adk3506] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2023] [Accepted: 03/21/2024] [Indexed: 04/12/2024]
Abstract
It has been presumed that rheumatoid arthritis (RA) joint pain is related to inflammation in the synovium; however, recent studies reveal that pain scores in patients do not correlate with synovial inflammation. We developed a machine-learning approach (graph-based gene expression module identification or GbGMI) to identify an 815-gene expression module associated with pain in synovial biopsy samples from patients with established RA who had limited synovial inflammation at arthroplasty. We then validated this finding in an independent cohort of synovial biopsy samples from patients who had early untreated RA with little inflammation. Single-cell RNA sequencing analyses indicated that most of these 815 genes were most robustly expressed by lining layer synovial fibroblasts. Receptor-ligand interaction analysis predicted cross-talk between human lining layer fibroblasts and human dorsal root ganglion neurons expressing calcitonin gene-related peptide (CGRP+). Both RA synovial fibroblast culture supernatant and netrin-4, which is abundantly expressed by lining fibroblasts and was within the GbGMI-identified pain-associated gene module, increased the branching of pain-sensitive murine CGRP+ dorsal root ganglion neurons in vitro. Imaging of solvent-cleared synovial tissue with little inflammation from humans with RA revealed CGRP+ pain-sensing neurons encasing blood vessels growing into synovial hypertrophic papilla. Together, these findings support a model whereby synovial lining fibroblasts express genes associated with pain that enhance the growth of pain-sensing neurons into regions of synovial hypertrophy in RA.
Collapse
Affiliation(s)
- Zilong Bai
- Weill Cornell Medicine, New York, NY 10065, USA
| | | | | | | | - Caryn R Hale
- Rockefeller University, New York, NY 10065, USA
- Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Nathalie E Blachere
- Rockefeller University, New York, NY 10065, USA
- Howard Hughes Medical Institute, Rockefeller University, New York, NY 10065, USA
| | | | | | | | | | | | | | | | | | | | - Myles J Lewis
- Queen Mary University of London & NIHR BRC Barts Health NHS Trust, London E1 4NS, UK
| | - Shafaq Sikandar
- Queen Mary University of London & NIHR BRC Barts Health NHS Trust, London E1 4NS, UK
| | - Costantino Pitzalis
- Queen Mary University of London & NIHR BRC Barts Health NHS Trust, London E1 4NS, UK
- Department of Biomedical Sciences, Humanitas University & IRCC Humanitas Research Hospital, Milan 20072, Italy
| | | | | | | | | | | | - Fan Zhang
- University of Colorado School of Medicine, Aurora, CO 80045, USA
| | - Susan Goodman
- Hospital for Special Surgery, New York, NY 10021, USA
| | - Robert B Darnell
- Rockefeller University, New York, NY 10065, USA
- Howard Hughes Medical Institute, Rockefeller University, New York, NY 10065, USA
| | - Fei Wang
- Weill Cornell Medicine, New York, NY 10065, USA
| | - Dana E Orange
- Rockefeller University, New York, NY 10065, USA
- Hospital for Special Surgery, New York, NY 10021, USA
| |
Collapse
|
2
|
Augustine J, Jereesh AS. Identification of gene-level methylation for disease prediction. Interdiscip Sci 2023; 15:678-695. [PMID: 37603212 DOI: 10.1007/s12539-023-00584-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2023] [Revised: 07/30/2023] [Accepted: 08/01/2023] [Indexed: 08/22/2023]
Abstract
DNA methylation is an epigenetic alteration that plays a fundamental part in governing gene regulatory processes. The DNA methylation mechanism affixes methyl groups to distinct cytosine residues, influencing chromatin architectures. Multiple studies have demonstrated that DNA methylation's regulatory effect on genes is linked to the beginning and progression of several disorders. Researchers have recently uncovered thousands of phenotype-related methylation sites through the epigenome-wide association study (EWAS). However, combining the methylation levels of several sites within a gene and determining the gene-level DNA methylation remains challenging. In this study, we proposed the supervised UMAP Assisted Gene-level Methylation method (sUAGM) for disease prediction based on supervised UMAP (Uniform Manifold Approximation and Projection), a manifold learning-based method for reducing dimensionality. The methylation values at the gene level generated using the proposed method are evaluated by employing various feature selection and classification algorithms on three distinct DNA methylation datasets derived from blood samples. The performance has been assessed employing classification accuracy, F-1 score, Mathews Correlation Coefficient (MCC), Kappa, Classification Success Index (CSI) and Jaccard Index. The Support Vector Machine with the linear kernel (SVML) classifier with Recursive Feature Elimination (RFE) performs best across all three datasets. From comparative analysis, our method outperformed existing gene-level and site-level approaches by achieving 100% accuracy and F1-score with fewer genes. The functional analysis of the top 28 genes selected from the Parkinson's disease dataset revealed a significant association with the disease.
Collapse
Affiliation(s)
- Jisha Augustine
- Bioinformatics Lab, Department of Computer Science, Cochin University of Science and Technology, Cochin, Kerala, 682022, India.
| | - A S Jereesh
- Bioinformatics Lab, Department of Computer Science, Cochin University of Science and Technology, Cochin, Kerala, 682022, India
| |
Collapse
|
3
|
Arafa A, El-Fishawy N, Badawy M, Radad M. RN-Autoencoder: Reduced Noise Autoencoder for classifying imbalanced cancer genomic data. J Biol Eng 2023; 17:7. [PMID: 36717866 PMCID: PMC9887895 DOI: 10.1186/s13036-022-00319-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2022] [Accepted: 12/12/2022] [Indexed: 01/31/2023] Open
Abstract
BACKGROUND In the current genomic era, gene expression datasets have become one of the main tools utilized in cancer classification. Both curse of dimensionality and class imbalance problems are inherent characteristics of these datasets. These characteristics have a negative impact on the performance of most classifiers when used to classify cancer using genomic datasets. RESULTS This paper introduces Reduced Noise-Autoencoder (RN-Autoencoder) for pre-processing imbalanced genomic datasets for precise cancer classification. Firstly, RN-Autoencoder solves the curse of dimensionality problem by utilizing the autoencoder for feature reduction and hence generating new extracted data with lower dimensionality. In the next stage, RN-Autoencoder introduces the extracted data to the well-known Reduced Noise-Synthesis Minority Over Sampling Technique (RN- SMOTE) that efficiently solve the problem of class imbalance in the extracted data. RN-Autoencoder has been evaluated using different classifiers and various imbalanced datasets with different imbalance ratios. The results proved that the performance of the classifiers has been improved with RN-Autoencoder and outperformed the performance with original data and extracted data with percentages based on the classifier, dataset and evaluation metric. Also, the performance of RN-Autoencoder has been compared to the performance of the current state of the art and resulted in an increase up to 18.017, 19.183, 18.58 and 8.87% in terms of test accuracy using colon, leukemia, Diffuse Large B-Cell Lymphoma (DLBCL) and Wisconsin Diagnostic Breast Cancer (WDBC) datasets respectively. CONCLUSION RN-Autoencoder is a model for cancer classification using imbalanced gene expression datasets. It utilizes the autoencoder to reduce the high dimensionality of the gene expression datasets and then handles the class imbalance using RN-SMOTE. RN-Autoencoder has been evaluated using many different classifiers and many different imbalanced datasets. The performance of many classifiers has improved and some have succeeded in classifying cancer with 100% performance in terms of all used metrics. In addition, RN-Autoencoder outperformed many recent works using the same datasets.
Collapse
Affiliation(s)
- Ahmed Arafa
- grid.411775.10000 0004 0621 4712Faculty of Electronic Engineering, Menoufia University, El-Gish Street, Box No. 32951, Menouf, Menoufia Egypt
| | - Nawal El-Fishawy
- grid.411775.10000 0004 0621 4712Faculty of Electronic Engineering, Menoufia University, El-Gish Street, Box No. 32951, Menouf, Menoufia Egypt
| | - Mohammed Badawy
- grid.411775.10000 0004 0621 4712Faculty of Electronic Engineering, Menoufia University, El-Gish Street, Box No. 32951, Menouf, Menoufia Egypt
| | - Marwa Radad
- grid.411775.10000 0004 0621 4712Faculty of Electronic Engineering, Menoufia University, El-Gish Street, Box No. 32951, Menouf, Menoufia Egypt
| |
Collapse
|
4
|
Zhang C, Cao X. Biological gene extraction path based on knowledge graph and natural language processing. Front Genet 2023; 13:1086379. [PMID: 36712855 PMCID: PMC9880067 DOI: 10.3389/fgene.2022.1086379] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2022] [Accepted: 12/09/2022] [Indexed: 01/15/2023] Open
Abstract
The continuous progress of society and the vigorous development of science and technology have brought people the dawn of maintaining health and preventing and controlling diseases. At the same time, with the update and iteration of bioinformatics technology, the current biological gene research has also undergone revolutionary changes. However, a long-standing problem in genetic research has always plagued researchers, that is, how to find the most needed sample genes from a large number of sample genes, so as to reduce unnecessary research and reduce research costs. By studying the extraction path of biological genes, it can help researchers to extract the most valuable research genes and avoid wasting time and energy. In order to solve the above problems, this paper used the Bhattacharyya distance index and the Gini index to screen the sample genes when extracting the characteristic genes of breast cancer. In the selected 49 public genes, 6 principal components were extracted by principal component analysis (PCA), and finally the experimental results were tested. It was found that when the optimal number of characteristic genes was selected as 5, the recognition rate of genes reached the highest 90.31%, which met the experimental requirements. In addition, the experiment also proved that the characteristic gene extraction method designed in this paper had a removal rate of 99.75% of redundant genes, which can greatly reduce the time and money cost of research.
Collapse
Affiliation(s)
- Canlin Zhang
- Sorenson Communications, Salt Lake City, UT, United States
| | - Xiaopei Cao
- College of Creative Culture and Communication, Zhejiang Normal University, Jinhua, Zhejiang, China,*Correspondence: Xiaopei Cao,
| |
Collapse
|
5
|
Shakya AK, Ramola A, Singh S, Vidyarthi A. Optimum supervised classification algorithm identification by investigating PlanetScope and Skysat multispectral satellite data of Covid lockdown. GEOSYSTEMS AND GEOENVIRONMENT 2022:100163. [PMCID: PMC9756603 DOI: 10.1016/j.geogeo.2022.100163] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
This research identifies the optimum supervised classification algorithm based on modeling Covid 19 lockdown situations all around the World. As the deadly Covid 19 viruses suddenly stopped the fast-moving World. All the commercial and noncommercial activities suddenly stop for an uncertain period during 2020-2021. In this work, object-based image classification approaches have been used to compare pre-Covid and post-Covid (at the time lockdown) images of the study area. These study areas are Washington DC, USA, Sao Paulo, Brazil, Cairo, Egypt, Afghanistan/Iran border, and Beijing, China. All the study areas possess different geographical conditions but have a similar situation of Covid 19 lockdowns. Six supervised image classification techniques are known as Parallelepiped classification (PPC), Minimum distance classification (MDC), Mahalanobis distance classification (MaDC), Maximum likelihood classification (MLC), Spectral angle mapper classification (SAMC) and Spectral information divergence classification (SIDC) are used to classify the satellite data of the study area. Thus based on classification results and statistical features, it has been observed that PPChas obtained the least significant results. In contrast, the most reliable results and highest classification accuracies are obtained through MDC, MaDC, and MLCclassification algorithms.
Collapse
Affiliation(s)
- Amit Kumar Shakya
- Department of Electronics and Communication Engineering, Sant Longowal Institute of Engineering and Technology, Longowal 148106 Sangrur, India,Corresponding author
| | - Ayushman Ramola
- Department of Electronics and Communication Engineering, Sant Longowal Institute of Engineering and Technology, Longowal 148106 Sangrur, India
| | - Surinder Singh
- Department of Electronics and Communication Engineering, Sant Longowal Institute of Engineering and Technology, Longowal 148106 Sangrur, India
| | - Anurag Vidyarthi
- Department of Electronics and Communication, Graphic Era (Deemed to be University), Dehradun 248002, Uttarakhand, India
| |
Collapse
|
6
|
Qin X, Zhang S, Yin D, Chen D, Dong X. Two-stage feature selection for classification of gene expression data based on an improved Salp Swarm Algorithm. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2022; 19:13747-13781. [PMID: 36654066 DOI: 10.3934/mbe.2022641] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
Microarray technology has developed rapidly in recent years, producing a large number of ultra-high dimensional gene expression data. However, due to the huge sample size and dimension proportion of gene expression data, it is very challenging work to screen important genes from gene expression data. For small samples of high-dimensional biomedical data, this paper proposes a two-stage feature selection framework combining Wrapper, embedding and filtering to avoid the curse of dimensionality. The proposed framework uses weighted gene co-expression network (WGCNA), random forest and minimal redundancy maximal relevance (mRMR) for first stage feature selection. In the second stage, a new gene selection method based on the improved binary Salp Swarm Algorithm is proposed, which combines machine learning methods to adaptively select feature subsets suitable for classification algorithms. Finally, the classification accuracy is evaluated using six methods: lightGBM, RF, SVM, XGBoost, MLP and KNN. To verify the performance of the framework and the effectiveness of the proposed algorithm, the number of genes selected and the classification accuracy was compared with the other five intelligent optimization algorithms. The results show that the proposed framework achieves an accuracy equal to or higher than other advanced intelligent algorithms on 10 datasets, and achieves an accuracy of over 97.6% on all 10 datasets. This shows that the method proposed in this paper can solve the feature selection problem related to high-dimensional data, and the proposed framework has no data set limitation, and it can be applied to other fields involving feature selection.
Collapse
Affiliation(s)
- Xiwen Qin
- School of Mathematics and Statistics, Changchun University of Technology, Changchun 130012, China
| | - Shuang Zhang
- School of Mathematics and Statistics, Changchun University of Technology, Changchun 130012, China
| | - Dongmei Yin
- School of Mathematics and Statistics, Changchun University of Technology, Changchun 130012, China
| | - Dongxue Chen
- School of Mathematics and Statistics, Changchun University of Technology, Changchun 130012, China
| | - Xiaogang Dong
- School of Mathematics and Statistics, Changchun University of Technology, Changchun 130012, China
| |
Collapse
|
7
|
Liu Z, Wang R, Zhang W. Improving the generalization of unsupervised feature learning by using data from different sources on gene expression data for cancer diagnosis. Med Biol Eng Comput 2022; 60:1055-1073. [DOI: 10.1007/s11517-022-02522-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2021] [Accepted: 01/30/2022] [Indexed: 10/19/2022]
|
8
|
Asad M, Halim Z, Waqas M, Tu S. An In-ad contents-based viewability prediction framework using Artificial Intelligence for Web Ads. Artif Intell Rev 2021. [DOI: 10.1007/s10462-021-10013-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
9
|
Ullah S, Halim Z. Imagined character recognition through EEG signals using deep convolutional neural network. Med Biol Eng Comput 2021; 59:1167-1183. [PMID: 33945075 DOI: 10.1007/s11517-021-02368-0] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2020] [Accepted: 04/27/2021] [Indexed: 11/28/2022]
Abstract
Electroencephalography (EEG)-based brain computer interface (BCI) enables people to interact directly with computing devices through their brain signals. A BCI typically interprets EEG signals to reflect the user's intent or other mental activity. Motor imagery (MI) is a commonly used technique in BCIs where a user is asked to imagine moving certain part of the body such as a hand or a foot. By correctly interpreting the signal, one can perform a multitude of tasks such as controlling wheel chair, playing computer games, or even typing text. However, the use of motor-imagery-based BCIs outside the laboratory environment is limited due to the lack of their reliability. This work focuses on another kind of mental imagery, namely, the visual imagery (VI). VI is the manipulation of visual information that comes from memory. This work presents a deep convolutional neural network (DCNN)-based system for the recognition of visual/mental imagination of English alphabets so as to enable typing directly via brain signals. The DCNN learns to extract the spatial features hidden in the EEG signal. As opposed to many deep neural networks that use raw EEG signals for classification, this work transforms the raw signals into band powers using Morlet wavelet transformation. The proposed approach is evaluated on two publicly available benchmark MI-EEG datasets and a visual imagery dataset specifically collected for this work. The obtained results demonstrate that the proposed model performs better than the existing state-of-the-art methods for MI-EEG classification and yields an average accuracy of 99.45% on the two public MI-EEG datasets. The model also achieves an average recognition rate of 95.2% for the 26 English-language alphabets. Overall working of the proposed solution for imagined character recognition through EEG signals.
Collapse
Affiliation(s)
- Sadiq Ullah
- The Machine Intelligence Research Group (MInG), Faculty of Computer Science and Engineering, Ghulam Ishaq Khan Institute of Engineering Sciences and Technology, Topi, Pakistan.,Department of Computer Science, Namal Institute, Mianwali, Pakistan
| | - Zahid Halim
- The Machine Intelligence Research Group (MInG), Faculty of Computer Science and Engineering, Ghulam Ishaq Khan Institute of Engineering Sciences and Technology, Topi, Pakistan.
| |
Collapse
|
10
|
Deshpande NM, Gite S, Aluvalu R. A review of microscopic analysis of blood cells for disease detection with AI perspective. PeerJ Comput Sci 2021; 7:e460. [PMID: 33981834 PMCID: PMC8080427 DOI: 10.7717/peerj-cs.460] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2020] [Accepted: 03/06/2021] [Indexed: 05/07/2023]
Abstract
BACKGROUND Any contamination in the human body can prompt changes in blood cell morphology and various parameters of cells. The minuscule images of blood cells are examined for recognizing the contamination inside the body with an expectation of maladies and variations from the norm. Appropriate segmentation of these cells makes the detection of a disease progressively exact and vigorous. Microscopic blood cell analysis is a critical activity in the pathological analysis. It highlights the investigation of appropriate malady after exact location followed by an order of abnormalities, which assumes an essential job in the analysis of various disorders, treatment arranging, and assessment of results of treatment. METHODOLOGY A survey of different areas where microscopic imaging of blood cells is used for disease detection is done in this paper. Research papers from this area are obtained from a popular search engine, Google Scholar. The articles are searched considering the basics of blood such as its composition followed by staining of blood, that is most important and mandatory before microscopic analysis. Different methods for classification, segmentation of blood cells are reviewed. Microscopic analysis using image processing, computer vision and machine learning are the main focus of the analysis and the review here. Methodologies employed by different researchers for blood cells analysis in terms of these mentioned algorithms is the key point of review considered in the study. RESULTS Different methodologies used for microscopic analysis of blood cells are analyzed and are compared according to different performance measures. From the extensive review the conclusion is made. CONCLUSION There are different machine learning and deep learning algorithms employed by researchers for segmentation of blood cell components and disease detection considering microscopic analysis. There is a scope of improvement in terms of different performance evaluation parameters. Different bio-inspired optimization algorithms can be used for improvement. Explainable AI can analyze the features of AI implemented system and will make the system more trusted and commercially suitable.
Collapse
Affiliation(s)
- Nilkanth Mukund Deshpande
- Department of Electronics and Telecommunication, Symbiosis Institute of Technology, Symbiosis International (Deemed University), Pune, Maharashtra, India
- Electronics & Telecommunication Department, Dr. Vithalrao Vikhe Patil College of Engineering, Ahmednagar, Ahmednagar, Maharashtra, India
| | - Shilpa Gite
- Department of Computer Science, Symbiosis Institute of Technology, Pune Symbiosis International (Deemed University), Pune, Maharashtra, India
- Symbiosis Center for Applied Artificial Intelligence (SCAAI), Symbiosis International (Deemed University), Pune, Maharashtra, India
| | - Rajanikanth Aluvalu
- Department of CSE, Vardhaman College of Engineering, Hyderabad, Telangana, India
| |
Collapse
|
11
|
A novel binary chaotic genetic algorithm for feature selection and its utility in affective computing and healthcare. Neural Comput Appl 2020. [DOI: 10.1007/s00521-020-05347-y] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
12
|
A Self-Care Prediction Model for Children with Disability Based on Genetic Algorithm and Extreme Gradient Boosting. MATHEMATICS 2020. [DOI: 10.3390/math8091590] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Detecting self-care problems is one of important and challenging issues for occupational therapists, since it requires a complex and time-consuming process. Machine learning algorithms have been recently applied to overcome this issue. In this study, we propose a self-care prediction model called GA-XGBoost, which combines genetic algorithms (GAs) with extreme gradient boosting (XGBoost) for predicting self-care problems of children with disability. Selecting the feature subset affects the model performance; thus, we utilize GA to optimize finding the optimum feature subsets toward improving the model’s performance. To validate the effectiveness of GA-XGBoost, we present six experiments: comparing GA-XGBoost with other machine learning models and previous study results, a statistical significant test, impact analysis of feature selection and comparison with other feature selection methods, and sensitivity analysis of GA parameters. During the experiments, we use accuracy, precision, recall, and f1-score to measure the performance of the prediction models. The results show that GA-XGBoost obtains better performance than other prediction models and the previous study results. In addition, we design and develop a web-based self-care prediction to help therapist diagnose the self-care problems of children with disabilities. Therefore, appropriate treatment/therapy could be performed for each child to improve their therapeutic outcome.
Collapse
|