1
|
Bhandari N, Walambe R, Kotecha K, Khare SP. A comprehensive survey on computational learning methods for analysis of gene expression data. Front Mol Biosci 2022; 9:907150. [PMID: 36458095 PMCID: PMC9706412 DOI: 10.3389/fmolb.2022.907150] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2022] [Accepted: 09/28/2022] [Indexed: 09/19/2023] Open
Abstract
Computational analysis methods including machine learning have a significant impact in the fields of genomics and medicine. High-throughput gene expression analysis methods such as microarray technology and RNA sequencing produce enormous amounts of data. Traditionally, statistical methods are used for comparative analysis of gene expression data. However, more complex analysis for classification of sample observations, or discovery of feature genes requires sophisticated computational approaches. In this review, we compile various statistical and computational tools used in analysis of expression microarray data. Even though the methods are discussed in the context of expression microarrays, they can also be applied for the analysis of RNA sequencing and quantitative proteomics datasets. We discuss the types of missing values, and the methods and approaches usually employed in their imputation. We also discuss methods of data normalization, feature selection, and feature extraction. Lastly, methods of classification and class discovery along with their evaluation parameters are described in detail. We believe that this detailed review will help the users to select appropriate methods for preprocessing and analysis of their data based on the expected outcome.
Collapse
Affiliation(s)
- Nikita Bhandari
- Computer Science Department, Symbiosis Institute of Technology, Symbiosis International (Deemed University), Pune, India
| | - Rahee Walambe
- Electronics and Telecommunication Department, Symbiosis Institute of Technology, Symbiosis International (Deemed University), Pune, India
- Symbiosis Center for Applied AI (SCAAI), Symbiosis International (Deemed University), Pune, India
| | - Ketan Kotecha
- Computer Science Department, Symbiosis Institute of Technology, Symbiosis International (Deemed University), Pune, India
- Symbiosis Center for Applied AI (SCAAI), Symbiosis International (Deemed University), Pune, India
| | - Satyajeet P. Khare
- Symbiosis School of Biological Sciences, Symbiosis International (Deemed University), Pune, India
| |
Collapse
|
2
|
Archer R, Hock E, Hamilton J, Stevens J, Essat M, Poku E, Clowes M, Pandor A, Stevenson M. Assessing prognosis and prediction of treatment response in early rheumatoid arthritis: systematic reviews. Health Technol Assess 2019; 22:1-294. [PMID: 30501821 DOI: 10.3310/hta22660] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
BACKGROUND Rheumatoid arthritis (RA) is a chronic, debilitating disease associated with reduced quality of life and substantial costs. It is unclear which tests and assessment tools allow the best assessment of prognosis in people with early RA and whether or not variables predict the response of patients to different drug treatments. OBJECTIVE To systematically review evidence on the use of selected tests and assessment tools in patients with early RA (1) in the evaluation of a prognosis (review 1) and (2) as predictive markers of treatment response (review 2). DATA SOURCES Electronic databases (e.g. MEDLINE, EMBASE, The Cochrane Library, Web of Science Conference Proceedings; searched to September 2016), registers, key websites, hand-searching of reference lists of included studies and key systematic reviews and contact with experts. STUDY SELECTION Review 1 - primary studies on the development, external validation and impact of clinical prediction models for selected outcomes in adult early RA patients. Review 2 - primary studies on the interaction between selected baseline covariates and treatment (conventional and biological disease-modifying antirheumatic drugs) on salient outcomes in adult early RA patients. RESULTS Review 1 - 22 model development studies and one combined model development/external validation study reporting 39 clinical prediction models were included. Five external validation studies evaluating eight clinical prediction models for radiographic joint damage were also included. c-statistics from internal validation ranged from 0.63 to 0.87 for radiographic progression (different definitions, six studies) and 0.78 to 0.82 for the Health Assessment Questionnaire (HAQ). Predictive performance in external validations varied considerably. Three models [(1) Active controlled Study of Patients receiving Infliximab for the treatment of Rheumatoid arthritis of Early onset (ASPIRE) C-reactive protein (ASPIRE CRP), (2) ASPIRE erythrocyte sedimentation rate (ASPIRE ESR) and (3) Behandelings Strategie (BeSt)] were externally validated using the same outcome definition in more than one population. Results of the random-effects meta-analysis suggested substantial uncertainty in the expected predictive performance of models in a new sample of patients. Review 2 - 12 studies were identified. Covariates examined included anti-citrullinated protein/peptide anti-body (ACPA) status, smoking status, erosions, rheumatoid factor status, C-reactive protein level, erythrocyte sedimentation rate, swollen joint count (SJC), body mass index and vascularity of synovium on power Doppler ultrasound (PDUS). Outcomes examined included erosions/radiographic progression, disease activity, physical function and Disease Activity Score-28 remission. There was statistical evidence to suggest that ACPA status, SJC and PDUS status at baseline may be treatment effect modifiers, but not necessarily that they are prognostic of response for all treatments. Most of the results were subject to considerable uncertainty and were not statistically significant. LIMITATIONS The meta-analysis in review 1 was limited by the availability of only a small number of external validation studies. Studies rarely investigated the interaction between predictors and treatment. SUGGESTED RESEARCH PRIORITIES Collaborative research (including the use of individual participant data) is needed to further develop and externally validate the clinical prediction models. The clinical prediction models should be validated with respect to individual treatments. Future assessments of treatment by covariate interactions should follow good statistical practice. CONCLUSIONS Review 1 - uncertainty remains over the optimal prediction model(s) for use in clinical practice. Review 2 - in general, there was insufficient evidence that the effect of treatment depended on baseline characteristics. STUDY REGISTRATION This study is registered as PROSPERO CRD42016042402. FUNDING The National Institute for Health Research Health Technology Assessment programme.
Collapse
Affiliation(s)
- Rachel Archer
- School of Health and Related Research (ScHARR), University of Sheffield, Sheffield, UK
| | - Emma Hock
- School of Health and Related Research (ScHARR), University of Sheffield, Sheffield, UK
| | - Jean Hamilton
- School of Health and Related Research (ScHARR), University of Sheffield, Sheffield, UK
| | - John Stevens
- School of Health and Related Research (ScHARR), University of Sheffield, Sheffield, UK
| | - Munira Essat
- School of Health and Related Research (ScHARR), University of Sheffield, Sheffield, UK
| | - Edith Poku
- School of Health and Related Research (ScHARR), University of Sheffield, Sheffield, UK
| | - Mark Clowes
- School of Health and Related Research (ScHARR), University of Sheffield, Sheffield, UK
| | - Abdullah Pandor
- School of Health and Related Research (ScHARR), University of Sheffield, Sheffield, UK
| | - Matt Stevenson
- School of Health and Related Research (ScHARR), University of Sheffield, Sheffield, UK
| |
Collapse
|
3
|
Oh H, Park J, Seo W. Identification of symptom clusters and their synergistic effects on quality of life in rheumatoid arthritis patients. Int J Nurs Pract 2018; 25:e12713. [PMID: 30456915 DOI: 10.1111/ijn.12713] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2018] [Revised: 10/24/2018] [Accepted: 10/27/2018] [Indexed: 11/28/2022]
Abstract
AIMS To examine the presence of symptom clusters and synergistic effects of symptom clusters on quality of life in rheumatoid arthritis patients. BACKGROUND Rheumatoid arthritis patients frequently experience multiple concurrent symptoms of pain, fatigue, and depression. DESIGN A nonexperimental, cross-sectional correlation design. METHODS The study participants were 179 rheumatoid arthritis patients. Data were collected between August and December 2016. A hypothetical model was developed based on the Theory of Unpleasant Symptoms Model: physiological antecedents included disease activity and obesity; symptoms of pain, fatigue, and depression were hypothesized as being clustered, and quality of life was taken as the outcome variable. RESULTS Disease activity had significant direct effects on pain, fatigue, and depression and indirect effects on fatigue and depression, whereas obesity had a significant direct effect on fatigue alone. Three symptom clusters, namely, pain fatigue, fatigue depression, and pain-fatigue depression were identified and found to have significant synergistic effects on quality of life. CONCLUSIONS Our findings support the importance of managing clusters of symptoms simultaneously, that is, collective symptom management. Inter-cluster dynamics between symptoms should be considered when nurses develop symptom management strategies or self-management programs to improve the quality of life of rheumatoid arthritis patients.
Collapse
Affiliation(s)
- HyunSoo Oh
- Department of Nursing, Inha University, Incheon, Republic of Korea
| | - JiSuk Park
- Department of Nursing, Inha University Hospital, Incheon, Republic of Korea
| | - WhaSook Seo
- Department of Nursing, Inha University, Incheon, Republic of Korea
| |
Collapse
|
4
|
Chetina EV, Markova GA. [Upcoming value of gene expression analysis in rheumatology]. BIOMEDIT︠S︡INSKAI︠A︡ KHIMII︠A︡ 2018; 64:221-232. [PMID: 29964257 DOI: 10.18097/pbmc20186403221] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
Rheumatoid arthritis (RA) is a chronic inflammatory disease of unknown etiology, which involves disturbance in immune system signaling pathway functions, damage of other tissues, pain and joint destruction. Modern treatment attempts to improve pathophysiological and biochemical mechanisms damaged by the disease. However, due to the RA patient heterogeneity personalized approach to treatment is required; the choice of personalized treatment is complicated by the variability of patient's response to treatment. Gene expression analysis might serve a tool for the disease control and therapy personification for inhibition of inflammation and pain as well as for prevention of joint destruction.
Collapse
Affiliation(s)
- E V Chetina
- Nasonova Research Institute of Rheumatology, Moscow, Russia
| | - G A Markova
- Nasonova Research Institute of Rheumatology, Moscow, Russia
| |
Collapse
|
5
|
K Nearest Neighbor Algorithm Coupled with Metabonomics to Study the Therapeutic Mechanism of Sendeng-4 in Adjuvant-Induced Rheumatoid Arthritis Rat. EVIDENCE-BASED COMPLEMENTARY AND ALTERNATIVE MEDICINE 2018; 2018:2484912. [PMID: 29681970 PMCID: PMC5842719 DOI: 10.1155/2018/2484912] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/14/2017] [Accepted: 01/24/2018] [Indexed: 01/13/2023]
Abstract
As a traditional Mongolian medicine, Sendeng-4 (SD) has been widely used to treat rheumatoid arthritis (RA) in Inner Mongolia and exhibits a good curative effect. Unfortunately, due to geographical factors, it is difficult to popularize this drug throughout the whole country, and the mechanism of action of SD has been unclear. In this study, a serum metabolite profile analysis was performed to identify potential biomarkers associated with adjuvant-induced RA and investigate the mechanism of action of SD. Ultraperformance liquid chromatography coupled with quadrupole time-of-flight mass spectrometry (UPLC-Q-TOF-MS) was performed for the metabonomics analysis. K nearest neighbor (KNN) models were established in both positive and negative spectra for classifying data from the control, model, and SD administration groups. Accuracy rate for classification was 95.8% in positive ion mode and 91.7% in negative ion mode. Orthogonal partial least squares discriminant analysis (OPLS-DA) enabled the identification of 12 metabolites as potential biomarkers of adjuvant-induced RA. After treatment with SD, the levels of uridine triphosphate, calcitroic acid, dynorphin B (6-9), and docosahexaenoic acid were restored to normal, indicating that SD likely ameliorated RA by regulating the levels of these biomarkers. This study identified early biomarkers of RA and elucidated the underlying mechanism of action of SD, which is worth further investigation for development as a clinical therapy.
Collapse
|
6
|
Tchetina E, Markova G. The clinical utility of gene expression examination in rheumatology. Mediterr J Rheumatol 2017; 28:116-126. [PMID: 32185269 PMCID: PMC7046055 DOI: 10.31138/mjr.28.3.116] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2017] [Accepted: 05/24/2017] [Indexed: 01/09/2023] Open
Abstract
Rheumatoid arthritis (RA) is a chronic inflammatory disease with unknown etiology that affects various pathways within the immune system, involves many other tissues and is associated with pain and joint destruction. Current treatments fail to address pathophysiological and biochemical mechanisms involved in joint degeneration and the induction of pain. Moreover, RA patients are extremely heterogeneous and require specific treatments, the choice of which is complicated by the fact that not all patients equally respond to therapy. Gene expression analysis offer tools for patient management and personalization of patient’s care to meet individual needs in controlling inflammation and pain and delaying joint destruction.
Collapse
Affiliation(s)
- Elena Tchetina
- Immunology and Molecular Biology Laboratory, Nasonova Research Institute of Rheumatology, Moscow, Russia
| | - Galina Markova
- Immunology and Molecular Biology Laboratory, Nasonova Research Institute of Rheumatology, Moscow, Russia
| |
Collapse
|
7
|
Márquez A, Martín J, Carmona FD. Emerging aspects of molecular biomarkers for diagnosis, prognosis and treatment response in rheumatoid arthritis. Expert Rev Mol Diagn 2016; 16:663-75. [DOI: 10.1080/14737159.2016.1174579] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
|
8
|
Wolf BJ, Slate EH, Hill EG. Ordinal Logic Regression: A classifier for discovering combinations of binary markers for ordinal outcomes. Comput Stat Data Anal 2015; 82:152-163. [PMID: 25892835 DOI: 10.1016/j.csda.2014.08.013] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
In medicine, it is often useful to stratify patients according to disease risk, severity, or response to therapy. Since many diseases arise from complex gene-gene and gene-environment interactions, patient strata may be defined by combinations of genetic and environmental factors. Traditional statistical methods require specifying interactions a priori making it difficult to identify high order interactions. Alternatively, machine learning methods can model complex interactions, however these models are often difficult to interpret in a clinical setting. Logic regression (LR) enables modeling a binary outcome using logical combinations of binary predictors yielding easily interpretable models. However LR, as currently available, cannot model ordinal responses. This paper extends LR to model an ordinal response and the resulting method is called Ordinal Logic Regression (OLR). Several simulations comparing OLR and Classification and Regression Trees (CART) demonstrate that OLR is superior to CART for identifying variable interactions associated with an ordinal response. OLR is applied to data from a study to determine associations between genetic and health factors with severity of adult periodontitis.
Collapse
Affiliation(s)
- Bethany J Wolf
- Department of Public Health Sciences, Medical University of South Carolina, Charleston, SC 29464
| | - Elizabeth H Slate
- Department of Statistics, Florida State University, Tallahassee, FL 32306
| | - Elizabeth G Hill
- Department of Public Health Sciences, Medical University of South Carolina, Charleston, SC 29464
| |
Collapse
|
9
|
Shazadi K, Petrovski S, Roten A, Miller H, Huggins RM, Brodie MJ, Pirmohamed M, Johnson MR, Marson AG, O'Brien TJ, Sills GJ. Validation of a multigenic model to predict seizure control in newly treated epilepsy. Epilepsy Res 2014; 108:1797-805. [PMID: 25282706 DOI: 10.1016/j.eplepsyres.2014.08.022] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2014] [Revised: 08/01/2014] [Accepted: 08/31/2014] [Indexed: 11/16/2022]
Abstract
A multigenic classifier based on five single nucleotide polymorphisms (SNPs) was previously reported to predict treatment response in an Australian newly-diagnosed epilepsy cohort using a k-nearest neighbour (kNN) algorithm. We assessed the validity of this classifier in predicting response to initial antiepileptic drug (AED) treatment in two UK cohorts of newly-diagnosed epilepsy and investigated the utility of these five SNPs in predicting seizure control in general. The original Australian cohort constituted the training set for the classifier and was used to predict response to the first well-tolerated AED monotherapy in independently recruited UK cohorts (Glasgow, n=281; SANAD, n=491). A "leave-one-out" cross-validation was also employed, with training sets derived internally from the UK datasets. The multigenic classifier using the Australian cohort as the training set was unable to predict treatment response in either UK cohort. In the "leave-one-out" analysis, the five SNPs collectively predicted treatment response in both Glasgow and SANAD patients prescribed either carbamazepine or valproate (Glasgow OR=3.1, 95% CI=1.4-6.6, p=0.018; SANAD OR=2.8, 95% CI=1.3-6.1, p=0.048), but not those receiving lamotrigine (Glasgow OR=1.3, 95% CI=0.6-2.8, p=1.0; SANAD OR=2.2, 95% CI=0.9-5.4, p=0.36) or other AEDs (Glasgow OR=0.6, 95% CI=0.2-2.0, p=1.0; SANAD OR=1.9, 95% CI=0.9-4.2, p=0.36). The Australian-based multigenic kNN model is not predictive of initial treatment response in UK cohorts of newly-diagnosed epilepsy. However, the five SNPs identified in the original Australian study appear to collectively have a predictive influence in UK patients prescribed either carbamazepine or valproate.
Collapse
Affiliation(s)
- Kanvel Shazadi
- Department of Molecular and Clinical Pharmacology, University of Liverpool, Liverpool, UK.
| | - Slavé Petrovski
- Department of Medicine (RMH/WH), University of Melbourne, Melbourne, VIC, Australia; Department of Neurology, Royal Melbourne Hospital, Melbourne, VIC, Australia; BioGrid Australia, Melbourne, VIC, Australia.
| | - Annie Roten
- Department of Neurology, Royal Melbourne Hospital, Melbourne, VIC, Australia.
| | - Hugh Miller
- Department of Mathematics and Statistics, University of Melbourne, Melbourne, VIC, Australia.
| | - Richard M Huggins
- Department of Mathematics and Statistics, University of Melbourne, Melbourne, VIC, Australia.
| | | | - Munir Pirmohamed
- Department of Molecular and Clinical Pharmacology, University of Liverpool, Liverpool, UK.
| | | | - Anthony G Marson
- Department of Molecular and Clinical Pharmacology, University of Liverpool, Liverpool, UK.
| | - Terence J O'Brien
- Department of Medicine (RMH/WH), University of Melbourne, Melbourne, VIC, Australia; Department of Neurology, Royal Melbourne Hospital, Melbourne, VIC, Australia; BioGrid Australia, Melbourne, VIC, Australia.
| | - Graeme J Sills
- Department of Molecular and Clinical Pharmacology, University of Liverpool, Liverpool, UK.
| |
Collapse
|
10
|
Burska AN, Roget K, Blits M, Soto Gomez L, van de Loo F, Hazelwood LD, Verweij CL, Rowe A, Goulielmos GN, van Baarsen LGM, Ponchel F. Gene expression analysis in RA: towards personalized medicine. THE PHARMACOGENOMICS JOURNAL 2014; 14:93-106. [PMID: 24589910 PMCID: PMC3992869 DOI: 10.1038/tpj.2013.48] [Citation(s) in RCA: 55] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 08/16/2013] [Revised: 10/29/2013] [Accepted: 11/26/2013] [Indexed: 12/13/2022]
Abstract
Gene expression has recently been at the forefront of advance in personalized medicine, notably in the field of cancer and transplantation, providing a rational for a similar approach in rheumatoid arthritis (RA). RA is a prototypic inflammatory autoimmune disease with a poorly understood etiopathogenesis. Inflammation is the main feature of RA; however, many biological processes are involved at different stages of the disease. Gene expression signatures offer management tools to meet the current needs for personalization of RA patients' care. This review analyses currently available information with respect to RA diagnostic, prognostic and prediction of response to therapy with a view to highlight the abundance of data, whose comparison is often inconclusive due to the mixed use of material source, experimental methodologies and analysis tools, reinforcing the need for harmonization if gene expression signatures are to become a useful clinical tool in personalized medicine for RA patients.
Collapse
Affiliation(s)
- A N Burska
- Leeds Institute of Rheumatic and Musculoskeletal Medicine and Leeds Musculoskeletal Biomediacal Research Unit, The University of Leeds, Leeds, UK
| | - K Roget
- TcLand Expression, Huningue, France
| | - M Blits
- Department of Pathology and Rheumatology, Inflammatory Disease Profiling Unit, VU University Medical Center, Amsterdam, The Netherlands
| | - L Soto Gomez
- School of law, The University of Leeds, Leeds, UK
| | - F van de Loo
- Department of Rheumatology Research and Advanced Therapeutics, Nijmegen Centre for Molecular Life Sciences, Nijmegen, The Netherlands
| | - L D Hazelwood
- School of Molecular and Cellular Biology, Faculty of Biological Sciences, University of Leeds, Leeds, UK
| | - C L Verweij
- Department of Pathology and Rheumatology, Inflammatory Disease Profiling Unit, VU University Medical Center, Amsterdam, The Netherlands
| | - A Rowe
- Janssen Research and Development, High Wycombe, UK
| | - G N Goulielmos
- Molecular Medicine and Human Genetics Section, Department of Medicine, University of Crete, Heraklion, Greece
| | - L G M van Baarsen
- Clinical Immunology and Rheumatology, Academic Medical Center, University of Amsterdam, Amsterdam, The Netherlands
| | - F Ponchel
- Leeds Institute of Rheumatic and Musculoskeletal Medicine and Leeds Musculoskeletal Biomediacal Research Unit, The University of Leeds, Leeds, UK
| |
Collapse
|
11
|
Yildirim P, Çeken Ç, Hassanpour R, Tolun MR. Prediction of Similarities Among Rheumatic Diseases. J Med Syst 2010; 36:1485-90. [DOI: 10.1007/s10916-010-9609-6] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2010] [Accepted: 10/05/2010] [Indexed: 11/24/2022]
|